Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Wasting hours in an offhand way
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Thu Jun 08, 2023 8:15 pm    Post subject: Wasting hours in an offhand way Reply with quote

I've been working on a program to process command line arguments.

A function converts an argv string to a number, and returns that number. Mixing valid return values with some "magic" value indicating error didn't seem like a good approach. So I used a custom true / false error type to indicate failure in converting the string to a value.

Handling a section of optional arguments with optional values appeared to have a logic error I couldn't find. I rewrote that section a couple of different ways, creating different errors in the process.

I finally thought I had a cleaner approach to the logic and adapted that example to the program. And, it still didn't produce correct results. It was failing to convert a number when it should have been able to. I finally discovered that the number was being converted correctly. It wasn't being used because I didn't reset the error condition to false for use with the next argument.

Yeah. I spent too many hours on that across multiple days.

I'll eventually try doing it with pointer math again. At least the errors incrementing the pointer too much were obvious :)
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
gorg86
Apprentice
Apprentice


Joined: 20 May 2011
Posts: 299

PostPosted: Thu Jun 08, 2023 8:58 pm    Post subject: Reply with quote

Stuff like that happens to all of us :)
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Thu Jun 08, 2023 9:18 pm    Post subject: Reply with quote

I know, I was just particularly annoyed because there isn't a lot I can learn from it. I wasn't even noticing that part of the code, despite using it and checking that it was "working." I suppose coming up with cleaner / easier to read logic helped convince me that it wasn't a persistent logic problem. It wasn't until I confirmed (again) that the function received the argv string correctly that I looked more closely at the error value. I had previously even noticed that the value was being ignored, leaving it at the initialized zero. But "it was the logic error" causing that failure.

I suppose my lesson learned is to improve at evaluating code to be clearer and becoming more proficient at rewrites. I had already not liked that entire function, but was only having problems with one section.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3131

PostPosted: Thu Jun 08, 2023 10:14 pm    Post subject: Reply with quote

The lesson here is:
"Always initialize variables" ;)

I've recently stumbled over some video on rules for programming space devices - things which must not fail, because nobody's driving there to unbrick them.
Well, I'm not THAT much of a programmer, so not sure how well the whole list fares, but one of said rules said "declare variables in the smallest scope possible".
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21607

PostPosted: Fri Jun 09, 2023 1:28 am    Post subject: Reply with quote

I agree with minimal scope, but I would argue that always initializing a variable is actually counterproductive. You are better off only initializing the variable when you have a useful value to assign to it, and enabling compiler warnings to detect when the variable might be used uninitialized. I've dealt with buggy programs where the bug was that the variable was initialized at declaration, usually to zero, then never given a real value later. Since the variable was initialized, the compiler did not warn that it was read back without being given a meaningful value. If the variable had been written only when a real value was available, the compiler could have warned when it found a path that read an uninitialized value.
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Fri Jun 09, 2023 2:15 am    Post subject: Reply with quote

szatox wrote:
The lesson here is:
"Always initialize variables" ;)

I've recently stumbled over some video on rules for programming space devices - things which must not fail, because nobody's driving there to unbrick them.
Well, I'm not THAT much of a programmer, so not sure how well the whole list fares, but one of said rules said "declare variables in the smallest scope possible".
I'd be curious to see that video.

In this case, initialization wasn't the problem. The first two lines of the function declare and define the Error variable with a false (no error) value, followed by a pointer to that variable. The issue was that I didn't reset the value after or before each subsequent reuse. At least that doesn't seem like an initialization issue as I understand initialization. My fix was to reset it to false at the beginning of the str_to_int function.


As for smallest scope declaration, I'm warming up to that, such as in for (int i = 0; ...). Although even there, I've come across wanting access to i outside of the loop. As Hu mentioned, I've also come across recommendations against always initializing variables. I understand the concept, but not necessarily the nuance. Is the main purpose to not initialize only during development? Or is it situational, that can only be understood with experience? How does that compare to using "debug values" such as DEADBEEF? Generally I just initialize.

With my issue regarding the error variable, definition was in the smallest scope, which happened to be at the beginning of the function. The subsequent "work" of the function makes sure argc is valid, followed by processing of argv values where the error variable is used.

For my use, I could use multiple error values and localize them to if blocks, but that seems less desirable to me. Even more so if I try to turn the code into a generic solution.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21607

PostPosted: Fri Jun 09, 2023 2:55 am    Post subject: Reply with quote

Assuming we are discussing C/C++, modern compilers can warn if the compiler proves the existence of a path which will read a local variable without first writing to it. Initializing the variable to any placeholder value, whether 0 or DEADBEEF, convinces the compiler that the variable has been written, and prevents it from warning about a read-before-write because it sees a write and cannot determine that this write is not useful. Initializing at declaration is fine when the initialization value is useful, but counterproductive if the value is a placeholder you intended to replace on every path through the function. In the latter case, writing to the variable only when you have something useful to put there helps the compiler warn you about paths where you failed to write anything useful.

For programs where you omit an initialization at declaration, I prefer to have it omitted unconditionally, and rely on compiler warnings to prevent you ever releasing a build where the variable is uninitialized. Recent compilers have an option to generate a hidden initialization for such variables, and since it is generated by the compiler, it does not count as a "write" for the purpose of the read-before-write warning, which only considers writes that are actually visible in the source code.
Back to top
View user's profile Send private message
stefan11111
l33t
l33t


Joined: 29 Jan 2023
Posts: 922
Location: Romania

PostPosted: Fri Jun 09, 2023 5:03 am    Post subject: Reply with quote

Couldn't you have just used atoi or strtol?
If you don't want to use errno, just check if when the function returns 0, it's because of an error, or because it's actually 0.
pjp wrote:
I'd be curious to see that video.

Probably this:
https://www.youtube.com/watch?v=GWYhtksrmhE
pjp wrote:

As for smallest scope declaration, I'm warming up to that, such as in for (int i = 0; ...). Although even there, I've come across wanting access to i outside of the loop.

Afaik, c90 doesn't support this. Though this only matters if extreme portability is a concern. I generally define things like this, as c99 compatibility is usually enough.
pjp wrote:

As Hu mentioned, I've also come across recommendations against always initializing variables. I understand the concept, but not necessarily the nuance. Is the main purpose to not initialize only during development? Or is it situational, that can only be understood with experience? How does that compare to using "debug values" such as DEADBEEF? Generally I just initialize.

I usually initialize only if I need to. I don't want to waste cpu cycles if I don't have to.
_________________
My overlay: https://github.com/stefan11111/stefan_overlay
INSTALL_MASK="/etc/systemd /lib/systemd /usr/lib/systemd /usr/lib/modules-load.d *udev* /usr/lib/tmpfiles.d *tmpfiles* /var/lib/dbus /usr/bin/gdbus /lib/udev"
Back to top
View user's profile Send private message
GDH-gentoo
Veteran
Veteran


Joined: 20 Jul 2019
Posts: 1528
Location: South America

PostPosted: Fri Jun 09, 2023 1:20 pm    Post subject: Reply with quote

pjp wrote:
The first two lines of the function declare and define the Error variable with a false (no error) value, followed by a pointer to that variable. The issue was that I didn't reset the value after or before each subsequent reuse. [...] My fix was to reset it to false at the beginning of the str_to_int function.

It is quite weird for me to see a programming error getting described in prose, instead of with code snippets :head scratch:

How did the error variable 'persist' between successive uses? Is there a pointer to a global variable (i. e. with static storage duration) involved?
_________________
NeddySeagoon wrote:
I'm not a witch, I'm a retired electronics engineer :)
Ionen wrote:
As a packager I just don't want things to get messier with weird build systems and multiple toolchains requirements though :)
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Sat Jun 10, 2023 3:13 am    Post subject: Reply with quote

Hu wrote:
Assuming we are discussing C/C++, modern compilers can warn if the compiler proves the existence of a path which will read a local variable without first writing to it. Initializing the variable to any placeholder value, whether 0 or DEADBEEF, convinces the compiler that the variable has been written, and prevents it from warning about a read-before-write because it sees a write and cannot determine that this write is not useful. Initializing at declaration is fine when the initialization value is useful, but counterproductive if the value is a placeholder you intended to replace on every path through the function. In the latter case, writing to the variable only when you have something useful to put there helps the compiler warn you about paths where you failed to write anything useful.

For programs where you omit an initialization at declaration, I prefer to have it omitted unconditionally, and rely on compiler warnings to prevent you ever releasing a build where the variable is uninitialized. Recent compilers have an option to generate a hidden initialization for such variables, and since it is generated by the compiler, it does not count as a "write" for the purpose of the read-before-write warning, which only considers writes that are actually visible in the source code.
Interesting, thank you. C for now, although I plan to try "C++ without the bad parts" eventually.

I don't know if I've ever initialized with anything other than placeholder values, so I'll try not doing that. Allowing the compiler to find problems seems like a good thing.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Sat Jun 10, 2023 3:29 am    Post subject: Reply with quote

stefan11111 wrote:
Couldn't you have just used atoi or strtol?
If you don't want to use errno, just check if when the function returns 0, it's because of an error, or because it's actually 0.
strtol does the actual conversion. My str_to_int wraps it and the the error handling. The example in 'man strtol' has 10 lines of error handling alone. Conversion is done multiple times during the same execution of the calling function. For now at least, the str_to_int wrapper seems like an improvement.

stefan11111 wrote:
Probably this:
https://www.youtube.com/watch?v=GWYhtksrmhE
Thanks! I'll watch it soon.

[quote="stefan11111"]Afaik, c90 doesn't support this. Though this only matters if extreme portability is a concern. I generally define things like this, as c99 compatibility is usually enough.
Quote:
I still don't like it. It just seems "wrong" to not have everything at the beginning. I don't have any meaningful experience, but I did start a long time ago, so it is still fairly ingrained.

[quote="stefan11111"]I usually initialize only if I need to. I don't want to waste cpu cycles if I don't have to.
Havning had no experience to inform a decision, I've gone with whatever seemed to be the "consensus," which at least for a while seemed to lean toward initialize. Given the apparent compiler advantage of not initializing with placeholders, it seems worthwhile to break that habit.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Sat Jun 10, 2023 3:50 am    Post subject: Reply with quote

GDH-gentoo wrote:
It is quite weird for me to see a programming error getting described in prose, instead of with code snippets :head scratch:
:) I initially thought to post the 3 lines of code, then decided it was pointless. I was originally going to write something like "I did initialize them", which I then thought could be inferred to have a negative tone. So I added more detail. *shrug*

GDH-gentoo wrote:
How did the error variable 'persist' between successive uses? Is there a pointer to a global variable (i. e. with static storage duration) involved?
The value is within scope for each use.

Code:
int argv_handler(... stuff ...) {
    struct Error err = { .code = F };
    struct Error errp = &err;

    if (argc == ...) {
        ... do stuff ...
        if (the next argv could optionally be a value) {
            i = str_to_int(whatever, *errp)
            ... do stuff ...
        }
    }
    ... do stuff...
    // including other argc == checks and additional calls to str_to_int
}
Having written that out, it would have been easier to continue from where I said it was in scope with: the argv_handler function calls str_to_int multiple times depending on the arguments, which are evaluated with a series of 'if argc ==' checks (among others). I almost deleted the code block and the comments afterward. But I guess I prefer to avoid terseness that could easily be misconstrued.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Sat Jun 10, 2023 5:29 am    Post subject: Reply with quote

stefan11111 wrote:
Probably this:
https://www.youtube.com/watch?v=GWYhtksrmhE
Interesting. I do some of them already, and aim to do others.

That do / while loops don't have a built in limiter syntax makes me wonder how often they're useful. At some point it became apparent that I should always put in a hard limit to prevent infinite loops.

I hope to avoid the heap as long as possible (or at least only use it when it is truly necessary).

I also try to limit function size. Although that can be a challenge even when still only doing "one action" (depending on what is meant by one action).

I try to check return values, but I had forgotten about printf. Doing so is in partly why I have str_to_int wrapping strtol.

And I use -Wall -Wextra and -pedantic. I'll add -Werror (I thought I had it, so maybe I removed it for some reason).


The part about not using recursion reminded me of a related NASA story. I remembered incorrectly in thinking it had to do with recursion, but it did involve LISP.

https://corecursive.com/lisp-in-space-with-ron-garret/

Thanks for the video!
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
stefan11111
l33t
l33t


Joined: 29 Jan 2023
Posts: 922
Location: Romania

PostPosted: Sat Jun 10, 2023 9:24 am    Post subject: Reply with quote

pjp wrote:

That do / while loops don't have a built in limiter syntax makes me wonder how often they're useful. At some point it became apparent that I should always put in a hard limit to prevent infinite loops.

That Hard limit is an extra check, which wastes cpu cycles and can add up if you use it for every loop.
pjp wrote:

I hope to avoid the heap as long as possible (or at least only use it when it is truly necessary).

That's a double-edged sword. It's better to use the stack whenever possible, but alloca() is not POSIX and VLAs...
Wikipedia wrote:
Programming languages that support VLAs include Ada, Algol 68 (for non-flexible rows), APL, C99 (although subsequently relegated in C11 to a conditional feature, which implementations are not required to support;

It comes down to portability, malloc is more portable, but the others are better.
If you want to go down this rabbit hole even further, see:
https://www.man7.org/linux/man-pages/man2/brk.2.html
pjp wrote:

I also try to limit function size. Although that can be a challenge even when still only doing "one action" (depending on what is meant by one action).

I find that comes naturally. Most function I write are less that 40 lines of code. If a function makes sense to be larger, I make it larger, but that is often not the case.
pjp wrote:

I try to check return values, but I had forgotten about printf. Doing so is in partly why I have str_to_int wrapping strtol.

That is a good habit, though I don't check every function, as often the way to handle failure in some functions is to keep going with the regular control flow.
pjp wrote:

And I use -Wall -Wextra and -pedantic. I'll add -Werror (I thought I had it, so maybe I removed it for some reason).

I try to limit warnings and also check my lints. However, I don't let them dictate what I should write.
If I see a wall of useless warnings that I don't want to fix, I will disable them to focus on what's important.
At times, I will even use -w.
I find that using -Werror for any program even remotely large is an exercise in patience.
_________________
My overlay: https://github.com/stefan11111/stefan_overlay
INSTALL_MASK="/etc/systemd /lib/systemd /usr/lib/systemd /usr/lib/modules-load.d *udev* /usr/lib/tmpfiles.d *tmpfiles* /var/lib/dbus /usr/bin/gdbus /lib/udev"
Back to top
View user's profile Send private message
no101
n00b
n00b


Joined: 10 Oct 2022
Posts: 11
Location: Piney Woods

PostPosted: Sat Jun 10, 2023 11:22 am    Post subject: Reply with quote

In my opinion, where you went wrong was initializing the struct to the OK condition rather than the error condition. You can return the struct at any time and it has a proper error value. If you forget to clear the error flag when you set the result value, you can immediately see that proper input returns an error.

My approach would be to use an "out parameter" and only return the error/success. Something like
Code:

int string_to_long(char *string, long *result) {
    int error_code = -1;

    ... valididate value ...

    result = value;
    error_code = 0;

   return error_code;
}

int main(void) {
    long result;
    if (string_to_long("1234", &result) != 0) {
        handle_error();
    }
}


Okay, I lie, yesterday, I actually wrote a similar function for parsing a numeric argument and I just exit instead of returning an error. What's the point in trying to continue with broken command line arguments?
Code:

void
usage(void)
{
   fprintf(stderr, "\nUsage: cbeer [minutes]\n"
         "    where minutes is less than %d\n\n", INT_MAX/60);
   exit(EXIT_FAILURE);
}

unsigned long
get_ul(const char *str)
{
   char *endptr;
   errno = 0;

   unsigned long val = strtoul(str, &endptr, 0);
   if (errno != 0) {
      perror("strtoul");
      usage();
   }

   if (endptr == &str[0]) {
      fprintf(stderr, "Error: found no minutes in '%s'\n", str);
      usage();
   }

   if (val > INT_MAX/60) {
      usage();
   }

   return val;
}


If you find yourself trying to shoehorn everything into the return value, try to think about other ways to do it.

Also, there's a lurking problem with your posted sample
Code:

int argv_handler(... stuff ...) {
    struct Error err = { .code = F };
    struct Error errp = &err;

You know that both err and errp go out of scope when you return right? While you could return err directly (no pointer), any access to errp outside the function is undefined behavior: even if it "works", it's wrong and can blow up in your face at any time.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3131

PostPosted: Sat Jun 10, 2023 12:23 pm    Post subject: Reply with quote

pjp wrote:
With my issue regarding the error variable, definition was in the smallest scope, which happened to be at the beginning of the function. The subsequent "work" of the function makes sure argc is valid, followed by processing of argv values where the error variable is used.

Quote:
It wasn't being used because I didn't reset the error condition to false for use with the next argument.

Well, you suffered from state leaking to subsequent invocations, right? The variable value being not what you expected. And you solved the problem by re-initializing your variable.
I wonder why you'd want to clear that error condition though, instead of just returning the error and giving up. You had some recovery procedure in place?

no101 wrote:
In my opinion, where you went wrong was initializing the struct to the OK condition rather than the error condition. You can return the struct at any time and it has a proper error value. If you forget to clear the error flag when you set the result value, you can immediately see that proper input returns an error.

When I process a bunch of steps, I sometimes put a counter or them tie them to particular bits in an integer, so the return value would tell me where to look for problems.
How do you know you should clear the error condition?
no101 wrote:
Okay, I lie, yesterday, I actually wrote a similar function for parsing a numeric argument and I just exit instead of returning an error. What's the point in trying to continue with broken command line arguments?
Yes, exactly, thank you.

stefan11111 wrote:
Probably this:
https://www.youtube.com/watch?v=GWYhtksrmhE
Yeah, something along these lines.
Why did I think it was roughly a 20 min video? Never mind, it's close enough.

Hu wrote:
I agree with minimal scope, but I would argue that always initializing a variable is actually counterproductive. You are better off only initializing the variable when you have a useful value to assign to it, and enabling compiler warnings to detect when the variable might be used uninitialized.
Well, you have a point there, it depends on how the variable is used, and having the compiler warn you about doing something stupid surely helps avoid simple mistakes.
Still, in context of this thread, it remind me this old joke:
Q: John got 6 apples from his grandma, and then gave 2 apples to Marry. How many apples does John have now?
A: -23402985302
Back to top
View user's profile Send private message
Hu
Moderator
Moderator


Joined: 06 Mar 2007
Posts: 21607

PostPosted: Sat Jun 10, 2023 3:54 pm    Post subject: Reply with quote

Agreed, there are ways to write the program such that leaving out initialization causes you problems the compiler fails to catch. For example:
Code:
int helper(int *);
int wrong() {
   int value;
   int err;
   err = helper(&value);
   return !err;
}
This program does not warn, even under -Wall -Wextra, because helper might (or might not) read value before writing to it. The compiler will not warn, because helper might be:
Code:
int helper(int *x) {
    *x = 5;
    return 0;
}
Or it could be:
Code:
int helper(int *x) {
    *x += 5;
    return 0;
}
The first version is safe. The second version is wrong for how wrong uses it. Since the definition of helper is not visible, the compiler cannot prove which version you are using, so it errs on the side of expecting you used a safe version.

You can rewrite this in a way that provokes compiler warnings for mistakes, but it is a bit ugly.
Code:
struct Combo
{
   int err, value;
};

struct Combo helper();
int wrong() {
   struct Combo c = helper();
   return !c.err;
}

struct Combo helper()
{
   int err, value;
   return (struct Combo){
      .err = err,
      .value = value,
   };
}
The construction of a temporary at the end seems to be necessary to provoke the warning. The following is simpler and equally wrong, but does not warn:
Code:
struct Combo
{
   int err, value;
};

struct Combo helper();
int wrong() {
   struct Combo c = helper();
   return !c.err;
}

struct Combo helper()
{
   struct Combo r;
   r.value = 0;
   return r;
}
The value of r.err is never set, so this program is wrong, but gcc-13 -Wall -Wextra -O2 accepts it silently. Incidentally, it seems to return with both members zeroed, but that looks to me like the compiler just picking a specific way to be undefined, rather than actually being guaranteed by the language. Disabling optimizations causes it to return with r.err uninitialized instead.
Back to top
View user's profile Send private message
no101
n00b
n00b


Joined: 10 Oct 2022
Posts: 11
Location: Piney Woods

PostPosted: Sun Jun 11, 2023 11:36 am    Post subject: Reply with quote

szatox wrote:
How do you know you should clear the error condition?

You should clear the error condition when you have a valid value to return. In most cases, you can check the return value of the function call but the strtol family of functions face the same problem as pjp is wrestling with: you need the whole range for the return value so you need to signal errors another way.

First check errno because that value will likely get trashed on the next syscall but even when when errno == 0, you can't be sure the value is valid. Next you compare endptr to the start of the string. If they're the same, you didn't extract any numbers at all from the string and should return an error. If both of those are okay, then you know that the value returned from strtol should be valid and you can clear the error flag, set the return value, and return the struct.

I'm not really a fan of using structs for errors but it seems to work pretty well in Go.
Back to top
View user's profile Send private message
GDH-gentoo
Veteran
Veteran


Joined: 20 Jul 2019
Posts: 1528
Location: South America

PostPosted: Sun Jun 11, 2023 5:57 pm    Post subject: Reply with quote

pjp wrote:
Code:
int argv_handler(... stuff ...) {
    struct Error err = { .code = F };
    struct Error errp = &err;

    if (argc == ...) {
        // ...
            i = str_to_int(whatever, *errp)
        // ...
    }
    // ...
}

So I wasn't that off. There is a pointer to a variable that persists between the processing of successive command line arguments, err.

no101 wrote:
In my opinion, where you went wrong was initializing the struct to the OK condition rather than the error condition.

In my opinion, there should not be a persistent err struct in the first place. The determination of whether a command line argument has a correct format and is usable or not is independent of all other arguments, so there is no need to 'remember' the status. However, my inclination would have been to use strtol(), as others have said, which already signals errors through both errno and the "endptr" (second) argument... and errno has exactly the same property as pjp's err structure: it is persistent, and also not modified if the call to strtol() is successful. So I would have ended up in the same situation, i. e. needing to assign to errno in order to 'forget' what happened in the last call to strtol().

There is also a question to ask that defines what the code should do: does a bad argument invalidate the entire command line and result in an unrecoverable situation, or is it possible to continue the processing of the arguments following the bad one? If it is an unrecoverable situation, then yes, as others have also said, just exiting might be a valid approach.

Assuming that the answer is "no, this is not fatal, just continue processing", my version would look similar to no101's:

Code:
$ cat test.c
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>

int no_error_or_explain(int i, char *arg, char *endptr, long result) {
   if (endptr == arg || errno == EINVAL) {
      fprintf(stderr, "No conversion performed for argument %d\n", i);
      return 0;
   } else if (errno == ERANGE) {
      fprintf(stderr, "Argument %d out of range\n", i);
      return 0;
   } else if (*endptr) {
       fprintf(stderr, "Partial conversion done for argument %d, value may or may not be usable: %ld\n", i, result);
       return 0;
   } else return 1;
}

int main(int argc, char *argv[]) {
   for (int i = 1; i < argc; i++) {
      errno = 0;
      char *endptr;
      long result = strtol(argv[i], &endptr, 10);
      if (no_error_or_explain(i, argv[i], endptr, result)) printf("Argument %d = %ld\n", i, result);
   }
   return 0;
}

$ ./test 10 -15 24 111111111111111111111 1534asdf asdfgh
Argument 1 = 10
Argument 2 = -15
Argument 3 = 24
Argument 4 out of range
Partial conversion done for argument 5, value may or may not be usable: 1534
No conversion performed for argument 6

Hu wrote:
The construction of a temporary at the end seems to be necessary to provoke the warning. The following is simpler and equally wrong, but does not warn:

I guess that assigning to one member is enough to silence the warning, even if the others are initialized to indeterminate values :-/ The version of helper() that returns a compound literal produces a warning because local variables err and value are used without an explicitly assigned initial value. In the initialization of the compound literal's members.
_________________
NeddySeagoon wrote:
I'm not a witch, I'm a retired electronics engineer :)
Ionen wrote:
As a packager I just don't want things to get messier with weird build systems and multiple toolchains requirements though :)
Back to top
View user's profile Send private message
no101
n00b
n00b


Joined: 10 Oct 2022
Posts: 11
Location: Piney Woods

PostPosted: Mon Jun 12, 2023 9:24 pm    Post subject: Reply with quote

GDH-gentoo wrote:

In my opinion, there should not be a persistent err struct in the first place. The determination of whether a command line argument has a correct format and is usable or not is independent of all other arguments, so there is no need to 'remember' the status. However, my inclination would have been to use strtol(), as others have said, which already signals errors through both errno and the "endptr" (second) argument... and errno has exactly the same property as pjp's err structure: it is persistent, and also not modified if the call to strtol() is successful. So I would have ended up in the same situation, i. e. needing to assign to errno in order to 'forget' what happened in the last call to strtol().


Yeah, good point. I thought I saw something and got carried away.You're right, for a persistent structure, initialization is basically irrelevant. Sorry for the noise.
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Mon Jun 12, 2023 11:16 pm    Post subject: Reply with quote

stefan11111 wrote:
That Hard limit is an extra check, which wastes cpu cycles and can add up if you use it for every loop.
For now, it is unlikely to matter, and I perceive the benefit to worth the negligible trade-off. I will eventually learn how to evaluate whether or not something I've written is causing performance issues and go from there. At this point, not having a difficult to stop infinite loop seems better than the alternative. Also, once the loop has been demonstrated to work correctly, the check can be removed.

stefan11111 wrote:
That's a double-edged sword. It's better to use the stack whenever possible, but alloca() is not POSIX and VLAs...
I'm not familiar with alloca, and I haven't used VLAs yet. I vaguely recall that might be one of the additions some disfavor using. I have no opinion, but would look into the issues before relying on them. In the meantime, not having to deal with malloc unless necessary is preferred :)

stefan11111 wrote:
I find that comes naturally. Most function I write are less that 40 lines of code. If a function makes sense to be larger, I make it larger, but that is often not the case.
I haven't done enough to say that it comes naturally, but I think I have a natural disinclination for "noise."

stefan11111 wrote:
I try to limit warnings and also check my lints. However, I don't let them dictate what I should write.
If I see a wall of useless warnings that I don't want to fix, I will disable them to focus on what's important.
At times, I will even use -w.
I find that using -Werror for any program even remotely large is an exercise in patience.
I'll have to keep -w in mind. I'm thinking I was using -Werror and disabled it.


no101 wrote:
In my opinion, where you went wrong was initializing the struct to the OK condition rather than the error condition. You can return the struct at any time and it has a proper error value. If you forget to clear the error flag when you set the result value, you can immediately see that proper input returns an error.
That's a good point, and I believe I have done that at times. In this specific instance, I think I caused that problem "intentionally" as I think I used the OK condition thinking it would be easier to use. I hadn't considered that default fail isn't as bad as it sounds. I'll try to adopt that approach unless there's a clear reason to not for a given problem.

no101 wrote:
If you find yourself trying to shoehorn everything into the return value, try to think about other ways to do it.
That's what led to my use of an Error struct. I originally was trying to use it for a return value, encapsulating the error and value in the same return. I quickly realized that was as bad as G0 :). I also found recommendations against doing that with return values, so I came up with this current solution. In this case since "errno" values and what is returned from strtol could be valid numbers to return, I indeed could not use return to handle both. Thus the pointer to hold the error condition.

no101 wrote:
Also, there's a lurking problem with your posted sample
Code:

int argv_handler(... stuff ...) {
struct Error err = { .code = F };
struct Error errp = &err;

You know that both err and errp go out of scope when you return right? While you could return err directly (no pointer), any access to errp outside the function is undefined behavior: even if it "works", it's wrong and can blow up in your face at any time.
Once argv_handler is done, those values don't matter. In argv_handler, it determines to send an argv value to str_to_int along with errp. str_to_int converts the string and updates errp. When argv_handler returns (to main) all options and values (if any) have been set. main then continues depending on those options and values.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
C5ace
Guru
Guru


Joined: 23 Dec 2013
Posts: 472
Location: Brisbane, Australia

PostPosted: Tue Jun 13, 2023 12:13 am    Post subject: Reply with quote

pjp:

I use 'getopt_long()'.
See example from glibc source:
Code:
25.2.4 Example of Parsing Long Options with getopt_long

#include <stdio.h>
#include <stdlib.h>
#include <getopt.h>

/* Flag set by ‘--verbose’. */
static int verbose_flag;

int
main (int argc, char **argv)
{
  int c;

  while (1)
    {
      static struct option long_options[] =
        {
          /* These options set a flag. */
          {"verbose", no_argument,       &verbose_flag, 1},
          {"brief",   no_argument,       &verbose_flag, 0},
          /* These options don’t set a flag.
             We distinguish them by their indices. */
          {"add",     no_argument,       0, 'a'},
          {"append",  no_argument,       0, 'b'},
          {"delete",  required_argument, 0, 'd'},
          {"create",  required_argument, 0, 'c'},
          {"file",    required_argument, 0, 'f'},
          {0, 0, 0, 0}
        };
      /* getopt_long stores the option index here. */
      int option_index = 0;

      c = getopt_long (argc, argv, "abc:d:f:",
                       long_options, &option_index);

      /* Detect the end of the options. */
      if (c == -1)
        break;

      switch (c)
        {
        case 0:
          /* If this option set a flag, do nothing else now. */
          if (long_options[option_index].flag != 0)
            break;
          printf ("option %s", long_options[option_index].name);
          if (optarg)
            printf (" with arg %s", optarg);
          printf ("\n");
          break;

        case 'a':
          puts ("option -a\n");
          break;

        case 'b':
          puts ("option -b\n");
          break;

        case 'c':
          printf ("option -c with value `%s'\n", optarg);
          break;

        case 'd':
          printf ("option -d with value `%s'\n", optarg);
          break;

        case 'f':
          printf ("option -f with value `%s'\n", optarg);
          break;

        case '?':
          /* getopt_long already printed an error message. */
          break;

        default:
          abort ();
        }
    }

  /* Instead of reporting ‘--verbose’
     and ‘--brief’ as they are encountered,
     we report the final status resulting from them. */
  if (verbose_flag)
    puts ("verbose flag is set");

  /* Print any remaining command line arguments (not options). */
  if (optind < argc)
    {
      printf ("non-option ARGV-elements: ");
      while (optind < argc)
        printf ("%s ", argv[optind++]);
      putchar ('\n');
    }

  exit (0);
}


You can download the source of a full working application from:
linux.c5ace.com:

MXcleaner for Linux x86-64 Bit (installer included).zip (24KB) Version 2.0.0
MXcleaner source code (GPL-3) Version 2.0.0
MXcleaner How-To

See 'get_cmd_line.c'.
Use your IDE's debugger to step trough the function. Line 104 main -> p_cfg_no = get_cmd_line(argc, argv);
_________________
Observation after 30 years working with computers:
All software has known and unknown bugs and vulnerabilities. Especially software written in complex, unstable and object oriented languages such as perl, python, C++, C#, Rust and the likes.
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Tue Jun 13, 2023 12:30 am    Post subject: Reply with quote

szatox wrote:
Well, you suffered from state leaking to subsequent invocations, right? The variable value being not what you expected. And you solved the problem by re-initializing your variable.
I wonder why you'd want to clear that error condition though, instead of just returning the error and giving up. You had some recovery procedure in place?
The state is created and initialized in argv_handler() and passed to str_to_int() where the error condition is updated. argv_handler only has one invocation. str_to_int is invoked multiple times.

Regarding "giving up". The error wasn't a critical / failure error. The argv value can be either an option (foo, bar, baz) or a value that could be an integer. The integer value may not be required. After foo, there may be bar. So when I determine argv is foo, I then evaluate whether or not the NEXT argv is an integer for foo, or another option, bar.

In the process of figuring out the problem, I rewrote that section of code, so that's no longer how it works. Now, I check for "foo" and "bar" at the same time. If bar isn't the next argv, it should be an integer. So now the error condition is indeed a failure. I hadn't realized the rewrite had that affect until writing this response. That code was confusing, so I knew it had to be rewritten eventually. The bug just made it happen sooner, and the rewrite helped me find the bug. So I guess it worked out in the end.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Tue Jun 13, 2023 1:09 am    Post subject: Reply with quote

GDH-gentoo wrote:
So I wasn't that off. There is a pointer to a variable that persists between the processing of successive command line arguments, err.
You're probably correct in principle. I think I misunderstood the original question. Referring to an in scope variable as "persisting" seemed odd. Neither is it a global nor does it use static. Processing successive command line arguments happens in a single scope, and does not use the pointer to make those evaluations. The pointer is only used by another function to indicate strtol failed to convert the string to an integer.

GDH-gentoo wrote:
However, my inclination would have been to use strtol(), as others have said,
I've replied at least once that str_to_int uses strtol.

GDH-gentoo wrote:
which already signals errors through both errno and the "endptr" (second) argument... and errno has exactly the same property as pjp's err structure: it is persistent, and also not modified if the call to strtol() is successful. So I would have ended up in the same situation, i. e. needing to assign to errno in order to 'forget' what happened in the last call to strtol().
Uhm, OK. The reason I put strtol into a separate function was due to multiple error checks required. strtol is used multiple times. Because of the error checking, that warrants the wrapper function. Since I'm already using strtol, and you indicate that using it creates the same situation, I'm very confused about the point you are making.

GDH-gentoo wrote:
There is also a question to ask that defines what the code should do: does a bad argument invalidate the entire command line and result in an unrecoverable situation, or is it possible to continue the processing of the arguments following the bad one? If it is an unrecoverable situation, then yes, as others have also said, just exiting might be a valid approach.
Originally the code allowed for multiple commands with optional values. If a value to an option was not provided, then that argv may be a valid option. "foo 10 bar 20" vs "foo bar 20". bar is still valid even if it has to be evaluated to possibly have been an integer. Anyway, after the rewrite, that is no longer true (see one of my recent posts that may have provided more detail).

GDH-gentoo wrote:
Assuming that the answer is "no, this is not fatal, just continue processing", my version would look similar to no101's:
OK, I think I'm less confused now :)
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20067

PostPosted: Tue Jun 13, 2023 1:20 am    Post subject: Reply with quote

C5ace wrote:
pjp:

I use 'getopt_long()'.

<snip>

You can download the source of a full working application from:
linux.c5ace.com:

MXcleaner for Linux x86-64 Bit (installer included).zip (24KB) Version 2.0.0
MXcleaner source code (GPL-3) Version 2.0.0
MXcleaner How-To

See 'get_cmd_line.c'.
Use your IDE's debugger to step trough the function. Line 104 main -> p_cfg_no = get_cmd_line(argc, argv);
Just last night I deleted some file I had no recollection of downloading. It was 'MXcleaner-2.0.0.tar' :) I'm guessing you mentioned it elsewhere as well.

Having used getopt with shell scripts, I've come to associate it with only negative experiences. I no longer use it in shell scripts. When I'm done, I may try rewriting it to use getopt. I also looked at popt, which seemed more complicated than I felt like trying to understand.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum