Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Assistance Portage & Programming
  • Search

Clarification of "Avoid heap memory allocation"

Problems with emerge or ebuilds? Have a basic programming question about C, PHP, Perl, BASH or something else?
Post Reply
Advanced search
39 posts
  • 1
  • 2
  • Next
Author
Message
pjp
Administrator
Administrator
User avatar
Posts: 20668
Joined: Tue Apr 16, 2002 10:35 pm

Clarification of "Avoid heap memory allocation"

  • Quote

Post by pjp » Wed Feb 19, 2020 5:54 am

The Power of 10: Rules for Developing Safety-Critical Code
3. Avoid heap memory allocation.
There is a PDF reference which includes more information:
Rule 3: Do not use dynamic memory allocation after initialization.

Rationale : This rule appears in most coding guidelines for safety-critical software. The reason is simple: Memory
allocators, such as malloc, and garbage collectors often have unpredictable behavior that can significantly impact
performance.

A notable class of coding errors also stems from the mishandling of memory allocation and free routines: forgetting to free
memory or continuing to use memory after it was freed, attempting to allocate more memory than physically available,
overstepping boundaries on allocated memory, and so on. Forcing all applications to live within a fixed, preallocated area
of memory can eliminate many of these problems and make it easier to verify memory use.

Note that the only way to dynamically claim memory in the absence of memory allocation from the heap is to use stack
memory. In the absence of recursion, an upper bound on the use of stack memory can be derived statically, thus making
it possible to prove that an application will always live within its resource bounds.
In the first line, what is meant by "after initialization"? I'm guessing that it is not referring to literal initalization such as "int foo = 10;"? My guess is that if I determine that I need 20MB of memory to do work, I can allocate that when the program starts, but I should not later realloc any additional memory? The reference to "after initialization" seems vague, so I'm guessing it is something speific the expected audience would know.

Next issue. In the last paragraph, where it mentions to use stack memory, how is this considered safe? It is my understanding that stack memory is easy to overflow. I've never perceived functions to query available stack memory to make sure it is safe to use. I also believe other programs may use memory in the stack, so this recommendation seems particularly unsafe, and therefor a strange given that the JPL included this in their standard.

My presumption is that there is a minimum level of experience for those statements to make sense, and that I most certainly lack that amount of experience :).

Thanks.
Quis separabit? Quo animo?
Top
virtguru
Tux's lil' helper
Tux's lil' helper
Posts: 148
Joined: Sat Aug 14, 2010 4:32 pm
Location: The Greatest Country in the World

  • Quote

Post by virtguru » Wed Feb 19, 2020 12:47 pm

It's referring to memory management. In C for example, stack variables are automatically freed, hence the term automatic variables.
However when using stack variables to the compiler needs to know how big the variables are beforehand. This can be limiting due to size of the stack space.

This is where heap comes in. You can create variables in the heap however the compiler doesn't know when your done with it so you need release that yourself 'memory management' .

Code: Select all

free(p);
Forgetting to release heap variables when no longer needed creates leaks. Albeit there are different ways of handling this language dependent. Above is just an example.
Top
GDH-gentoo
Advocate
Advocate
User avatar
Posts: 2115
Joined: Sat Jul 20, 2019 7:02 pm
Location: South America

  • Quote

Post by GDH-gentoo » Wed Feb 19, 2020 2:06 pm

pjp wrote:In the first line, what is meant by "after initialization"?
I take it to mean "after program startup".
pjp wrote:My guess is that if I determine that I need 20MB of memory to do work, I can allocate that when the program starts, but I should not later realloc any additional memory?
I'm not sure, it appears to be discouraging dynamic allocation, period.
pjp wrote:In the last paragraph, where it mentions to use stack memory, how is this considered safe?
As virtguru said, "safe" in the context of memory usage by a program, as opposed to dynamic memory allocation and management. My interpretation is that this recommends preferring objects of automatic storage duration (e.g. local variables in a function) as the way of allowing no longer used memory to be freed. Compilers generate stack management code themselves as a result, whereas dynamic memory has to be managed explicitly by the program, with the possibility of making the mentioned classes of programming mistakes.
pjp wrote:It is my understanding that stack memory is easy to overflow.
Well, this doesn't protect you from other classes of programing errors. I suppose that's why there are 9 more rules :)
Top
krinn
Watchman
Watchman
User avatar
Posts: 7476
Joined: Fri May 02, 2003 6:14 am

  • Quote

Post by krinn » Wed Feb 19, 2020 2:41 pm

I think the idea is "always use stack when you could" ; and int z; is > than int *z = malloc(sizeof(int));

and the second idea is declare any heap allocation from the start (i think they mean within your program initialization, but not after, so at start of main() or in your init() kind of function)
with the idea behind: it is better to have a program eating from start to end 10MB, than a program that would eat 2MB + (upto 8MB sometimes) ; so instead of declearing 2MB on heap, and later use 8MB on heap for something and release that, you should declare the 10MB from the start
because it will be easier for you to know what you have to freed in your exit step (what need to be freed is all seen in your "init" part of your program, rather than hidden in many functions your program will use)

It appears counter productive at first, if you malloc in a function, you can free before that function exit, but with jump, conditional exit... it start become tweaky to always free what you have malloc before the function exit in real, and you start to have lot of free for the same variable (you add a free on each return conditions ; and any return without a free will lead to leaking memory) ; while if the function use a variable you have declare in your "init" step, you will free it in your exit step
Top
pjp
Administrator
Administrator
User avatar
Posts: 20668
Joined: Tue Apr 16, 2002 10:35 pm

  • Quote

Post by pjp » Wed Feb 19, 2020 10:26 pm

@virtguru: Thanks. I understand the general concepts, but it was the particulars covered in the rule that weren't clear.
GDH-gentoo wrote:I take it to mean "after program startup".
I'm not sure, it appears to be discouraging dynamic allocation, period.
"after program startup". Is also vague :). I think they are discouraging "unplanned" use of dynamic allocation. To me, "after initialization" seems to imply a specific time when it is acceptable to allocate dynamic memory. Naturally, the reason they would discourage use of dynamic memory is for the problems associated with poor manual memory management. That could be performance (poorly timed cleanup), memory leaks or perhaps crashing. I don't have the technical knowledge to go into detail, but I get the general idea. That's where I get lost on the recommendation of using the stack. If normal usage of the stack doesn't incorporate mechanisms to ensure the code doesn't cause many of those same problems, the advantage is lost on me. It seems that poorly managed use of the stack could lead to some of the same problems they intend to avoid by minimizing use of heap. Local / stack is all good an well, until I exceed that seemingly very short supply.
GDH-gentoo wrote:Well, this doesn't protect you from other classes of programing errors. I suppose that's why there are 9 more rules :)
The others don't directly address the issue. Avoiding recursion does, but the other rules don't solve the problem of using more memory than is available in the stack. That's why I thought there were methods I hadn't seen to make sure it was safe to use the stack in any given situation.
Quis separabit? Quo animo?
Top
pjp
Administrator
Administrator
User avatar
Posts: 20668
Joined: Tue Apr 16, 2002 10:35 pm

  • Quote

Post by pjp » Wed Feb 19, 2020 10:36 pm

krinn wrote:I think the idea is "always use stack when you could" ; and int z; is > than int *z = malloc(sizeof(int));

and the second idea is declare any heap allocation from the start (i think they mean within your program initialization, but not after, so at start of main() or in your init() kind of function)

with the idea behind: it is better to have a program eating from start to end 10MB, than a program that would eat 2MB + (upto 8MB sometimes) ; so instead of declearing 2MB on heap, and later use 8MB on heap for something and release that, you should declare the 10MB from the start
because it will be easier for you to know what you have to freed in your exit step (what need to be freed is all seen in your "init" part of your program, rather than hidden in many functions your program will use)
OK, that's what I was thinking too, but I've not seen how that is handled. After I had posted, I think I came across a reference to a __setup function of some sort, so that seems to be the likely direction intended by that part of the rule. I guess finding out how to manually manage single allocations of memory is something I'll have to add to my list. So I allocate a given amount of memory during that phase... rhetorically then, I still have to manage what accesses the memory, which will presumably be used by more than one thing throughout the life of the program. Or at least the same chunk of memory will be overwritten at times throughout the life of a program run.
krinn wrote:It appears counter productive at first, if you malloc in a function, you can free before that function exit, but with jump, conditional exit... it start become tweaky to always free what you have malloc before the function exit in real, and you start to have lot of free for the same variable (you add a free on each return conditions ; and any return without a free will lead to leaking memory) ; while if the function use a variable you have declare in your "init" step, you will free it in your exit step
I've read about the "mess" that is making sure to free memory and when to do it, so it makes sense from that standpoint. I'm thinking of it as unplanned vs. planned memory usage (an example might be unplanned travel vs. planned travel).

Thanks for the comments!
Quis separabit? Quo animo?
Top
alamahant
Advocate
Advocate
Posts: 4034
Joined: Sat Mar 23, 2019 12:12 pm

  • Quote

Post by alamahant » Thu Feb 20, 2020 12:56 am

May be it means avoiding instantiating classes using the "new" keyword..
I think by default cpp creates objects in the stack,unless explicitly told to use the heap via the "new" keyword.
Then supposedly one has to "delete" these objects at a proper time lest leading to a memory leak.
But I think this can be safely handled by the use of smart pointers (scoped or shared)by :

Code: Select all

#include <memory>
Top
GDH-gentoo
Advocate
Advocate
User avatar
Posts: 2115
Joined: Sat Jul 20, 2019 7:02 pm
Location: South America

  • Quote

Post by GDH-gentoo » Thu Feb 20, 2020 2:21 am

pjp wrote:"after program startup". Is also vague :).
I meant this as opposed to the other interpretation of "initialization" you gave, initialization as in "initialization of a variable in a declaration".
pjp wrote:I think they are discouraging "unplanned" use of dynamic allocation. To me, "after initialization" seems to imply a specific time when it is acceptable to allocate dynamic memory.
The rest of the paragraphs (behaviour of memory allocators, errors in doing memory management), and the phrase "the absence of dynamic memory allocation, stipulated by the third rule" in section "Following the rules" make me think they are discouraging it altogether. In "safety-critical code", which is what this rules are about according to the introduction.
pjp wrote:It seems that poorly managed use of the stack could lead to some of the same problems they intend to avoid by minimizing use of heap. Local / stack is all good an well, until I exceed that seemingly very short supply.
Maybe I don't understand what you mean by "poorly managed use of the stack". Use of the stack "just happens" when the compiler translates language constructs to code.

This rule doesn't look to me like anything more profound than "avoid dynamic memory allocation". Especially with the emphasis throughout the document on facilitating tools-based checks.
pjp wrote:
GDH-gentoo wrote:Well, this doesn't protect you from other classes of programing errors. I suppose that's why there are 9 more rules :)
The others don't directly address the issue. Avoiding recursion does, but the other rules don't solve the problem of using more memory than is available in the stack.
Without recursion and without dynamic memory allocation, the memory used by the program is bounded. That's the point they make in the last sentence of the last paragraph. The compiler can know what it'll take to accomodate it.
alamahant wrote:May be it means avoiding instantiating classes using the "new" keyword..
[...]
But I think this can be safely handled by the use of smart pointers (scoped or shared)by :

Code: Select all

#include <memory>
That's C++, different language.
Top
Hu
Administrator
Administrator
Posts: 24403
Joined: Tue Mar 06, 2007 5:38 am

  • Quote

Post by Hu » Thu Feb 20, 2020 3:38 am

The ideal for safety-critical code is to be able to formally prove the program to be correct in the face of all possible inputs. As the number of possible program states increases, the difficulty of proving correctness also increases. In the typical desktop program, the number of possible states is so great that formally proving the program correct is infeasible. While it would be nice to have verification methods that can handle complex desktop programs, the alternative, which these guidelines pursue, is to make the program simple enough that it can be verified with the verification methods we know. To that end, the rules seek to minimize the number of program states.

The heap is complex, and dynamic allocations of non-static sizes (say, the length of a user-chosen string) make the problem even more complex. In this context, "after initialization" means that the program should run some set of defined startup code, do any dynamic allocations it needs, then declare that all initialization is done and the program is "ready for duty." As much as possible, the program would not be entrusted with anything safety-critical until after it declares "ready for duty." In the case of a self-driving car, it should declare "ready" before it even disengages the vehicle's parking brake. Under that rule, any complicated or non-verifiable code is guaranteed to happen before the car does anything dangerous, so a fault during that code merely prevents the car from traveling, but cannot cause it to do something dangerous while traveling at speed.

Although not articulated clearly in the quoted rule, I think the authors also intended that any dynamically allocated memory become effectively static after startup. Allocating your own arena and running your own allocator out of it are a violation of the spirit, if not the letter, of the rule. Each dynamic allocation should have a clearly defined lifetime that can be easily verified. It's probably permitted for the block to serve more than one purpose, if the verifier can prove that there is no overlap or conflict. For example, a logging subsystem could use the same block of memory for composing every output line, since it would compose a line, flush it out of the program to storage, then compose another line. However, it might not be legal to use a given block of memory to receive network input from multiple sources, because it may not be practical to verify that the program only ever receives from one source at a time.

The stack is a per-thread private resource, so other programs are not relevant. The relevant point is whether the verifier can prove that the program has a maximum stack depth in the worst case scenario. Forbidding recursion is one way to achieve that. If recursion is needed, but is carefully bounded (say, not more than N frames deep, and the recursive path never calls other code that could start its own recursion), that can also work, although the verification is harder.

Stack memory can be very easy or very hard to overflow, depending on the program. If you have no recursion, and all stack-local variables have compile-time provable maximum sizes, then a static verifier can determine, without running the program, the worst-case stack usage. Similarly, if all stack-local variables have compile-time provable minimum sizes, then a static verifier can determine whether the program might experience a stack buffer overflow. For this purpose, the verifier can cheat: any time it cannot prove that an overflow is impossible, it can call the program unsafe, without proving that an overflow is possible. Since the goal is to prove the program safe, erring on the side of caution is acceptable.
Top
pjp
Administrator
Administrator
User avatar
Posts: 20668
Joined: Tue Apr 16, 2002 10:35 pm

  • Quote

Post by pjp » Thu Feb 20, 2020 3:57 am

alamahant wrote:May be it means avoiding instantiating classes using the "new" keyword..
I think by default cpp creates objects in the stack,unless explicitly told to use the heap via the "new" keyword.
Then supposedly one has to "delete" these objects at a proper time lest leading to a memory leak.
But I think this can be safely handled by the use of smart pointers (scoped or shared)by :

Code: Select all

#include <memory>
Perhaps. I was looking at C when I came across the rules, so I may have presumed them to be related to C. The page states "These rules are a complement to the MISRA C guidelines and have been incorporated into the greater set of JPL coding standards." And reference 2 is "JPL C Coding Standard". The PDF also seems to reference C a bunch of times, but I didn't find any references to C++.
Quis separabit? Quo animo?
Top
pjp
Administrator
Administrator
User avatar
Posts: 20668
Joined: Tue Apr 16, 2002 10:35 pm

  • Quote

Post by pjp » Thu Feb 20, 2020 4:24 am

GDH-gentoo wrote:
pjp wrote:I think they are discouraging "unplanned" use of dynamic allocation. To me, "after initialization" seems to imply a specific time when it is acceptable to allocate dynamic memory.
The rest of the paragraphs (behaviour of memory allocators, errors in doing memory management), and the phrase "the absence of dynamic memory allocation, stipulated by the third rule" in section "Following the rules" make me think they are discouraging it altogether. In "safety-critical code", which is what this rules are about according to the introduction.
My first thought was that they were discouraging it completely as well. However, that just doesn't seem to make sense. How many significant programs can really operate only on the stack? Given that this is referencing possibly all embeded or specialized devices, perhaps that is reasonable. However, "after initliaization" to me implies there is a specifc point at which it is acceptable. Otherwise I think it would explicitly state to not use it.
GDH-gentoo wrote:
pjp wrote:It seems that poorly managed use of the stack could lead to some of the same problems they intend to avoid by minimizing use of heap. Local / stack is all good an well, until I exceed that seemingly very short supply.
Maybe I don't understand what you mean by "poorly managed use of the stack". Use of the stack "just happens" when the compiler translates language constructs to code.
Right. That's kind of my point. Either I was unfamiliar with methods to "safely use" the stack, or a program can try to use / request more memory than is available in the stack. That would appear to create a situation where the stack may have some of the same problems that exist with dynamic memory allocation. That was a key point of my confusion. I read that as "Don't use dynamimc memory allocation because <problems>. Use the stack instead, which may also exhibit some of the same problems, they'll just occur unexpectedly." :me being confused: My understanding of malloc and related functions is that I can handle its failure to allocate memory. And with the stack, it is just up in the air as to whether or not there will be anything available. That sounds more like a game of chance.
GDH-gentoo wrote:Without recursion and without dynamic memory allocation, the memory used by the program is bounded. That's the point they make in the last sentence of the last paragraph. The compiler can know what it'll take to accomodate it.
OK, so it compiles fine. Does that translate to never running out of stack memory every time it runs, wherever it runs (presuming it is capable of starting and running at all)? My knowledge of the stack isn't strong, but I was under the impression that multiple programs could compete for usage of available stack memory.
Quis separabit? Quo animo?
Top
pjp
Administrator
Administrator
User avatar
Posts: 20668
Joined: Tue Apr 16, 2002 10:35 pm

  • Quote

Post by pjp » Thu Feb 20, 2020 4:40 am

Hu wrote:...
Thanks, that seems to confirm at least some of what I was thinking to be true.
Hu wrote:Although not articulated clearly in the quoted rule, I think the authors also intended that any dynamically allocated memory become effectively static after startup. Allocating your own arena and running your own allocator out of it are a violation of the spirit, if not the letter, of the rule. Each dynamic allocation should have a clearly defined lifetime that can be easily verified. It's probably permitted for the block to serve more than one purpose, if the verifier can prove that there is no overlap or conflict. For example, a logging subsystem could use the same block of memory for composing every output line, since it would compose a line, flush it out of the program to storage, then compose another line. However, it might not be legal to use a given block of memory to receive network input from multiple sources, because it may not be practical to verify that the program only ever receives from one source at a time.
That makes sense too. I don't have the experience, so I was just trying to come up with a possibility for using the heap before but not after initialization.
Hu wrote:The stack is a per-thread private resource, so other programs are not relevant. The relevant point is whether the verifier can prove that the program has a maximum stack depth in the worst case scenario. Forbidding recursion is one way to achieve that. If recursion is needed, but is carefully bounded (say, not more than N frames deep, and the recursive path never calls other code that could start its own recursion), that can also work, although the verification is harder.

Stack memory can be very easy or very hard to overflow, depending on the program. If you have no recursion, and all stack-local variables have compile-time provable maximum sizes, then a static verifier can determine, without running the program, the worst-case stack usage. Similarly, if all stack-local variables have compile-time provable minimum sizes, then a static verifier can determine whether the program might experience a stack buffer overflow. For this purpose, the verifier can cheat: any time it cannot prove that an overflow is impossible, it can call the program unsafe, without proving that an overflow is possible. Since the goal is to prove the program safe, erring on the side of caution is acceptable.
And this helps a lot. I'm not sure where I picked up the idea that the stack was shared... I was probably mixing something else in with the stack. However, I do see vague warnings about overflowing the stack which seems unpredictable at runtime.

Now I know I need to learn more about the stack.

Thanks all!
Quis separabit? Quo animo?
Top
Ant P.
Watchman
Watchman
Posts: 6920
Joined: Sat Apr 18, 2009 7:18 pm
Contact:
Contact Ant P.
Website

  • Quote

Post by Ant P. » Thu Feb 20, 2020 6:48 am

pjp wrote:My understanding of malloc and related functions is that I can handle its failure to allocate memory.
I think what the advice is getting at is: if you're going to do something that can fail, fail early. The worst time for a malloc to fail is in the middle of a program when you've already burned substantial CPU to get there and can't back off (just ask anyone who's ever had browser builds OOMing at the final LTO step).

Plus if you can do one big allocation up front for what you know you'll need, instead of always doing small ones when they turn up, that avoids a bunch of syscalls and memory fragmentation (another big contributor to OOM).
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Thu Mar 05, 2020 8:37 am

So are we are are we not allowed to use stack?

If not, this is misguided. Stack use (or more specifically, functional programming with no side effects) is one way that software can be proven for correctness if needed. Once side effects are introduced, it becomes much harder to prove correctness. Using heap memory is one big side effect ...

Really I have the feeling that the words "stack overflow" and "stack smashing" is scaring people from using the stack. Stack overflow shouldn't happen unless you ran out of memory. Stack smashing, well, is yet another problem that needs to be dealt with that C does not have inherent protection against.

Honestly IMHO, just get your pointers straight and all will be fine. Hah. Easier said than done, but it gives the most flexibility.

(BTW,

Code: Select all

void func(int a) {
int localarr[a];
}
This would be dynamic allocation of an array on the stack in local scope of func, but I don't know why this seems grammatically wrong to me, perhaps I'm too used to an older C standard where doing this was actually illegal? In any case, you still need to make sure you don't stack smash which this is very easy to do if you can/do dynamically allocate like this...
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
krinn
Watchman
Watchman
User avatar
Posts: 7476
Joined: Fri May 02, 2003 6:14 am

  • Quote

Post by krinn » Thu Mar 05, 2020 11:45 am

eccerr0r wrote:Stack overflow shouldn't happen unless you ran out of memory.
I don't think so, stack is a define memory size, the compiler know its value and limit, the stack overflow is going outside the range of a variable on the stack, and crushing stack area used by other variables
if we assume stack address is assign sequencialy, then
char crushme[1]; would be next to your localarr[a] allocation, and memcpy(localarr, overflow, sizeof(overflow)) will be stack overflow if sizeof(overflow) is int*int+1 ; and will kill crushme value ; and this even if the compiler has seen that int*int+char is tiny enough to be handle on the stack
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Thu Mar 05, 2020 5:47 pm

Perhaps the terminology is being confused here - I define stack overflow is when the stack page is fully consumed and cannot allocate any more space to it; stack smashing is the case described where an element on the stack, data or return value, is corrupted due to overreach of a pointer.

Going beyond the end of an array is an array overflow, but is not clear whether it destroys stack values because it may not necessarily be on the stack. If they were automatic variables, then yes they would be on stack.
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
GDH-gentoo
Advocate
Advocate
User avatar
Posts: 2115
Joined: Sat Jul 20, 2019 7:02 pm
Location: South America

  • Quote

Post by GDH-gentoo » Thu Mar 05, 2020 7:57 pm

pjp wrote:
GDH-gentoo wrote:Without recursion and without dynamic memory allocation, the memory used by the program is bounded. That's the point they make in the last sentence of the last paragraph. The compiler can know what it'll take to accomodate it.
OK, so it compiles fine. Does that translate to never running out of stack memory every time it runs [...]?
As I understand it, and not considering the case of VLAs as eccerr0r points out, yes, unless available memory in the execution environment is, or can become, less than the program's (predictable) upper bound on usage.
eccerr0r wrote:So are we are are we not allowed to use stack?
The document discourages use of the heap, why do you say that?
eccerr0r wrote:[...] scaring people from using the stack.
As if a program could avoid using the stack?
eccerr0r wrote:Hah. Easier said than done [...]
That's why they propose rules :wink:
eccerr0r wrote: (BTW,

Code: Select all

void func(int a) {
int localarr[a];
}
This would be dynamic allocation of an array on the stack in local scope of func, [...]
I've seen critiques of VLAs because the standard does not (to this date, I think?) specify a way to indicate to the program that the allocation cannot be done. I wonder why it cannot mandate that array-to-pointer conversion yield a null pointer in that case, that the sizeof operator yield 0, and that any other use of the VLA result in undefined behaviour.
Top
Hu
Administrator
Administrator
Posts: 24403
Joined: Tue Mar 06, 2007 5:38 am

  • Quote

Post by Hu » Fri Mar 06, 2020 1:58 am

eccerr0r wrote:So are we are are we not allowed to use stack?
Stack use is permitted, and encouraged, provided that the use can be proved safe. If the verifier cannot prove a safe upper bound on the amount of stack required, then the program will likely be considered "not proven safe" and rejected. As I noted in an earlier post, verifiers have a large opportunity to cheat here. They don't need to prove the program is unsafe, which in the general case could be quite hard. Instead, they try to prove it is safe, and if they run into a construct that the verifier cannot prove is safe, assume the program is unsafe and reject it, even if a better verifier would have proved the program to be safe. (For this purpose, "better" could mean a flow control tracer that does a better job of recognizing and pruning out impossible paths, or that traces deep logic trees farther before bailing out with a "tree too complicated" error, or that has better cross-module contract analysis, etc.)
eccerr0r wrote:Stack overflow shouldn't happen unless you ran out of memory.
Yes, but if the verifier cannot accurately predict worst-case stack usage, how can you be sure that the device will not run out of memory? That's why some verifiers ban constructs that are difficult to prove safe.
eccerr0r wrote:Going beyond the end of an array is an array overflow, but is not clear whether it destroys stack values because it may not necessarily be on the stack. If they were automatic variables, then yes they would be on stack.
Also, depending on the nature of the variables involved and the extent of the smash, there may be unused padding bytes immediately after the smashed variable, for the purpose of starting the next real variable on a nice alignment boundary. In such a case, a small smash may cause no harm at all, because the only bytes it trashes were ignored anyway. It's never wise to rely on this, but it can happen in non-contrived programs. I believe such programs are still considered to be "ill-formed, no diagnostic required" since they write beyond the end of an allocation.

Code: Select all

void f() {
    char c[1];
    int i[4];
    strcpy(c, "a"); // one byte smash, but might cause no harm, depending on stack layout
}
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Sun Mar 22, 2020 4:13 pm

What is this "recursion depth checker"? Only specific languages, or is this only theoretical?

While for formal verification it would be nice, but what compilers actually check stack depth at compile time? Is it actually being done?

Right now I'd imagine this is only done at runtime, where a counter could be used.

I would think formal verification is not being done for stack depth unless for a very specific purpose or theoretical (like designing an efficient algorithm), else running out of stack memory is not checked until runtime, just like running out of heap memory.
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
AlexJGreen
Tux's lil' helper
Tux's lil' helper
Posts: 149
Joined: Wed Sep 19, 2018 5:37 pm

  • Quote

Post by AlexJGreen » Sun Mar 22, 2020 7:56 pm

_
Last edited by AlexJGreen on Mon Dec 28, 2020 3:15 am, edited 1 time in total.
Top
Hu
Administrator
Administrator
Posts: 24403
Joined: Tue Mar 06, 2007 5:38 am

  • Quote

Post by Hu » Sun Mar 22, 2020 9:08 pm

Tail call optimization only works if the recursive step is the last one in the function. If the function needs to do additional work after the recursive step, a tail call optimization may not be possible.

Recursion depth can be checked in any language, if you constrain the program such that a complete call graph can be constructed. If you have a complete call graph, and any recursion that occurs is strictly bounded, then you can compute the maximum required stack depth without executing the program. Since this thread is about guidelines for designing software for use in critical applications, I expect there would be interest in proving that the stack cannot be exhausted by any execution path.
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Mon Mar 23, 2020 7:22 am

I'd like to know of specific software that actually does compile time formal verification of stack depth, yes theoretically it's possible but I'd assume this somewhat falls under the NP code sequences (because of quite possible arbitrary data input) as it's a subset of the halting problem.

Example would be, say, a simple recursive binary tree insertion. Worst case the stack depth could go the depth of the data inserted, but if the data is not bounded as it's not available at compile time, how can this be proven? And what program can be used to check this?

Or are you thinking of some other checker that assumes precooked data from a run profile - and this would no longer be formal?
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
krinn
Watchman
Watchman
User avatar
Posts: 7476
Joined: Fri May 02, 2003 6:14 am

  • Quote

Post by krinn » Mon Mar 23, 2020 10:54 am

any compiler can check it easy, grep "Max stack size" /proc/$(pidof X)/limits
but i'm not quiet sure it's a compiler task, more a profiler one
Top
eccerr0r
Watchman
Watchman
Posts: 10239
Joined: Thu Jul 01, 2004 6:51 pm
Location: almost Mile High in the USA
Contact:
Contact eccerr0r
Website

  • Quote

Post by eccerr0r » Mon Mar 23, 2020 4:13 pm

Uh... any program that requires the kernel to detect stack size is runtime, not compile time.

Formal verification means at compile/build time, determining problems when you run 'gcc' or 'rustc' or whatever on the software and not need to run the resultant binary. The dependence on instrumentation on the binary is not 'formal'.

Agreed you can tell the stack size with profiling, but that's not formally proving the stack size will not increase past a certain size because it's dataset dependent.
Intel Core i7 2700K/Radeon Firepro W2100/24GB DDR3/800GB SSD
What am I supposed watching?
Top
GDH-gentoo
Advocate
Advocate
User avatar
Posts: 2115
Joined: Sat Jul 20, 2019 7:02 pm
Location: South America

  • Quote

Post by GDH-gentoo » Mon Mar 23, 2020 8:29 pm

Let's see. For common implementations of a C language "execution environment", with the "no recursion" rule, I believe that maximum stack depth == maximum number of nested function calls, which can be determined by parsing the program. No need to run it. If the stack frame size is known (I believe that it is of fixed size, but I don't know how it is determined), maximum stack depth * stack frame size == upper bound on stack memory usage.

Would that be correct?

I don't know about other compiled languages.
Top
Post Reply

39 posts
  • 1
  • 2
  • Next

Return to “Portage & Programming”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic