View previous topic :: View next topic |
Author |
Message |
toralf Developer
Joined: 01 Feb 2004 Posts: 3922 Location: Hamburg
|
Posted: Tue Apr 23, 2013 3:24 pm Post subject: "u64 &= ~bitmask;" where's the pitfall ? |
|
|
I'm trying to understood this explanation from Linus : http://article.gmane.org/gmane.linux.kernel/1479420 but cannot get it replayed with this code snippet : Code: | /*
*
* http://article.gmane.org/gmane.linux.kernel/1479420
*
*/
#include <stdio.h>
int
main () {
unsigned long ul = -1;
unsigned short us = 7;
printf("%u %u\n", ul, us);
ul &= ~us;
printf("%u %u\n", ul, us);
}
|
Please could somebody enlighten me ?
Last edited by toralf on Wed Apr 24, 2013 8:23 am; edited 2 times in total |
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Tue Apr 23, 2013 4:08 pm Post subject: |
|
|
AFAIK gcc considers the result of ~ always as signed, independent of the type of the argument. So maybe Linus refers to other compilers. I am not sure what is defined by the standard.
[b]Edit:[b] Thist shows the problem, of course: Code: | ul &= (unsigned short)(~us); |
|
|
Back to top |
|
|
toralf Developer
Joined: 01 Feb 2004 Posts: 3922 Location: Hamburg
|
Posted: Tue Apr 23, 2013 5:43 pm Post subject: |
|
|
mv wrote: | Thist shows the problem, of course: Code: | ul &= (unsigned short)(~us); |
| yep - that's probably the (compiler ?) problem. |
|
Back to top |
|
|
Hu Moderator
Joined: 06 Mar 2007 Posts: 21633
|
Posted: Wed Apr 24, 2013 1:07 am Post subject: |
|
|
Using =sys-devel/gcc-4.6.3 on amd64, I can reproduce the problem Linus describes. Code: | #include <stdio.h>
unsigned long f(unsigned long ul, unsigned int ui)
{
ul &= ~ui;
return ul;
}
unsigned long g(unsigned long ul, unsigned long ui)
{
ul &= ~ui;
return ul;
}
unsigned long h(unsigned long ul, unsigned int ui)
{
ul &= ~(unsigned long)ui;
return ul;
}
unsigned long i(unsigned long ul, unsigned int ui)
{
ul &= (unsigned long)~ui;
return ul;
}
int main()
{
printf("%lx\n", f(0xf0f0f0f0f0f0f0f0, 0xff));
printf("%lx\n", g(0xf0f0f0f0f0f0f0f0, 0xff));
printf("%lx\n", h(0xf0f0f0f0f0f0f0f0, 0xff));
printf("%lx\n", i(0xf0f0f0f0f0f0f0f0, 0xff));
return 0;
} |
Code: | $ ./bit
f0f0f000
f0f0f0f0f0f0f000
f0f0f0f0f0f0f000
f0f0f000
| As I understand the mail from Linus, he believes that programmers expect f() and g() to return the same results. As this output shows, they do not. |
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Wed Apr 24, 2013 6:47 am Post subject: |
|
|
Hu wrote: | Using =sys-devel/gcc-4.6.3 on amd64, I can reproduce the problem Linus describes. |
Strange: In your example I can also reproduce it. Why doesn't toralf's example behave identically? (BTW: yes, I added "l" in the output in toralf's example).
It seems as if the same code in a subroutine behaves differently than if the compiler can optimize something away.
So maybe the behavior is not standardized, neither in one way nor in the other. |
|
Back to top |
|
|
toralf Developer
Joined: 01 Feb 2004 Posts: 3922 Location: Hamburg
|
Posted: Wed Apr 24, 2013 5:27 pm Post subject: |
|
|
Well, this explains - or better - shows it : Code: | #include <stdio.h>
$> cat bitmask.c
int
main () {
unsigned long ul = 0xF0F0F0F0;
unsigned short us = 0xFF;
printf("%lx %x\n", ul, us);
ul &= ~us;
printf("%lx %x\n", ul, us);
ul &= (unsigned short) (~us);
printf("%lx %x\n", ul, us);
}
$> gcc bitmask.c -o bitmask && ./bitmask
f0f0f0f0 ff
f0f0f000 ff
f000 ff
|
|
|
Back to top |
|
|
Yamakuzure Advocate
Joined: 21 Jun 2006 Posts: 2284 Location: Adendorf, Germany
|
Posted: Thu Apr 25, 2013 6:49 am Post subject: |
|
|
to me the reason why Linus is puzzled seems to be the idea, that the NOT operator is applied first, and the value is expanded then (with zeros).
But this is not the case. gcc first expands the type and then applies the NOT operation. This is rather simple:The ~ operator does an implicit cast to the needed bit count. So this line is actually the same as: Code: | 0xF0F0F0F0 &= 0xFFFFFF00 |
toralf wrote: | Code: | ul &= (unsigned short) (~us); |
| This simply eleminates some of the added bits and is the same as: Code: | 0xF0F0F000 &= 0x0000FF00 | Or both examples in step-by-step: Code: | 1: 0xF0F0F0F0 &= ~0xFF
=> 0xF0F0F0F0 &= ~((int)0xFF)
=> 0xF0F0F0F0 &= ~(0x000000FF)
=> 0xF0F0F0F0 &= 0xFFFFFF00
= 0xF0F0F000
2: 0xF0F0F000 &= (int)(unsigned short)(~0xFF)
=> 0xF0F0F000 &=(int)(unsigned short)(~((int)0xFF))
=> 0xF0F0F000 &= (int)(unsigned short)(~(0x000000FF))
=> 0xF0F0F000 &=(int)(unsigned short)0xFFFFFF00
=> 0xF0F0F000 &= (int)0xFF00
=> 0xF0F0F000 &= 0x0000FF00
= 0x0000F000 | So basically there is nothing mysterious here.
For some impact of this on comparisons where one of two values is NOT'ed, see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38341 - First comment by Richard Biener. It might shed some light onto it: Richard Biener wrote: | note that integer promotion is done on the operand(!) of ~. So
u1 == (u8_t)(~u2)
is equal to
(int)u1 == (int)(u8_t)(~(int)u2) | (The evil detail of the above comparison, and the reason why gcc issues a warning for these, is that the compiler might reorder the casts during optimization, eventually eliminating (some) casts, and evenatually elimiating the comparison as it is always false.) _________________ Important German:- "Aha" - German reaction to pretend that you are really interested while giving no f*ck.
- "Tja" - German reaction to the apocalypse, nuclear war, an alien invasion or no bread in the house.
|
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Fri Apr 26, 2013 11:38 pm Post subject: |
|
|
Yamakuzure wrote: | gcc first expands the type and then applies the NOT operation. |
But this contradicts Hu's example in f where we have identical code and types but different behaviour. |
|
Back to top |
|
|
|