?

Log in

No account? Create an account
64-bit C programming - brad's life — LiveJournal [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

64-bit C programming [Mar. 1st, 2004|10:25 am]
Brad Fitzpatrick
Can anybody recommend a good resource for how big C data types are when compiling on 32-bit vs. 64-bit platforms?

On both, what is:

int
void*
long
long long
printf formats

Apparently memcached has some 32-bit limits on 64-bit platforms, and I want to understand the problem before I apply some random "makes it work" patch.
LinkReply

Comments:
[User Picture]From: taral
2004-03-01 11:06 am (UTC)
On a 64-bit integer and pointer system:

void * = 64
int = 32
long = 64
long long = 64
(Reply) (Thread)
[User Picture]From: gen_witt
2004-03-01 01:20 pm (UTC)
Are you sure? int is defined to be the fastest integer format for the machine, which for a 64bit integer machine would be 64bits.

This is so heavily compiler dependent, that the simplest way to go about it is to do typedef out u32, u64, i32, i64, etc. I'm told autoconf does this in some way, but I never did figgure out how to use it.

Other ways of figguring it out. You can looking into the limits.h file for UINT_MAX, or do a quick and dirty printf( "%d\n", sizeof( int ) );. Hope something here helps.
(Reply) (Parent) (Thread)
[User Picture]From: taral
2004-03-01 07:49 pm (UTC)
POSIX defines u_int32_t and so on.

int should be the fastest type, but it turns out so much code depends on it being 32-bit that that is now the defacto standard. "long" is usually the fastest integer type on the system now.
(Reply) (Parent) (Thread)
[User Picture]From: haran
2004-03-01 08:11 pm (UTC)
int should be the fastest type, but it turns out so much code depends on it being 32-bit that that is now the defacto standard. "long" is usually the fastest integer type on the system now
"Because so much code depends on it" seems like a really lame reason for keeping ints at 32-bit.
Are programmers now expected to use long for loop counters, indexing and so forth?
Maybe gcc has some compiler switch to downgrade int to 32-bits when building code that relies on this?
(Reply) (Parent) (Thread)
From: jeffr
2004-03-02 02:00 am (UTC)
Backward/forward compatibility is often a huge deciding factor in whether or not a technology will survive. Unless you're apple, who switched processor architectures without their users really noticing (68k to ppc). Rarely does anyone have control over the hardware, operating system, tool chain, and software.

What's the point in using 64bit ints where you don't need 64bits? Do you often have loops that require more than 4 billion iterations? Do you have arrays that are larger than 4Gb * sizeof(type)?
(Reply) (Parent) (Thread)
[User Picture]From: haran
2004-03-02 04:36 am (UTC)
Do you often have loops that require more than 4 billion iterations?
Of course not. But by the same token, 32 bits are also overkill. We use them (as opposed to short or char) because, theoretically, using variables of the machine's native word size produces optimal results.
Granted, the performance boost isn't much, but you're still breaking the C standard by not making sizeof(int)==machine word size.
Yes, backwards compatibility is important and its easy to make false assumptions when coding(esp. in bitwise manipulation of integers) but I honestly believe, that when porting, you should have to go through the code and correct these errors.
From the POV of the end-user, compatibility isn't a problem, is it? Existing programs should run on 64-bit platforms. Only they wouldn't be optimized.
It's only when the developer tries to retarget his code for 64-bit machines, that the problem comes.
In which case, I still believe its up to the developer to fix the mistakes. I'm sure there are compiler flags to issue warnings about potential issues.
(Reply) (Parent) (Thread)
From: jeffr
2004-03-02 01:53 pm (UTC)
Section 6.2.5, bullet 5 of ISO/IEC 9899:1999E (c99) says:
"A plain int object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range INT_MIN to INT_MAX as defined in the header
[Error: Irreparable invalid markup ('<limits.h>') in entry. Owner must fix manually. Raw contents below.]

Section 6.2.5, bullet 5 of ISO/IEC 9899:1999E (c99) says:
"A plain int object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range INT_MIN to INT_MAX as defined in the header <limits.h>)."

Please show me the section that makes using something other than the largest supported integer word a violation of the standard.

Most machines have many native word sizes. x86 processors support the full range of math functions on bytes words and dwords. Using 32bit integers on a 64bit machine by default produces faster, more compact code in the common case. The "performance boost" is a pessimization in almost all cases.

What you're advocating is not economical, which is most likely why it was not done. No one wants to pay a developer to review every line of code for simple type errors when a single change to the ABI renders it unnecessary.

Furthermore, the programmers who made the 32bit/64bit mistakes in the first place have probably silenced the compiler errors that they original got when making these mistakes by using casts. I know because I've ported a lot of code to 64bit platforms. Usually you just have to run it and wait for a truncated pointer to pop up and cause a core dump.

There's also the problem of the wealth of open source software. Who's going to pay those developers to port forward? BSD/Linux on 64bit platforms wouldn't have been half as successful if we had to rework every line of opensource code.
(Reply) (Parent) (Thread)
From: jeffr
2004-03-01 11:00 pm (UTC)
Actually u_int*_t is old BSD style. POSIX and C99 define uint*_t and int*_t. Notice the lack of an extra underscore.

32bit math is often faster on 64bit machines than 64bit math is. I'm certain that this is the case for the opteron and the alpha, which are the 64bit platforms that I most often program from.

Specifically, division and multiplication take half as long for the more narrow integer types. Not to mention the shorter opcodes on amd64, and the lack of 64bit immediates for many integer operations requiring extra instructions.

Really the only thing that could be more expensive is the cost of sign or zero extension, which is always free. In practice the extra cost of 64bit math on 64bit machines isn't an issue. The real issue is the extra memory bandwidth required and memory used from having larger pointers.
(Reply) (Parent) (Thread)
[User Picture]From: gen_witt
2004-03-01 01:40 pm (UTC)

Icky.

It would apear that under gcc using -m64 to compile for x86-64 (or IA-32e, whatever you want to call it). Yields 32bit ints, I cannot tell from the gcc doco for IPF (IA-64). Doesn't IA-64 take a penalty for operating on 32bit values??
(Reply) (Parent) (Thread)
[User Picture]From: denshi
2004-03-01 10:22 pm (UTC)
void * = 40 on AMD64 machines, IIRC. Leaving 24 delicious type-identification bits for SBCL.
(Reply) (Parent) (Thread)
[User Picture]From: taral
2004-03-01 10:28 pm (UTC)
That's why I qualified it as for 64-bit pointer systems.
(Reply) (Parent) (Thread)
[User Picture]From: denshi
2004-03-01 10:31 pm (UTC)
Oh, I wasn't correcting, just pointing out a new toy.

You down at Mojo's atm?
(Reply) (Parent) (Thread)
[User Picture]From: taral
2004-03-01 10:44 pm (UTC)
Nope. I'm at home. Jonathan is here, packing up to leave.
(Reply) (Parent) (Thread)
[User Picture]From: krow
2004-03-01 12:22 pm (UTC)
Print formats are off between linux and solaris I believe.

I am working on long double at the moment, and it is a mess :(
(Reply) (Thread)
From: evan
2004-03-01 12:32 pm (UTC)
iirc configure scripts usually test some of these (see krow's comment) on a platform-by-platform basis.

if you wanna go to that, i'd be happy to hack something up.
it'd be easy to steal glib's gint32, gint64 tests.
(Reply) (Thread)
[User Picture]From: xaosenkosmos
2004-03-01 02:14 pm (UTC)
Here's an excellent resouce on 64b porting from Sun. I had to look over it not too long ago, but didn't actually run into any real problems to fix, so i have no real input, sorry.
(Reply) (Thread)
[User Picture]From: caladri
2004-03-01 02:59 pm (UTC)
It depends on the ISA and the ABI. Where are you seeing the problems? That can isolate your problems, or help to... In general, make copious use of warning flags and try to compile on 64-bit platforms.
(Reply) (Thread)
From: jeffr
2004-03-01 10:48 pm (UTC)
There is a standard notation for expressing the width of the datatypes on a particular platform. X86 is ILP32 while amd64 is LP64. Virtually every 64bit platform other than windows is LP64. Windows is LLP64.

If you haven't guessed the characters stand for Int, Long, and Pointer. Often when porting to a 64bit architecture people refer to "LP64" problems, which usually arise when someone casts an int to a pointer or a pointer to an int.

If you have to do pointer math, you want to use the uintptr_t type. This is an integer that it big enough to hold any pointer value. There is also a ptrdiff_t which can hold the results of pointer math, but I don't ever really use that.

For printf formats when you're using a 64bit type on all platforms, you want the 'j' modifier. Like printf("%jd", (off_t)foo);

Hope that helps.
(Reply) (Thread)
[User Picture]From: brad
2004-03-01 11:07 pm (UTC)
This helps tremendously, thanks!
(Reply) (Parent) (Thread)