Down with 64 bit pointers, up with 32 bit indices

29/01/2010

lately I’ve been experimenting with a trick to save space on 64 bit architectures: rather than using pointers (which take 8 bytes of memory! 8! bytes! omg!), I’m using 32 bit indices into a single, gigantic 32 gig block I allocate at startup. as in:

u64 *bigblock=malloc(32*1024*1024*1024); // at startup. or choose a number smaller than 32 :)

struct foo { int x, y, whatever, whatever; };

// later…

u32 myfoo=MyAllocatorForBigBlock(sizeof(foo)); // using a custom allocator of your choice…

((foo*)(bigblock+myfoo))->x=100;

((foo*)(bigblock+myfoo))->y=200;

with some macro goodness, or in C++, a template class that looks a bit like a smart pointer but is in fact not smart, those casts I made explicit above go away it can be made to look really neat:

P<foo> myfoo=P<foo>::Alloc(); // P<foo> is really a 32 bit index just like before.

myfoo->x=100;myfoo->y=200; // the overloaded -> operator does the bigalloc+index thing for us

The indices in my case assume an 8 byte stride, ie all allocations happen on 8 byte boundaries (that’s why bigblock is a u64*), so you can address 4*8=32 gigs with 32 bit indices. For an in-memory database (my use case, as it happens), that’s plenty. I don’t want it swapping anyway, and 32 gig machines are reasonable these days. The compiler seems to do the right thing and generate quite efficient code (eg it keeps bigblock in a register (ecx), puts indices in say edx and does things like mov [ecx+edx*8+4],200)

another cute side benefit is that all the pointers in your system, now that they’re indices from a base address, are sort of ‘position independent’… you can mmap() or fwrite() the entire bigblock to and from disk, and next time you boot, it doesn’t matter if the base address moves, it all just works. no need for pointer fixups or anything like that.  win!

kinda makes 64 bit linux/windows feel a bit more like a nice embedded system with a predictable memory map and nice small 4 byte pointers. lovely!