Log in

No account? Create an account
Relocating memory allocator - brad's life — LiveJournal [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Relocating memory allocator [May. 26th, 2003|01:53 pm]
Brad Fitzpatrick
So, it turns out the memcached takes up no CPU and makes the site blazing fast (even though we're only caching a few object types so far!).

Unfortunately, it did what we suspected: died hard when its address space got fragmented. (In the meantime I restarted the daemons with much smaller sizes, accounting for the bloat which accumulates over a few hours....)

I jotted out notes for an arena allocation scheme where we only malloc large, fixed-size pages and handle our own placement and and recompaction within those pages. However, I can't believe this hasn't been done before.

I want a memory allocator that lets you register a hook when it wants to move allocations around to defragment, so you can update pointers and whatnot. Anybody know of one?

From: jeffr
2003-05-28 06:05 pm (UTC)
That adds two system calls for every access to a cached object. I guess the application could be written with a two tiered cache. The first tier could be locked always and the second would have to be migrated to the first and locked before use.

Even so, this seems like a very imposing interface. You have to page align all objects and all objects must be page size. Otherwise you pin down unrelated objects.

It still seems to me that the signal is more flexible and less intrusive. It's also less code.

It's your deal though, have fun with it.
(Reply) (Parent) (Thread)
[User Picture]From: taral
2003-05-28 10:07 pm (UTC)
Oh, sure. The signal is more flexible and less intrusive. It's also less effective. I'll do a few experiments on my end and tell you what I end up with.
(Reply) (Parent) (Thread)
From: jeffr
2003-05-29 11:16 am (UTC)

I do like to beat a dead horse.

By effectiveness I assume you mean the amount of memory reclaimed.

In the signal based solution the memory reclaimed is limited by the application because it has to voluntarily give it up.

In the other solution the memory reclaimed is limited by the application because it has to mark it as reclaimable and unlock it.

The second solution also seems more costly in the kernel. You have to dig through your vm datastructures finding something suitable to remove and then make sure you can remove it. So you also maintain the idea of pages that can be freed in two places instead of one.

This also restricts you to a coarse grain LRU page replacement policy. If the page reclaiming is done in the application it can exploit application specific knowledge to build a free list that is smarter than LRU. It is coarse grain because most VMs don't reset the accessed bits on the page with any great frequency while in the application perfect LRU order could be maintained by simply using a tail queue.

Anyway, I'm sure both solutions will work well enough. I've already bikesheded this discussion enough. I'll try to find some other outlet for my post coffee programming discussion frenzy.
(Reply) (Parent) (Thread)
[User Picture]From: taral
2003-05-29 03:14 pm (UTC)

So do I

Actually, effectiveness is "probability of avoiding swapping out data".

Yes, bikeshed indeed...
(Reply) (Parent) (Thread)