Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> why were you trying to hand allocations over to malloc()?

So that they could be released with free(). For historical reasons, on Linux, most libraries don't generally provide foo_free() functions to be used for freeing objects returned from those libraries, everyone is supposed to use free(), under the tacit assumption that there is only one version of libc loaded in the process which everyone will use. The Windows world has somewhat better culture in this regard.

> Why not entirely replace malloc() for the process?

Now that's just rude.



> So that they could be released with free().

Well, yes, you have to replace free() and calloc()/realloc() too. Sorry, didn't think that needed spelling out.

> > Why not entirely replace malloc() for the process?

> Now that's just rude.

Isn't that generally what alternate custom allocators do, though? Like dmalloc and jemalloc?


Well, it's one thing when you, as an author of a program, links libraries into your code, and then you pick a custom allocator, and then the libraries you've picked will (transparently) use it. It's an altogether different thing when you, an author of a library, decide to use a custom allocator and then, since your users probably can't be persuaded to use your custom foo_free() that you could export, you hijack the libc's malloc implementation from anyone who links against you and replace it with your own. I personally think it's just rude.

So, with those constraints it's either memcpy-ing into the buffer that you get from the global malloc(), or trying to remap the addresses that that buffer spans onto your own buffer; I thought the latter could be faster starting with moderately large buffers but since I couldn't make it work, I couldn't benchmark it so nothing came out of it.


Ah.

One other approach that does occur is to just malloc() a 16GiB buffer to begin with. Only pages you touch should end up being backed by RAM(/swap). Then your finalize() operation is just a realloc() to "shrink" the buffer down to its final size. Any decent allocator should keep the data where it is, and just make the now-unused tail portion of address space available again, without ever having needed to back it.


It's possible, absolutely, but if you allocate N such fragments one after another, do some allocations in them, and then finalize them from first to the last, there will be gaps left less than 16 GiB in size, so they could only be reused for small allocations; and IIRC allocators with special support for huge allocations (e.g. using so-called huge pages) do not reuse that memory for "normal-sized" allocations (although I can be wrong on this).

So it's a tradeoff: this fragmentation is not that bad but it's still noticeable in a sufficiently long-running program because 16 GiB is 2**34 so you only can make 16 Ki such allocations before you hit the 2**48 limit. And if you could just simply remap them!..




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: