MMAP - What is it?
Summary
- Natural Defragmentation Searching
- Good for shared memory space reading a physical file.
- Handles larger memory spaces
- Optimized for SMP usage
MMAP is technically only a reader as it must read the allocation from a physical file before into memory. The remove the need MALLOC has to zero out a space before usage. When allocating it can mark the memory space as RO or RW. If you choose to mark is RW, you will need a writer thread to sync to the physical file every so often or lose data ( This may or may not be an issue).
While this sounds exceptionally useful, we must remember what is good for "Case A" is bad in "Case B". For example as the kernel control how many pages MMAP may map to. Any allocation of this space must go through the kernel. As such any time you have a shared memory segment, let’s say holding the contents of a “frm” file in memory, this is useful and not an issue. However cases like session level memory this is causing all allocations/de-allocation/syncs and other functions to route outside of the user-land and into the kernel. This is a minor hit in general however if you are doing several thousand questions per second it can add up. Which is tantamount to spin-rounds, they are not bad in themselves but too many of them will cause CPU% and LOAD to increase, causing more rounds in a situation that gets worse and worse as the queue grows.
It also has the ability scale much better with multiple CPU’s, this sounds perfect, however we must recall each session is bound to a specific CPU (even in NUMA). As such using an allocator in user-land, that is optimized for UMP can actually yield better results.
The last major point I would like to make is MMAP does a lot of natural search and fragmentation actions; these are good to make sure it write all the allocation in a sequential manner. This does have a cost associated with hunting and finding memory to be used.
MALLOC - What is it?
Summary
- Uses binning system for < 512k memory allocations, removing the need for fragmentation detections
- Will pass to MMAP if a given allocation needs to grow based 512k
- Optimized for UMP situation where you have one read/write updating the pointer
- Does not need to “read” the allocation from a file to write it into memory
MALLOC is a much simpler allocator, by its simplest definition it will request a given heap size and reserve it. To make use of this allocation when writing data into it you would want to write your data and pad it with zeros. This step can be done during allocation to have MALLOC present a pointer to a pre-zeroed data space. As padding multiple pages can become expensive designs were made that anything > 512k would map to MMAP vs. MALLOC
MALLOC naturally likes to use a binning system, as such having a session level buffer level “sort_buffer_size” 256k vs. 512k, will have little impact. Let me explain, if you have 1 512k allocation that will use a full bin. Next let us allocate a 128k, this will put a 8-byte pad between the bin’s and create a new bin of 512k, using 128k of it. Next we add a 256k request, as this will still fit in 512-128>=256 we will not make a new bin but use that bin. This math assumes you have setup malloc(8) not (16) or (128), those are outside of this topic. Just know that would affect things.
This technique makes it much easier for the allocator to determine where space is to make its next allocation, or if it needs a new bin. Thus we have removed the new for a defragmentation set up functions in the allocation, and the need for “seeking” for spaces large enough.
Next and MOST important we come to the single biggest difference in MALLOC vs. MMAP. MALLOC as it is just setting a heap size, while possibly also zeroing said size out, has no need to write the data to a file before reading it into the memory pointer. Which is exactly what MMAP does. This means we can save some valuable space, as the price of having no flexibility in resizing the allocation. MMAP could sync the file and re-read it to get a new allocation to the new space, larger or smaller.
Conclusion
With out spending a few days getting into the finer points , the basic conclusion is:
If you need to only use a static size in a single thread, MALLOC wins removing some overhead and needing to hit the kernel.
If you are sharing the memory across multiple threads and you may need to write and re-read the data into memory again MMAP can be superior.
Additionally when you have thread from multiple CPU’s where count > 4, you will see better performance from MMAP as it understand multiple CPU’s better than MALLOC.
For our purposes in this talk (Session Level Buffers), if possible we would prefer MALLOC.
Putting it to use:
Now that we have determine what is the correct allocator to use let us talk real world application.
I tend to like the following code sample:
SET @kilo_bytes = 1024;
SET @mega_bytes = @kilo_bytes * 1024;
SET @giga_bytes = @mega_bytes * 1024;
SELECT
( @@key_buffer_size + @@query_cache_size + @@tmp_table_size + @@innodb_buffer_pool_size + @@innodb_additional_mem_pool_size + @@innodb_log_buffer_size) / @giga_bytes as GLOBAL_MEMORY
, (@@max_connections * (@@read_buffer_size + @@read_rnd_buffer_size + @@sort_buffer_size + @@join_buffer_size + @@binlog_cache_size + @@thread_stack)) / @giga_bytes as MAX_CONNECTION_MEMORY
, @@max_connections as MAX_CONNECTIONS
, (SELECT CASE @@read_buffer_size WHEN @@read_buffer_size > 512*1024 THEN 'MMAP' ELSE 'MALLOC' END) as READ_BUFFER_TYPE
, (SELECT CASE @@read_rnd_buffer_size WHEN @@read_rnd_buffer_size > 512*1024 THEN 'MMAP' ELSE 'MALLOC' END) as READ_RND_BUFFER_TYPE
, (SELECT CASE @@sort_buffer_size WHEN @@sort_buffer_size > 512*1024 THEN 'MMAP' ELSE 'MALLOC' END) as SORT_BUFFER_TYPE
, (SELECT CASE @@join_buffer_size WHEN @@join_buffer_size > 512*1024 THEN 'MMAP' ELSE 'MALLOC' END) as JOIN_BUFFER_TYPE;
This will output something like:
KeyName: | Value: |
---|---|
GLOBAL_MEMORY | 19.73 GB |
MAX_CONNECTION_MEMORY | 96.87 MB |
MAX_CONNECTIONS | 100 |
READ_BUFFER_TYPE | MALLOC |
READ_RND_BUFFER_TYPE | MALLOC |
SORT_BUFFER_TYPE | MALLOC |
JOIN_BUFFER_TYPE | MALLOC |
This says I am currently using MALLOC on my dev setup.
However if sort_buffer_size was its default of 2M , that line item would say MMAP