• Not actually a use-after-free bug, just memory corruption due to a non-aligned pointer. (Calculated from the unsound math in safe code you mentioned.)

    Non-aligned memory access in CPUs is a “don’t care” condition that can do whatever is most convenient for the processor implementation. Typically, the cause for the memory corruption is due to caching. Cache lines are a certain size, in this case 128 bytes. Let’s say you want to fetch 16 bytes of data, and the processor specifies for simplicity that all data you fetch or this width should have an address divisible by 16. This is so that the data in question is entirely within a single cache line, and the memory controller doesn’t have to handle fetching two cache lines for a single load, which would be slower for no good reason.

    If we violate this contract, let’s say we load a pointer with an offset that is 15 mod 16. Then the processor fetchs the cache line where the data starts, slots it into the cache memory, then reads from the cache memory and you get back either:

    1. 7/8 times: the entire correct data, because the pointer is not also at the end of the cache line
    2. 1/8 times: the first byte of the correct data and 15 bytes of other data from the cache, which data exactly is governed by the caching behavior of the L1 cache and can be difficult to predict.

    This affects both loads and stores, so you can overwrite other cache lines. And doing so may mark them dirty depending on implementation. When they get evicted from the cache if they’re marked as dirty (either from this or some other write) they’ll be committed, either to the L2 cache, or if the cache is write through, also to memory. This is the most likely source of the memory corruption, to my understanding.