Close to the last allocation doesn't matter. What matters is the memory returned to the application - and this is memory that has been touched long ago and unlikely in cache. If your new generation size is larger than L3 cache it will have to be fetched from main memory for sure every time you start the next 64 bytes. I believe a smart cpu will notice the pattern and will prefetch to reduce cache miss latency. But a high allocation rate will use a lot of memory bandwidth and would thrash the caches.
An extreme case of that problem happens when using GC in an app that gets swapped out. Performance drops to virtually zero then.
An extreme case of that problem happens when using GC in an app that gets swapped out. Performance drops to virtually zero then.