Game Security: Is Your Memory Okay?

@codewiz · December 19, 2013 · 11 min read

640K ought to be enough for anybody.

– Bill Gates

It seems there was a time when Bill Gates, a pioneering computer engineer who amassed the greatest wealth one could with computers, thought 640K of memory should be enough to run every program in the world. Whether he actually made this statement is uncertain, but what's clear is that a considerable amount of time has passed since that mindset was common. In that time, CPUs have evolved, operating systems have changed, and so have programs. Nowadays, if there's a program that runs only on 640K, it would be more surprising than not.

Modern programs use an unimaginable amount of memory, and games are perhaps the prime example. Games remain among the applications that require the most memory and CPU resources. Game security software needs to run concurrently with these resource-intensive programs, therefore, there are various constraints involved, with memory being one of them. In one case, a game could not implement a large content patch due to the memory usage of its security solution, so they switched to our product. The issue was simply the security product consuming too much memory. Regardless, it's undeniable that cohabitation is not easy for resource-intensive games and security solutions that must use memory to function.

CPU issues have often been raised, but memory issues have not been highlighted as much, leading us to believe that all was well with memory usage. However, a hot new game that requires more memory than previous titles proved that assumption wrong. We had to forego some key features while working on making memory usage more efficient to accommodate this popular game, and the process was honestly more interesting than I expected, prompting me to share a bit about it here.

#0

The initial issue that was raised was memory fragmentation. In reality, we don't use an inordinate amount of memory, so if fragmentation was truly the issue, there could only be one culprit: ASLR (Address Space Layout Randomization). We have mentioned in previous posts that we use ASLR to make code analysis difficult by dispersing code throughout memory. However, this could be problematic if large contiguous memory spaces are required. We turned off this option, and the game worked fine. It appeared the problem had been solved, but I thought, "At some later date, I will have to develop an ASLR algorithm that minimizes the impact on fragmentation..."

#1

But the issue that seemed to have disappeared had just been delayed. Over time, as the game continued to be played, the fragmentation problem persisted. Even with ASLR turned off, memory allocation and release cycles could still contribute to fragmentation.

A large-scale construction project started with the determination not to influence even a speck of the game's memory fragmentation. We developed a memory pool and restructured all the chunk memory allocations to operate from that pool. The rationale was this: even if allocation and release cycles continued, they would occur within this predetermined pool, thereby minimizing the impact on the game and avoiding in-game memory fragmentation.

Disabling ASLR and concentrating allocations in a specific area by using a memory pool actually has nothing to do with the security aspects I mentioned in previous posts. However, in the real world, functionality rightly takes precedence over security. After all, if the game cannot be played, what's the point of worrying about catching cheaters?

The "massive" construction project wasn't as extensive as I initially thought; there weren't that many places using chunk memory. When the pool project was wrapping up, I boldly allocated a pool size of 100 megabytes. We had two pools, so effectively, we were starting the game with an initial memory space footprint of 200 megabytes. Although we never use that much memory, I didn't want to risk affecting game fragmentation, so I went with a generous size. At that point, I thought fragmention was the problem, so it seemed like a wise decision.

#2

But this "massive" reconstruction was still insufficient for satisfying the requirements of the new hot game we were working with. It was at this point I was able to speak with the game programmers and learned that the real issue wasn't fragmentation; the game was running on a tight memory budget, almost maxing out at 2 GB. They mentioned that if a third-party program used too much space, problems would arise, and it'd be ideal if the program could operate within a 64 MB margin. Essentially, the problem was not just fragmentation but also an overall shortage of memory space.

In an age where memory is so cheap, it may seem odd to fret about memory capacity, especially when even phones come with more than 2 GB of memory. However, we are not talking about physical memory limitations here but rather address space limitations. Address space is the amount of memory a program can use at one time. Just because you have 16 GB doesn't mean you can use all of it at once. This becomes possible with a 64-bit CPU, 64-bit operating system, and 64-bit application, but many games are not ready for that environment yet. Hence, we are constrained by 32-bit applications. In a typical Windows environment, the address space available to 32-bit applications is limited to 2 GB.

The most blatant waste of address space is often due to improper use of the VirtualAlloc function. In a 32-bit environment, Windows uses 64 KB (0x10000) as an allocation unit and 4 KB (0x1000) as a page size. Because of this 64 KB allocation unit, using VirtualAlloc with a value that is not a multiple of 64 KB creates unusable address space, which is known as slack.

PVOID ptr = VirtualAlloc(NULL, 0x8000, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);

Consider this simple allocation code, where VirtualAlloc is called to allocate 32 KB. If memory is available, this call will execute successfully, and ptr will return some value. Typically, there's no issue with such a call, but when space is scarce, even this becomes a luxury. As mentioned earlier, Windows uses a 64 KB allocation unit, so the call above makes the 32 KB region following ptr unusable and dead. To eliminate slack, memory should be used diligently as below. The following code reserves a 64 KB area and commits physical memory only to the first 32 KB, leaving the option to allocate the following 32 KB when needed.

PVOID ptr = VirtualAlloc(NULL, 0x10000, MEM_RESERVE, PAGE_READWRITE);
VirtualAlloc(ptr, 0x8000, MEM_COMMIT, PAGE_READWRITE);

To manage these allocations meticulously and prevent slack, it's necessary to create a dedicated pool memory manager API. If the purpose is simply to allocate space for reading/writing, it's more practical to use heap-related APIs as they internally use VirtualAlloc for large chunks, and the performance hit is often not as significant as most programmers think. Plus, when you consider the effort to implement your own memory manager, they may end up being rather similar.

#4

The pool size was intentionally set large, so changing the 100 MB value to just 10 MB would suffice. We worked on adjusting the pool size to an optimal minimum plus a margin, and in order to assist the game developers with their dilemma, we decided to inspect their memory space. My first task was to extract memory maps from situations where the game ceased functioning. Thankfully, the diligent game programmers provided us with a complete dump file, which enabled me to readily extract a detailed memory map like the one below.

  BaseAddr EndAddr+1 RgnSize     Type       State      Protect       Usage
-------------------------------------------------------------------------------------------
*        0    10000    10000             MEM_FREE    PAGE_NOACCESS   Free 
*    10000    20000    10000 MEM_MAPPED  MEM_COMMIT  PAGE_READWRITE   
\* 20000 21000 1000 MEM\_PRIVATE MEM\_COMMIT PAGE\_READWRITE  
\* 21000 30000 f000 MEM\_FREE PAGE\_NOACCESS Free 
\* 30000 34000 4000 MEM\_MAPPED MEM\_COMMIT PAGE\_READONLY ActivationContextData 
\* 34000 40000 c000 MEM\_FREE PAGE\_NOACCESS Free 
\* 40000 42000 2000 MEM\_MAPPED MEM\_COMMIT PAGE\_READONLY ActivationContextData 
\* 42000 50000 e000 MEM\_FREE PAGE\_NOACCESS Free 

...

An extensive amount of logs were produced. It was evident the game was using a substantial amount of memory. I separated the empty areas to check if allocations had failed, sorting them by size. The largest chunk was 0xF0000, an area allowing a 15 MB allocation. However, the game seemed to use memory chunks near 20 MB and if a similar size was requested, it could have failed.

There were memory fragments scattered here and there as the game programmer mentioned. In the list below, the shaded areas, such as 6d10c000 and 4df31000, are fragmented and empty but their preceding areas are unusable.

*** 7b320000 7c220000 f00000 MEM\_FREE PAGE\_NOACCESS Free 
* 712f6000 72180000 e8a000 MEM\_FREE PAGE\_NOACCESS Free 
* 7e1c0000 7f030000 e70000 MEM\_FREE PAGE\_NOACCESS Free** 

* 6d10c000 6def0000   de4000             MEM_FREE    PAGE_NOACCESS     Free 
* 4df31000 4ed00000   dcf000             MEM_FREE    PAGE_NOACCESS     Free 

*** 4baf1000 4c560000 a6f000 MEM\_FREE PAGE\_NOACCESS Free 
* 67570000 67d40000 7d0000 MEM\_FREE PAGE\_NOACCESS Free** 

...

Next, I separated the allocated areas to investigate. As I believed the problem was not with the amount but the allocation unit, and addressed issues with VirtualAlloc could save significant memory. The log results seemed to support my theory with innumerable memories allocated with sizes like 0x12000, 0x11000, etc. Bingo~ By eliminating these areas, the game could make much more space available for memory usage.

* 584d0000 5998a000  14ba000 MEM_PRIVATE MEM_COMMIT  PAGE_READWRITE      
|-568c2000 57500000 c3e000 MEM\_PRIVATE MEM\_COMMIT PAGE\_READWRITE  
\* 5e3b0000 5ee5c000 aac000 MEM\_PRIVATE MEM\_COMMIT PAGE\_READWRITE  

...

\* 440b0000 440c2000 12000 MEM\_PRIVATE MEM\_COMMIT PAGE\_READWRITE  
\* dda0000 ddb1000 11000 MEM\_PRIVATE MEM\_COMMIT PAGE\_READWRITE  
\* dde0000 ddf1000 11000 MEM\_PRIVATE MEM\_COMMIT PAGE\_READWRITE  
\* de20000 de31000 11000 MEM\_PRIVATE MEM\_COMMIT PAGE\_READWRITE  
\* de40000 de51000 11000 MEM\_PRIVATE MEM\_COMMIT PAGE\_READWRITE  
\* de60000 de71000 11000 MEM\_PRIVATE MEM\_COMMIT PAGE\_READWRITE  

...

#5

If I had more time, I might have rushed to prove this at this point. It sounds cool to offer a security solution with a memory manager API. However, an important external meeting was scheduled, so my thoughts had to pause. Retrospectively, not rushing into things likely saved me.

Upon returning to work, when I began drafting the memory manager API, it felt more complicated than expected. We could have directly connected our pool manager to the game API, but the challenge lay in ensuring that hooking VirtualAlloc would not result in recursive calls within the hook function. This meant crafting entirely new code to block all dynamic allocations—a tiresome task. It also wasn't practical to ask the game company to apply an untested solution.

What to do? As I was pondering this, a bright idea struck me—let's determine if this really makes sense. Exactly how much slack space could I reclaim with this change? If the theoretical benefits seemed promising, it could be worth discussing with the game company. To proceed, I began categorizing the allocated memory based on size. Since the game company was using VirtualAlloc for allocations larger than 32 KB, I started investigating memory within the range of 32 KB (0x8000) to 128 KB (0x20000). The counts for each size used within this range are as follows.

00008000: 30
00009000: 39
0000a000: 33
0000b000: 11
0000c000: 35
0000d000: 20
0000e000: 51
0000f000: 240
00010000: 4996
00011000: 403
00012000: 16
00013000: 10
00014000: 8
00015000: 12
00016000: 4
00017000: 5
00018000: 6
00019000: 3
0001a000: 4
0001b000: 4
0001c000: 4
0001d000: 6
0001e000: 1
0001f000: 12
00020000: 7

With a perfect memory pool, how much slack space could be saved? The 403 of 0x11000 allocations seemed to smile at me. But that joy was short-lived as the calculations disproved my assumptions. The total space used by the allocation in the list was 404 MB, with only 27 MB wasted as slack—a far cry from what I had hoped. Of course, 27 MB is not a negligible amount compared to 404 MB. Encouraged by this ratio, I extended the calculation to the entire memory allocation data, only to find that the total used space was 1331 MB, with a mere 40 MB wasted as slack. A shock, to say the least.

#6

Mother used to say, "Use Python when possible, dear. Remember that there's nothing as foolish as premature optimization. Your time is always more valuable than the CPU." She was wise. Not jumping the gun seems to have been the right choice after all. While 40 MB is not negligible and may well be a tipping point, assuming that just an additional 20 MB could have kept the game running by entering a cycle of deallocation and reallocation is stretching it too far.

At the close of the issue, this thought occurred to me: the 64-bit era is not far off. Just as we moved to 32-bit architectures to break free from the 1 MB limit, the 2 GB space now also feels too constrained. Given that smartphone CPUs are heading for 64-bit, it's certainly indicative of the times.

Now that we've somewhat guaranteed stability for cohabitation, it's time to regain security by segmenting the pool and randomizing allocations—walking along Möbius's strip, it seems. What is this feeling, huh?

If you've read through to the end of this post, now might be a good time to relax with a cup of coffee...

Truth is, slack space isn't all that bad. It generously allows for buffer overruns with grace, heh.

Just like memory, and life in general, it's good to have some spare room.

@codewiz
Looking back, there were good days and bad days. I record all of my little everyday experiences and learnings here. Everything written here is from my personal perspective and opinion, and it has absolutely nothing to do with the organization I am a part of.
(C) 2001 YoungJin Shin, 0일째 운영 중