JVM Memory Model and Garbage Collection
A well know aspect of running code on languages which use the JVM is garbage collection. The mechanism of running a background thread to release unused memory and avoid fragmentation.
In Java programs the memory management aspect is abstracted by the JVM. Once we start a jvm process memory is allocated by the kernel using the virtual memory address that map to User space memory. (Note: I am not sure how much is allocated at start and if this is fixed or increased on demand, will update the same in future edits).
Once memory is allocated to JVM process. Its up to JVM to manage the memory blocks , i.e. allocate and reclaim unused memory.
In this blog I will try to describe the jvm memory model and how garbage collection works along with indicating various jvm parameters that could be used in configuring GC settings. I will also try to mention a few tools that will help in debugging the jvm in production.
Note: This is a WIP blog. I will update this as I learn more.
Java memory model:
The memory model in jvm can be broken into the following components:
MetaSpace
JIT Code cache
Thread Stacks
Heap
MetaSpace:
This region of memory was know as PermGen space in Jdk 7(Permanent Generation space as it would never need any garbage collected).
This is the memory space used to store the loaded class files by jvm class loaders. Since a typical java application tends to link to various libraries, all these would need to be loaded into meta space. This is designed to grow based on need.
But issues would arise if java class loaders end up loading more classes than the physical RAM space. As Linux assigns only virtual memory address to a process, if swap is enabled in kernel options. Some of the memory data in the process with be moved to swap file. This will degrade performance of the java process running on the JVM.
Alternatively jvm-args can be specified to limit the size of meta space. This will restrict memory size and error out when we exceed the limit.
-XX:MetaspaceSize=128m // specifies 128Mb for metaspace
Note: usually in typical long running server kind of applications the count of class loaded would increase during bootstrap of server and then remain fairly static post that. Exceptional cases will involve loading classes at runtime over network etc which are infact dangerous in security considerations.
JIT CodeCache:
If you recall Java 101 basics. Java takes your sourceCode in *.java and converts it to byte code *.class. This is still not machine specific code. (Machine specific code would vary based on your OS). Byte code is run on the JVM. Hence JVM has a JIT (Just in time) compiler which converts the bytecode to machine specific code, now if jvm determines that a block of code is being frequently accessed it would cache the compiled native code, which will provide benefit of recompiling.
Basic JVM args in setting codeCacheSize: -XX:InitialCodeCacheSize=32m.
ThreadStacks:
Part of JVM memory used store ThreadStacks per thread running in the JVM.
More on this will be updated latter
Heap:
The major part of the jvm memory which is used to store objects.
So when you do:
String a = “test”;
You are creating a string object “test”.
Which internally is represented by String.class
Where “test” is inside a char[].
Size of charArray = 2N+ 24; // number of chars = 4 => 36 bytes;
Total size of the object “test” ⇒ 36 + 16 (object overhead) + 4 (hash) + 4(buffer) = 60 bytes. (The size is an approximate estimate - I could be wrong in the calculation of object overhead and array).
Now 60 bytes need to be allocated by jvm in order to process the line
“test”;
The JVM allocates 60 bytes in the Heap. Also another 8 bytes would be allocated for the pointer “String a” which stores address of object “test”.
Heap size in jvm can be controlled using -Xms and -Xmx (or: -XX:InitialHeapSize and -XX:MaxHeapSize)
Eg: java -Xms128m -Xmx2g MyApp
Note, this heap is cleaned up by garbage collector enabling us to keep creating new objects and not worry about destroying unused objects and ensuring we are below the MaxHeapSize configure else we would throw OutOfMemoryException.
The heap is broken down into 2 regions:
Young Gen
Old Gen (Tenured region)
YoungGen as the name implies is the region where all objects are created and short lived objects exist.
YoungGen is further broken into 2 components.
Eden
Survivor Spaces (S0 and S1)
Eden is where all objects are created first.
Garbage Collection:
There are 2 kinds of GC minor GC and major GC.
Minor GC:
Note some details given below could vary based on type of Garbage collector in use.
Minor GC runs when JVM is not able to get enough memory from Eden.
Minor GC checks for blocks of memory which are still in use, but running down the Thread stack memory and checking their pointers to memory location, (Note this implies threads need to be paused during this run so that they dont change. Hence minor GC does create a stop the world pause, which would be very small in duration).
Minor GC can be happening on single thread or multithread based on collector applied.
Once it identifies Unused memory spaces in Eden , it runs copy collection. To move all used data to Survivor space so as to clean up the space in Eden.
Survivor space comprises of 2 survivor which are equal in size.
At any given point time one Survivor is ToSpace and other is FromSpace.
So if S0 is ToSpace in the first Run and S1 is FromSpace and both are empty
In second run S1 is ToSpace which implies both data from Eden which are not collected plus old data in S0 which survived first GC are pushed to S1.
Note: If object is to big and can not be pushed in to S1 or S0 during either collection then it’s pushed in to Tenured region. (This is called premature promotion. One of the issues you can run into if you are not cognizant of your data size and have not allocated sufficient survivor space and eden space)
Post each collection object’s age is increment, hence a long living object will flip-flop from S0 to S1 regions at each minor GC.
If object stays longer than a configured number Tenure, it is copied over to OldGen/ Tenured region, by default this in 15.
We know that go configure heap space to be used we can use
-Xms512m -Xmx512m
Now to configure the portion of Heap to be used by Young Gen we can use
-XX:newRatio=2 -XX:survivorRatio=8
These options are not quite straightforward and need some explanation.
newRatio = 2 implies that out of total heap allocated we the ratio of OldGen / YounGen == newRatio.
Hence newRatio = 2 implies ⅓ of heap memory will be used for YounGen. In this case close to 171Mb will be used in YounGen and 341Mb for oldGen.
Now survivorRatio=8 implies that each survivor space with ⅛ of the eden. Two survivor space implies ⅛ + ⅛ = ¼
Hence eden would be ¾ of the youngGenSpace.
For the above example that would mean
Eden size would be ¾ * 171Mb = 128Mb aprox
Survivor space S0 and S1 will each be = 21Mb
Note: above options are relative modes of tuning the GC size, Alternatively we can specify specific size for YoungGeneneration which will override the newRatio using
XX:NewSize=171m -XX:MaxNewSize=171m
XX:MaxTenuringThreshold=15 // To change the tenuringThreshold on which object should move to tenured region which by default is 15.
Major GC:
If tenured region gets filled up then JVM will need to trigger Major GC or FullGC.
System.gc() and Runtime.getRuntime().gc() suggest JVM to initiate GC.
FullGC will remove unused objects in tenured region and will also try to reclaim space for MetaSpace. And for loaded classes which do not have any objects on heap, those classes can be removed.
If metaspace size threshold is provided. FullGC will get trigger to reclaim MetaSpace.
Jvm cmd line tools: (TODO will update this shortly)
jstack
jmap
jps
jcmd
jinfo
java -XX:+PrintFlagsFinal -XX:+UseG1GC -version












