Table of contents of Part 1
Table of contents of Part 2
- The Components of the Garbage Collector
Table of contents of Part 3
Table of contents of Part 4
How does the Garbage Collector work?
The Garbage Collector runs in three phases. Mark Phase, Clean Phase, and Compact Phase.
In this phase, the Garbage Collector creates a list of living objects by traversing the object tree. At the end of this phase, all objects that are not in this list are marked as potentially deleted objects.
This phase is known as the ‘Relocating Phase’. In this phase, all marked objects are cleaned and the location of the objects that were in the living list objects are updated to the new memory address where the objects will be relocated to during the Compact Phase.
The managed heap is compacted during this phase as the space occupied by the dead objects is released and the remaining objects are moved to new locations. The above-given Figure-6 illustrates the states of a memory heap during the phases of the Garbage Collector.
Large Object Heap (LOH)
The Large Object Heap is a special memory segment in the managed heap used to store objects larger than 85K – objects like XML, JSON, or large Byte arrays. The idea behind the LOH is to avoid performance losses during promotions between generations. Because copying large objects in the memory is very expensive.
The objects in the LOH are not compacted. This means that the LOH becomes fragmented over time. The white spaces in Figure-7 illustrate these memory fragments.
The Garbage Collector uses a LinkedList data structure to maintain fragmented sections in the memory. When a new large object is created, the Garbage Collector checks whether the new object can fit in a fragmented section that it tracks. If not, the new object is appended at the end of the memory region. This approach is much better than copying whole objects in memory, despite its potential performance cost.
Starting with .NET 4.5.1, it is possible to force the Garbage Collector to compact the LOH. The details can be found in the docs of GCSettings.LargeObjectHeapCompactionMode.
Some .NET reference types might need to use native resources such as a file or a network connection which are unmanaged objects. Unmanaged means this type of object doesn’t reside on the managed heap but is handled by the operating system in the unmanaged heap instead. Since the Garbage Collector can manage the objects placed in the managed heap, to manage or to release those native resources require additional techniques/implementations.
Implementing the Finalize method is one of these techniques. The Finalize method is used for performing cleanups on native resources before the holding object is collected by the Garbage Collector.
Figure-8 gives an example of the Finalize method. The Finalize methods have the same name as the class they are defined in with a tilde (~) character prefix.
Figure-9 illustrates the memory layout of certain points in time of an executing process. The process has four objects in the managed heap and two of those (Object-1 and Object-3) have a Finalize method. The Garbage Collector creates one entry in the Finalization queue for objects that have a Finalize method. Therefore the Finalization queue has two items for each Finalizable object.
To have a better understanding, let’s assume that at a certain point of time of the execution, the objects Object-1 and Object-2 are dead and need to be collected.
Since Object-2 doesn’t have a Finalize method, it will be collected on the next run of the Garbage Collector as usual. But Object-1 has a Finalize method, so that’s why Garbage Collector will follow a different approach for this object. Here, the F-Reachable queue comes into play. Remember the intention of the F-Reachable queue which is to handle the garbage objects which have Finalize method. Since Object-1 is dead, the entry for it in the Finalization queue will be moved to the F-Reachable queue as shown in Figure-11.
Cleaning up objects
From now on, Object-1 will wait for the Finalize method to be executed. The CLR is in charge of the execution of the Finalize methods. It has a special high priority running thread to execute Finalize methods whose name is Finalizer Thread. Whenever a new finalizable object is added to the F-Reachable queue, the Finalizer thread wakes up and executes the finalize method of the object. After the execution of the Finalize method of an object, it is removed from the F-Reachable queue too. On the next run of the Garbage Collector, it is ready to be collected and the memory it occupies will be reclaimed.
Figures 12 and 13 illustrate the status of the memory after Object-1 is completely collected from the F-Reachable queue and the managed heap. This is how the Finalization process works in a .NET environment.
Although this approach is quite important when using native resources and reclaiming memory, it has some disadvantages. With this technique, it isn’t possible to control and know when the Finalize method would be executed. Because the execution control of Finalize methods is completely under the control of the Finalizer Thread. In addition to that, it is likely to block the Finalizer Thread because of a faulty usage of the Finalize method. If the Finalizer thread is blocked, it may not be possible to reclaim the memory of finalizable objects. This is why it is always better to avoid using the finalizer method unless you need it and you know what you are doing. Instead of a Finalize method, the Dispose pattern should be preferred.