From this post on knowing about the heap, we moved to this post which gave brief thought on the GC. Continuing the understanding of heap, I will share what I have gathered for garbage collection's algorithms. It is upto to choice of people to select and implement algorithm(s) and technique(s) in the JVM they build. To remind, Android OS has Dalvik (till 4.x) and ART (from 5.x) JVMs. For understanding and carrying out the test investigation for RAM consumption, being aware of the GC algorithm and techniques is a plus point.
Roots and Garbage Collection
The GC algorithm will be doing these two things.
- It must find the dead objects and live objects
- It must reclaim the heap space used by the dead objects and make it available for application/program
Finding the dead objects is accomplished by identifying set of root and enumerating it to determine the reachability from the roots. An object is reachable if there is a path of references from the roots by which the program/application can access the object. The reachable objects are called 'live' objects and those which are not reachable is called 'dead' object i.e. the garbage. Any objects referred by a live objects is also reachable and it is 'live' object.
What may be the GC algorithm chosen in the JVM, the JVM will stop the running application when it has to run the GC. When this happens, every thread except for the threads needed for the GC will stop their task for a while. The interrupted threads will resume only after the GC is completed.
So there is a need for fine tuning the algorithm and techniques of GC used in the JVM, such that the interrupted time will be as small as possible. Dalvik and ART are no exception from this.
Source of the "Roots" in JVM
The executing program/application will have access to root, always. The root set in a JVM will be based on the implementation. Few of the source of roots can be as below
- object references in local variables
- operand stack of any stack frame
- object references in any class variables
- strings stored in the heap (from loaded classes) -- for example, class name, method name, field name
Approach to identify live objects among dead objects
Reference counting and Tracing are the two common approach for identifying the live objects (reachable) from the dead objects (not reachable) in the heap.
The reference counting GC differentiates the live objects from the dead objects by keeping a count for each object in the heap. This count keep the track of numbers of references to that object. The tracing GC will trace out the graph of references starting from the root nodes. The objects that are encountered during the trace, will be marked and it is known be live object. The one which are not marked in the tracing are the dead objects and needed to removed from the heap.
Further learning and exploring
It will be not in scope to write more about the different algorithms and techniques with this blog post. However, use the below to learn more about the GC algorithm, types and techniques. It will help in knowing how the ART better.
- Generational Collectors -- Weak generational hypothesis; Young generation; Old generation; Permanent generation;
- Serial GC
- Parallel GC
- Parallel Compacting GC
- Concurrent Mark and Sweep GC
- Garbage First GC
- Reference counting collectors
- Tracking collectors
- Compacting collectors
- Copying Collectors
- Adaptive Collectors
- Remembered sets and popular objects
- Reference objects -- reference object and its referent
- Reachability state changes and use of Cache, Canonicalizing mapping, and Pre-Mortem clean up
No comments:
Post a Comment
Please, do write your comment on the read information. Thank you.