Tuesday, November 29, 2016

Android: Test investigating for RAM consumption - Bit more on Garbage Collection


From this post on knowing about the heap, we moved to this post which gave brief thought on the GC. Continuing the understanding of heap, I will share what I have gathered for garbage collection's algorithms.  It is upto to choice of people to select and implement algorithm(s) and technique(s) in the JVM they build.  To remind, Android OS has Dalvik (till 4.x) and ART (from 5.x) JVMs. For understanding and carrying out the test investigation for RAM consumption, being aware of the GC algorithm and techniques is a plus point.


Roots and Garbage Collection

The GC algorithm will be doing these two things.
  1. It must find the dead objects and live objects
  2. It must reclaim the heap space used by the dead objects and make it available for application/program
Finding the dead objects is accomplished by identifying set of root and enumerating it to determine the reachability from the roots. An object is reachable if there is a path of references from the roots by which the program/application can access the object. The reachable objects are called 'live' objects and those which are not reachable is called 'dead' object i.e. the garbage.  Any objects referred by a live objects is also reachable and it is 'live' object.

What may be the GC algorithm chosen in the JVM, the JVM will stop the running application when it has to run the GC. When this happens, every thread except for the threads needed for the GC will stop their task for a while. The interrupted threads will resume only after the GC is completed.

So there is a need for fine tuning the algorithm and techniques of GC used in the JVM, such that the interrupted time will be as small as possible. Dalvik and ART are no exception from this.


Source of the "Roots" in JVM

The executing program/application will have access to root, always. The root set in a JVM will be based on the implementation. Few of the source of roots can be as below
  • object references in local variables
  • operand stack of any stack frame
  • object references in any class variables
  • strings stored in the heap (from loaded classes) -- for example, class name, method name, field name


Approach to identify live objects among dead objects

Reference counting and Tracing are the two common approach for identifying the live objects (reachable) from the dead objects (not reachable) in the heap.

The reference counting GC differentiates the live objects from the dead objects by keeping a count for each object in the heap.  This count keep the track of numbers of references to that object. The tracing GC will trace out the graph of references starting from the root nodes.  The objects that are encountered during the trace, will be marked and it is known be live object. The one which are not marked in the tracing are the dead objects and needed to removed from the heap.


Further learning and exploring

It will be not in scope to write more about the different algorithms and techniques with this blog post. However, use the below to learn more about the GC algorithm, types and techniques. It will help in knowing how the ART better.
  1. Generational Collectors -- Weak generational hypothesis; Young generation; Old generation; Permanent generation; 
  2. Serial GC
  3. Parallel GC
  4. Parallel Compacting GC
  5. Concurrent Mark and Sweep GC
  6. Garbage First GC
  7. Reference counting collectors
  8. Tracking collectors
  9. Compacting collectors
  10. Copying Collectors
  11. Adaptive Collectors
  12. Remembered sets and popular objects
  13. Reference objects -- reference object and its referent
  14. Reachability state changes and use of Cache, Canonicalizing mapping, and Pre-Mortem clean up


No comments:

Post a Comment

Please, do write your comment on the read information. Thank you.