Thursday, January 26, 2017

The Testability and a Tester



In one of the recent discussion with fellow testers, the topic was "testability".  While fellow testers asked, "what is testability?", I had a question which I shared in the discussion -- "What is codability?".


Codability and Testability

I see both of these are inter related.  If it is codable then it is testable to a degree.  What is that degree? That's the question of interest.  But, what is that one will accomplish by knowing the order of degree, here?  This forms the base in having the curiosity or wish to know about testability.

I have a product's feature which I have to implement. The feature has cases which I need to handle in run time for different situations which a user can encounter. Well, I can code for situations which I can think and which is of priority from point view of using the product. Then for other situations which I'm not seeing or not thought off, will it be a hit for my product?  What is the complexity level of the code I'm writing for this feature and what are the risks that are subtle in code I have written? The written code will always have the risk that is tied along with it.

Now, can I tell that testability is influenced by multiple factors and not just on the basis of the test identified? Well, after knowing this, it is essential to understand -- "testability is the easiness at which I can test the system in a given situation.".


Factors influencing testability

The few factors which will potentially influence the order of  easiness -- in learning the test challenge, identifying the test, approaching the test and executing it, are as below:
  • The testing skills of a tester
  • The programming skills of a tester
  • The technology skills of a tester
  • The experience of a tester in testing such systems
  • What is know by tester about the product -- purpose of product; people who are building the product; people who will use or who are using the product; the features and functionality of product; technology of the product; code written for the product; how the product is built with limitations from business team; information available about the product; process, practice and culture followed by teams in building product; availability of the system and its present state
  • Time available for system development
  • Time available for testing
  • The level or order to which the testability aid is built and can be provided in system to test, and the complexity or simplicity level of same
  • Measures taken to improvise and include testability aiding stuffs in system
  • Existing tools or utilities for testing the product or code, or to build one such utilities
  • The freedom to tester and testing to control and guide the testing activity
  • Information expected out of testing and priority of it
  • Relationship with fellow testers, fellow programmers, business teams and any other who are involved and interested in the system being programmed and tested
  • The audience interested in test report or outcome of testing and their expectation


Testability and Test Coverage

The test coverage accomplished by a tester in given context is relational to the skills set of tester and the testability factors of the system and environment.

While I have understood this, I tried to learn the scale of testability keeping the test coverage as base. Here is what I have observed for now.
By keeping the test coverage as a base scale might mislead me to learn what I can accomplish in coverage. Because, the testability might be available and easy one for me to accomplish the 10th level of coverage with my skills and other influencing factors of testability. But to cover and accomplish the first three levels can be not the easy one given my skills and other influencing factors of testability. The vice versa is also possible.  Here, I can take longer time to learn and know how could I have cut short and still accomplish the same coverage given the testability level to me in a context, if I had learned about the complexity level of of it. While I reach to a intended level of coverage, I would have done the moves which does not favor business timeline, learned from them and repeated few stuffs.  This is not wrong but I have consumed time which I could have invested in testing to accomplish much more coverage.  The same is represented in below image.




Keeping the test coverage in base and assessing the testability on vertical scale marking the breadth and width of coverage spread out in parallel which  I want accomplish or what I can accomplish, here is what I observe for now:
Understanding the test coverage I want to reach by my testing and automation, so I provide the information expected is a challenge each time.  The reason for simple to quote for first is the 'testability' for that coverage mark.  Once I get an idea of what is the skills that I need to build and use in such situations, it will help me to make strategic decisions in testing execution and its management.  The is very important as per me from point of view testing strategic base.  This is helping me and allows me to test the perceived level of testability itself for marked coverage boundary or milestones.



This base of strategic thinking in learning and identifying the testability can be used in timelines of a project. Not just during the execution of testing, also in pre-execution and post-execution, it can be used. It gives an idea of how the tester has to be equipped in changing needs of testing and engineering. For me this is working in context where I'm exposed and in few cases I had to add few vectors along with test coverage such as technology and programming factors specific. I'm experimenting this in varied projects having different technology and skill challenges.


Visibility out of testability

With this, for now I see,

  • Testability is related with codability, programmer, tester, environment, process, people, situation, priorities
  • Codability gives the first hint on testability and skills needed to test the same.
  • Testability influences the Test Coverage
  • Testability influences the time taken by a tester to test the system in a given context
  • Like codability requires the skills of a programmer, testability requires skills of a tester

Note: Automation when leveraged with testability, wonders can be done via automation in assisting the testing.



Wednesday, November 30, 2016

A day ends in seeing crash and hearing "No crash! It works."



It is a case where I involved actively in the second day to complete my investigation, after learning my negligence and assumption.  The happening, in the case, stopped the testing for a day. 

A day with no testing and programming, it is a cost which has not turned into any benefit to business! I heard the same from business team.


Happening in the case

One of my fellow tester had got a project assignment and expectation was to carry out the performance test. The latest release and deployment was available for the testing and it was used. While the fellow tester started the task from tester's desk, the business team started using the product from their place.

The tester noticed the product crashing after few actions. Clueless why it is, just the message was available that said, "Something went wrong!"  Looking at this, tester had rolled back the latest installation and did a fresh installation. Yet, the same behavior and message.  This made the tester to use different test setup and noticed the same behavior and message.

By this time, I was said about this behavior.


Observing the happening

Fellow tester walked through me the context of task, it's priority and expected time to finish the task. Following with that walk through, tester gave me the happening in the test environment and the observations recorded for the crash.

With this, the tester said, the business team is not facing this behavior. But here it is crashing after few actions on launching it.

Hearing this, we wanted to make sure the version of the product and hardware setup was similar or close enough to be fair.  It looked everything is symmetrical in terms of product version and setup.

I asked for the log and noticed the parsing failing at the client end. Running through the stack trace in the log it was evident for me that there is a problem in processing the data. Which data, that was the question.  The stack trace said, the data that was wrong. But the claim from the tester was, the same data is being used by testing team and business team. Then, why the crash experience just for the tester.

Product getting crashed is a common sight. The interesting aspect in the crash is knowing the root. I said this to tester and asked to observe data transmitted and update me on the same.


How negligent here, I'm!

I knew, from so the so far investigation, we are very close in knowing the root cause of the problem. But then, from here, on updating what to do next in investigation and analyze, I moved into my practice. This is the mistake I did. I should have joined the tester in completing it. 

But that next part of investigation did not happen for the whole day and it never came to me back. I heard business team is using product without any problem. At least, I should have inquired with the engineering team but which I did not.


Next day's fresh listening

I noticed the discussions between business and engineering team. The calculation was on the time that went out seeing no testing and no clue of the crash.  After few minutes, I heard from tester, it works if I choose different network. It was tried in morning and since it worked, the other network line is used.

That shook me very strongly and my eyes just became very straight in sight seeing the engineering team. I asked, "How? That should not be a problem at all from network information I see here. It should be something else, not the network."  

I could not convince myself that changing the network will solve the crash. Just the product did not exhibit the behavior for a reason. What was the reason? I started to converse with the product again.


Going back to the undone

I requested for the data from the data monitoring between the client and server. This was one part of the investigation which was suppose to continue.  My bad, I assumed, this was done and engineering team saw no problem here. I did not think of asking it and look at it, on the previous day, because I assumed.

This time, I was very keen to look and requested to setup environment for monitoring the data. Used different networks and noticed data sent, received and parsed.


What's the problem?

From last day's investigation it was evident to me, the client could parse the data and assign it to the object. The consequence was Null Pointer Exception and product getting crashed.

Now with data, I started analyzing and every bit of data and line by line information in the log.  The problem is, the product is unable to handle the data at this transition point if that is not from the server. But why? Isn't that a expectation? I left the question with the engineering team for their discussion.

The request from client did not reach the server at all. The web monitoring and filtering system (WMFS) in the network it did not let the request to reach server. The response was from the WMFS to client.

The other mystery to learn was, how is it possible to work on one network and not on another network line, while the service line is same.  In one network, out of the first three request, all got response from WMFS.  While on another network, the second and third response was from the server while just the first request got response from WMFS.  This showed, though first request fails, the second and third requests are crucial.


Beyond the corners of the problem

I see, there is no harm in having WMFS in place. Customer who procures the product can have WMFS in place. Product cannot say, not to use WMFS to its users.

The product (client module) should be able to parse the response and have identity mechanism with server token to parse data, which it receives be it from server and other sources.


Learning

For engineering team, it is an alert to handle data in all possibly identified corners. To me, it was the learning which said, not to be negligent and hand over the investigation if you do not request for the update after few minutes.

To business team, the learning is to follow up on investigation which started and see a closure.


Tuesday, November 29, 2016

Android: Test investigating for RAM consumption - Bit more on Garbage Collection


From this post on knowing about the heap, we moved to this post which gave brief thought on the GC. Continuing the understanding of heap, I will share what I have gathered for garbage collection's algorithms.  It is upto to choice of people to select and implement algorithm(s) and technique(s) in the JVM they build.  To remind, Android OS has Dalvik (till 4.x) and ART (from 5.x) JVMs. For understanding and carrying out the test investigation for RAM consumption, being aware of the GC algorithm and techniques is a plus point.


Roots and Garbage Collection

The GC algorithm will be doing these two things.
  1. It must find the dead objects and live objects
  2. It must reclaim the heap space used by the dead objects and make it available for application/program
Finding the dead objects is accomplished by identifying set of root and enumerating it to determine the reachability from the roots. An object is reachable if there is a path of references from the roots by which the program/application can access the object. The reachable objects are called 'live' objects and those which are not reachable is called 'dead' object i.e. the garbage.  Any objects referred by a live objects is also reachable and it is 'live' object.

What may be the GC algorithm chosen in the JVM, the JVM will stop the running application when it has to run the GC. When this happens, every thread except for the threads needed for the GC will stop their task for a while. The interrupted threads will resume only after the GC is completed.

So there is a need for fine tuning the algorithm and techniques of GC used in the JVM, such that the interrupted time will be as small as possible. Dalvik and ART are no exception from this.


Source of the "Roots" in JVM

The executing program/application will have access to root, always. The root set in a JVM will be based on the implementation. Few of the source of roots can be as below
  • object references in local variables
  • operand stack of any stack frame
  • object references in any class variables
  • strings stored in the heap (from loaded classes) -- for example, class name, method name, field name


Approach to identify live objects among dead objects

Reference counting and Tracing are the two common approach for identifying the live objects (reachable) from the dead objects (not reachable) in the heap.

The reference counting GC differentiates the live objects from the dead objects by keeping a count for each object in the heap.  This count keep the track of numbers of references to that object. The tracing GC will trace out the graph of references starting from the root nodes.  The objects that are encountered during the trace, will be marked and it is known be live object. The one which are not marked in the tracing are the dead objects and needed to removed from the heap.


Further learning and exploring

It will be not in scope to write more about the different algorithms and techniques with this blog post. However, use the below to learn more about the GC algorithm, types and techniques. It will help in knowing how the ART better.
  1. Generational Collectors -- Weak generational hypothesis; Young generation; Old generation; Permanent generation; 
  2. Serial GC
  3. Parallel GC
  4. Parallel Compacting GC
  5. Concurrent Mark and Sweep GC
  6. Garbage First GC
  7. Reference counting collectors
  8. Tracking collectors
  9. Compacting collectors
  10. Copying Collectors
  11. Adaptive Collectors
  12. Remembered sets and popular objects
  13. Reference objects -- reference object and its referent
  14. Reachability state changes and use of Cache, Canonicalizing mapping, and Pre-Mortem clean up


Android: Test investigating for RAM consumption - Garbage Collection



Getting the brief information on heap this post, I will move on to Garbage Collection in JVM. I have tried to understand Garbage Collection and here is the brief information on the same.


Garbage Collection

When an object is no longer referenced by a program, the heap space it occupies it can be recycled so the space is made available for subsequent new objects.

The JVM has to to keep tracking which objects are being referenced by the program which is executed, and finalize and free unreferenced objects at run time.  This activity will need more CPU time than what it requires when program explicitly free unnecessarily occupied memory.  How to optimize the CPU usage by the program is key in designing of the product so product can still do better when CPU is loaded in a given context.


Heap Fragmentation and Garbage Collection

The heap fragmentation occurs gradually as the program execution happens. The new objects are allocated and the unreferenced objects are freed. The freed portions of heap memory are left in between portions occupied by live objects. In this case, the request to allocate new objects will be filled by extending the size of the heap even though there is sufficient unused total space in the existing heap.  And this is possible if there is sufficient contiguous free heap space available into which the newly created object can fit.

The swapping is required to service the growing heap and this can degrade the performance of the program which is in execution.  Now, I will leave it to you to consider what and how will the impact be from heap fragmentation in desktop and mobile device, running on lower RAM. Have you ever noticed message saying "Out Of Memory" or "running out of memory"?

With the garbage collected dead objects, the problem is, it can impact the performance of the application/program. The JVM has to keep track of the objects being referenced by the executing program/application and finalize the unreferenced objects (dead objects) in run time. All this happens on the fly while one uses the program/application.  This activity can consume the CPU scheduling time. What if the program/application itself freed the memory? Won't it consume the CPU, then? Both consumes but how much is the question. It depends on the code of the application/program and the hardware as well. I'm not sure if one can handle the CPU scheduling in garbage collection process to free the dead objects. If it is possible to handle the CPU scheduling in this context, programmers should have the constraints and limitations; else by this time most of the app would have been better in GC environment if the CPU scheduling was in full control to the written code of program/app.



Graphical Representation of GC and Heap Fragmentation






























Figure-1: Garbage collection contextual elements





























Figure-2: Heap Fragmentation from garbage collection



Android: Test investigating for RAM consumption - Heap and Heap Dump



The process of allocating new objects and removing the unused objects to make space for the new allocation is termed as Memory Management.  


Object Memory Management - Heap and Stack

Hearing the two terms Stack and Heap, can bring confusion in the memory management. Though I will not get in detail about each here, I will share briefly for what I have understood.  

The 'stack' memory is used for static memory allocation and 'heap' is used for dynamic allocation. For those allocated on the 'stack', they are stored directly to the memory and the allocation happens when the program is compiled. Whereas for those allocated on 'heap' have their memory allocated at the run time and the size of heap is limited by the virtual memory. Both the 'heap' and 'stack' are stored in RAM.


Shallow Heap and Retained Heap

Moving ahead, in 'heap' we will hear the words -- Shallow Heap and Retained Heap.

  • Shallow heap -- is the amount memory consumed by one object. That is, the amount of memory allocated to store the object itself by not taking the reference objects into consideration. The shallow heap size of set of objects is the sum of shallow sizes all objects in the set.
  • Retained heap -- is the amount memory consumed by all objects that are still alive by objects at the root of the retained set. The retained heap size of an object is, "shallow heap size" added with "shallow heap size of objects that are accessible directly or indirectly, only from this object".
    • Retained set -- is the set of objects which will be removed by garbage collection when an object or multiple objects is garbage collected.
In other way, the shallow heap of an object is the size of object in the heap. The retained size of the same object is the amount of heap memory that is freed when this object is garbage collected.

From these two resources (1 and 2), I have understood the details of Shallow Heap and Retained Heap. I feel it is simple and well explained; I have been referring it since years.


Heap dump

It is a snapshot of the memory at a Java process at a given point of time. The dump will consist of Java objects and classes. The reference of memory for the objects and classes will be available in the dump taken. If the Garbage Collection occurs at time of obtaining the dump, then the obtained heap dump will have details and information about the remaining (existing & used, existing & unused) objects and classes.

The heap dump do not tell who created the object and where it was created. Based on type of heap dump, the information available in the dump will be as below.

  1. Classes -- Classloader, name, super class, static fields
  2. Objects -- Class, primitive value and references, fields
  3. Thread Stacks and variables -- The information about call stack of threads at time of obtaining heap dump. The information about local variables and objects.
  4. GC Roots -- The mentioning of objects which are reachable by JVM