Tuesday, November 28, 2023

Behind the Every Test Data, There is a ?!

 

Read this blog post to have a perspective about the Test Data and Test Data Management.  The point is, if I'm not aware of a test and what does it tell me to explore, I cannot think of a Test Data.

That said, if I know what I should be evaluating as part of performance, why, when and how, this will help me to come up with a thought for identifying the tests and its test data for the same. 

The ninth question from season two of 100 Days of Skilled Testing is:

What role does data management play in performance testing, and how do you ensure the availability of suitable test data sets?


Testing and "Ensure"

We test and have tests in testing, because, there is no "sure" and "ensure" idea in software.  But, we presume on a rational basis upon, "if, these are this", in a given context when the software processes.

Now, ask yourself, how can we ensure the availability of suitable test data sets?

In my opinion, the Test Data is often misunderstood.  This is the primary problem and should be the first problem, when asked "what are the challenges in creating the test data?".

When you read the concluding lines of this blog post, you will learn why I say this.


Test Data and Immunity

In my opinion and experience in practicing the Test Engineering, I see, the Test Data should be a viral strain and it should have its variants.  When this test data is used to test [experiment, test investigate, and debug], how do the software and its ecosystem respond?

  • Does the software and its ecosystem is immune to this test data?
    • Does it exhibit any risks and problems?
      • If yes, then, do the purpose of my testing and automation is accomplished with this test data?
This puts me back to question, what is the purpose [intent] of my test?  It drives me to derive the test data which helps me to know -- What am I supposed to learn and on priority?  With this, I get an idea for what kind of test data I should be creating knowing its pattern.

If the system is immune to Test Data and not reveling anything new in the context, I classify this pattern of test data as "Immune" to the context.

In my practice and research work in Test Engineering and Software Testing, to start, I categorize Test Data into two areas.
  1. Immune
  2. Not Immune
Further, I have categories, under these two, where I classify the Test Data deterministically for the context.   Get in touch if you want to learn more about this.  I'm just one ping away!

The tests should help me to evaluate for the immunity and also non-immunity; both are essential and necessity.  

The credit is to me for such classification of Test Data.  It is my research work out of my practice.

Note that, Test Data is not just the input [characters or files] entered or given to a system.  Test Data has its association to tech stacks, infrastructure, ecosystem, business workflows and people.  To craft such Test Data, one has to have the understanding of the system and its internals, and, the problem it solves by knowing how it solves.



Performance Testing and Test Data

  1. What is that I'm testing as part of performance?
  2. What do I want to evaluate in the name of performance?
  3. What part of the system is evaluated for its performance?
    • Should I evaluate this in isolation or as a wholeness of the system?
  4. What domain knowledge and information I should have when testing for performance?
  5. What system's architecture and internal details I should understand and be aware to test for performance?
  6. Is this the first delivery?  Or, do we have this system running in the production?
    • If it is first delivery,
      • How will I create the test data to suit the consumers of this application?
      • What are the key workflows of business that we should be evaluating for its performance?
      • Do all workflows and sub-systems need the evaluation for performance, and on priority?
      • How do I map the fragmentation of users and their data [with its patterns]?
      • What are the infrastructure and ecosystem characteristics that should be part of the test data identified?
      • Does caching have any effect if the same pattern of data is used?
    • If it is a running version in production
      • Can I refer to the DB to figure out the pattern for the particular workflow that I'm evaluating?
      • How can I match the test data to have the production data's characteristics and attributes?
  7. What is the backup strategy for the Test Data?
    • How do I version control the Test Data?
    • Which version of the Test Data I should be using?
  8. What is the threshold I'm targeting with Test Data?
    • What should be the size of the data in DB when I make the IO and RW operations?
    • What should be the network capability when I make the IO and RW operations?
    • What should be the hardware capability when I make the IO and RW operations?
    • What should be the geographical traffic and its pattern when I make the IO and RW operations?
    • More of such factors will be considered when identifying and deriving the test data.
  9. What is the client error yielding Test Data that I should have for the workflow?
  10. What is the server error yielding Test Data that I should have for the workflow?
  11. What is the redirection yielding Test Data that I should have for the workflow?
  12. What is the no-response and no-change Test Data that I should have for the workflow?
And, more.  It is simple; get in touch to discuss and know beyond the listed.



To conclude and stop here, all these questions, do not ensure or assure or make sure that I will have test data for evaluating a characteristic of performance.
  • It helps me to know:
    • What are the tests I should be doing?
    • What kind of preparation I should be having in my practice to create the Test Data for these tests?

The, Test Data should challenge the available Testability and its limits.  If it is not doing, then, we are having a test data no doubt about it; but, it is of shallow. Shallow!?

One has to ask self, "Is this sufficient enough and effective Test Data for the system [and workflow] I'm testing?"

The, Test Data should drive the engineering team to add more layers of Testability into the system.




Friday, November 24, 2023

Test Data is Not [Equal To] a Test

 

Not every release is a critical release!

There are releases which are critical from technical and business interests.  In such critical releases, most of my time is spent on understanding the technicalities of the system and the test data identification on identifying what are the priority tests.  Identify and building the test data takes major chunk of the testing time. This effort has helped me, testing, stakeholders, projects and products immensely.


Test Data != Test

Looks like the word "Test Data Management" has become a buzz and marketing word.  In how the word "Test Data Management" is pushed, to me it looks like the intent is missing for -- how to identify the tests and its test data?

  • If I cannot think of tests, how can I think of Test Data?
    • How can I think of tests, the right tests in the context, and the priority tests out of it? 
      • If I cannot get this, how can I get the Test Data for all these tests?

Before talking of Test Data Management, we need to talk about the intent and goal of the test.  If I have a clarity here, by being aware of the product's technical internals, architecture and its eco system, I can think of test data in a better sense.  For first, the PaaS which provides Test Data Management solution has to push and enforce for learning and mentioning -- What is the intent of the test?

When I know, for what to evaluate and how should I be evaluating it, those test intent will lead me to the Test Data and its dimensions which should be vectoring the tests.  

Test Data is not [equal to] a Test.  Test Data is one of the variance, in fact, it is a multi-dimensioned variance that a test will [need to] have.  While a test has one deterministic objective, the test data will have multi-dimensioned vectors that instruments a test.

Test Data has its role in a test, and, it is a critical role.  Hence, we practicing Test Engineers talk and spend our time on Test Data in equal importance as we give our time to identifying, designing and execution of tests.

Test Data is not a test; but the test data is an expression of the test's intent.  Test Data is one of the byproducts of a test.  Having this clarity is important.

Note this, the test data is not the only expression of the test's intent.  A test will have multiple touch points; these touch points express, advocate and can show the intent of a test.  A test data is also a heuristic as a test is.

 

Test Data Management

The word "management" is underrated and not attempted to understand from context where it is being spoken.  How do this word sound -- "Test Data Leadership"?

In engineering, especially in the software engineering, the word "management" inherently talks primarily about design.  On day-to-day operation, the management designs its strategy and approach to solve the problem and challenges.  

When said "Test Data Management" in software engineering, it is about strategizing and approaching the problem of identifying and categorizing (subsets) the data to test.  A kind of leadership at one layer.  It has its role and critical to the deterministic outcome for a test.

In machine learning, we can see this categorization -- training data and test data.  Then, is training data not a test data?  It is!  The training data is designed [managed] to train and this data as well identifies the problem as the training progresses.

The data which are identified, designed, categorized and collected as Test Data, are well sampled data in the context, for the intent of a test.

To conclude, Test Data Management is one of the correlation in software engineering to solve a problem; it is contextual in how and what tests and testing is executed.  The different teams and organizations has their own way of managing the test data.  How they do it, it is their way of doing it.  We have to look Test Data Management from point of Test Engineering and not as an another buzz word to sell just as a product and business's service.