Testing Garage: AMYQ: Approaching the Solution to Test Data Challenges of Shrini Kulkarni

In this post, I'm trying to reason in brief for the questions shared by Shrini Kulkarni in this AMYQ. Here are the challenges/questions shared by Srini,

#1 challenge - setting up data in upstream systems to suite test cases that need to be run. There is AUT and there are upstream systems. In a corp setup -- individual teams are setup for each application. Hence getting another team to set some data in other system often encounters lots of manual effort and red-tapism

#2 Reserving test data created in AUT or upstream systems for specific team's use so that other teams do not change it.

My Understanding of These Challenges

AUT is in place and it has upstream systems.
There are multiple teams for each AUT in an org.

Say teams A, B, and C for sake of learning here.

Team A has difficulty when wanting to create the Test Data in the space of Team B, and vice versa.

The difficult for Team A can be as,

Not permitted to create data
No awareness and understanding of Team B's system
Cannot make progress in testing unless there is data created for Team A
The created Test Data by Team B is not shared to Team A for multiple reasons as data pollution and corruption which disturbs their test cycles
And, more ...

How to reserve Test Data created in Upstream system for use of a specific team?

How to make sure other team does not use it or edit it or delete it?

This is my understanding of Shrini's challenges.

Conway's Law and My Experiences

The Conway's Law says,

The architecture of a system will be determined by the communication and organization structures of the company.

Inverse of Conway's Law says,

The organizational structure of a company is determined by the architecture of its product.

The above stated both laws hold good for Microservices teams and the teams which consumes their services.

I have experienced,

Each microservices teams creating their own set of test data.
These test data is not shared to any other team.
The other teams are not allowed to create data in their space.
No team is aware of what others are doing in tests and what they are using to test.
The teams work in silos.

This leads to friction, aggression and unhealthy communications between teams. Now, where is the possibility of having the Test Data in upstream system which can be consumed to run tests by every other teams?

Do you see the statements of Conway's Law and Inverse of Conway's Law here in how these teams are setup, orchestrated and communicates in building one product?

What's the Problem?

Here is my understanding of the Problem Statement from the challenges shared by Shrini,

How to setup Test Data in upstream system, to suite the test cases, that need to be run? And, how to ensure Test Data created by one team is not modified by other teams?

If observed, the outcome described in previous section is not a technical problem.

It is the collaboration and communication problem, that comes up in presumption -- other teams using one's test data it affects their team's work and delivery.

Yes, it can impact badly if the creation, editing and managing the test data is not done attentively by teams when everyone shares and consumes it. So the respective team's engineering managers and directors show the resistance to create data in their team's space as it impacts their pipeline and delivery.

How I Solved It?

In the contexts where I work, I have different upstream systems to which I'm supposed to interact and get the data. Then, use these data to test the service offered by the product.

But, creating test data in other team's space is not allowed!

There are multiple solutions that I approached with and solved this problem in different organizations. The below said approach worked with most clients and so I share it here. It is like a Design Pattern; I can use the structure of it in multiple teams to solve the similar problems.

I came up with an approach to create a suite having the endpoints of all these upstream systems. And, I need not say how difficult and pain it was to get the endpoints and its details from teams. It was a marathon circus of me to get it!

Here is how I approached the solution,

I understood my teams were comfortable with Postman.
For this problem solving, I see Postman was simple and quicker than writing a automation project with libraries like RestAssured and request.

Team can run the Postman Collection from their system in quick time.

I created an Test Data Inventory

The inventory has the Test Data which the test teams engineered for their testing and automation
These data is not just for functional testing; it had Test Data for security, performance and accessibility too.

I built the Postman Collection having the endpoints of these upstream systems.

The collection is version maintained like any other code in the organization.
The meaningful commits are being made and pushed, as and when the need comes up.

I crafted an environment file and wrote the scripts which places a variable having the stamp

This stamp tells, it is a test data created by automation run for this version of regression cycle and from this team.
The stamp is max of 10 characters and it is dynamic based on that run to avoid duplicate data.

I said it is dynamic and not the random!
This dynamic characters has meaning and an intent.
Note, when I say dynamic it is created for that iteration of Test Data

These created test data are used by team eventually not touching the test data of other teams.

This let us move further with our tests and automation in the pipeline. Note that, I have multiple upstream systems to which the AUT speaks. Not disturbing them is a challenge!

I maintain the inventory of test data having versions of test data being crafted sprint-on-sprint. We pull the test data of different versions from the inventory as and when it is needed for the intent of test and automation.

All that said, the team continues to work in silos with no much collaboration. Who has to solve this? Though I can solve this, my role and pay grade does not allow me to do so effectively. I have solved within my teams and not to the organization level. It is a engineering and culture problem which has to be addressed at the organization level.

To summarize,

The test data creation and its inventory management having versions of test data is an Engineering culture of an organization.

It has to be taken in consciously and help the teams move swiftly in building the resilient and usable systems.

Test Data and its engineering is a project within an engineering project.

No wonder if any LLMs offers a business solution solely on Test Data in coming days!
Test the data offered by such LLMs before consuming it.

Practice in the space of test data and testing the test data.
There is Data Coverage in the engineering like it has Test Coverage.

What's your Data Coverage?

2 comments:

RavisuriyaMarch 11, 2025 at 1:03 PM
References:

1. https://en.wikipedia.org/wiki/Conway%27s_law

2. https://en.wikipedia.org/wiki/Melvin_Conway
RavisuriyaMarch 11, 2025 at 2:22 PM
I have to this with all of you.

Shrini's blog is my gateway to the entry of Software Testing universe. If I had not come across this blog in 2007, I'm not sure where I would have been in my practice today.

Thanks, Shrini for writing your blog. It opened me to the Universe of Software Testing.

Another person who actually took me to Shrini's blog is, my colleague Kantharaja MP.

I'm grateful to these two people. :)