Testing Garage: GUI Testing

Showing posts with label GUI Testing. Show all posts

Tuesday, March 17, 2026

45 Seconds of Confusion: When a Familiar GUI Fails the Human Eye

In meetings, we often hear the same line,

"That's not a bug. Report it as an enhancement."

Sometimes the observed behavior never even makes it to the enhancement list.

But, what happens when the problem is not about functionality, but about how the users experience the GUI and its usability?

My peer Dhanasekar Subramaniam (DS) recently published a blog about a UI design that delayed him using the app. This made me curious. How could a UI that an engineering team acknowledged and used slow down a user?

I decided to test investigate the design. On testing and analyzing the UI behavior and usability, I discovered something interesting -- the GUI looks as in code, but behaved differently to the human eye.

I went through the usability and experience problem, but I was conscious about this behavior so I could identify it in quick time.

If you are a SDET or Test Engineer, this blog will help you develop new perspectives while analyzing GUI problems. If you are a manager or decision-maker, it highlights why seemingly small GUI problems should not be ignored.

When a Simple Task Creates Anxiety

Late at night 11 PM, the user opened the cab booking app Rapido, just as they had done many times before. The goal was simple -- book a cab and reach the bus stop.

But something unexpected happened.

This time, the users could not figure out how to book the cab.

Seconds started passing. The GUI looked familiar, yet the action to book the ride was not obvious. Nearly 45+ seconds went by, trying to understand what to do next.

The situation made it worse.

It was 11 PM, the bus departure time was getting closer, and the user was unaware how to proceed because of the app's GUI.

That moment -- when the user knows the app, knows the task, yet cannot complete it, creates anxiety.

So the question is,

Why did two tech-savvy users, using an iPhone and familiar with the app, spend more than 45 seconds trying to figure out how to book a cab?

45 seconds for a task that usually takes less than 5 makes the problem feel bigger instantly.

Understanding the Cause of Anxiety

Here is how I started to learn and understand,

I installed the Rapido on the Android phone.
I have no ride history with Rapido.
I signed in for the first time.
I see the Ride screen.

I see the three addresses given which I had not chosen. I could save these as favorite; I did not.

On the Ride screen, I do not see where to enter or select the pick up location and destination.

TL;DR -- In quick and short here is what caused the confusion which led to anxiety.

The text in the search text field.
The color contrast of search text field.
The color contrast of view showing the three addresses.
The color contrast in between the 2 and 3.
User not able to identify that is a search text field which is tappable.

What to fix?

To rephrase the search text.

"Enter pickup location" works as charm; refer Pic-4 in this blog post.

To have better color contrasting for the three GUI elements.

The GUI color and contrast to have ΔE ≥ 3 -- good and preferred

To highlight a search text field so that I will be prompted to tap and enter or select the pick-up and drop location.
When experimenting with AB Test configs, the GUI design to follow the suggested GUI Design & Color Engineering practices.

In usability and user experience what is not noticed is as good as not present.

Continue to read the below sections for the detailed information on usability and user experience problems.

If you want to quickly know the technical analysis alone then move to sub-section -- Why It Fails - Mathematical Analysis and Human Brain. This tells why the present Rapido app's GUI Design and colors confuses a human brain and eye.

No wonder, why the users got into anxiety when booking the cab at 11 PM!

The First Usability Pitfall in the GUI

Now continue to read with much attention.

I looked at the top of the screen. I see text -- "Where are you going?".
Below the text, I see the three locations listed which I had not chosen or entered or of my current interest.

Ah! that confused me. Why?

I closely looked my mobile screen, that is Ride screen.

Pic-1: The confusing text and three locations displayed

I see a search text field.
I see a search icon next to the text field.
The search text field has a text -- "Where are you going?"

This is the first usability pitfall in the confusing GUI.

Why am I asked where am I going and listing the three locations that I did not enter or choose?

The Second Usability Pitfall in the GUI

In the below image Pic-2, I see,

There is no prominent visual difference in the layouts of

Search text field

Color hex code #FFF8FAFC

The three location displayed

Color hex code #FFF6FAFF

Yes my brain could not perceive the differences right away between these two layouts.

Pic-2: The color contrast of the GUI elements.

The color contrast of these two layouts are almost same.

This added for the confusion.
My brain was perplexed in knowing what's happening here.
I'm wasting time here to learn how to book a cab.

Is this what a Rapido need as a business?
Or, does it need a user to book a cab right away on opening the app?
Won't this experience drive away the user to the competitors app -- Ola, Uber, Namma Yatri.

If my brain cannot perceive the differences and is processing to understand what's happening here, is that a good user experience?

Forget about the user experience. Is that a serving UI Design and Engineering? I will leave this to your thoughts.

Further, the space in between these two layouts is with color contrast of #FFFFFFFF. This makes the confusion much stronger. Why?

All these three GUI components are on one main view
To the human eye and brain, the color contrast of these three GUI components blend as one rather three distinct GUI elements.

This is the second usability pitfall in the confusing GUI.

Not being able to distinguish between these three GUI elements in a quick time is a problem. Why the app has confusing color engineering for these three GUI components? Why the GUI design did not highlight that search text field as tappable? Why the search text is confusing when combined with the GUI color?

If the GUI components had distinguishing contrast colors.

Rapido's Competitor GUI and Usability

The competitor of Rapido has similar GUI, but it is more intuitive with the search text and color contrast of GUI components. Refer the below pic.

In the Ola and Uber apps,

The search text is straight and easy to understand.
The readability of search text is close to the context of using the app.
Importantly, the search text field is easily distinguishable easily.

The search text and distinguished search text field makes me understand I should tap on it and enter the pick up and destination location.

Pic-3: The search text field and GUI in Ola, Uber and Rapido apps.

The Two GUIs of Ride Screen

When test investigating, I experienced the two GUIs of Ride screen.

The other GUI looks better in terms of usability and prompted me to tap on the search text field. But my question is when do I get this screen?

Could be a AB test parameter coming in to the app shows the different Ride screen. I did not pick this for debugging as it looked better.

In the below picture, screen 2 shows a better search text. Also, it does not have those three locations that I see in screen 1.

Pic-4: The two different Ride screens of Rapido.

Test Investigation & Analysis - Why My Brain & Eyes Took 45+ Seconds?

This section has details of my debugging, test investigation and analysis. I have put my eyes, brain, smart phone, reasoning and the Rapido app to evaluation.

If you are a Test Engineer or SDET in a role, this should be super helpful when you are testing for GUI. Do not skip it!

Human Eyes and Cones for Blue Shades

I learn, the human eyes have three cone types L, M and S; below are it sensitivity.

L-cones is for Long-wavelength cones; it is sensitive to red-ish light.
M-cones is for Medium-wavelength cones; it is sensitive to Green-ish light.
S-cones is for Short-wavelength cones; it is to Blue-ish light.

You remember, I shared the hexa color code for the two GUI components, that is

#FFF8FAFC
#FFF6FAFF

Pic-5: #FFF8FAFC color

Pic-6: #FFF6FAFF color

In between the above two hexa color code images, we have white (#FFFFFFFF) background as in the Rapido app.

These two hexa color codes explain my observation.

I struggled to distinguish the subtle color difference, especially in certain ranges.
So the two users who were booking the cab at 11 PM. Why?

We humans have a fewer S-cones and it is less sensitive. Hence, the small changes in blue/cyan hues are hard to see.

But, the small changes in red/green are easier to detect.

Have you seen sky in the night when an aero plane is flying.

You see the red-light of an airplane though the distance between sea level and the air plane is around 10 to 13 KM.

Why do the plane use red-light and not blue or any shades of blue? Hope this should trigger your eyes and mind now.

With this simple daily life example shared, tell me about the two blue shades discussed here with minimal difference and next to each other as GUI components in an mobile application?

To add more to the complexities, the hardware and display capabilities of smart phone models from the same OEM varies. You see, how critical is the UI Engineering now in software design!

Display Behavior of Smart Phones

Even before your and my eyes see the color, the smart phone's display (hardware + software) processes it.

That is, the smart phones,

Quantize colors (round up values)
Use OLED sub-pixels
Apply gamma corrections

This leads the rgb(246, 250, 252) and (248, 250, 252) to be same emitted light. Why? The display hardware will round or merge the small difference. An another reason why on the Android device that I used and on an iPhone the other two users could not differentiate between the two GUI elements of Rapido app.

Viewing Angle Makes It Even Worse

I was holding my smart phone at 180 degree to the ground -- that is device at an angle to view.

Pic-7: Holding the smart phone to view at an 180 degree to the ground.

At an angle,

The contrast reduces
The colors shift
The subpixels blur

So even a small difference that might exist becomes visually flattened. Small hue differences are flattened by the panel optics. This effect is common in cyan/blue hues.

Further our human visual system averages nearby pixels. The two adjacent colors like #FFF8FAFC and #FFF6FAFC are interpreted by the brain as a single averaged blue.

Why It Fails - Mathematical Analysis and Human Brain

In color science, the term "Empfindung" is used when talking about the experience of a color. It is a German word meaning sensation or perceived differences.

In the UI and Design Engineering it is used as ΔE. Where Δ (delta) means change or difference, and E is Empfindung -- perceptual sensation.

The professional UI Engineering rule enforces the minimum color difference of 3 to 5 RGB units. Or the perceptual metrics as ΔE > 3 to ensure UI elements remain distinguishable.

For these two colors #FFF8FAFC and #FFF6FAFC in discussion here, the calculation using CIEDE2000 color difference formula results in ΔE = 2.0 to 2.3.

This range falls into interpretation as slightly noticeable -- borderline perception.

But, both colors in discussion here are with very high lightness (almost white) and low chroma (very low saturation) -- this is critical.

For such colors the human sensitivity to differences drops significantly.
Despite the ΔE ≈ 2, in reality the users will not notice the differences, especially on the mobile phones.

The smart phone display may map the two colors discussed here to the same or near-identical output. Why?

Display Quantization

The values of R and B are near maximum in the above said color

The maximum color value is 255.
In our case the RGB of the two are rgb(246, 250, 252) and (248, 250, 252).

Rounding and Gamma correction will compress the differences from the hardware and software of a mobile device.

OLED Screen

The smart phones having OLED screens, with high brightness levels,

The subpixel differences become less distinguishable

+3 in blue channel may not produce a visible shift
-2 in red may be completely lost

Viewing Conditions

On smart phones, the brightness varies, ambient light interferes and viewing angle shifts color.
As a Result, ΔE ≈ 2 is often perceived as identical by human brain.

That is, the human brain cannot differentiate between the colors

Using these two colors #FFF8FAFC and #FFF6FAFC for buttons, states and backgrounds is risky and leads to users unable to distinguish them reliably.

For those with accessibility concerns and conditions, ΔE ≈ 2 is effectively invisible. It fails practical usability and experience expectations.

The final outcome from the test investigation and debugging is,

These two colors used is not helpful and unreliable.
Not suitable for distinguishing the GUI elements.
Needs stronger contrast for interactive GUI design.
If the ΔE ≈ 2.0 to 2.3, it is borderline and unreliable.

The range 2.0 to 2.3 may be ok only for subtle background variations.
In this case it failed; we all three users had difficulty and trouble to understand the GUI.

Use the colors and contrasts with ΔE ≥ 3.

Use the below as a reference (heuristic) for standard perception thresholds.

ΔE is < 1,

the interpretation is not visible.

ΔE is 1 to 2,

the interpretation is barely noticeable.

ΔE is 2 to 3,

the interpretation is slightly noticeable
But it does not serve on mobile app engineering.

ΔE > 3,

clearly visible

On the lighter fun side refer to the below pic. Let me know what is the Empfindung of your eyes for the discussed two colors together with the white background.

Pic-8: The screenshot of this blog post on my mobile screen.

The three colors FFFFFFFF, FFF8FAFC and FFF8FAFC appear to merge and look as one color. Doesn't it?

You can try an experiment with the above pic. Look at this pic, by increasing and decreasing the brightness and contrast of the screen (smart phone or monitor) by being in the lighted and dark. What's your experience?

Hope this should be a sufficient data to understand the seriousness of the problem discussed in this post.

What's the Fix?

For mobile app engineering, the recommendation for GUI color and contrast is

ΔE ≥ 3 -- good and preferred

ΔE ≥ 5 -- safe

Use the better text in the search text field

This looks better and prompts to tap on it -- "Enter pickup location"

Distinguish and highlight the search text field GUI component prominently
When experimenting with AB Test configs, the GUI design to follow the suggested GUI Design & Color Engineering practices.

These fixes also benefit the users with accessibility concerns and conditions.

Any questions or information needed on this please do connect with me. I'm just one ping away!

Wednesday, January 26, 2022

Automation Strategy - How to Automate on Web UI Table for the Data Displayed in Table

Use Case to Automate and a Problem Statement

I read the below question in The Test Tribe's Discord community server. As I read, I realized, we all go through it when we think of automating a use case. The credit of this question is to Avishek Behera.

Picture: Description of a use case to automate and a problem statement

Here is the copy-paste of the use case and problem statement posted by Avishek Behera:

Hello everyone, here is a use case I came across while having a discussion on automating it.
A webpage has a table containing different columns,let's say employees table with I'd, name, salary , date, etc
It also has pagination in UI, we can have 20 rows in one screen or navigate to next 20 records based on how many total records present,it could go about 10 + pages and so on....

Problem statement:
How to validate the data displayed in table are correctly displayed as per column header , also correct values like names, amount etc. Use case is to validate data.
The data comes from an underlying service, different endpoints basically.
Now it's not about automation but about right and faster approach to test this data.
What are different ways can we think of?
I know this is a basic scenario but since I was thinking of different possible solutions.
One way my friend suggested to use selenium, loop through tables ,get values ,assert with expected. Then it is time consuming, is it right approach just to validate data using selenium?

These are the useful attributes of this question:

It had the preset of context and the context information to the reader
The availability of context information gave an idea of

What would API look like
The request and type
The response and details
The consumer of this API

It helped to imagine and visualize how the data would be interpreted by consumers to render
I get to see what Avishek is looking for in the said context

Interpreting the Use Case and Problem Statement

What it is?

Looks like the consumer is a web UI interface

Mention of Selenium library supports this interpretation

The response has a data which is displayed in the table kind of web UI
There can be no data to multiple rows of data displayed in the table
Pagination is available for this result in the UI

Is pagination available in the API request and response or not, this is not sure from the problem description
20 rows are shown on one page of a table
The number of pages in the table can be more than one
The response will have a filter flag

I assume that data displayed in the table can be validated accordingly

The response will have the data on the number of result pages

This makes the result in a page to be of fixed length
That is 20 results on each page and I cannot choose the number on the UI
The response will have the offset or a value that tells the number of records displayed or/and returned in the response

Is it a GET or POST request?

This is not said in the problem description
But from the way the problem is described, it looks like a GET request
But should I assume that it is an HTTP request?

I assume it for now!

I assume the data is received in JSON format by the consumer
I assume the data responded by the endpoint or the service, are sorted and returned

The consumer need not process the response, sort, filter, and display
If the consumer has to process the response, then filter, sort and display,

it would be a heavy operation on the client and the client-side automation for this use case

If it is other than an HTTP request, it should not matter much. The underlying working and representation may remain something similar to HTTP requests and responses unless the data is transferred in binary format.

Automation Strategy for the Use Case

The key question I ask myself here is:

What is the expectation from automating this use case?

How I automate and where I automate is important but it comes later on answering the key question. The key is in knowing:

What is the expectation by automating this use case?
What am I going to do from the outcome of this automation?
What if the outcome of the automation gives me False Positive information and feedback?

These questions help me to see:

How should I weigh and prioritize the automation of this use case?
How should I approach automating this use case to be close to precise and accurate with deterministic attributes?
What and whose problem am I solving from automating this use case?

It gives me the lead to the idea of different approaches for automating the same; helps in picking the best for the context

That said, the use case shared by Avishek Behera is not a problem or a challenge with the Selenium library or any other similar libraries. Also, it is not a problem or a challenge with libraries used in the automation of web requests and responses.

Challenges in the Problem Statement

I do not see any problem in automating the use case. But there are challenges in approaching the automation of this use case.

On the web UI, if I automate on data returned, filtered, sorted, and displayed, it is a heavy task for automation. Eventually, this is a very good candidate for soon to be a fragile test.

Do the below said are the expectation from automation of the use case?

To have a fragile test
To have high code maintenance for this use case
To do high rework in the automation when UI of the web change
To complicate the deterministic attribute of this use case automation

If these are not the expectations, then picking an approach that has lower cost and maintenance is a need.

The challenges here are:

It is an Automation Strategy and Approaching challenge
It is a sampling challenge

Yes, automation at its best is as well a sampling, not just the testing

It is about having better data, state, and response which helps to have accuracy in the deterministic attributes of automation

To know if it is a:

true positive
false positive
true negative
false negative
an error
not processable

The layer where we want to automate

The layers which we want to use together in automation, and how much

Automate to what extent for having information and the confidence -- if this sampling works then most data should work in this context of a system?
The availability of test data that helps me to evaluate faster and confidently

Let whatever the system have in the underhood that is GraphQL, gRPC, ReST API, or any other technology stack services, one has to work on -- how to make a request; go through the response, and analyze it in context. Like testing depends on context, automation as well depends on context. In fact, context drives testing and automation better when it is included.

My Approach to Automate this Use Case

I will not automate the functional flow of this use case entirely on the web UI. My thought will be to have those tests which are more reliable and the result influencing and driving the decision.

This thought has nothing to do with the Test Automation Pyramid and its advocacy that is to have the minimal number of UI tests at the UI layer and much more at the integration (or service) layer. I'm looking for what works best in the context and where to have the tests that give me information and feedback so I have the confidence to decide and act.

To start, I identify the below functional tests for the said use case:

Does the endpoint exist and serve?
Assuming it is HTTP, I see what HTTP methods this endpoint serves?
What does the endpoint serve when it has no data to return?

The different HTTP status code this endpoint is programmed to return and not programmed but still returns

What inputs (data, state, and event) does this endpoint need to return the data?
In what format and how the input is sent in the request?
In what format the response will be returned from the endpoint?
Is the response sorted and filtered by the endpoint?
How does the response look when there is no data available for any key?
What if certain keys and their value are not available in the response? How does it impact the client when displaying the data in a table?

For example,

No filter data is returned or it is invalid to a consumer to process
No sorted data is returned or it is invalid to a consumer to process
No pagination data is returned or it is invalid to a consumer to process
The contract mismatch between provider and consumer for data returned

What the web UI shows in the table data

Any locale or environment-specific data format and its conversion when the client consumes the data that is returned by the endpoint
The data when sorted by consumer and provider differs
The data is sorted on a state by the endpoint and that might change at any time when being consumed by the consumer
Is it a one time response or a lazy loading

If it is a lazy response, does the response have the key which tells the number of pages

and more cases as we explore ...

and more tests as we explore ...

Should we automate all of these tests? Maybe no per business needs. Imagine the complexity it carries when automating all these tests at the UI level. But there are a few cases that need to be automated at the UI level.

Then, should we to look at the table rows on different pages to test in this automation? No! But we can sample and thereby we try to evaluate with as much as minimal data. This highlights the importance and usefulness of the Test Data preparation and availability. While, preparing the test data is a skill, the using of minimal test data to sample is also a skill.

API Layer Test

I have a straight case here for first. That is to evaluate:

The key (table header) and its value are returned as expected

Is it filtered?
If yes, is it filtered on key what I want?
Is it sorted upon filtering?
There is no null or no value for a key that needs to have a value in any case
The data count (usually the JSON array object), that is the number of rows
The page index and current offset value
The number of result pages returned by the endpoint

Can I accomplish this with an API test?

Yes, I can and it will be efficient for the given context

I will have five to ten test data which will help me to know if the data is sorted and filtered
Another test will be to receive more than 10 rows and how these data look on filtered and sorted

Especially in case of lazy loading
I will try to evaluate the filtering and sorting with minimal data
I will have my test data available for the same in the system

UPDATE: I missed this point so adding it as an update. I'm exploring the Selenium 4 feature where I can use the dev toolbar and monitor the network. If I can accomplish what I can and it is simple in the context, this will help.

UI Layer Test

I have a straight case here as well to evaluate in the given context:

I assume the provider and consumer abides by the contract

If not then this is not an automation problem
It is a culture and practice problem to address and fix

I assume the data returned data is sorted on the filter; the web UI just consume it to display

If not, I will understand why the client is doing heavy work to filter and sort

What makes it to be this way?

You see, this is not an automation problem; it is a design challenge that can become a problem to product, not jot just for automation

Asserting the data in the web UI table:

I will keep minimal data on the UI to assert that is not more than 4 or 5 rows
These rows should have data that tells me the displayed order is sorted and filtered

Let's call the above 1 and 2 as one test

To evaluate pagination that is number of result pages, I will use the response of API and use the same on the web UI to assert

Let's call the above another test that is the second test
Again, the test data will be the key here

To see if the pagination is interactive and navigatable on UI, I make an action to navigate for page number 'n'

If it is lazy loading, I will have to think about how to test table refresh

Mostly I will not assert for data

In testing the endpoint,

I would have validated for the results returned and its length

I will assert for number of rows in the table now

Let's call it a third test

I will not do the data validations and its heavy assertions on the web UI unless I have no other way

This is not a good approach to pick either
One test will try to evaluate just one aspect and I do not club tests into one test

Note: The purpose of the test is not check if the web UI is loading the same rows in all pages. If this is the purpose, then it will be another test and I will try to keep minimal assertion on the web UI.

The Parallel Learnings

If observed, the outcome of automation and its effectiveness is not just dependent and directly proportional to how we write automation. It is also dependent on:

The design of the system (& product)
The environment and maintenance
The test data and maintenance
The way we sequence the tests to execute in automation
Where and how we automate
The person and team doing the automation

The organization's thought process and vision for testing and automation
The organization's expectation from testing and automation
How, why, and what the people, organization, and customers understand for testing and automation

Time and resources for testing and automation
The automation strategy and approach
More importantly, the system having and providing

Testability
Automatability
Observability

Note: This is not the only way to approach the automation of this use case. I shared the one which looks much better to the context.

Testing Garage

Pages