Testing Garage: Web

Showing posts with label Web. Show all posts

Thursday, May 29, 2025

Browser Compatibility - What Problems Are You Witnessing In 2025?

One of my long standing work in Software Engineering is testing for Browser Compatibility. But, is that a bigger problem as it was between 2000 to 2017?

Today's browsers have to and adhere to the common guidelines and standards of W3C. Likewise, the web development and artefacts [libraries, frameworks, language, etc.] used are expected to adhere the standards and guidelines.

In 2012, I layed out the evolving model of Test Architecture for Browser Compatibility. The teams across Aditi Technologies did use this solution and it evolved gradually. It is the 'go to' test solution and approach to test and automate for web application's browser compatibility, strategically.

Those were the days, where the browsers and web development did not adhere to standards and guidelines. As a consequence, the web pages exhibited problems and unexpected [intermittent] behaviors on major browsers and its versions.

I understand and aware, that, the browser compatibility behavior can be a GUI difference to a functional blocker. Further, it can expand to the web page's performance behavior to accessibility and security differences. I have witnessed all this when testing for the web application and its browser compatibility, then.

I cannot forget the versions of Internet Explorer crashing while another did not when using the web pages.

Maybe, on reading this, you are thinking of me as an old tester. Is it so?

2025 and Browser Compatibility

I'm curious to know and understand from you. So, I ask this question to you.

Did you experience and report the browser compatibility behaviors recently? If so, what is that?

Please do not share the project and tech details.

Requesting you to share what behavior did you notice as part of the browser compatibility and which quality criteria of the web page is impacted.

I'm one ping away on Google Hangout and on an email. Please do share! This will help me to contribute back to the Software Engineering and Testing community.

Friday, February 2, 2024

Deep Link and its Testing via Automation

I get these question consistently from my fellow testers and community.

How to automate the mobile apps and web applications using Deep Links?
How to automate the business flows using Deep Links?
How to achieve end-to-end business flows testing on using Deep Links?
How to automate scenarios in mobile apps using Deep Links?
What is the best approach to automate the mobile apps using Deep Links?
What is the best practice to automate using the Deep Links?

And, more questions on same pitch.

No Deep Dive into - What is Deep Link?

A hyperlink in HTML is a kind of deep link within a website or to another website.

Deep Link is known with different names for web, Android app and iOS app. All these names have the same understanding and intent at some point.

The Deep Links are URIs that takes me directly to a specific part (activity or fragment) of the app that I'm using or testing. The Deep Link will have an intent which tells where I will be taken on using it.

When we converse on diving deep technically into testing and automation of Deep Link, will share more insights into its internals.

Deep Link and Challenges

This question is discussed with me often:

How to do end-to-end testing using the Deep Link?

Automation of a mobile app using Deep Link poses a challenge which is not experienced in web application.

One such challenge is, say you have not installed the mobile app. [This is solvable!]

On using a Deep Link, I should be taken to Apple Store or Play Store based on the app.
I have to install the app.

Post this, in the traditional automation, I should start traversing the business work flows via GUI.
Is this adding to the flakiness aspect of automation via GUI?

When we talk so much about flakiness and how to avoid (not prevent), should we exercise business workflows when automating using Deep Link? What you are thinking? Let me know!

Scoping of Automation Using Deep Link

Back to the fundamentals.

We have to automate, no escape from it. Let us automate what must be automated!
Let us not fall into trap of "Automate everything!"

For today, I'm in this mindset and attitude,

What we automate depends on the objective or goal that we want to accomplish.

Each test should have precise and deterministic goal.

A test via automation is not an exemption to it.
A test defined in automation should be precise, deterministic and have a single objective - Single Responsibility Principle.

What is the objective of my testing via automation for the Deep Link? This define the scope and extent of my automation. This will minimize the number of checks that I do using Deep Link.

The purpose of Deep Link is to take me to specific part of the mobile app.

Should I start the end-to-end or exercising the workflow to be included in the Deep Link tests?

If included, am I not complicating the testing via automation?

Automation using Deep Link

I ask this question to myself and to my team.

What is the goal of testing via automation using Deep Link?

This question helps me to pick minimal and necessity flow actions. It has lead and leads me to define minimal tests for Deep Link based on what we want to learn from automation of same.

To me, the purpose of Deep Link is not end-to-end testing. It's purpose is,

Am I taken to the intended state and data when used the Deep Link?

I have kept the test intent to this.

With this, I have come with tests that has minimal must evaluation and assertion to learn if the app is responding or not to the Deep Link. This is what the business wants when the Deep Links are created.

The app usage and workflow function is not a problem statement of Deep Link in a general context.

Deep Link is not for end-to-end. It is to take to you from a point to another point, that's it.

Are you automating using Deep Link?

Sunday, December 25, 2022

HTTP Request Methods - DOT 3P HCG

Today, in the morning session with a mentee, she asked, "I have difficulty in remembering all the HTTP request methods and what it does. How can I make it simple?"

I had the same question in the end of 2009 when I started testing the applications built using the HTTP.

Learning, and Registering the Learning

When I read, I forget it, because it is not yet registered in me consciously. How to learn in a way so that it registers in me? I had this question. Especially, when I started my career, I had this challenge.

In the college days, I had formed a tricks and hacks to remember and the mnemonic was one of them. In 2008, I came across mnemonics in Software Testing. I saw the mnemonic used by practitioners in Software Testing as one of the learning techniques and to register and retrieve the learning.

I repeat my learning in multiple approaches until I understand a concept. Then I form a layer where I make it simple for me to register it, in me, and to retrieve.

I applied the same with the HTTP request methods. It became simple to me to recall and use it in my test designs when needed.

DOT 3P HCG

I helped myself by framing the mnemonic DOT 3P HCG in 2010. I had difficulty in recalling the HGC part. For this, I said to myself -- head, chest, and gut. That HCG became smooth in registering. Finally, I could recall all the HTTP request methods with this mnemonic.

DOT 3P HCG stands for:

D: DELETE

to delete the resource specified

O: OPTIONS

describes the communication options for the targeted source

T: TRACE

used for diagnostic purpose and does a loop-back test along the path to target resource

P: POST

to submit an entity to specified resource

P: PUT

to upload/update an entity that is saved on server at a specified endpoint

P: PATCH

to do a partial modification to a resource

H: HEAD

Ask for a response which is identical to GET but without a response body

For example, fetching the expiry date in a header as a response so that it can be used in the next request's header or a payload

C: CONNECT

To establish a tunnel with a endpoint or server for communication

G: GET

To request a representation (an information copy) of specified resource

As the HTTP request methods name are verbal, I can recall easily the purpose of each method. I shared the same today with a mentee. She could register it in a minute and recall these HTTP request methods and its purpose.

She is happy and says it is so simple now to recall the HTTP methods and its purpose.

Thursday, September 22, 2022

WebDriver: Tracing the Interface WebDriver - Part 2

In the previous post of this WebDriver series, I shared a gist about what WebDriver does and how. In this blog post as Part 2 of this series, I'm sharing a bit more details on WebDriver and RemoteWebDriver.

From there, we will see how AppiumDriver is related to WebDriver -- which extends the interface SearchContext.

This blog post is written as part of 21Days21Tips from The Test Chat. The tip shared in this post is to know more about WebDriver internals and how it associates with RemoteWebDriver and AppiumDriver.

This should help in understanding the Selenium APIs better and from where it comes. This helps in having a better mental model of the Selenium WebDriver and how we want to structure the instructions in the tests and utilities we write

SearchContext and WebDriver

Picture: Representation of SearchContext and hierarchy of WebDriver

The SearchContext is the parent interface in the WebDriver hierarchy

The subinterfaces of SearchContext are

WebDriver
WebElement

This SearchContext defines two methods

findElement(By by)

Modifier and Type is: WebElement
It finds the first WebElement using the given method

findElements(By by)

Modifier and Type is: java.util.List<WebElement>
It finds all elements within the current context using the given mechanism

Note: I'm referring to Java APIs of Selenium in this blog post
More details of this can be found here.

Note: Selenium's Ruby client describes the Interface SearchContext as this.

The WebDriver provides the below methods:

close()
findElement(By by)
findElements(By by)
get(java.lang.String url)
getCurrentUrl()
getPageSource()
getTitle()
getWindowHandle()
getWindowHandles()
manage()
navigate()
quit()
switchTo()

More details of these methods can be found here.

RemoteWebDriver and AppiumDriver

Further, we see the class RemoteWebDriver implements the interface WebDriver. Today, the WebDriver and RemoteWebDriver communicate using standard W3C specifications.

That way, all the modern browser which adheres to W3C specification should not have (much) trouble when using WebDriver and RemoteWebDriver to mimic the user action on them. We see the ChromiumDriver(), ChromeDriver(), FirefoxDriver(), Edgedriver, SafariDriver(), and OperaDriver() extending the RemoteWebDriver.

This hints us to know and learn:

Why do we initiate the WebDriver for first
And, then we instantiate the browser's driver
Later how we use WebDriver's instantiation to drive action (mimic the user action) on the browser using the respective browser's driver

When we want to automate using Selenium Grid, we make use of RemoteWebDriver to drive the action between the client and server.

The class AppiumDriver extends the WebElement and RemoteWebDriver from the project Selenium. And further, it has its own methods to interact with the mobile elements. More details about the Java Client of AppiumDriver can be found here.

The subclasses of AppiumDrivers are:

AndroidDriver
iOSDriver
WindowsDriver

21 Days 21 Tips -- #day17

Here are my pointers to fellow test engineers

Interface SearchContext is top in the hierarchy of the WebDriver interface
Interface SearchContext defines

Should I want to search for the element in the whole page

using WebDriver object

Or, should I search within a containing element

using WebElement object

We can notice methods returning the type WebElement

RemoteWebDriver implements the interface WebDriver
The modern browsers drivers extends the class RemoteWebDriver
AppiumDriver extends the class RemoteWebDriver and interface WebElement

AndroidDriver extends AppiumDriver
iOSDriver extends AppiumDriver
WindowsDriver extends WindowsDriver

For more understanding of the SearchContext and WebDriver, refer to below git repository of SeleniumHQ:

SeleniumHQ Repository -- https://github.com/SeleniumHQ/selenium
Selenium Java Client

SearchContext

https://github.com/SeleniumHQ/selenium/blob/trunk/java/src/org/openqa/selenium/SearchContext.java

WebDriver

https://github.com/SeleniumHQ/selenium/blob/trunk/java/src/org/openqa/selenium/WebDriver.java

RemoteWebDriver

https://github.com/SeleniumHQ/selenium/blob/trunk/java/src/org/openqa/selenium/remote/RemoteWebDriver.java

ChromiumDriver

https://github.com/SeleniumHQ/selenium/blob/trunk/java/src/org/openqa/selenium/chromium/ChromiumDriver.java

The below understanding should give a mental model of how the call happens in Selenium's library:

WebDriver and browser's driver instantiation
The order in which it is instantiated and used in programming to automate actions on the browser

If noticed, the automation we do is more of programming and not of Selenium's library. We extend and implement the Selenium library in our programming to mimic the action on the browsers and mobile apps.

Friday, April 1, 2022

WebDriver: Clarifying the Confusion on Why and What is the WebDriver - Part 1

I had a question "What is WebDriver and why should I use it to automate on a browser?" I tried to understand it and relate its presence in code written using Selenium. I see this question in the test engineers who are starting the practice of automation on browsers.

And, most of us get confused with WebDriver, WebDriverManager, and WebdriverIO. All of these are not the same but all these work around the same space that is automation on the web and mobile.

Between, I learn understanding of WebDriver is fundamental to the practice of automation on web browsers. The same idea is taken to the automation of mobile apps using Appium.

I'm sharing this learning of me as a part of 21Days21Tips the initiative from The Test Chat community. The tip here is to assist by providing clarity around the WebDriver and why we use it in automation on a browser.

What is WebDriver?

The WebDriver is part of the Selenium library and we use it every time when we are trying to do any interaction with and upon a browser. It is also a language binding and helps to write the browser controlling code. For example, if I pick Selenium's Java WebDriver,

it provides the APIs that I consume to control the actions on the web page displayed on a browser
likewise, if I pick Selenium's Python WebDriver it provides me the APIs that I consume to automate my actions on a browser

I code here using Python

That said, the WebDriver is a set of APIs and to be precise it is an object-oriented API adhering to the W3C standards. As a result, the WebDriver drives the browsers effectively today as all popular browsers to the W3C standards. The HTTP is used as the transport protocol.

Understanding the WebDriver

On a higher level, this is what WebDrier does:

The tests we write make use of WebDriver API
This WebDriver API carries the commands (written in the test) to interact with the browser's driver
On receiving the commands, the browser's driver and the browser will have native communication, where the driver will translate the commands to the browser to emulate the action on a browser.
The browser returns the response to its driver
The browser's driver will transfer information to the WebDriver
Then, WebDriver shows the information to a user who is running the test

Examples of browser's driver are:

chromedriver of Chrome
geckodriver of Firefox

Representation of Selenium WebDriver's Communication

The instructions (commands) that I pass via WebDriver's object are translated to stateless information. That is, there is no state maintained between the client and the browser's driver.

Representation of Selenium's WebDriver SPI & Browser Interaction

When the code enters into Stateless Programming Interface (SPI), it is called into a process that breaks down what the element is, by using the unique identification and then calling the command. For example, let us look into the below statements to understand what the code looks like at SPI:

Code written using WebDriver API:

WebElement greetBox = driver.findElement(By.id("greeting_textbox"));
greetBox.sendKeys("Welcome to Testing Garage's Blog");

SPI:

findElement(using="id", value="greeting_textbox")
sendKeys(element="greetBox", value="Welcome to Testing Garage's Blog");

Note: The findElement and sendKeys are the commands provided by Selenium's WebDriver API to find the web element on the web page and enter the text into the web element. The browser's driver receives these commands and data, then emulates the command (a user action) on the browser, and carries back the response to WebDriver.

21 Days 21 Tips -- #day13

Here are my pointers to fellow test engineers who are confused about WebDriver

WebDriverManager and WebdriverIO are not WebDriver

But all of these are around automation of the web and mobile

WebDriver interface helps in

Control of the browser
Identification and selection of web elements on the web page
Provides assistance to debug

Browser Level API

driver.manage().window().maximize();
driver.get("https://testingGarage.blogspot.com");
driver.navigate().back();
driver.navigate().forward();
driver.getWindowHandle();
driver.getWindowHandles();

Few Page Level API

driver.findElement(By by)
driver.findElements(By by)
driver.getCurrentURL();
driver.getTitle();
driver.getPageSource();

If you notice, we use these APIs to automate the browser

The tests we write use these APIs of Selenium WebDriver along with the assertion

Why are we using "driver" in the above commands?

This is another question and confusion among fellow test engineers starting to practice automation
I will share this in the next tip :)

This understanding of WebDriver, and the why and how it is instantiated (in the next post) will help you to be comfortable in starting to read the test code written using Selenium.

Sunday, April 26, 2020

Web: Debugging the Errors using Browser's Utilities

My curiosity raises when I come across problems that has no first-hand clues. I will get in and start debugging it, looking around what is available for first. Here is one such case, which was mentioned in Facebook Group of The Test Tribe community. It had a screenshot of a browser (not sure which browser) with URL and a error message for a untitled tab -- "This webpage was reloaded because a problem occurred."

I can't guess what went around there! But I see, there should be a reload of page again. The reason for it can be anything. But I will take two into considerations at least to start:

Browser crash, and
DOM failed and paint did not happen (for 'n' reasons, which is not know at time of witnessing it).

My questions to start looking for insights:

How will I know what actions did I make with browser?
How will I know what was loaded and not loaded?
How will I know if there was any errors in loading web page?
How will I know if there was a crash in browser? (Assuming this is a rare case, but it cannot be denied. I have witnessed browser crash when I tested for cross browser compatibility of web application)

Typically I prefer the browser in this order when I want to assess the event actions and performance -- Chrome, Firefox, and any other browser. Note that, how the event actions and performance are handled differs from browser to browser.

Opening below Chrome URLs in a tab of Chrome, and in another tab using the web applications records and collects information that can be useful to debug. That way I don't have to make note of my actions in parallel, while I test in this context:

chrome://user-actions (This records what actions I'm doing with my open browser window instance)
chrome://history (This shows the places I have visited)
chrome://net-export (I use this when I actually need it to debug much deeper, else I will not enable it. This can have influence on the performance of a browser when enabled.)
chrome://crashes (lists the crashes of browser)

Make note of what you are collecting in file when used net-export utility in Chrome. The credentials and private can get recorded as well in a file. If this is to be shared with someone, know the risk of doing it. Likewise, in Firefox, referring the "about protocol" available is useful. For example, about://crashes

To know if the resources was requested and was it loaded or not, below commands in console of Chrome and Firefox, helps:

performance.getEntriesByType("resource");
performance.getEntriesByType("navigation");

I have not found IE giving this detailed information. Or, I'm not aware of it. If you are aware of it, please share. I have not debugged much in Safari in recent times. Usually, the technical heuristic is if "looks to work" in Chrome, "more likely it looks to work" in Safari. Need to look at the CSS particulars, in specific with Safari.

These are few things which I will have to keep pre-setup, before I test web applications in browser. Possibility of getting the required information is high if had this setup -- to debug and report bug with tech details.

In IE, I will debug from the point where the error occurs by following the stack trace details. When done together with a programmer, it helps very much mutually. Apart from above said, searching in web, I found several reasons stated for this error from cache to invalid time in system where browser is being used. I will cross check quickly for them as well, while I have this pre-setup done and collecting required information.

Testing Garage

Pages