Testing Garage

Monday, January 3, 2022

The Automation Strategy Problem; Not a Appium Challenge

In The Test Tribe's forum, I read the post which described the problem as in the below paragraphs and picture. On looking into it, I learned this can be made as a blog post that tells a strategy for automation.

Maybe, 10 years back I would have asked the same question. That's a learning curve. Today as well, I end up in thinking for a while asking self -- how to test it and how to automate it.

I want to share how this problem can be looked at from the perspective of testing and automation, and then approach it to automate.

Folks. I've two issues on Appium automation which needs your help.

1. I'm working on a ecommerce website where a payment method is integrated (lets take the example as PhonePe). When i try to place the order in the mobile website with payment method as PhonePe, the payment method app will be opened and I've to complete the payment using it and I'm navigated back to the browser. Issue is - How can i switch context between the mobile browser and the app? I tried using driver.startActivity() but on performing any other actions, it errors out.

2. Since i need to use the browser to place order and the payment using the payment app, I tried to set up the driver instance with browserName and app as the desired capabilities together. But on running the test, it errors out - browserName and app can't be used together. How can i approach this problem? Anyone who has automated such flows?

Apologies, i'm pretty new to Appium and so, please excuse my ignorance.

Picture: Problem Statement - Description of Scenario & Challenge

Understanding the Scenario and Functional Flow

I observe the below in the said scenario:

It is a website; it also has a mobile website
It has got a payment option integrated
The Appium's Desired Capabilities defined has browserName and the app

borwserName -- name of the mobile web browser used in automation; it is an empty string if automating an app
app -- the path of an app to be automated

When using a mobile website on a mobile device -- assuming it a mobile web app

On selecting a payment app -- assuming it a native app

The context changes to payment app UI
On completing the payment, the context changes to the mobile website

Challenges Described in the Funtional Flow

I see these as challenges:

How to handle this said scenario in automation using Appium?
How to switch context between mobile browser and the mobile app?
Using driver.startActivity(), it yields an error on performing any other actions

On making any actions on UI after using the above said method, the error is observed

Reading the description, it is said that the error is thrown when running the automation

And, when changing the context back to mobile website from payment app

The driver.startActivity(), takes two arguments -- app's package name and activity to be started. What's passed for the package name and activity name is not clear from the problem description.

If the mobile browser is used to launch the mobile website and mimic the action, what is passed as app's package name and activity in driver.startActivity() ? This is not mentioned and unclear to me.

Also what is mentioned for the browserName and app in desired capabilities is not clear.

A Common Use Case

In recent years this is a common use case in a mobile native app having a web view and the websites that have payment transactions. For example, in the native app when making payment, the web view of payment gateway that shows list of payment choices. On successful payment, the view switches to native view from web view.

Questions on Reading the Problem Statement:

I have the below questions on reading the problem description:

Why did it throw the errors on any actions post calling the driver.startActivity()

driver.startActivity() will start an Android activity using package name and the activity name

The context picked on switching from web view to native and then back to native, is not well picked?

But it is a mobile website which means it is opened on a mobile browser, right?

No where it is mentioned as a Hybird app i.e. the mobile website installed as an app

Does this mobile website maintains its context when switching to a native app (payment app), and then changing the context to (web view) mobile browser?

This takes me to seek clarity for:

Is mobile website a installed Hybrid app? Or, is it a regular website which also has a mobile website and accessed on a mobile browser?
Is it possible to switch the context of web page from mobile web browser to native app, and vice versa?

I need to explore it; I'm unsure of it
When read the desired capabilities, it looks like this can be done

That is context switching of mobile web browser to native app, and back to mobile browser from native app is possible
I need to explore on the same to be very sure of it

Code Snippets for Context Switching

Refer to this page for details on using the Web view with Appium. The below code snippets tell how to find the context of web and native views, and switching to it.

Snippet illustrating the change of context to Web view

Snippet illustrating the change of context to Native view

But, What's Actually the Problem?

If automated as described in problem statement, do we end up in a problem? I see, yes we will end up in a problem:

Need to maintain our automation to make sure it executes the payment app UI anytime

If the UI of the payment app changes, we need maintain the code

Do we have stage environment payment app in this case?

If we test the mobile website in the stage and make transaction in production payment app,

Can we continue as this in each test iterations?
If yes, how long can we continue to use production payment app and pay?
Will there by any transaction fee charged each time from payment app?

Can this become a financial cost to the business and client or to stakeholders?
What other cost should I bear for using this approach?

I need to know:

What is that I want to learn from the use case or scenario on automating it?
What would be the impact if the test did not help me to learn what I want to learn from automating this use case and scenario?
Should I be testing the payment app along with my app?

As I write UI automation to handle the web view of payment gateway and then the native payment app, it becomes part of this test. Should I do that?

What information, risk and problem discovery I miss, if I do not automate the payment app flows?

Is it okay for the business and product, if I miss any information here or if I do not test the flow in payment app?
How to arrive at this decision?

The decision here need to be rational. But, being rational alone may not help always. Can I be reasonable here when I'm deciding or influencing stakeholders when deciding?

This is a Automation Strategy Problem!

If seen, for first this is not a Appium problem. It is a problem with -- what to automate, how to automate, when to automate, how much to automate, and why automate. That is, it is a problem with automation strategy on how to approach and execute it.

To me it is a problem to solve with approaching and execution of automation for payment transaction, and not a automation library usage and implementation problem.

How can I Approach the Automation Here?

I will learn, should this payment scenario be automated on the UI layer for first? If yes, why? And, then I will have the below questions

Can I use the developer APIs of payment service to test and complete the transaction?

If yes, then

Can I use the stage APIs of payment to simulate the transaction flow and its completion?

If I just use APIs, I will not know what's the functional experience of transactions in native payment app. Is this okay?

I and the product I test, do not have control over the payment system and its apps

When I have no control over it at any point in time, should I test it as part of my system? If I did so, should my product as well include the probabilities and complexities of payment system?

Having this information is good!
But what can I do with that information?

Do I have an authority to change or fix payment system with that information?
If yes, good; if no, then the time and resource spent on this s a value return to my stakeholders and their business?

It is wise to mention that I'm not including and testing the payment system and its transactions as a part of my system

Because my system does not have a control over payment system in any means

If the API that is used for initiating transaction is functional and usable, then I do not have to worry technically from functional perspective of transactions

We will have to work on -- if the payment initiating web view is functional on my native app and in my website or a mobile website

From here the control of payment and any transaction problem that arises are in the realm of the payment system

In the test report

I will include the stage payment API request and its response with data

Talking to payment app organization, we may get the developer API access on stage to test our system on their stage
Talk to payment app organization!
Also we can mock the payment API to an extent and in the test report say this is a mock result

If relied on mock, then we can miss the change in payment system
I will have the mocking as last approach just to complete a business flow and it will not be my pick unless someone wants to see a business flow completion in a test

Have a test that tell about functional and usable aspect of the payment page in -- a mobile website and the payment web view in native/hybrid app

Benefits of this Approach

I and my tests will have clarity what is in my control and what not
When I have control, the test and automation can be well maintained
The flaky areas can be identified; I can come to a decision to eliminate it from the automation or not
It helps to identify what is my problem and what is the problem that I don't own in terms of authority
With this approach, the tests and automation provides clarity when we uncover a risk or problem

While I know the benefits, I must also know the cost of having this approach.

Sunday, December 26, 2021

Before Identifying and Listing My Tests

I read the below query in TTC's Telegram chat. The discussion had started on this thread and fellow members here were responding. Further, I read this line and it made me look into it -- "The question was we have to use valid username and password..and perform a negative testcase".

The Default Thinking and Applying Interface

Including me, I see it is subconsciously common for us to approach the problem statement visualization in terms of Graphical User Interface. When I ask why it is so, maybe it is rooted in our subconscious thinking i.e. with first order and second order or any orders of thinking.

I want to give a try to attempt approaching it by reminding and asking self the below questions:

Is it a GUI specific problem?
Is it a problem that is tied to the context of GUI?
What does this question encapsulate within and open as an interface?
What forms do these interfaces take when I stand out of specific interface?
Should I stick to one interface to learn and attempt this problem?

Identify the Tests and Framing of Tests

We test to learn

Does the system do what it is supposed to do and how, why, and when?
When the system does not do what it is supposed to do and how, why, and when?

Should I call it Negative Tests? This is not what I share in this post.

To me, these are tests that help me to learn when the system responds and behaves in the other way than I expected.

I can start to identify the straight use cases for inputting an error (a human introduced error) at a given state/data/event; then look for the behavior of the system. It is good when we can keep identifying and ideating the use cases.

We get limited with use cases as we continue to think about use cases. That said, for sure we will identify and frame the tests within identified use cases. But, we need tests that help to learn when the system fails in doing what is supposed to do.

To supplement it there is another way, which I use. I do not say this is the only way to supplement. I use multiple approaches to supplement and identify the tests. When I do so, I ask the question to the system with the help of these tests and evaluate the response of the system.

Questions to Identify the Priority Tests

I learn and understand the system each time, to identify the better tests. And, each time I learn something new about the system that I did not know.

When I'm asked a question in the interview, I ask for details that help me to test better or to demonstrate my deliverable better. I will watch the questions that I ask!

If I were the candidate who got this question in an interview, I would ask the below questions. When I learn this is good enough for the initial tests, I will pause with questions. I move to identify and frame the tests using the responses I got for the questions that I asked.

These questions will surely help me to be precise and close to the context that better demonstrates my testing skills. If it is not close, then there is a problem (or a difference) in my presenting and expectations in the interview. I will have to address it with the help of the interviewer.

Questions:

What is the interface where I'm entering the username and password?

Where is this authentication used?
On UI (if so which UI), or CLI, or touch interface, or what is its interface type?
At which layer of the system this authentication is used?

Where is the format of username and password?
What is used as Authorization identity on successful authentication?

What happens if my authentication is not successful in the UI you want me to test?
How do I understand that UI is communicating to me that my authentication is not successful?

How is this authentication processed?
Where the authentication is mapped to authorization and stored for references?
What protocol is used to communicate in authentication?

What protocol and communication order is used to grant and revoke authorization?

Who uses this authentication and authorization?

To know the different means of doing the same

Is there any other form of authentication that grants me the authorization?

Do these different entry points of authentication update my authorization?
Will I have different authorization data to authenticate? If yes, how the data, states, and events are maintained for my authentication and account?

What's the language and Unicode supported by this system?

Will the languages and Unicode used in the system have any impact when I try to authorize by changing the language and Unicode? How does the system understand these differences and maintain one state of data with authorization?

Are there any computing differences for authentication and authorization on big and small endian machines? If yes, how and for what context of the system's behavior, processing, and decision?
Where and how the authentication and authorization details are processed, stored, and presented back.

Is there any specific reason for doing it in this particular way?
How you have strengthened the authentication process to grant the authorization?

For example, 1FA, 2FA, nFA, what else?

Does any other system use your authentication to authenticate and authorize?
Do you use SSO for authentication and authorization?
What testability layer do I have that I can make use of to support and identify the tests?

Does this testability layer help me to identify more tests and also classify them?

I can keep generating the questioning like this. But I will have to pause and start working on what the questions offer me.

With the help of these questions, I can learn better about the system before attempting to identify the test and frame it. This also pulls out the risk or problem area if any that looks important and of priority.

I have eased my work to an extent when I know:

the target surface area to start my work
what it takes and brings back, and how

In this context, I would have started this way!