Application & Script Independent Framework:

N4N (The Need for Data Normalization)

Abstract

This paper presents a modern test automation framework that uses Data Normalization of the test data and objects metadata (object name, object type and object properties) as a technique to minimize the rework and maintenance effort whenever there is any change in the requirement or design or even business flow. It leverages on various concepts of RDBMS to exploit maximum benefit which is commonly practiced in application development but often ignored in test automation development efforts. Use of database views provides an abstraction and flexibility to the overall architecture. The key is to make the automation project independent of changes in application (design or requirements) and enabling automation scripts to be able to continue working with least possible maintenance effort. Paper provides the proposed solution, architecture, workflow and guidelines to be followed by Test teams in order to implement the framework.

Keywords: Advanced Automation Frameworks, Normalization, Software Test Automation, Software Testing,  

1. Introduction

Test Automation is one of the fastest growing areas and it has been very much successful in attracting the attention of clients, managers etc. Every company spends a considerable amount of money on acquiring the best automation tools and automation experts to make the project successful. On the other hand, world is becoming more and more dynamic and so the requirements. With the evolutions of development models like Agile, SCRUM and RUP, changes in requirements are possible almost at any phase but the do we think the automation frameworks available are good enough to cope up with all these problems. Answer is No. We have to always keep in mind “Requirement Changes are Inevitable”. Most of the automation project ends up with the “Maintenance phase” taking more effort than the automation effort spent initially. Often changes in requirements, in the form of Change Requests / through emails / over phone results into lot of automation maintenance effort. It generally takes 4-5 releases to get ROI on automation, which has a significant impact on the client’s decision to go for automation or not.

2. Problems

2.1 Redundancy

  Typically, in the data-driven testing context, different automation consultants create / use test data for their respective functional modules where lot of it might be common with other modules and as a consequence of this maintaining multiple copies of the same data ultimately leads to redundancy. Particularly, in case of an update, all the data pools containing the redundant fields need to be updated.

Data normalization is not present in the test data pools supported by QA tools as they are stored in excel / CSV / file format.

Scenario 1: Consider 500 Data pools being created for 1000 different test cases by different users. In this scenario the users might have unknowingly created common fields in multiple data pools. If some common objects changes during the next release, the tester needs to revisit all the data pools to change the duplicate objects. This process is time consuming and error prone.

2.2 Inconsistency

It is generally very difficult to manage a big application that involves millions of test data. This often leads to inconsistency, which is one of the prime reasons for script failures.

Scenario 1: If the object name, which is used for object identification, is changed or renamed then all the scripts using that object will be aborted as a consequence. Then the tester has to find all the occurrences of the changed object in all the scripts and replace them with the new value, which is a tedious job.

Scenario 2: If the object type is changed, all the scripts using that object will be aborted as a consequence. Here again, the user has to find all the occurrences of the changed object type in all the scripts and replace them with the new statements, which is very time consuming.

Root Cause Analysis

Figure 1. Root Cause Analysis

2.3 Iterative software development

Given today's sophisticated software systems, it is not possible to sequentially define the entire problem at the beginning, design the entire solution, build the software and then test the product at the end. An iterative approach is required that allows for increased understanding of the problem through successive refinements enabling incremental growth resulting in an effective solution over multiple iterations. With every release of a project an executable is produced. As project’s progress and the application grow iteratively, the test suite grows as well.

2.4 Lack of experienced automation resources

There is a scarcity of automation experts who are technically sound with the testing attitude.

2.5 Change Requests

Requirements changes dynamically during the later stages of project which affects the scripts automated.

2.6 Maintenance effort

Generally maintenance effort exceeds the initial effort involved. In the end team members feel it is better to it from scratch rather than doing maintenance. I know many of you must have felt the same while maintaining someone else’s scripts or sometime even your own scripts.

2.7 Tedious

Automation becomes tedious and difficult to continue when complexity grows with along the functionality. When automation candidates are lengthy and contains too many verification points, it takes a day to automate whereas the same could be done in 30-45 minutes manually.

2.8 Too many failures

It becomes almost impossible to run the suites uninterrupted as script fails because of small reasons.  Unattended playback becomes theoretical. I have often observed people leaving to their home after playing back the suites with the hope to see the test log next day morning but you are right, suite doesn’t continue for long and fails because of some silly mistakes in the code.

2.9 Difficult to debug

 It is often difficult to debug the script written by someone else or even your own script if it done after few days or months of coding. To make a small change, you need to run through 300-400 lines of code if the documentation is not being done.

3. Proposed Solution

ASIF is an automation framework which has been evolved by keeping into mind all the problems mentioned above. It uses the advantages the usage of RDBMS concepts and strongly argues that effort overall effort can be reduced drastically and ROI can be achieved in the first release itself. Another important aspect is that automation doesn’t really need people strong in programming. This method stores all the object properties and test data to be used in a database.

A database schema is designed before creating automation script. Database schema involves inter-dependencies among the data to avoid redundancy. Normalization enables this approach to be much more effective than the traditional approach. 

This framework requires the development of base tables, which are independent of the test automation tool used to execute them. These tables are also independent of the common library functions that “drive” the application-under-test. In this case the script will just call the common functions, which will be used to retrieve the data from the database and in turn input the same to the application.

Architecture Diagram

Figure 2. Architecture Diagram

This paper will be organized by the normal steps that we all use for our automation frameworks, making special notes of the considerations and challenges that are particular to test automation:

Implementation
Advantages
Conclusion
References

4. Implementation

4.1. Analysis:

Perform size estimation, resource estimation, schedule planning, expected changes and development model. This approach yields maximum benefits especially when the iterative model is used.

4.2. Identify the automation candidates

Select automation candidates from the entire set of test cases. It is best to automate regression test suites, as they need to be executed for each build, in all the subsequent releases.

4.3. Identify all the possible screens in the application

These are the screens across to be navigated the application to execute the selected test cases.

4.4. Record objects properties for the selected screens

 Use tool to capture the object properties like object name, recognition method, and object name etc for the screens within scope.

Implementation Stages

Figure 3. Implementation Stages

4.5. Select an appropriate Database

 Make a choice between the databases depending upon the maximum no. of users in team.

4.6. Design database schema & create base tables for all the selected screens

Design base table, select appropriate data types and field lengths and standard naming convention. Populates tables with the properties captured.

4.7. Create View by joining required base tables

 Design views by joining Base tables, based on the test case flow.

4.8. Include common library

Common functions are to be used across all the test cases. These functions are independent of the projects and writing them is a one-time activity.

4.9. Write scripts for test cases

 Include Library files and just give calls to common functions in the library. Scripts are restricted to just couple of lines to call reusable functions.

Test Database Implementation Steps

Figure 4. Test Database Implementation Steps

Architecture Diagram

Figure 5. Architecture Diagram

Database Design Diagram

Figure 6. Database Design Diagram

4.10 Schema Design:

The following activities comprise schema design:

4.10.1 Base Tables: Tables contain Object Names/Object types/Test Data for each screen in   the application. The first row of all the base tables always contains Object Names/Object Types.

4.10.2 Logical Views: We can design views by joining different tables according to the need of any particular Test Case.

Scenario 1: Change in Application-Under-Test:

If an object name or an object type is changed by a developer in any of the subsequent releases, then the tester has to open the corresponding base table and make the changes only at one place. The changes made to the base tables will automatically get reflected in all the views using the base table and all the scripts will work in exactly the same way they worked in previous releases.

Scenario 2: Change in the Test Cases or flow

A view corresponds to one particular test case and if the test cases are changed then the tester has to just modify the corresponding view and need not to worry about the scripts. Whatever is changed or modified in the view will get reflected in all the joined base tables. This reduces the effort in case a particular step in a test case is being modified or added.

Generation Automation Flow

Figure 7. Generation Automation Flow

5. Advantages

5.1. Normalization:

  Designing the database schema with normalization will eliminate the chances of inconsistency and redundancy to a large extent. We can reduce the effort by using the advantages of a RDBMS and can reduce the number of connections in any script. It uses the concept of “Views”, which is nothing but logical connection of tables. In this approach, common sets of tables (database) are going to be used by all the users instead of creating their individual data pools.

5.2 Application & Script independent:

We must strive to develop a single framework that should grow and continuously improve with future projects that challenge us. This approach makes the test script completely independent of the application. Script creation becomes a one-time effort and reduces maintenance effort at the time when changes are being made.

ASIF Break-Even Point

Figure 8. ASIF Break-Even Point

Fortunately, this heavy initial investment is mostly a one-shot deal. Once in place, ASIF is arguably the easiest of the existing automation frameworks. It provides a very great potential for a long-term success.

Scenario 1: Let us assume that 1000 test cases have the same precondition to login to the application. If the object name for Login is changed then the users need not to touch 1000 scripts but using ASIF they just need to change the object name in the single base table containing information about Login screen and the same will get reflected in all the 1000 scripts.

Effort Reduction using ASIF

Figure 9. Effort Reduction using ASIF

5.3. Data storage and Retrieval efficiency

Database can handle large amount of data in an efficient way and retrieval of the data is much faster. In case of changes, data and object names need to change only at one place that makes this approach very attractive. It doesn’t require connecting to more than one data source for the same script.

5.4. Views & Queries

Views can be created from base tables and each view corresponds to one test case. This makes script very simple to understand as it is fetching data only from one source and that is customized as per the test case. View doesn’t consume space as well as it is just a logical table. In case of any change, views alone will be updated and the entire base tables associated with the views will automatically get updated and vice versa.

Scenario 1: Consider that an object in a screen is stored in a single base table and multiple views are accessing the same base table. Scripts refer to the corresponding logical views to get the required object name, object type and test data. In case if object type is changed then the tester needs to just change that the details of the object, in the corresponding base table and the same will get reflected in all the views and the scripts as well.

5.5. Logical view and flow of the application

ASIF provides the user with a logical view of the overall application, which is easier for new resources, reviewers and the most importantly the clients.

It requires the design of the database schema, which is a time consuming process and needs good understanding of the database concepts. This can be used for various builds with new change requests. Once it has been designed then the process of making new changes is not very difficult and time consuming thereby making it very useful for the long run.

5.6. Modularity

Scripts are broken down into modules that are independent of each other so that the same module can be called for multiple scripts and also make the debugging of the scripts very easy.

This insulates the application from modifications in the component and provides modularity in the application design. The test script modularity applies the principle of abstraction or encapsulation in order to improve the maintainability and scalability of automated test suites.

Performance based on Complexity

Figure 10. Performance based on Complexity

6. Conclusion

ASIF requires relatively more effort for the first time but it will reduce the maintenance time greatly for subsequent releases. This is ideal especially when the development team follows Iterative / Agile / SCRUM / RUP and the regression test case needs to be automated and executed for each and every build to ensure that the functionality is working fine.

7. References

[1] John Crunk, Keyword Driven Testing

 http://www.wilsonmar.com/WRSAFS%5CDocs%5CKeyword%20Driven%20Testing.ppt

[2] Michael Kelly, Getting started with automated testing: Road map to success

[3] Mario Cardinal., Test Driven Infrastructures https://msdn.microsoft.com/en-us/architecture/bb491123.aspx

[4] Cem Kaner, http://www.kaner.com

[5] Carl J/. Nagle, “Test Automation Frameworks” http://safsdev.sourceforge.net/FRAMESDataDrivenTestAutomationFrameworks.htm

Raj Kamal

Microsoft Business Intelligence Engineering, Hyderabad, India

Rajkamal@microsoft.com