System testing plays a major role in building customer satisfaction towards a product as day by day the users of an application are increasing.
In current practice, we have approaches, methodologies designed to validate the coverage of test cases in Functional or Black box type testing. For e.g. there are software engineering practices such as test driven development that lets you write greedy code just enough to make the test pass. Methods such as this holds good to ensure coverage for a Unit testing as well as Functional Testing. Though there are approaches for a Functional testing however, the same is missing in White box or Stress testing. How to validate a Stress testing or a System Testing? Additionally, Project Test planners and Project Leads plan the Project testing with their experiences, with the result of last Test Execution. But there is no formal data support, derived from the previous execution of Project to support their plan. This incurs some drawbacks during execution phase like reconfiguring machines, modifying test bucket, test scripts etc. leading to more time in Testing the Project, considerable waste of hardware and human resources, sometimes Test Planners revisiting the initial plan made.
The aim of the project “Improving quality of QA stress testing and Feedback mechanism” is to give a formal data to support the decision taken during planning. The data could be refined over each releases to get closer to the goals. The project will have two stages where we first define the metrics to collect the required data. The second stage will retrieve information from the data and give feedback to the current quality of testing.
A. What is done in the project to achieve the goal:
First, we define the characteristics of a test project taken as an example.
A Quality test project goes through several phases during the product testing – A planning phase where Test planners determine the release schedules, resources for hardware and human resources; A design stage where Test scripts and Machines are set ready; Execution stage where Tests are run, results verified. And the last Feedback stage where we see what was beneficial and what caused problems during the tests runs.
B. Understand what is the currently running workload
During the Test runs, first understand the currently running workload. The product being tested will have subcomponents. What is the stress level on each component? Use performance monitoring tools to get this data. Build a table with component versus load generated for each component.
Secondly get a statistics of how many defect or problems are found during the tests of these components. Add to the table of workload above for each component with the number of defects, severity of defects.
In this Project, we present the idea of employing a feedback mechanism by using metrics like level of load put on system per test case, number of defects along with severity, component etc.
The idea is to collect data from different spaces and take a decision for validating, refining, planning etc.
By referring tables TABLE I. and TABLE II. and using our proposed selection criteria we can see if, our test cases actually putting load in the required component or not, if, they are putting enough load, should we refine them, or they are still good as is because they yield defect.
Test planners now can infer from the Table data about,
1. Validating the stress test cases based on their performance
2. Using them to validate, refine, prioritize test cases and achieving quality testing.
In short, a methodology for feedback mechanism in System Test Environment.
In system testing getting details of how much a specific component is being stressed is important i.e. which test case puts load in which component.
The idea proposed through this is about having new form of feedback mechanism in system test cycle. This feedback mechanism will help in identifying and building a table of the stress generated on the system v/s test case executed, data driven approach towards planning the m/c configuration required to run the set of test cases, optimize the test cycle time. Moreover, knowing which component(s) is being stressed by which all test case(s) or set of test cases, it can help tester to effectively plan his testing, do some creative or ad hoc data driven testing, it can also give us pointer while recreating a defect or while trying to stress a specific component(s).
Looking at the same from different point of view we can also suggest which test case is not stressing the system enough and which one. This can help us in refining the test script/bucket.
To achieve this feedback mechanism we propose the team to do the following:
1) Step 1 – Identify workload on various components
Taking a Unix based Operating system’s Stress testing as an example test product,
a) Identify a performance based tool which can give you clock ticks used by a function. E.g: tprof in AIX
b) Collect the function based details about kernel and user space per unit time.
c) Map these functions to Operating System components like VMM, TCPIP etc.
d) Combine the details collected for each component for per unit time (say data collected for every one minute) for the complete cycle. E.g.: if data is collected for every minute and test run time is 72Hrs then combine all the 72 reports for each component to get the complete run data.
e) Repeat the steps for every test case and then the combination of test cases which the team is using so far.
f) Plot a table for test case(s) v/s component filled with number of CPU ticks taken by Component when that test case was executing, resulting TABLE I.
TABLE I. (1ST HOUR)
2) Feedback from Step 1:
The above example is for 1st hour or run, we can continue to collect data for next few hours and combine them.
It is evident from above if, we have to stress sysproc component I need to use T2 test bucket. Similarly for filesystem, We will use T4 test bucket.
3) Step 2 - Considering testing, this does not appear to be the only criteria for refining test bucket therefore we propose an additional matrix which is defect, severity, component v/s test bucket matrix:
This matrix will guide us as to, irrespective of the load generated, is it still able to catch some bugs in the system, due to interactions, or unstable functionality or due to timing etc. or not.
Assign a reasonable weightage points to the severity of defects and derive final points for each Test bucket Tn.
Severity 1 defect can be assigned 3 points, severity 2 defect = 2 points and severity 3 defect = 1 point. Final Points for a test bucket = Number of defects * Severity Points
4) Feedback from Step 2:
We can relatively grade the test bucket as HIGH, MEDIUM or LOW based on the final points achieved.
With the data collected from the above procedure we get following benefits :
a) prioritize testing
Possible approach which can be used to prioritize the test cases :
b) effectively plan the testing in terms of machine configuration required and set test to run on same.Index: H = High, L = Low, M = Medium N/A -Not applicable
By building Tables for different configuration we will know which test case performs how in which configuration.
The configuration in which the test case puts maximum load is good for its testing.
The configuration in which the test case puts lesser load then we can review to add some additional test cases to it.
c) achieve effective data driven testing
By referring to TABLE I. we will know which test case is optimal to run the on which machine.
And a Tester can try for a recreate effectively as we know what test case stresses which component and set of functions. Hence, we can target to stress the required function effectively.
d) refine/Optimize test scripts
By knowing the load put on system and number of defects found by test case/test bucket we can optimize for putting more load on system or keep as is, if, it is able to find good number of defects.
e) Testers can come up with some new test scenarios or a combination of test cases with data supporting their idea.
With data put in Table form, tester can try new combination of test cases to stress the system in different ways.
They can even focus on running test cases which interact with multiple components and have some common components to see, if, their interaction is stable under stress.