Agile testing is a practice not fully described by many Agile methodologies even though is the basis for continuous delivery. The only way to provide continuous feedback to the client is to have high-quality software running all the time by using automation and formal end-to-end tests, while maintaining a robust architecture through intense refactoring. This paper is about how a development software company used agile tests and evolved them for more than 20 years. In this story, we will focus on one of the biggest challenges: How to scale tests and have quick feedback when running fast. And then, how can we continue to run faster and faster?
We conclude with a step-by-step guide for those who are having difficulties scaling tests and want to build a continuous, quick, and reliable quality process.
“Quality is not an act, it is a habit” —Aristotle
We firmly believe in the above sentence. The quality of a product cannot be perceived as an extra cost, it is an integral part of its structure. It should not be optional.
Working more than 20 years developing software, we have learned that the sooner you invest in tests the better, more secure, cheaper, and robust your software will be. So, that’s our way.
As developers, we discovered the power of software test engineering long before the Agile Manifesto. Using an evolutionary and empirical process, we began by using private and manual tests. We evolved to formal software tests and automation. We built frameworks for creating new test models, types, and evolved our architecture to support complex and large-scale software development.
Today we act as agile coaches for big companies. We help them in agile transformations, always using tests as the basis for any continuous deployment process.
Our story begins in 1998. At that time we were part of the software development company Objective Solutions, working on a product that had been in production for around 4 years. This software was running at a huge telecom company in Brazil. Its core function was to manage hundreds of thousands of subscriptions.
Objective Solutions was responsible for software maintenance and improvements. We had around 50 collaborators, 20% of them working on manual quality assurance. This was a high percentage dedicated to tests, but unfortunately, it was not enough to prevent critical bugs, production stoppage and sometimes chaos in the deployment process.
We needed a more robust solution to testing and quality. So, after some study and a small time investment, we developed our first Test Framework Tool that allowed the team to automate the scripts and make the test phase run faster. It was the birth of our first tool and the most crucial step towards supporting software delivery on the large scale we have today.
Figure 1. Test Framework Screen
3. RUNNING: HOW CAN WE GO FAST?
Most companies, when trying to move to a test-driven culture, give up when the running time for tests increases dramatically, slowing their development cycle. This is one of the biggest problems especially because the most common way to implement tests is to run ‘integrated tests’ that record system actions and attempt to reproduce these scenarios during the execution of the test suite.
We cannot recommend fixing this by focusing only on unit testing. Integrated tests are really what is important to the user. To the system user it doesn’t matter if our classes and methods are clean and elegant. They want to validate that the features ‘end-to-end’ in the system are running well. And that must be the primary focus of an Agile Test culture.
So, what can we do? How can we make it easy to build a test process that runs more efficiently?
Here a little bit of our history and how we evolved our process, tools, and practices answering step-by-step important questions along the way.
3.1 The database schema evolution
When we started our software company in 1995, we ran our tests manually using ‘client populated’ database copies. Our initial approach was to think that we should run our tests in an environment as close to the production environment as possible.
That was a mistake.
The client database was hard to replicate, the data is highly customizable, and it is very complex to maintain the consistency with tests changing the state of the database during their execution. As a consequence, it took a long time to recreate new test environments.
Our solution to this problem was to develop a ‘database schema evolution’ framework - a mechanism that maintains the version of the database in sync with the system version. This tool was also responsible for creating a new clean database every time we needed one.
That technique enabled us to also achieve a new level of quality gate: the capacity to refactor the system when there is a database dependency. For instance, when you want to break some large class into several smaller classes, you can also synchronize this code with the migration of database tables.
This approach, in conjunction with production environment versioning, provided the control for what changes must be updated in the database to run some test scenarios, or to deploy something to the production environment.
We changed our deploy process because of our new test process requirements, and that increased our trust in the consistency of the data schema with the software system version.
Today there are a lot of market frameworks making this process easy for various languages and databases. Don’t hesitate to use them or create your own.
Figure 2. Test performance after Database Schema completed
3.2 The fake GUI tests
One of the first types of test improvement is the automation of scripts to run the ‘integrated tests’. It is central to regression tests executed by the QA guys.
Well, those tests usually are the ones that take more time to run. The main cause is that screen updates require a lot of computation and computer resources. Prior to the year 2000, this was much more complicated.
Our solution to this performance bottleneck? We asked ourselves a basic question: while we need the screen to operate the system, do we really need to see it while running the test? The answer was no. So, we changed the software architecture of our screen components to modify their behavior. When in test mode, the GUI components simply discard the draw commands and don’t update the screen anymore.
This is a deep change but it was a simple solution to our performance problem, decreasing our test running time by half.
After this problem was solved, we were running faster. But with the test scenarios growing in number a new bottleneck had been identified—IO operations.
Figure 3. Test performance after fake GUI tests
3.3 The environments tree
Although discarding the commit command increased our test performance, we still were using much time to interact with the database. So, we started some changes in the test framework architecture. We measured that more than 85% of run test time was spent creating the base dataset. That dataset cannot be discarded but maybe it could be optimized.
Our solution: To create an environment tree of datasets where each test scenario is dependent. Something similar to the figure below. Follow the number sequence to identify the set of executions:
Figure 4. Each color represents a dataset where a subset of scripts can run
Using this approach, scenarios with the same dependencies could be run in the same set, optimizing the time to populate the data. This greatly decreased the setup time for the test scripts simply by organizing the test execution sequences.
One important note is that we used a mechanism available in our database system (Oracle and SQLServer): ‘the savepoint’. Using this, we can easily discarded the last sets and create others.
Dataset dependency is in the past, but we ran across another issue: Client customizations and parameters.
Figure 5. Test performance after environments tree completed
3.4 The database or memory
Exercising our continuous ‘asking why’ questions, we asked, “Why do we really need the database?” Again the answer was similar to the last question and so, we asked, “Can we discard the database in test mode?”
Well, here the answer was not so simple.
While the ‘commit change’ only intercepted a command, and is not system dependent, usually the database is a part of the system architecture and to remove it you need to be careful with the consequences.
Using a parsimonious evolutionary approach, we started a change to remove direct dependencies from our system to the database and enable it to run directly in memory.
When in memory mode there is no multi-user transaction, no fault tolerance, and we are limited to RAM memory. But it was sufficient for testing purposes.
This change was different from the other approaches because in this case we were aware of the impact of database independence. Maybe the system could run in a different way as was designed. So our solution has been to implement a flag: enable/disable database. The developers and QAs could choose which mode they want to use: database or memory.
During the build process, database mode is mandatory to be sure that we are running an environment close to production.
The four topics above are related to technological changes to increase the performance of running tests.
However sometime between 2002 and 2003, we discovered that this approach was not sufficient when you need to run thousands of scenarios. We were still spending too much time.
Figure 6. Test performance after DB or memory option implemented
3.5 The client configuration
Our company developed one big product to attend big telecoms in Brazil, an SMS/Billing system designed for the core business of those companies.
This type of product usually shares the same codebase with each installation, yet each customer configures hundreds or thousands of parameters according to their context. The question was: How can we address in tests all possible parameter combinations? It is an exponential problem.
Our solution: We exercised the ‘Why’ game again, “Why do we need to test all combinations? Can we get by only testing the current supported configurations?”
So, we created a module to separate the parameter configurations from the system load and a mechanism to load a configuration at test time. At test time we just load the configurations from real clients and run all the scripts. The idea is simple: we run tests using all real clients configurations, no fake configurations anymore.
While the first four topics above are about the technology the next two are about architecture, in some time you find there is no big change needed to be more efficient. You need to increase your power of execution. Let’s talk about it.
Figure 7. Test performance after client configuration parameterization
3.6 The distribution
After the use of ‘environments tree’ (Figure 5) and ‘client load policy’ (Figure 7), we were very close to a big change in our approach to running automated tests. We were able to distribute to different machines. Since each machine could be responsible for one or a set of ‘environment branches,’ we could run a complete suite of tests in distributed mode.
Our implementation of this was to simply create a mechanism to consolidate the results of a unique execution set. Having a server to orchestrate the distribution enabled a new way to run our tests and increased our performance by an order of magnitude.
With this change, the execution time of 10.000 tests dropped from 22 hours to 3 hours.
This change occurred around 2009 and has been one of the most significant improvements in our test culture.
The distribution (Figure 4) in 2011 was running on some clusters in our own company. At that time, our main product had 15.000 functional tests and the quantity of tests was growing fast.
We started running the entire test suite during the night and it was spending around 6 hours in a cluster. For a long period, this was our reality: to only know the entire result the next day.
So, we next increased the number of machines in the clusters and changed to run the tests continuously on them. So sometimes we could have the feedback in just a few hours.
Figure 8. Test performance after DB distribution
3.7 The virtualization!
Finally, in 2013 we ‘discovered the Matrix’. With the cloud environment running well, we changed our approach to move everything to the cloud. It was not a simple new infrastructure. To use the power of the cloud environment and reduce costs, we changed our server to control the setup and dismount the virtual machines in the cloud.
We designed an orchestrator to manage ‘spot instances’ in the cloud. With that, we were able to start and stop new environments according to our configuration for budget and performance.
This started a new age. Now, we could intercept all the code commits in our master branch automatically and then start a new environment to run the entire test suite. We could decide if we wanted dozens or hundreds of dedicated machines to process the tests and choose how many minutes to wait until we get feedback. This is only an economic decision. Now we can say we truly have a scaled automated testing process.
Figure 9. Test performance after moving to the cloud
Today we have in our main product more than 50.000 automated functional tests and a large number of unit and performance tests. Every test running after every code commit made by our developers. We are not limited by the number of increasing test scenarios effect or the number of developers working in parallel.
Finally, a continuous feedback system!
Figure 10. Current test performance
4. WINS WE DIDN'T PLAN
Along our journey implementing Agile Testing we discovered a lot of peripheral gains not planned for but supporting the quality of our software nonetheless.
4.1 Live documentation
By creating new scenarios for every new feature or bug fix, we automatically are documenting the system behavior.
It is a live document about the features of the system and it is the best way to learn how the software works.
In our company, we deliver to the client the test scripts together with the software as the documentation to our client. By using DSL - Domain System Language - in the scripts, the customers can read easily and understand how the system follows the commands.
4.2 The team onboarding
Another plus of using tests as documentation is similar to the above: new team members can quickly understand system rules and how our software is structured.
The onboarding process is simplified and the developers can experience changes without impacting the production environment. They can change part of the code and validate immediately in the test suite the results.
The test framework is our best tool to teach new members and also to do ‘spikes’ about new developments.
4.3 The sensible code detection
Using automated tests and persisting the result of the test execution runs, you discover some interesting new insights about your code.
Comparing the history data, we can determine ‘sensible code’ that is broken so many times. This is a good indication that this part of the code needs refactoring or the dependencies must be rethought.
It is another level of quality. You are not only worried about the software delivery but you can care about the development process with metrics. You can prove the need for refactoring using the cost of running failed tests.
4.4 The software language independence
Another interesting part of our journey is about software language. We started using Smalltalk in 1995, and after 10 years, we needed to change to another language, Java.
We created a tool to automatically translate Smalltalk code to Java code and... it worked! Without adjustments after just clicking one button!
Maybe it is a good story to tell in another experience report.
5. "HOW TO" IN SMALL STEPS
Helping new teams to scale the Agile Test culture in their environments is tough; here are some tips that we believe will lead to great success:
- Automate all the tests following these 3 simple rules:
- Only work in a bug fix after creating an automated test to replicate it
- All new features must be covered by automated tests
- The legacy code must be covered by test when you need to change it because of the two above rules
- Unit tests are important and easy to build, but only functional integrated tests bring customer value
- Don’t spend much time making a GUI test. Try to refactor your system to run functional tests without screen dependency
- QA is a role, not a function. Make your QAs work together with Developers and ask for results from the whole team (QAs + Developers).
- Use mocks to emulate systems, modules, classes or methods. Making one part testable, the rest will be simpler.
- Think in tests since product inception. Your process must be oriented to tests.
- Don’t give up when you find obstacles. Maybe it is not so easy, but it is the primary practice of great and sustainable software. There is no ‘Agile’ or ‘DevOps’ without the culture of Agile Testing
- And finally: It is not negotiable! The continuous, fully covered and automated tests are the responsibility of developers. It is unethical to deliver software without tests!
Figure 3: Our story, always ahead of the market!
Although the use of automated tests in software is seen today as a cool and trendy, that was not the reason why we started doing it more than 20 years ago. For us it was a matter of survival. We had to find a way to reduce costs, and deliver fast and reliably. It was a pragmatic decision made towards achieving our business goals in an uncertain, volatile and emerging market in Brazil. For us, developing software with test automation is no longer a matter of choice. It is our natural way of developing software
We want to express our gratitude to all who were part of this story, with an honorable mention to Klaus Wuestefeld and Paulo Peccin, whose disruptive mindset and a pinch of geniality started this path in our company.
After all, our report is just a compilation of thoughts and experiences coming from a great team. The whole team was a combination of restless and innovative spirits.
We have achieved success through your hard, constant, provocative and questioning mindset.
Thank you all for the opportunity to be part of this trajectory!
And last but not least, thank you, Curtis Michelson, our shepherd writing this paper. Your support and comments were undoubtedly crucial to improving the quality of this report.