The cost of test automation

Over the past few posts I have written a lot about test automation, however one very important subject I have left out thus far. What is the actual cost of test automation? How do you calculate the cost of test automation, how do you compare this cost to the overall costs of testing? In other words, how do you get to the return on investment people are looking for? First thing that needs to be covered if you want to know and understand the costs of test automation is a CLEAR understanding of the goal. Typically there are three possible goals:Cost, quality or time to market?

  1. reduce the cost of testing
  2. reduce the time spent on testing
  3. improve the quality of the software

These three have a direct relation with each other, and thus each of them also has a direct impact on the other two depending on which one you take as your main focus. In the next few paragraphs I will discuss the impact of picking one of the three as a goal.

Reduce the cost of testing

When putting the focus of your test automation on reducing the overall cost of testing you set yourself up for a long road ahead. There generally is an initial investment needed for test automation to become cost reducing. Put simple, you need to go through the initial implementation of a tool or framework. Get to know this tool well and ensure that the testers who need to work with the tool all know and understand how to work with it as well. If going for a big commercial tool there is of course the investment in purchasing the license, which often ranges between 5.000 and 10.000 euros per seat (floating or dedicated). Assuming more than one test engineer will need to be using the tool concurrently, you will always need more than one license. This license cost needs to be earned back by an overall reduction in testing costs (since that is your goal). An average investment graph will look something like this when working with a commercial tool: Investment of test automation with a commercial tool

The initial investment is high, this is the cost of the licenses. At that point there is nothing yet, no tests have been automated yet and you have already spent a small fortune. The costs will not drop immediately, since after the purchase the tool needs to be installed, people need to be trained etc. Next to it, all the while no tests have been automated, thus the cost is high, but the return is zero. Once the training process has been finilised and the implementation of the automated tests has started the cost line will slowly drop to a stable flatline. The flatline will be the running cost of test automation, which includes the cost of maintenance of the tool and the testscripts and of course the cost of running tests and reviewing and interpreting the reports.

A particular post in a LinkedIn group with the ominous question “Who’s afraid of test automation”, one of the more disturbing responses was as follows (and I am quoting from the group literally, with a changed name however):

Q: Who’s afraid of test automation?

A: Anyone with headcount. What would it look like if all of the testing is done by machine and there is only one person left in the organization?
Respectfully, Louise, PhD

The idea that test automation will take away testers’ work completely and as such will reduce the running costs of a test team drastically is a common misconception which takes too much time and effort right now to address. But in short, Test automation may reduce some of the workload, but will not be able to reduce your cost of testing by removing all of the testers, nor should this ever be your objective.

In my next blog post I will continue this story, then with the focus on “Reduce the time spent on testing”

FitNesse – Test automation in a wiki

the assignment
When I started working at my current assignment I got told that the tools for automation had already been chosen, one of the tools being FitNesse. The way the tools were chosen is not my preferred way of picking a tool, the fact that the assignment was to “implement test automation across the organization through the use of the chosen tools” made me slightly worried whether or not FitNesse and the rest would indeed turn out to be the right choice.

Prior to this assignment I had heard a lot about FitNesse but had never had any hands-on experience with it, nor did I know anyone with hands-on experience with it.
Having worked with FitNesse for a few months now i feel the time has come to share my thoughts on it, what do I like, what do I believe is up for improvement, how is it working for me for now etc.

learning curve
Getting started with FitNesse was not all too intuitive. Getting it started is easy enough, but once you have it running it is not clear where to start and where to go from the FrontPage. Since we were not planning to use the standard fixtures but instead were planning to create our own we started on the fixture side rather than with FitNesse itself directly. Having created a generic login functionality in the fixture translating actions back int FitNesse became a lot more intuitive.

The base fixtures such as the DoFixture, WebFixture etc. are very powerful in itself, I however feel they somewhat miss the point of automating in clear text: the tests are not easy to read, logical to follow or intuitive to write. We chose to work with SLIM rather than with FIT since FIT gives too much flexibility in usage of (almost) psuedo-code. Examples as used in the acceptance test in FitNesse are not clear enough for our purpose at this client. The test team is, to say the least, not very technically inclined and examples such as below do not really help them very much:

This is still somewhat readable

!|Response Examiner.|
|type  |pattern|matches?|contents?|
|contents|Location: LinkingPage\?properties|true||

A while loop implemented in FitNesse however quickly turns into black-magic in the hands of the technically less inclined:

|put book in the cart|i|
|confirm selection|i|

With our custom implementation we now have test cases that can be read by most people within the organization and will be quite well understood, for example the below scenario for transferring money from one account to another:

|Scenario|Transfer Money|amount|From Account|accountFrom|To Account|accountTo|With Description|desc|
|Start               |TransferMoney |
|Go To|
|Select Van Rekening |@accountFrom |From Dropdown|
|Select Naar Rekening|@accountTo|From Dropdown|
|Enter Bedrag        |@amount|In Textbox|
|Enter Omschrijving  |@desc|In Textbox|
|Click On Verwerken|
|Select Content Frame|
|Is Text             |Het effect van deze overboeking op uw vrije bestedingsruimte is|Present|
|Click On Verwerken|
|Start               |CurrentAccount|
|go to|
|check               |Get data from column|Bedrag|3|

Having started with Selenium as the driver below FitNesse enabled us to quickly build out quite a lot of test cases for the web applications. Part of the beauty of FitNesse in my opinion is that it is driver agnostic. In other words, it doesn’t really care what the system under test is, be it a website, a JavaApplet, a database or desktop applications. We are currently starting to work on TIBCO interfaces and will soon have to move over to Delphi and C# desktop applications. With quite some traditional test-automation-frameworks this would force us to start working either with a different tool or at least in quite a different way. The great thing about FitNesse is that it is so flexible that we can not only test desktop applications, we can also test across several parts of the platform. For example start executing some functions on the web application, verify these actions directly in the database, start a management desktop application and approve the actions initiated from the web application, all within one test case. A test case that big would make the test fragile, but the great thing is, it is possible if you really would want to.

Quite some of the tests currently in FitNesse have been built up based on a functional mapping we initially made of the system, rather than the flows through the application. This is not quite ideal when running the tests, let alone when trying to sort through them and building up a suite for a particular type of flow or functionality.
Refactoring in FitNesse is probably the part where I believe a lot of improvements can be made. The current functionality, based on regular expression search is fairly crude.
FitNesse being a wiki, does have a wonderful different perk when needing to execute some bigger refactoring or moving around of test cases. All tests are text files within directories in the filesystem of your PC. In other words, if the built-in refactor function is too crude, a tool like Notepad++ or TextPad can be of immense value. These can search through files across directory structures and do a big part of the refactoring work for you. If you need to move whole folder structures, again you can copy them around directly on the file system.

My feeling regarding FitNesse so far is that it is a great tool which thus far seems to be underestimated out there in the world of test automation. Even when working directly with the standard fixtures FitNesse makes for easy to use, simple and quick to implement test automation. The main challenge is the initial learning curve, getting started with it and making the right choice in whether to go with Fit or Slim are for the newcomer the main hurdles.

SSL and other frustrations in testautomation lead me to new insights

The last two days we have been preparing some automated tests which will be used in production during maintenance.

Effectively these tests will verify if all the web and application servers are back up and connected to all the services in the backend systems. During this maintenance servers will be pulled out of the overall production pool and one by one, or in batches, be upgraded to a newer version of something or other, what exactly is not relevant in the context of our tests

Our client being a bank, have all parts of the sites reachable via HTTPS and firewall blocked. In order to reach all the individual webservers directly we have to circumvent the use of the loadbalancers.

In our test automation setup we have made the assumption that all URL’s are served over SSL. Now in preparing these production tests we have found out that this is not always true, for example when bypassing the loadbalancers and firewalls and running the automated tests directly against the individual webservers. So we have had to come up with a new feature in our automation suite: toggling the use of SSL. This is all nice, but it does create a bit of an issue while testing the new suite. Normally when I am creating something to be run on production I try to wait as long as possible before actually running it on production, e.g. debugging my test I do on a test environment rahter than on production. In this case however I have had to debug the tests on production, behind firewalls and SSL layers, since the test environments all require SSL in order to reach them.

Another typical headache I ran into is that some people deem it necessary to make hard links rather than relative links. The objective of one of the tests was to login to a certain account and navigate through the site, like a user would to a different part of the site. The actions would guide us from a new part to an older part of the sites, so in other words, going from asp to aspx. While trying this we found out the new aspx parts have hard links rather than relative links, thus kicking us out from in between firewall and webserver and into the loadbalancer, thus killing our session and breaking the tests.

We were asked to make the tests specifically so that the navigation flow moves from the new code into the old code, since endusers seem to have issues with this transition so now and again and there were hopes that this test could possibly surface these inconsistent issues users see in production, since it has not been possible to reproduce the issue in the test environments.

All of this made me think more about the different strategies one would need to come up with when testing the full chain of a product moving from development environment over to a test environment and then on into production. How valid are the automated tests we have done in test, can we compare them to results coming out of production? Can we run the same tests in production as we do in test? Should the test environment be updated in order to represent production even more? Should the binaries that are deployed in test be 100% identical to the ones in production?

I would like to think that the tests we have done in the test environment are indeed valid and judging by the fact that most of these tests run succesfully on production outside the firewall, e.g. the way our end users reach all functionality, I do believe the tests are indeed valid. It does show clearly to me however, that while testing we do miss some things, such as hard links rather than relative links, which may result in loss of a session for example. Question is how we will be able to catch these links earlier on, rather than in production.

There is still quite some work to be done before we can safely say we have it all covered. Building this fairly simple test suite in FitNesse and WebDriver have shown me some new things to consider while testing applications, such as where is the SSL layer placed, how can i verify if a url is built relative or hard, should we always want to verify this? How do I play around with hostnames and what can this bring me; in this case it brought me the exiting world of hardcoded URL’s to parts of the site and the insight that the binaries used in the test environment are built differently from the ones going into production.

All in all it was just another day in a testers paradise!



What did I get out of today’s testingdojo

It’s funny to see how difficult it is to get a group of people, who work with one another daily, to talk freely and share their ideas, even when their manager is not present and they are amongst their peers.

During today’s testingdojo, which again was supposed to last an entire day focussing fully on working with FitNesse, we started off with a talk about what we aim to achieve at our customer’s with test automation. I tried to enthuse the group by pushing them to think about the possible difference between “test automation” and “computer aided testing” and if there are differences, what does one mean and what does the other mean. From there I hoped to get to insight into what they think we should aim to achieve and of course whether or not their ideas make sense to us, as the leads on implementing test automation.

A real discussion on this never took flight unfortunately, moreover, the two people we have been working with closely on the implementation remained most silent of all. I am still not sure what the cause of this silence from their side was, natural shyness, cultural pressure, or something else. Instead I ended up pulling some keywords out of the group and discussing my thoughts on them. Not too bad either, but I do not believe I should have been the one talking this much about the subject.

The second part where I hoped to create a bit of discussion was on what the group believes to be good practices in testautomation. This also took some pains from my side, along with some poking, probing and planting the occasional seed, but some discussion arose on this. After a while one of them remarked that in the end it seemed that all things that can be considered good or best practices in testautomation also fly for manual functional testing.

This insight led me nicely back to clarifying the first point, what are we aiming to do: trying to remove manual testing all together or trying to create more free-space and time to enable them to do more and different manual testing? I do believe I got the picture across that we are not trying to take away manual testing, but rather trying to help them remove repetitive work. Since repetitive testing of the same items and same or similar functionality is quite likely to create a form of feature-blindness.

The term feature-blindness seemed to be a new concept for a big part of the group; however I managed to get this concept explained fairly easy by example.

In the end the morning session was not exactly what I hoped it would be, but it clearly did get the points I wanted to make across. Which were: think of what you want to test, try to describe for yourself why you want to automate something and then read it back in order to figure out whether it indeed still makes sense to automate this. Try to keep your tests small, self contained and reusable. Refactor your FitNesse tests into reusable scenarios, but also keep an eye out on over-complicating things by making everything a scenario, e.g. do not make a scenario for the sake of making it, only create it if you indeed have several identical tests which need different input data. And the most important of all as far as I am concerned in functional testautomation: Keep It Simple and Stupid. Even fancy stuff you should be able to keep simple, readable and brief. If at a first attempt you fail at doing it, don’t worry, move on and come back at a later stage to refactor your test.

One not so nice thing about today’s dojo was that for the second time in a row the second part of the day was rudely disturbed by some very unexpected downtime of our test-environments. We were told in advance that one of the environments would be taken down for urgent maintenance and patching, unfortunately both environments went down during this change which resulted in us sending the group off earlier than anticipated.

Main takeaway for me: I really enjoy doing these knowledge sharing and coaching sessions, I like it a lot and see it as a great bonus to my work as a consultant, especially since it makes me (and hopefully my colleagues) think about why I am doing things they way I am doing them.

Can we convince the team there is no silver bullet?

So how do you go about explaining the difference between full-scale test automation (as in the silver bullet concept) and useful, sensible computer aided testing.

Over the past few months I have been trying to get the message across to the entire testteam at my customer that testautomation is not a goal, rather it is a tool which can help the team spend less time in regressiontesting and thus spend more time focusing on testing new functionalities, and new combinations which were not covered before.

In our next testingdojo we will attempt to reach a breakthrough in that, question is however, will the method with which I hope to get this breakthrough actually work?

I hope to make the team see the light by opening up a discussion, I have everything prepared for a discussion on the subject of testautomation best practices and why do we do testautomation. Now that the actual day is getting closer though, I am starting to fear that the group may not be strong enough to actually go into a discussion about the subject.  Which means I need to have a backup plan, which at the moment I do not have.

Reason I am hoping for a discussion is that thus far the team has shown they learn best by experience and example. Whenever in the past I ran into people who learn best by experience I would let them make their mistake and then take go over it with them to see what went wrong and why. In this case however taking that approach might be a costly mistake.

Wanting to minimize time spent on regressiontesting is a good goal I believe, especially to start working on testautomation. The one problem I see is that the testers here seem to want to go overboard in their coverage. Through “youthful enthusiasm” on the team’s side, we have already spent quite some time on explaining why certain things should not be covered in an automation suite, for example testing the export to excel functionality or print functionality.

What if conversation simply cannot convince the team?

Is there a way to make them experience the senselessness of automating some things without it costing a lot of time? Is there a silver bullet which will help prove testautomation is not a silver bullet?

I still have till tomorrow morning to come up with the backup plan, otherwise it will be a lot of improvising during the dojo.