Test automation metrics – mashing up non-test data

In my previous post I wrote about reporting on test automation and one of the main things I said in there was: use the vast amount of data you have. In this post I will elaborate a bit more on that, in fact, I am going to go one step beyond and say: don’t just use your own data! Generally in an organisation there is more data at hand than just the data you create with test automation. Some very powerful pieces of data are release-boards and bug-metrics.

Bug metrics

The bug-metrics are quite often a logical item to look at. However, keep in mind that generally speaking, test automation will not have a very high bug-detection-percentage. The bug-detection-percentage curiously enough is something you do not often read about, but it is being measured all over the place actively; it is the amount of bugs found in production set off as a percentage of the amount of bug found before production and in production, in other words: Bug Detection Percentage calculation An example of a BDP graph, with Preproduction found bugs, bugs found in production and the grand total of bugs found per run (please keep in mind, this is simply sample data and not from an actual project) Bug detection percentage graph The problem I see in using the BDP to show the effectiveness of test automation is that the objective of test automation is generally notto find new bugs, but instead to ensure all existing functionality, on a new release, functions as expected and thus as it was on previous builds or releases. So in short, test automation is excellent at running a regression test, but not at finding new problems. Of course it’s possible to then see if you can define a bug detection percentage for test automation specifically, e.g. make the numbers solely based on test automation. This however will not really give any insights into the effectiveness of test automation considering the  low amount of bugs found by test automation in general. What can we do with bug data and test automation? Frankly, I am convinced that generally speaking test automation will not benefit from setting metrics based on bugs. Use other data instead, such as new features introduced, lines of code changed, number of tasks or user stories completed and set that off against your test set showing growth.

Test automation and User Stories

In agile environments it is generally considered (and thankfully so) a good thing to automate the crap out of everything being built. Generally there is quite some data available on the amount of stories or tasks coming out of a sprint, or on the story-points having been committed to per sprint. A graph setting off the story-points (or complexity points if you will) against the amount of tests having been automated can give a nice insight into the “coverage” of your automation. Add to this set the amount of bugs appearing in production and you have a sweet showcase of how your agile effort is performing over all. Storypoints vs automated tests vs production issuesA graph like this contains a lot of information so it will need a bit of explanation when showing to the team and especially to managers. On a per sprint basis this graph shows:

  • Blue bars: the amount of story points delivered per sprint (the actual delivery, not the committed amount!) set off against the left vertical axis
  • White bars: the cumulative amount of automated tests run on these stories and run as regression set off against the right vertical axis.
  • Black line: a trend line of the overall amount of automated tests
  • Green line: the actual amount of production issue occurences after releasing a sprint worth of code to production.

A graph like this is relatively easy to setup and maintain and it will show a very strong picture. It clearly shows the progress of the team effort (growing amount of story points taken up, less production issues over time) and at the same time gives a very clear insight into the added value of test automation within the team by showing the rising coverage against the declining production issues.

Why test “by the book”

The other day I was reading a blog post on agile and why agile will fail in many instances. One of the comments got my specific attention, the comment states the following:

” a process with little Agility due to the remains of the “old process”. ”

This is why by-the-book scrum is so powerful. Too many agile consultants try to fit agile into the existing org structure and processes, thereby allowing existing dysfunction to remain, or worse, covering it up. They try to modify everything right out of the gate, instead of just choosing scrum.

This got me thinking about why so many methodologies seem to get followers who treat “their” methodology as a religion. I have been pondering about the different development and testing methods, such as XP, Scrum, Lean for development life cycles and ISTQB, TMap and such for testing in particular.

In religions it is generally considered bad to be extreme in following the rules, hence the term extremist, whether this is an orthodox, katholic, jewish or islamic extremist, they are always considered to be dangerous to society. Isn’t it the same in software development and testing? Aren’t the people who go to extremes to follow the rules as dreamed up by some author also extremists that seem to lose sight of the context?

Thus far the best implementation of any methodology I have seen, is a form of hybrid, or as the Dutch would call it a “polder model”, where you make a compromise between “the book” and “what actually works for us as a team or organization”.

Are methodologies best practices then? Aren’t methodologies meant to help people get a frame of reference and fill that in for themselves, by thinking about the frame of reference, critizing it, adjusting it to their needs. Shaping the method in such a way that it works optimally for you, in this situation, in this particular context. When moving on to a new task or assignment, you can take these learnings with you and see what of it works for you within this new context and adjust whatever doesn’t work.

Maybe it is time for a new methodology, which ties in well with a solid development method by Zed A. Shaw. A possible working name could be “Testing, fuckwit!”.

Is testing the dumping grounds of IT?

The other day I was talking to a few developers I was on an assignment with about getting testers added to their scrum team, and the response I got from them disturbed me. They told me that in their experience most testers do not work together in the team, they work against development, trying to get everything fully tested, despite them knowing this is not a feasible thing, and with that delay projects. On top of that they told me, most of the testers they have worked with, are part of the dumping grounds of the IT industry. And with that they meant that in their view most testers are not good enough to be a developer, so they decided to become testers instead (<sarcasm> cause, come on, testing is not that difficult anyone can do that! <\sarcasm>).

I was shocked to hear there are still a lot of developers out there who believe that testers are the dumping grounds of the IT industry, but I was even more shocked of their experiences with testers, working in an “us versus them” modus operandi instead of working in a team, part of a joing effort with a shared focus and goal.

What is it that still makes testers often work against developers instead of with them?

Most testers I have worked with over the past years agree that working side by side with development is the most effective and efficient way of working, this way you both keep track of your joint goal: get the software out on time, on budget and according to what your customer (or end-user for that matter) wants and needs. Together you try to add value to the software.

So is it indeed true that there are still a lot of testers out there in the field who are indeed not seeing the big picture and are trying to prove their worth by working against dev and looking for bugs that are not relevant, e.g. just looking for bugs for the sake of finding one, no matter what the value of that bug is to the end-user/customer, just so they can triumphantly point to a developer that indeed, “see! There are bugs in your code, you did it wrong!” Unfortunately I fear there are still too many testers out there that think and work this way, not to even mention all the developers out there that seem to not understand the added value of a good tester to the team and to the developers work!

Fortunately there is a wonderful contrast out there as well, in the form of this blog post by Nathan Lusher who shows that there indeed are good testers out there, who weigh in on a project and prove the value of testing and with that show that testers are not (or at least not everywhere) the dumping grounds of IT.

In my experience, there are a lot of very good, inspired and knowledgable testers out there, who see the added value of working together, in a team with a shared goal, a shared approach and shared respect. If testers want to get the respect of developers, I believe it is up to quite a lot of the testers to start by showing respect to the developers and where needed, increasing their technical knowledge in order to be able to counterbalance a developers viewpoint. You get what you give!

All my automated tests are passing, what’s wrong?

The other day I had an interesting discussion with some developers within a scrumteam I am part of. We were discussing the use of test automation within the team and how to deal with the changing requirements, code and environments, which would lead to failing tests.

They gave the clear impression they were very worried about tests failing. I asked them what would be wrong with tests failing in a regression set during sprints, which led them to look at me with a question in their faces: Why would a tester want tests to fail??
If anything I would automated tests expect to fail, at least partially.

While automating in sprint I’m assuming things to be in a certain state, for example I’m assuming that when I hit the search button nothing happens, in the next sprint I really hope this test will break, meaning the search button is leading me to some form of a result page. That way all tests are, just like the rest of the code, continuously in flux and get constantly updated and refactored.

This of course fully applies to automating tests within a sprint, when automating for regression or end-to-end testing however, I would rather expect my tests to pass. Or at least the majority of regression tests should keep passing consistently.