Tracking exploratory testing work is difficult for test managers. We don’t want to micro-manage testers, we want them to explore to their hearts content when they test, but we won’t know how much progress there is if we don’t track the work. We also won’t know what sort of problems testers encounter during testing, unless they have the nerve to tell us immediately. Jonathan Bach’s “Session-Based Test Management” article has one suggestion: use sessions, uninterrupted blocks of reviewable and chartered test effort.
Here are a few favorite notes from the article:
Unlike traditional scripted testing, exploratory testing is an ad hoc process. Everything we do is optimized to find bugs fast, so we continually adjust our plans to re-focus on the most promising risk areas; we follow hunches; we minimize the time spent on documentation. That leaves us with some problems. For one thing, keeping track of each tester’s progress can be like herding snakes into a burlap bag.
The first thing we realized in our effort to reinvent exploratory test management was that testers do a lot of things during the day that aren’t testing. If we wanted to track testing, we needed a way to distinguish testing from everything else. Thus, “sessions” were born. In our practice of exploratory testing, a session, not a test case or bug report, is the basic testing work unit . What we call a session is an uninterrupted block of reviewable, chartered test effort. By “chartered,” we mean that each session is associated with a mission—what we are testing or what problems we are looking for. By “uninterrupted,” we mean no significant interruptions, no email, meetings, chatting or telephone calls. By “reviewable,” we mean a report, called a session sheet, is produced that can be examined by a third-party, such as the test manager, that provides information about what happened.
From a distance, exploratory testing can look like one big amorphous task. But it’s actually an aggregate of sub-tasks that appear and disappear like bubbles in a Jacuzzi. We’d like to know what tasks happen during a test session, but we don’t want the reporting to be too much of a burden. Collecting data about testing takes energy away from doing testing.
We separate test sessions into three kinds of tasks: test design and execution, bug investigation and reporting, and session setup. We call these the “TBS” metrics. We then ask the testers to estimate the relative proportion of time they spent on each kind of task. Test design and execution means scanning the product and looking for problems. Bug investigation and reporting is what happens once the tester stumbles into behavior that looks like it might be a problem. Session setup is anything else testers do that makes the first two tasks possible, including tasks such as configuring equipment, locating materials, reading manuals, or writing a session report
We also ask testers to report the portion of their time they spend “on charter” versus “on opportunity”. Opportunity testing is any testing that doesn’t fit the charter of the session. Since we’re in doing exploratory testing, we remind and encourage testers that it’s okay to divert from their charter if they stumble into an off-charter problem that looks important.
Although these metrics can provide better visibility and insight about what we’re doing in our test process, it’s important to realize that the session-based testing process and associated metrics could easily be distorted by a confused or biased test manager. A silver-tongued tester could bias the sheets and manipulate the debriefing in such a way as to fool the test manager about the work being done. Even if everyone is completely sober and honest, the numbers may be distorted by confusion over the reporting protocol, or the fact that some testers may be far more productive than other testers. Effective use of the session sheets and metrics requires continual awareness about the potential for these problems.
- One colleague of mine, upon hearing me talk about this approach, expressed the concern that senior testers would balk at all the paperwork associated with the session sheets. All that structure, she felt, would just get in the way of what senior testers already know how to do. Although my first instinct was to argue with her, on second thought, she was giving me an important reality check. This approach does impose a structure that is not strictly necessary in order to achieve the mission of good testing. Segmenting complex and interwoven test tasks into distinct little sessions is not always easy or natural. Session-based test management is simply one way to bring more accountability to exploratory testing, for those situations where accountability is especially important.
Automated checking is not a new concept. Gojko Adzic, however, provides us a way to make better integration of it in our software development processes. In his book titled “Specification by Example”, he talks about executable specifications that double as a living documentation. These are examples which continuously exercise business rules, they help teams collaborate, and, along with software code, they’re supposed to be the source of truth for understanding how our applications work. He builds a strong case about the benefits of writing specifications by example by presenting case studies and testimonials of teams who have actually used it in their projects, and I think that it is a great way of moving forward, of baking quality in.
Some favorite takeaways from the book:
- Tests are specifications; specifications are tests.
- “If I cannot have the documentation in an automated fashion, I don’t trust it. It’s not exercised.” -Tim Andersen
- Beginners think that there is no documentation in agile, which is not true. It’s about choosing the types of documentation that are useful. There is still documentation in an agile process, and that’s not a two-feet-high pile of paper, but something lighter, bound to the real code. When you ask, “does your system have this feature?” you don’t have a Word document that claims that something is done; you have something executable that proves that the system really does what you want. That’s real documentation.
- Fred Brooks quote: In The Mythical Man-Month 4 he wrote, “The hardest single part of building a software system is deciding precisely what to build.” Albert Einstein himself said that “the formulation of a problem is often more essential than its solution.”
- We don’t really want to bother with estimating stories. If you start estimating stories, with Fibonacci numbers for example, you soon realize that anything eight or higher is too big to deliver in an iteration, so we’ll make it one, two, three, and five. Then you go to the next level and say five is really big. Now that everything is one, two, and three, they’re now really the same thing. We can just break that down into stories of that size and forget about that part of estimating, and then just measure the cycle time to when it is actually delivered.
- Sometimes people still struggle with explaining what the value of a given feature would be (even when asking them for an example). As a further step, I ask them to give an example and say what they would need to do differently (work around) if the system would not provide this feature. Usually this helps them then to express the value of a given feature.
- QA doesn’t write [acceptance] tests for developers; they work together. The QA person owns the specification, which is expressed through the test plan, and continues to own that until we ship the feature. Developers write the feature files [specifications] with the QA involved to advise what should be covered. QA finds the holes in the feature files, points out things that are not covered, and also produces test scripts for manual testing.
- If we don’t have enough information to design good test cases, we definitely don’t have enough information to build the system.
- Postponing automation is just a local optimization. You might get through the stories quicker from the initial development perspective, but they’ll come back for fixing down the road. David Evans often illustrates this with an analogy of a city bus: A bus can go a lot faster if it doesn’t have to stop to pick up passengers, but it isn’t really doing its job then.
- Workflow and session rules can often be checked only against the user interface layer. But that doesn’t mean that the only option to automate those checks is to launch a browser. Instead of automating the specifications through a browser, several teams developing web applications saved a lot of time and effort going right below the skin of the application—to the HTTP layer.
- Automating executable specifications forces developers to experience what it’s like to use their own system, because they have to use the interfaces designed for clients. If executable specifications are hard to automate, this means that the client APIs aren’t easy to use, which means it’s time to start simplifying the APIs.
- Automation itself isn’t a goal. It’s a tool to exercise the business processes.
- Effective delivery with short iterations or in constant flow requires removing as many expected obstacles as possible so that unexpected issues can be addressed. Adam Geras puts this more eloquently: “Quality is about being prepared for the usual so you have time to tackle the unusual.” Living documentation simply makes common problems go away.
- Find the most annoying thing and fix it, then something else will pop up, and after that something else will pop up. Eventually, if you keep doing this, you will create a stable system that will be really useful.
I was never fond of history subjects back in formal school. I don’t know why exactly, but there’s something about history that felt uninteresting to me – maybe I’m just not keen on memorizing facts and figures, maybe it’s because what’s being discussed are stuff that happened decades in the past and those are often terribly difficult to relate to the context of things that are happening now, or maybe I just can’t imagine history questions as puzzles the way I can do it with math, science, english, and the other subjects. Of course I studied for the quizzes and the exams, and I read the books as much as I could, but I don’t remember myself enjoying the process of learning history in school. I wonder if reading those books again today will change something.
On the contrary, I like to stay updated about the history of the applications that I test. The product history is akin to a colleague who helps me find out what important changes have happened during a particular point in time in an application’s active life. He takes notes of how systems evolved from one day to another and tells me why some applications are behaving differently compared to what’s expected. He shows me how software can get sick some of the time, points me to the root causes of problems, and aids programmers in building fixes. He shares stories of brilliant creative work of people who care, which often do not get celebrated, as well as stories of mistakes and challenges that those same people have faced. Most of all, he displays the truth about learning, teamwork, growth, and how the road looks like from starting to continuously delivering.
Looking at a sprint document I designed a few years back (and updated by my scrum master colleagues), what I see are all sorts of data: planned total story points, actual finished story points, planned total task hours, actual finished task hours (total, per developer, team per day), burndown chart, attendance records, retrospective notes, tickets notes, returned tasks, unplanned tasks, underestimated tasks. So many information being gathered and compiled every sprint iteration; it’s not really too difficult to document but it does kind of takes away a good chunk of time to manage. Back then when we were new to scrum I thought that crunching numbers will provide insights as to how we can improve estimates and productivity, I thought that maybe we’d become faster and better at what we do after performing some mathematics.
It turns out that it doesn’t work that way. It turns out that estimates and productive behavior is a lot more complicated than it seems. It turns out that personalities and preferences and work environment and team dynamics and other small stuff we don’t pay attention much to play big roles in being good at software development as a team. Data about these things are not shown in the sprint document, and I don’t have the slightest idea how to measure them. So, looking back at all these stuff we’re trying to analyze, I’m haunted by questions: Which data are important enough to keep taking notes of, sprint after sprint, to keep reviewing? Why are they important? How do these data help me define how my team improves in succeeding iterations? Are they just pointing out problems but no solutions in sight? What if I stop documenting these data, would that help me focus my energy towards other possibly more important to-dos?