Notes from Jonathan Bach’s “Session-Based Test Management”

Tracking exploratory testing work is difficult for test managers. We don’t want to micro-manage testers, we want them to explore to their hearts content when they test, but we won’t know how much progress there is if we don’t track the work. We also won’t know what sort of problems testers encounter during testing, unless they have the nerve to tell us immediately. Jonathan Bach’s “Session-Based Test Management” article has one suggestion: use sessions, uninterrupted blocks of reviewable and chartered test effort.

Here are a few favorite notes from the article:

  • Unlike traditional scripted testing, exploratory testing is an ad hoc process. Everything we do is optimized to find bugs fast, so we continually adjust our plans to re-focus on the most promising risk areas; we follow hunches; we minimize the time spent on documentation. That leaves us with some problems. For one thing, keeping track of each tester’s progress can be like herding snakes into a burlap bag.
  • The first thing we realized in our effort to reinvent exploratory test management was that testers do a lot of things during the day that aren’t testing. If we wanted to track testing, we needed a way to distinguish testing from everything else. Thus, “sessions” were born. In our practice of exploratory testing, a session, not a test case or bug report, is the basic testing work unit . What we call a session is an uninterrupted block of reviewable, chartered test effort. By “chartered,” we mean that each session is associated with a mission—what we are testing or what problems we are looking for. By “uninterrupted,” we mean no significant interruptions, no email, meetings, chatting or telephone calls. By “reviewable,” we mean a report, called a session sheet, is produced that can be examined by a third-party, such as the test manager, that provides information about what happened.
  • From a distance, exploratory testing can look like one big amorphous task. But it’s actually an aggregate of sub-tasks that appear and disappear like bubbles in a Jacuzzi. We’d like to know what tasks happen during a test session, but we don’t want the reporting to be too much of a burden. Collecting data about testing takes energy away from doing testing.
  • We separate test sessions into three kinds of tasks: test design and execution, bug investigation and reporting, and session setup. We call these the “TBS” metrics. We then ask the testers to estimate the relative proportion of time they spent on each kind of task. Test design and execution means scanning the product and looking for problems. Bug investigation and reporting is what happens once the tester stumbles into behavior that looks like it might be a problem. Session setup is anything else testers do that makes the first two tasks possible, including tasks such as configuring equipment, locating materials, reading manuals, or writing a session report
  • We also ask testers to report the portion of their time they spend “on charter” versus “on opportunity”. Opportunity testing is any testing that doesn’t fit the charter of the session. Since we’re in doing exploratory testing, we remind and encourage testers that it’s okay to divert from their charter if they stumble into an off-charter problem that looks important.
  • Although these metrics can provide better visibility and insight about what we’re doing in our test process, it’s important to realize that the session-based testing process and associated metrics could easily be distorted by a confused or biased test manager. A silver-tongued tester could bias the sheets and manipulate the debriefing in such a way as to fool the test manager about the work being done. Even if everyone is completely sober and honest, the numbers may be distorted by confusion over the reporting protocol, or the fact that some testers may be far more productive than other testers. Effective use of the session sheets and metrics requires continual awareness about the potential for these problems.
  • One colleague of mine, upon hearing me talk about this approach, expressed the concern that senior testers would balk at all the paperwork associated with the session sheets. All that structure, she felt, would just get in the way of what senior testers already know how to do. Although my first instinct was to argue with her, on second thought, she was giving me an important reality check. This approach does impose a structure that is not strictly necessary in order to achieve the mission of good testing. Segmenting complex and interwoven test tasks into distinct little sessions is not always easy or natural. Session-based test management is simply one way to bring more accountability to exploratory testing, for those situations where accountability is especially important.
Advertisements

Web Accessibility Visualization Tool: tota11y

Testing web sites and apps come in many forms. Testers try their best to test everything, but obviously there’s only so much they can do within a schedule. Some forms of testing are more prioritized than others, and that’s not inherently bad; for a solo tester on a team, one usually tests in a way that covers more bases at the beginning.

Web accessibility testing is one of those forms of testing that often takes a backseat, sometimes even forgotten. Web accessibility helps people with disabilities get better surfing experience. Although websites are typically not built with such functionality in mind, it matters.

And tota11y is a tool from Khan Academy we can leverage for testing accessibility. It is available as an easy-to-use bookmarklet. For whatever page we want to test, we just need to go there and click the bookmarklet, after which the tool will appear on the bottom left corner of the page. Clicking the tool reveals options and using each helps us spot common accessibility violations.

Here are some screenshots of using it on a page I test at work, checking headings, contrast, and link text:

Spotted: Nonconsecutive heading level use

Multiple insufficient contrast ratio violations

And unclear link texts

Looks like there’s room for improvement, although these violations are not necessarily errors.

Automating the Windows 10 Desktop Calculator using PyAutoGui

Last year, I’ve tried automating a Windows 7 desktop calculator using a Java library called Winium. A year later, someone from the comments told me that the example didn’t work on Windows 10. I ignored the comment for a while because I didn’t have a Windows 10 machine to perform a test back then, but now that I’ve recently upgraded my home PC I decided to try it out.

What I found:

Running a Windows 7 desktop calculator automation example (using Winium) on a Windows 10 machine

The Windows 10 calculator opened up but the tests didn’t run properly. The error log told me that the program was unable to find the calculator elements it was supposed to click. Bummer. Maybe the names of the calculator elements were different, Windows 10 versus Windows 7? It wasn’t; the element names were still the same according to UI Spy. Perhaps there’s something from Winium that can point me to a clue? Oh, it seems that the library has not been updated in recent years.

If I can’t use Winium, how then can I automate the Windows 10 calculator? A Google search pointed me to PyAutoGui. Instead of Java, it says that I’ll need Python (and Pip) for this tool to work. And yes, being Windows, I also need to properly set the environment variables so I can use the Python and Pip commands on a terminal.

Let’s install PyAutoGui:

Installing PyAutoGui

And after writing some code, let’s see if we can automate the Windows 10 calculator with it:

Automating the Windows 10 desktop calculator with PyAutoGui. Click the image to view the GIF full-size on another browser tab 🙂

It works!

But here are some catches:

  • I had to rely on PyAutoGui’s keyboard control functions for performing the calculator actions, instead of finding elements via the user interface. Well, from what I’ve seen so far from the docs is that the only way to locate a UI element is by locating elements using screenshots. I tried that the first time and it was very flaky, so I opted for using the keyboard control functions instead.
  • The code introduces a time variable to wait for the calculator to appear on screen. The code also introduces a time variable to pause a portion of a second in-between each keyboard action so the actions don’t happen too fast for a person’s eye.
  • There are no assertions in the example code, because I couldn’t find any assertion functions I could use from the PyAutoGui docs. It is not a tool built for testing, only for automating desktop apps.

Source code for this experiment can be found on: Win-Calculator-PyAutoGui.

Five People and Their Thoughts (Part 8)

Sharing a new batch of engaging videos from people I follow, which I hope you’ll come to like as I do:

  • Workarounds (by Alan Richardson, about workarounds, how you can use them in testing, custom-using tools, moving boilerplate to the appendix, bypassing processes that gets in the way, understanding risks and value, technical skills, and taking control of your career)
  • 100% Coverage is Too High for Apps! (by Kent Dodds, on code coverage, what it tells you, as well as what it does not tell you)
  • Open Water Swimming (by Timothy Ferriss, about fears, habits, total immersion swimming, Terry Laughlin, compressing months of conventional training into just a few days, and the power of micro-successes)
  • How To Stop Hating Your Tests (by Justin Searls, representing Test Double, on doing only three things for tests, avoiding conditionals, consistency, apparent test purpose, redundant test coverage, optimizing feedback loops, false negatives, and building better workflows)
  • Meaning of Life (by Derek Sivers, about a classic unsolvable problem, using time wisely, making good choices, making memories, the growth mindset, inherent meaning, and a blank slate)

Set Cucumber to Retry Tests after a Failure

This is probably old news, but I only recently found out that there is a way for Cucumber tests running in Ruby to automatically retry when they fail.

Here’s a sample failing test:

Apparently we can just add a --retry command to set the test to retry if it fails. And we can set how many times we want the test to retry. If we retry the failing test above to retry two times, with a --retry 2 command, we’ll get the following output:

It did retry the test twice, after the initial failure. Neat! 🙂

But what happens if we run the test, with the same retry command, and the test passes? Does the test run three times? Let’s see:

Good, the retry command gets ignored when the test passes.

Cypress and Mochawesome

A week ago I was working on a quick automation project which asks for an HTML report of the test results to go along with the scripts. It was a straightforward request, but it was something that I don’t usually generate. Personally I find test results being displayed in a terminal to be enough, but for the said task I needed a report generator. I had already decided to use Cypress for the job so I needed something that plays well with it. In their docs I found a custom reporter called Mochawesome.

To install it, I updated the package.json file inside the project directory to include the reporter:

The Cypress documentation on Reporters also said that for Mochawesome to properly work I should also install mocha as a dev dependency, so that’s what I did.

And then run npm install on the terminal to actually install the reporter.

Before running tests, there’s a tiny change we need to write on the cypress.json file inside the project directory, which tells Cypress which reporter do we want to use for generating test reports.

And we’re all set after all that. 🙂

Run Cypress tests by running cypress run --reporter mochawesomeOr if you specified a script in the package.json file the same way I did in the first photo above, just run npm test.

After running tests, we’re going to find out that a mochawesome-report directory has been added to our project directory which houses both HTML and JSON reports of the tests.

A sample HTML test report looks something like this:

Looks nice and simple and ready for archiving.

Lessons from Kent Beck and Martin Fowler’s “Planning Extreme Programming”

The first edition of “Planning Extreme Programming” by Kent Beck and Martin Fowler was published about 18 years (that long) ago, and already it says so much about how planning for software projects can be done well. It talks about programmers and customers, their fears and frustrations, their rights and responsibilities, and putting all those knowledge into a planning style that can work for development teams. It doesn’t have to be labeled XP, but it does need to help people be focused, confident, and hopeful.

Here are some noteworthy lines from the book:

  • Planning is not about predicting the future. When you make a plan for developing a piece of software, development is not going to go like that. Not ever. Your customers wouldn’t even be happy if it did, because by the time the software gets there, the customers don’t want what was planned; they want something different.
  • We plan to ensure that we are always doing the most important thing left to do, to coordinate effectively with other people, and to respond quickly to unexpected events.
  • If you know you have a tight deadline, but you make a plan and the plans says you can make the deadline, then you’ll start on your first task with a sense of urgency but still working as well as possible. After all, you have enough time. This is exactly the behavior that is most likely to cause the plan to come true. Panic leads to fatigue, defects, and communication breakdowns.
  • Any software planning technique must try to create visibility, so everyone involved in the project can really see how far along a project is. This means that you need clear milestones, ones that cannot be fudged, and clearly represent progress. Milestones must also be things that everyone involved in the project, including the customer, can understand and learn to trust.
  • We need a planning style that
    • Preserves the programmer’s confidence that the plan is possible
    • Preserves the customer’s confidence that they are getting as much as they can
    • Costs as little to execute as possible (because we’ll be planning often, but nobody pays for plans; they pay for results)
  • If we are going to develop well, we must create a culture that makes it possible for programmers and customers to acknowledge their fears and accept their rights and responsibilities. Without such guarantees, we cannot be courageous. We huddle in fear behind fortress walls, building them ever stronger, adding ever more weight to the development processes we have adopted. We continually add cannonades and battlements, documents and reviews, procedures and sign-offs, moats with crocodiles, torture chambers, and huge pots of boiling oil. But when our fears are acknowledged and our rights are accepted, then we can be courageous. We can set goals that are hard to reach and collaborate to make those goals. We can tear down the structures that we built out of fear and that impeded us. We will have the courage to do only what is necessary and no more, to spend our time on what’s important rather than on protecting ourselves.
  • We use driving as a metaphor for developing software. Driving is not about pointing in one direction and holding to it; driving is about making lots of little course corrections. You don’t drive software development by getting your project pointed in the right direction (The Plan). You drive software development by seeing that you are drifting a little this way and steering a little that way. This way, that way, as long as you develop the software.
  • When you don’t have enough time you are out of luck. You can’t make more time. Not having enough time is a position of helplessness. And hopelessness breeds frustration, mistakes, burnout, and failure. Having too much to do, however, is a situation we all know. When you have too much to do you can prioritize and not do some things, reduce the size of some of the things you do, ask someone else to do some things. Having too much to do breeds hope. We may not like being there, but at least we know what to do.
  • Focusing on one or two iterations means that the programmers clearly need to know that stories are in the iteration they are currently working on. It’s also useful to know what’s in the next iteration. Beyond that the iteration allocation is not so useful. The real decider for how far in advance you should plan is the cost of keeping the plan up-to-date versus the benefit you get when you know that plans are inherently unstable. You have to honestly asses the value compared to the volatility of the plans.
  • Writing the stories is not the point. Communicating is the point. We’ve seen too many requirements documents that are written down but don’t involve communication.
  • We want to get a release to the customer as soon as possible. We want this release to be as valuable to the customer as possible. That way the customer will like us and keep feeding us cookies. So we give her the things she wants most. That way we can release quickly and the customer feels the benefit. Should everything go to pot at the end of the schedule, it’s okay, because the stories at risk are less important than the stories we have already completed. Even if we can’t release quickly, the customer will be happier if we do the most valuable things first. It shows we are listening, and really trying to solve her problems. It also may prompt the customer to go for an earlier release once she sees that value of what appears.
  • One of the worst things about software bugs is that they come with a strong element of blame (from the customer) and guilt (from the programmer). If only we’d tested more, if only you were competent programmers, there wouldn’t be these bugs. We’ve seen people screaming on news groups and managers banging on tables saying that no bugs are acceptable. All this emotion really screws up the process of dealing with bugs and hurts the key human relationships that are essential if software development is to work well.
  • We assume that the programmers are trying to do the most professional job they can. As part of this they will go to great lengths to eliminate bugs. But nobody can eliminate all of them. The customer has to trust that the programmers are working hard to reduce bugs, and can monitor the testing process to see that they are doing as much as they should.
  • For most software, however, we don’t actually want zero bugs. (Now there’s a statement that we guarantee will be used against us out of context.) Any defect, once it’s in there, takes time and effort to remove. That time and effort will take away from effort spent putting in features. So you have to decide which to do. Even when you know about a bug, someone has to decide whether you want to eliminate the bug or add another feature. Who decides? In our view it must be the customer. The customer has to make a business decision based on the cost of having the bug versus the value of having another feature – or the value of deploying now instead of waiting to reduce the bug count. (We would argue that this does not hold true for bugs that could be life-threatening. In that case we think the programmers have a duty to public safety that is far greater than their duty to the customer.) There are plenty of cases where the business decision is to have the feature instead.
  • All the planning techniques in the world, can’t save you if you forget that software is built by human beings. In the end keep the human beings focused, happy, and motivated and they will deliver.