In a recent automation work I’ve been asked to take part in, running a single test safely, according to peers, generally follow a three-step process:
- Delete existing feature files:
rm -rf *.feature
- Create updated feature files:
PROJECT_DIRECTORY=../<dir> APP=<app> TEST=automated ruby create_tests.rb, where app is the name of the app you want to test
- Run the desired specific test:
cucumber --tags @<tag>
This means I have to re-run these three steps over and over for every single test I want to review. That can get tiring easily, especially if I have to re-type them repeatedly, or even if I copy-paste. Using the arrow keys in the terminal helps, but sometimes commands can get lost in the history and then it becomes a bit of a hassle to find them.
There should be a way for me to run an automated check I want to review, given just the app name and the test tag. It would be nice if I can run a specific test using a shorter command but still following the same process as provided.
I decided to set aside a little bit of time to write a personal script, and used Makefile because I’ve had experience with that before.
My personal short command for running a test:
make test app=<app> tag=<tag>
And here’s what the script looks like, which runs the known commands step-by-step plus displays some information on what’s currently happening:
Now I won’t have to worry about remembering what commands to type in as well as the sequence of the commands. I can give more focus on actually reviewing the test itself, which is what’s more important.
Tracking exploratory testing work is difficult for test managers. We don’t want to micro-manage testers, we want them to explore to their hearts content when they test, but we won’t know how much progress there is if we don’t track the work. We also won’t know what sort of problems testers encounter during testing, unless they have the nerve to tell us immediately. Jonathan Bach’s “Session-Based Test Management” article has one suggestion: use sessions, uninterrupted blocks of reviewable and chartered test effort.
Here are a few favorite notes from the article:
Unlike traditional scripted testing, exploratory testing is an ad hoc process. Everything we do is optimized to find bugs fast, so we continually adjust our plans to re-focus on the most promising risk areas; we follow hunches; we minimize the time spent on documentation. That leaves us with some problems. For one thing, keeping track of each tester’s progress can be like herding snakes into a burlap bag.
The first thing we realized in our effort to reinvent exploratory test management was that testers do a lot of things during the day that aren’t testing. If we wanted to track testing, we needed a way to distinguish testing from everything else. Thus, “sessions” were born. In our practice of exploratory testing, a session, not a test case or bug report, is the basic testing work unit . What we call a session is an uninterrupted block of reviewable, chartered test effort. By “chartered,” we mean that each session is associated with a mission—what we are testing or what problems we are looking for. By “uninterrupted,” we mean no significant interruptions, no email, meetings, chatting or telephone calls. By “reviewable,” we mean a report, called a session sheet, is produced that can be examined by a third-party, such as the test manager, that provides information about what happened.
From a distance, exploratory testing can look like one big amorphous task. But it’s actually an aggregate of sub-tasks that appear and disappear like bubbles in a Jacuzzi. We’d like to know what tasks happen during a test session, but we don’t want the reporting to be too much of a burden. Collecting data about testing takes energy away from doing testing.
We separate test sessions into three kinds of tasks: test design and execution, bug investigation and reporting, and session setup. We call these the “TBS” metrics. We then ask the testers to estimate the relative proportion of time they spent on each kind of task. Test design and execution means scanning the product and looking for problems. Bug investigation and reporting is what happens once the tester stumbles into behavior that looks like it might be a problem. Session setup is anything else testers do that makes the first two tasks possible, including tasks such as configuring equipment, locating materials, reading manuals, or writing a session report
We also ask testers to report the portion of their time they spend “on charter” versus “on opportunity”. Opportunity testing is any testing that doesn’t fit the charter of the session. Since we’re in doing exploratory testing, we remind and encourage testers that it’s okay to divert from their charter if they stumble into an off-charter problem that looks important.
Although these metrics can provide better visibility and insight about what we’re doing in our test process, it’s important to realize that the session-based testing process and associated metrics could easily be distorted by a confused or biased test manager. A silver-tongued tester could bias the sheets and manipulate the debriefing in such a way as to fool the test manager about the work being done. Even if everyone is completely sober and honest, the numbers may be distorted by confusion over the reporting protocol, or the fact that some testers may be far more productive than other testers. Effective use of the session sheets and metrics requires continual awareness about the potential for these problems.
- One colleague of mine, upon hearing me talk about this approach, expressed the concern that senior testers would balk at all the paperwork associated with the session sheets. All that structure, she felt, would just get in the way of what senior testers already know how to do. Although my first instinct was to argue with her, on second thought, she was giving me an important reality check. This approach does impose a structure that is not strictly necessary in order to achieve the mission of good testing. Segmenting complex and interwoven test tasks into distinct little sessions is not always easy or natural. Session-based test management is simply one way to bring more accountability to exploratory testing, for those situations where accountability is especially important.
The first edition of “Planning Extreme Programming” by Kent Beck and Martin Fowler was published about 18 years (that long) ago, and already it says so much about how planning for software projects can be done well. It talks about programmers and customers, their fears and frustrations, their rights and responsibilities, and putting all those knowledge into a planning style that can work for development teams. It doesn’t have to be labeled XP, but it does need to help people be focused, confident, and hopeful.
Here are some noteworthy lines from the book:
- Planning is not about predicting the future. When you make a plan for developing a piece of software, development is not going to go like that. Not ever. Your customers wouldn’t even be happy if it did, because by the time the software gets there, the customers don’t want what was planned; they want something different.
- We plan to ensure that we are always doing the most important thing left to do, to coordinate effectively with other people, and to respond quickly to unexpected events.
- If you know you have a tight deadline, but you make a plan and the plans says you can make the deadline, then you’ll start on your first task with a sense of urgency but still working as well as possible. After all, you have enough time. This is exactly the behavior that is most likely to cause the plan to come true. Panic leads to fatigue, defects, and communication breakdowns.
- Any software planning technique must try to create visibility, so everyone involved in the project can really see how far along a project is. This means that you need clear milestones, ones that cannot be fudged, and clearly represent progress. Milestones must also be things that everyone involved in the project, including the customer, can understand and learn to trust.
- We need a planning style that
- Preserves the programmer’s confidence that the plan is possible
- Preserves the customer’s confidence that they are getting as much as they can
- Costs as little to execute as possible (because we’ll be planning often, but nobody pays for plans; they pay for results)
- If we are going to develop well, we must create a culture that makes it possible for programmers and customers to acknowledge their fears and accept their rights and responsibilities. Without such guarantees, we cannot be courageous. We huddle in fear behind fortress walls, building them ever stronger, adding ever more weight to the development processes we have adopted. We continually add cannonades and battlements, documents and reviews, procedures and sign-offs, moats with crocodiles, torture chambers, and huge pots of boiling oil. But when our fears are acknowledged and our rights are accepted, then we can be courageous. We can set goals that are hard to reach and collaborate to make those goals. We can tear down the structures that we built out of fear and that impeded us. We will have the courage to do only what is necessary and no more, to spend our time on what’s important rather than on protecting ourselves.
- We use driving as a metaphor for developing software. Driving is not about pointing in one direction and holding to it; driving is about making lots of little course corrections. You don’t drive software development by getting your project pointed in the right direction (The Plan). You drive software development by seeing that you are drifting a little this way and steering a little that way. This way, that way, as long as you develop the software.
- When you don’t have enough time you are out of luck. You can’t make more time. Not having enough time is a position of helplessness. And hopelessness breeds frustration, mistakes, burnout, and failure. Having too much to do, however, is a situation we all know. When you have too much to do you can prioritize and not do some things, reduce the size of some of the things you do, ask someone else to do some things. Having too much to do breeds hope. We may not like being there, but at least we know what to do.
- Focusing on one or two iterations means that the programmers clearly need to know that stories are in the iteration they are currently working on. It’s also useful to know what’s in the next iteration. Beyond that the iteration allocation is not so useful. The real decider for how far in advance you should plan is the cost of keeping the plan up-to-date versus the benefit you get when you know that plans are inherently unstable. You have to honestly asses the value compared to the volatility of the plans.
- Writing the stories is not the point. Communicating is the point. We’ve seen too many requirements documents that are written down but don’t involve communication.
- We want to get a release to the customer as soon as possible. We want this release to be as valuable to the customer as possible. That way the customer will like us and keep feeding us cookies. So we give her the things she wants most. That way we can release quickly and the customer feels the benefit. Should everything go to pot at the end of the schedule, it’s okay, because the stories at risk are less important than the stories we have already completed. Even if we can’t release quickly, the customer will be happier if we do the most valuable things first. It shows we are listening, and really trying to solve her problems. It also may prompt the customer to go for an earlier release once she sees that value of what appears.
- One of the worst things about software bugs is that they come with a strong element of blame (from the customer) and guilt (from the programmer). If only we’d tested more, if only you were competent programmers, there wouldn’t be these bugs. We’ve seen people screaming on news groups and managers banging on tables saying that no bugs are acceptable. All this emotion really screws up the process of dealing with bugs and hurts the key human relationships that are essential if software development is to work well.
- We assume that the programmers are trying to do the most professional job they can. As part of this they will go to great lengths to eliminate bugs. But nobody can eliminate all of them. The customer has to trust that the programmers are working hard to reduce bugs, and can monitor the testing process to see that they are doing as much as they should.
- For most software, however, we don’t actually want zero bugs. (Now there’s a statement that we guarantee will be used against us out of context.) Any defect, once it’s in there, takes time and effort to remove. That time and effort will take away from effort spent putting in features. So you have to decide which to do. Even when you know about a bug, someone has to decide whether you want to eliminate the bug or add another feature. Who decides? In our view it must be the customer. The customer has to make a business decision based on the cost of having the bug versus the value of having another feature – or the value of deploying now instead of waiting to reduce the bug count. (We would argue that this does not hold true for bugs that could be life-threatening. In that case we think the programmers have a duty to public safety that is far greater than their duty to the customer.) There are plenty of cases where the business decision is to have the feature instead.
- All the planning techniques in the world, can’t save you if you forget that software is built by human beings. In the end keep the human beings focused, happy, and motivated and they will deliver.
Looking at the screenshot above, it’s ironic that there’s a mismatch between the commit message and the actual pipeline result from the tests after pushing the said commit. 😛 What happened was that after making the unit tests pass on my local machine, I fetched the latest changes from the remote repository, made a commit of my changes, and then pushed the commit. Standard procedure, right? After receiving the failure email I realized that I forgot to re-run the tests locally to double-check, which I should have done since there were changes from remote. Those pulled changes broke some of the tests. Honest mistake, lesson learned.
I frequently make errors like this on all sorts of things, not just code, and I welcome them. They bewilder me in a good way, and remind me of how fallible I can be. They show me that it’s alright to make mistakes, and tell me that it’s all part of the journey to becoming better.
And yes, testers can also test on the unit level! 🙂
The experience of writing and running automated checks, as well as building some personal apps that run on the terminal, in recent years, has given me a keen sense of appreciation on how effective command-line applications can be as tools. I’ve grown fond of quick programming experiments (scraping data, playing with Excel files, among others), which are relatively easy to write, powerful, dependable, and maintainable. Tons of libraries online help interface well with the myriad of programs in our desktop or out in the web.
Choosing to read “Build Awesome Command-Line Applications in Ruby 2” is choosing to go on an adventure about writing better CLI apps, finding out how options are designed and built, understanding how they are configured and distributed, and learning how to actually test them.
Some notes from the book:
- Graphical user interfaces (GUIs) are great for a lot of things; they are typically much kinder to newcomers than the stark glow of a cold, blinking cursor. This comes at a price: you can get only so proficient at a GUI before you have to learn its esoteric keyboard shortcuts. Even then, you will hit the limits of productivity and efficiency. GUIs are notoriously hard to script and automate, and when you can, your script tends not to be very portable.
- An awesome command-line app has the following characteristics:
- Easy to use. The command-line can be an unforgiving place to be, so the easier an app is to use, the better.
- Helpful. Being easy to use isn’t enough; the user will need clear direction on how to use an app and how to fix things they might’ve done wrong.
- Plays well with others. The more an app can interoperate with other apps and systems, the more useful it will be, and the fewer special customizations that will be needed.
- Has sensible defaults but is configurable. Users appreciate apps that have a clear goal and opinion on how to do something. Apps that try to be all things to all people are confusing and difficult to master. Awesome apps, however, allow advanced users to tinker under the hood and use the app in ways not imagined by the author. Striking this balance is important.
- Installs painlessly. Apps that can be installed with one command, on any environment, are more likely to be used.
- Fails gracefully. Users will misuse apps, trying to make them do things they weren’t designed to do, in environments where they were never designed to run. Awesome apps take this in stride and give useful error messages without being destructive. This is because they’re developed with a comprehensive test suite.
- Gets new features and bug fixes easily. Awesome command-line apps aren’t awesome just to use; they are awesome to hack on. An awesome app’s internal structure is geared around quickly fixing bugs and easily adding new features.
- Delights users. Not all command-line apps have to output monochrome text. Color, formatting, and interactive input all have their place and can greatly contribute to the user experience of an awesome command-line app.
- Three guiding principles for designing command-line applications:
- Make common tasks easy to accomplish
- Make uncommon tasks possible (but not easy)
- Make default behavior nondestructive
- Options come in two-forms: short and long. Short-form options allow frequent users who use the app on the command line to quickly specify things without a lot of typing. Long-form options allow maintainers of systems that use our app to easily understand what the options do without having to go to the documentation. The existence of a short-form option signals to the user that that option is common and encouraged. The absence of a short-form option signals the opposite— that using it is unusual and possibly dangerous. You might think that unusual or dangerous options should simply be omitted, but we want our application to be as flexible as is reasonable. We want to guide our users to do things safely and correctly, but we also want to respect that they know what they’re doing if they want to do something unusual or dangerous.
- Thinking about which behavior of an app is destructive is a great way to differentiate the common things from the uncommon things and thus drive some of your design decisions. Any feature that does something destructive shouldn’t be a feature we make easy to use, but we should make it possible.
- The future of development won’t just be manipulating buttons and toolbars and dragging and dropping icons to create code; the efficiency and productivity inherent to a command-line interface will always have a place in a good developer’s tool chest.
Here’s a screenshot of something we’ve currently been working on:
It’s not available publicly yet, because it’s still in the early stages of development. The app, hosted locally on http://aspen.reservations.com/#/stay-dates, will redirect to a 404 page unless you have the app installed in your machine. As a programmer, the changes I’m making on the app is only accessible on my local computer. Other programmers will only be able to view the changes I’ve made if I push said changes and they download those updates on their machines.
Similarly, I’ll be redirected to an error page if I view the said app on my phone:
What if a tester or a product owner requests an impromptu demo of the app? What if they want to test it as is? Early constructive feedback is always nice. We can set up the app on their machine, but what if they’re working remotely? I’m pretty sure it would be a pain for them to install the development environment on their own. And what if they want to test it on their mobile phone? Having the environment installed on their computer doesn’t mean they can test the app on an actual phone.
Apparently there’s Ngrok. It’s a quick solution to publicly sharing local apps over the internet.
Here’s a screenshot of me trying out the free version:
And now I can test the app on my phone!
No installations needed on the tester / product owner / client side, just a single bash command from the programmer instead. So cool!
Until recently, what I mostly knew about test-driven development was the concept of red-green-refactor, unit testing application code while building them from the bottom-up. Likewise, I also thought that mocking was only all about mocks, fakes, stubs, and spies, tools that assist us in faking integrations so we can focus on what we really need to test. Apparently there is so much more to TDD and mocks that many programmers do not put into use.
Justin Searls calls it discovery testing. It’s a practice that’s attempts to help us build carefully-designed features with focus, confidence, and mindfulness, in a top-down approach. It breaks feature stories down into smaller problems right from the very beginning and forces programmers to slow down a little bit, enough for them to be more thoughtful about the high-level design. It uses test doubles to guide the design, which shapes up a source tree of collaborators and logical leaf nodes as development go. It favors rewrites over refactors, and helps us become aware of nasty redundant coverage with our tests.
And it categorizes TDD as a soft skill, rather than a technical skill.
He talks about it in more depth in a series of videos, building Conway’s Game of Life (in Java) as a coding exercise. He’s also talked about how we are often using mocks the wrong way, and shared insight on how we can write programs better.
Discover testing is fascinating, and has a lot of great points going for it. I feel that it takes the concept of red-green-refactor further and drives long-term maintainable software right from the start. I want to be good at it eventually, and I know I’d initially have to practice it long enough to see how much actual impact it has.
There are two things that’s wonderful from last year’s Agile Testing Days conference talks: content focusing on other valuable stuff for testers and teams (not automation), and having as many women speakers as there are men. I hope they continue on with that trend.
Here’s a list of my favourite talks from said conference (enjoy!):
- How To Tell People They Failed and Make Them Feel Great (by Liz Keogh, about Cynefin, our innate dislike of uncertainty and love of making things predictable, putting safety nets and allowing for failure, learning reviews, letting people change themselves, building robust probes, and making it a habit to come from a place of care)
- Pivotal Moments (by Janet Gregory, on living in a dairy farm, volunteering, traveling, toastmasters, Lisa Crispin, Mary Poppindieck and going on adventures, sharing failures and taking help, and reflecting on pivotal moments)
- Owning Our Narrative (by Angie Jones, on the history of the music industry so far, the changes in environment, tools, and business models musicians have had to go through so survive, and embracing changes and finding ways to fulfil our roles as software testers)
- Learning Through Osmosis (by Maaret Pyhäjärvi, on mob programming and osmosis, creating safe spaces to facilitate learning, and the power of changing some of our beliefs and behaviour)
- There and Back Again: A Hobbit’s/Developer’s/Tester’s Journey (by Pete Walen, on how software was built in the old days, how testing and programming broke up into silos, and a challenge for both parties to go back at excelling at each other’s skills and teaming up
- 10 Behaviours of Effective Agile Teams (by Rob Lambert, about shipping software and customer service, becoming a more effective employee, behaviours, and communicating well)
Yes, we need to write tests because it is something that we think will help us in the long term, even though it may be more work for us in the short run. If written with care and with the end in mind, tests serve as living documentation, living because they change as much as the application code changes, and they help us refer back to what some feature does and doesn’t, in as much detail as we want. Tests let us know which areas of the application matters to us, and every time they run they remind us of where our bearings currently are.
Tests may be user journeys in the user interface, a simulation of requests and response through the app’s API, or small tests within the application’s discrete units, most likely a combination of all these types of tests, perhaps more. What matters is that we find some value in whatever test we write, and that value merits its cost of writing and maintenance. What’s important is asking ourselves whether the test is actually significant enough to add to the test suite.
It is valuable to build a good enough suite of tests. It makes sense to add more tests as we find more key scenarios to exercise. It also makes sense to remove tests that were necessary in the past but aren’t anymore. However, I don’t think it is particularly helpful to advocate for 100% test coverage, because that brings the focus on a numbers game, similar to how measuring likes or stars isn’t really the point. I believe it is better when we discuss among ourselves, in the context we’re in, which tests are relevant and which are just diving into minutiae. If our test suite helps us deploy our apps with confidence, if our tests allows us to be effective in the performance of our testing, if we are continuously able to serve our customers as best as we can, then the numbers really doesn’t amount to much.
For the past few weeks a number of programmers and myself have been tasked to build an initial prototype for a system rewrite project, handed to us by management. The merit of such project is a matter of discussion for another day; for now it is enough to say that the team has been given a difficult challenge, but at the same time excited about the lessons we knew we will gain from such an adventure.
There’s been several takeaways already in terms of technology know-how – dockerized applications, front-end development with Vue, repositories as application vendor dependency, microservices – just several of the things we’ve never done before.
But the great takeaway so far is the joy of literally working together, inside a room away from distractions, the team working on one task at a time, focused, taking turns writing application or test code on a single machine, continuously discussing options and experimenting until a problem is solved or until it is time to take a break. We instantly become aligned at what we want to achieve, we immediately help teammates move forward, we learn from each other’s skills and mistakes, we have fun. It’s a wonder why we’ve never done much of this before. Perhaps it’s because of working in cubicles. Perhaps it’s because there’s nearly not enough available rooms for such software development practice. Perhaps it’s because we’ve never heard anything about mob programming until recently.
I’m sure it won’t be everyday since we have remote work schedules, but I imagine the team spending more days working together like this from here on.