At most companies, there seems to be a never-ending supply of meetings to attend, and complaining about their sheer volume is popular water-cooler conversation. We may not be able to reduce the number of meetings, but can we make them more effective (and thus, shorter)? Certainly!
For example, many people still start and end their meetings on the hour. This ensures that the people in the meeting have to wait for everyone who had a previous meeting that ended on the hour to arrive. And if that meeting slopped over its end time by a few minutes, the delay is even more. If you start your meetings five minutes after the hour and end them five minutes before the hour, it gives people time to get from one conference room to another, and maybe even stop for a coffee or a bathroom break.
And about meetings that go over their scheduled time: When you do that, you are taking time that does not belong to you. It may be convenient for you to extend the meeting on-the-fly, but if the attendees have subsequent meetings that you are making them late for, you are wasting the time of the people in those meetings, which is terribly rude. On top of that, Agile development is all about delivering on time. If we can't even bring a meeting in on time, how can we hope to deliver software on time?
Another example: Many Scrum teams use what Mike Cohn calls commitment-driven iteration planning. (Agile Estimating and Planning, p158). The team first figures out the number of hours available in the sprint, based on the percentage of their time they expect to be able to spend on stories and any days off that team members have. Then they take stories one at a time, break them into tasks, estimate how long each task will take, and subtract the time from the available hours. They continue to do this until they run out of hours. At that point, their sprint is loaded.
I have been in sprint planning meetings where figuring out the days off can take 15 to 20 minutes. The team goes around the table, with each team member saying something like "OK, I was going to take a few days off, maybe this sprint, or maybe in July. Or maybe I could take them in April." Eventually, after a two- or three-minute monologue, the team member figures out how may days he or she is going to be out during the sprint. Then they move on to the next person.
It doesn't have to be this way. They all know that they are going to be asked about days off in the sprint; this happens every planning meeting. Couldn't they figure this out in advance, rather than making everyone sit around twiddling their thumbs while they muse about when it would be good to take vacation? In some cases, there are coverage issues. This can be handled by announcing in the daily standup that you would like to take some time off in the next sprint, and if there is a problem, working it out before the planning meeting.
And in spite of all that has been written in the literature about not keeping a room full of people waiting while a poor typist updates a story in the sprint-tracking tool, we are still doing it regularly. Write it down and do it later!
I have seen lots of stories in the Agile tracking tool that refer to issues in the problem tracking system, or to documents in documentation repositories. It is frequently possible to get a URL for these issues or documents, and this should be put in the story, rather than just saying "Issue 12345" or "the code spec is in the repository under this project". If you make people search for information that you must of had a link to at the time you were looking at it, you waste the time of everyone who has to look at your story.
Wasting people's time in meetings isn't just rude; it severely impacts the flow of the meeting. While you are searching through the document repository to try to find the referenced code specification, the rest of the team starts checking their phone for messages, or talking about last night's ballgame, or they decide this would be a good time to get a coffee or take a bathroom break. It may take you five or ten minutes to get the meeting back on track.
It is easy to gripe about meetings, but it is not much harder to make them a lot less wasteful, which will go a long way towards making them less irritating.
Chaos Monkey is tool developed by Netflix to test the resiliency of their servers on the Amazon cloud when faced with failures. It periodically a terminates a random virtual machine that is running their application. Their automated error recovery is supposed to spin up a new virtual machine to replace the one that failed, and do so in a manner that appears seamless to customers.
Rather than just implementing the error recovery code, testing it once, and assuming that it will do the job, they are constantly testing it, figuring that if it really can seamlessly recover from failures, there should be no problem with randomly blowing away virtual machines.
Realizing that there is the possibility that the recovery could fail, they run Chaos Monkey between 9 AM and 3 PM on weekdays, so if a problem does occur, there will be people present who can deal with it. They also have a way for applications that they know are not ready for this to opt out.
This got me thinking about testing the agility of our teams.
One of the big reasons we do Agile development is so that we can change direction at sprint boundaries, if the priorities for delivering particularly stories changes. By finishing all their work by the end of the sprint, the team is able to change direction immediately.
Some teams have trouble understanding this. They resist breaking large stories into sprint-sized pieces, because they say it will increase the overall elapsed time to implement the change. This overlooks several things:
- If it takes you six months to implement the change, the customer's needs may have changed by the time you finish.
- If you try to test six-months-worth of coding and it doesn't work, you have to wade through all that code to find the error. If you are implementing it in Sprint-sized stories, you only have one-Sprint's-worth of code to look through.
- Priorities may change. The Product Owner may need to have the team implement some feature ahead of time to keep from losing a customer. If you are two months into a six-month product and you have dozens of modules open, it is very difficult to change direction.
For teams that cling to elapsed time as the only viable metric, I would propose engaging the Product Owner in the following exercise:
If you have several epics that have similar priorities, mix them up each Sprint. If the first epic has stories A, B, and C, and the second epic has stories P, Q, and R, and the team is currently working on story A, they will expect that in the next sprint they will be working on story B, and in the next, story C. Instead, have them work on story A, then P, then B, then Q, etc.
This will make transparent how agile they are, and will help get them out of the habit of assuming that if they don't finish a story in a sprint, they can just roll it over to the next with no consequences.
Of course, just as Netflix runs Chaos Monkey only during weekdays when people are present, you would want to be careful about how you do this exercise:
- Don't do it during a deadline crunch.
- Let the team know a sprint or two in advance that you are going to be doing this, and that you expect them to be able to do this.
- Discuss in the Retrospective how well this worked, and what they can do to make it work better.
- A few stubborn teams may say that this is stupid and a waste of their time. You may have to remind them that the Product Owner is responsible for the priority of stories in the backlog, and the team is responsible for committing only to what they can accomplish a Sprint.
Even teams that already buy into the idea of being able to change direction at sprint boundaries may discover impediments to doing so that they didn't see before.
Just as Netflix exercises their recovery software so they know it will work when they get a real failure, teams should exercise their agility regularly, so that when a critical customer demand comes along, they are practiced and ready for it.
Poka-yoke (the final "e" is pronounced like "eh?" in English) is the Japanese term for "error proofing", formalized by industrial engineer Shiego Shingo as part of the Toyota Production System. (He is said to have picked the term "error proofing" rather than "fool proofing" [baka-yoke] to underscore that the problem was not foolish workers, but the fact that everyone makes mistakes from time to time when given a chance.)
In the Toyota Production System, poka-yoke deals with designing vehicles and their assembly processes in such a way that it is difficult to assemble them incorrectly.
Here is an example from the 1940s of the need for poka-yoke: My father worked at Kaiser-Frazer when they used to make a car called the Frazer. It had "FRAZER" spelled out in individual chrome letters on the front of the hood, like this.
Posts that stuck out the back of each letter went through holes in the hood to hold the letters in place. The positions of the posts were standardized, making all the letters interchangeable. Sometimes, when an assembly worker was having a bad day, he would end up making ZEFFERs instead of FRAZERs. Automakers subsequently changed the letters so they could not be interchanged, either by putting the posts in a different position on each letter, or by making the logo a single unit, like this.
A more modern example is the floppy disk drives that used to be in PCs. They used an unkeyed four-pin power connector that could be installed the right way, which would make the drive work, or the wrong way, which would make it go up in smoke. You had to remember that the red wire went away from the data connector, unless you were working with the odd brand where the red wire had to go toward the data connector. (I fried several floppy disk drives this way.) That was the last unkeyed connector I remember seeing on a PC, so the PC industry evidently adopted poka-yoke.
There are all kinds of modern examples, like polarized power plugs, and IKEA furniture, which is difficult to assemble wrong, and the same concept applies to software development.
For example, suppose you have a routine that performs several different functions. (That, in itself, is probably a violation of the Single Responsibility Principle, but suppose there is a good reason for it to be that way.) Some of the functions require few parameters, while others require many. Callers have to remember to pass the right number of parameters, including dummy parameters. So you get something like this:
If you miscount the commas, you have a problem. It would be much less error-prone to have multiple entry points, each with a fixed number of parameters.
The same concept applies to development tools. In one build system I worked with, you have to type in the names of all the modules that are part of your change. If the build fails because you left out one of the changed modules, you have to re-submit the build request, again typing in the names all the modules. If you have 30 or 40 modules in your change, you might mistype or leave out one of the names you got right in the first request, causing the build to fail again. If you could just call up the first request and say you wanted to add a module, there would be much less chance of error.
Another case is anywhere that you have to enter the same information into two different places. Eventually, you will forget to update one of them, or will mistype the information while entering it the second time. If the systems can be made to talk to each other, this greatly lessens the chance for error.
One oft-mentioned feature of Lean manufacturing is the andon light or the andon cord. The idea is that any employee on the assembly line who encounters a problem pulls the andon cord, the line is stopped, and the light comes on to indicate where the problem is.
By the way, andon is the Japanese word for paper lantern, which has apparently been generalized to mean any lantern. Here are a couple of good references about the history of andon lanterns:
Although andon lights are frequently mentioned in the Agile development literature, some of their most important points are sometimes glossed over.
In A Study of the Toyota Production System, Shigeo Shingo, an industrial engineer noted in connection with Toyota’s SMED (Single Minute Exchange of Dies) program that reduced setup times for punch presses from many hours to less than a minute, said:
The andon is a visual control that communicates important information and signals the need for immediate action by supervisor. There are some managers who believe that a variety of production problems can be overcome by implementing Toyota’s visual control system. At Toyota, however, the most important issue is not how quickly personnel are alerted to a problem, but what solutions are implemented. Makeshift or temporary measures, although they may restore the operation to normal most quickly, are not appropriate.
The key point is that each time the line is stopped because of the andon, the team strives to make sure that the same error does not happen again. Shingo states this more forcefully when he says “At Toyota, there is only one reason to stop the line—to ensure that it won’t have to stop again.”
The andon lights used by continuous integration teams come close to this philosophy. The light comes on when a build fails, or the automated tests that run as part of the build fail. Everyone on the team stops what they are doing and works on fixing the build.
What is missing sometimes is the idea of making sure that it doesn’t happen again, perhaps by implementing a Poka-yoke solution that will prevent the error from happening again.
When you read about the Toyota production system, what is striking is the commitment to doing whatever it takes to eliminate waste and errors, even at the expense of some short-team pain.
This is difficult to do in a mature company that is trying to adapt to Agile development, because it is a major change, and various departments are not used to working closely together. But it is essential to eliminate waste.
Online game companies are frequently on the forefront of technology, both the technology of the games, as well as how they are developed. For example, IMVU, a 3D online chat website, has been a leader in continuous deployment, deploying as many as 50 changes a day.
Another development leader was Cmune, a Chinese company that used to make the MMO (Massively Multiplayer Online) first-person shooter game, UberStrike. (UberStrike was "sunsetted" in June 2016.)
In case you are not familiar with first-person-shooter games, there are various levels in a game, each one more difficult than the previous ones. As a player proceeds through the game, gaining more points (however this is achieved in the mechanics of the particular game), the player ascends to harder and harder levels.
Each level has a map, which defines the terrain and buildings that the player has to negotiate while playing the level. Designing a map is a two-pronged affair. First the terrain and buildings have to be defined in such a way as to be fun to play. Next they have to be modeled and textured, so they look realistic.
Traditionally, both steps are completed before the level is made available to players. If the majority of players decide that the level is too easy or too difficult, then all of the effort in modeling and texturing it is wasted.
Cmune decoupled these two steps for UberStrike with their Bluebox Maps program. Proposed level maps were made available to interested customers. They were not textured (they had a uniform blue color, thus the name of the program), and high-quality modeling had not been completed. Also, game mechanics, such as shooting, were not implemented. Here is an example of a Bluebox map.
Participating customers could download a Bluebox map and try it out, in order to determine whether it would be fun to play. Based on the feedback Cmune receives on a map, they either continued with the high-quality modeling and texturing, or they discarded the map.
Developers outside of the game world can learn from the Bluebox program. When we think about getting feedback from customers, we usually think of showing them completed features. Since we break even large epics down into sprint-sized pieces, the feature we are demonstrating to the customer may be a small, incremental change, but it is generally complete.
In some cases, it may be beneficial to break changes into even smaller pieces, large enough that the customer can see if we are going in the right direction, but not polished enough to actually release.
This must be done with caution, particularly if we are using continuous integration, or other SCM methodologies where everything gets checked into the main branch. (For some hints, see my previous post, Small Stories, Legacy Code, and Scaffolding.) Perhaps a feature flag can be added, so when it is turned on, it lets the customer go as far as the part being demonstrated, and then stops.
One of the buzzwords of Agile development is failing fast. The sooner you can find out that what you are developing is not what the customer wants, the sooner you can change course, without a lot of wasted development time.
Page 4 of 5