Note: This is a cross-post of an article I wrote for Medium.
When creating a design, it can be difficult to ascertain how well you’re doing or where you might be having issues. Often, our goals in design rely on concepts which are notoriously hard to measure. How does one measure that they’ve created an “enjoyable experience?” Or that they’ve made a game feel “smoother?” Contrast this with something like sales, where you have several concrete and measurable factors. Number of sales made, profits, and expenditures are all easily-measured things one can look at to determine whether sales have been successful or unsuccessful.
The key determining difference between these two factors is the existence of reliable numbers. In the former group, you must rely on identifying the qualities of something (e.g., “enjoyable,” or “smooth”), where in the latter you derive meaning by looking at various quantities (e.g., dollars, distance, or time). In research, there are methodologies for distilling information through each of these lenses, known as qualitative methods and quantitative methods, respectively.
To gain an effective understanding of your design, you’ll often have to apply a mixture of both methods to get the best results. Regardless of the methods you use, measuring your design through research is critical to creating a successful game.
As I’ve been designing the gameplay for Diabolical!, I’ve used two primary methods to gather data to help me determine how well the design is working. Both methods are primarily qualitative in nature, though I do have some quantitative measurements thrown in for me to benchmark against. Using qualitative methods does not preclude you from gathering quantitative data as well. The two forms of research I use are:
- Direct observation of play testers
- Surveying play testers
Together, these create a potent (and cheap!) combo for me to understand both what players are doing in the game as well as players’ perceptions about what they were doing in the game. As a designer, it’s important for me to understand both to better inform my decision-making. In this post, I’ll cover how I gather information through observing a play test. In a future post, I’ll cover more about the survey I use after a test.
Direct observation through play testing
Play testing is the most crucial method to creating a successful game (unless you’re the maker of We Didn’t Playtest This At All). Observing a play test is a great way to understand how people less familiar with the rules will interpret them and make decisions.
As far as tools go, I typically only use a prototype of the game, a printed rulebook, and a notebook with a pen or pencil. In the notebook, I annotate where players struggle, what went well, or anything I else I think is worth remembering later. After the test, I transfer all of my notes into Evernote, where they are more consolidated, searchable, and I can easily share them with James.
When running a play test, I try to say no more than what is necessary and leave as much of the talking as possible up to the players. I do this because it allows me to extract the maximum amount of information possible from a test. If a player asks me a question, I frequently ask them what they would do if I weren’t there or ask the other players what they think. Hearing the players confer over the meaning of a specific rule gives me valuable insight into how they are considering the given problem. Maybe the rule needs a slight verbiage tweak to be more clear, or perhaps it’s being significantly misinterpreted and needs to be rewritten entirely. In reality, players won’t have the designer of the game sitting next to them ready to answer whatever questions they have. In order to emulate this environment successfully, it’s important I avoid over-explaining things in the context of a test.
Along these same lines, I always have a printed rule book for the players to learn the rules from and to reference while playing. By having a printed rule book, there is a consistent, controlled way to explain the rules from test to test, which reduces variance. If I try to describe the rules to each testing group, I might forget something important or I might use an example that is less successful than an example I used in a different test. This variance makes it more challenging to assess what isn’t working well. Is it the rules? Or the way I described them? Eliminating unnecessary variance in the a test will help keep the data cleaner and easier to interpret.
I also always time how long each test takes, which is a quantitative measure. Timing the play tests gives me an understanding of how long the game is taking for new players versus how long I want it to take. Ideally Diabolical! will take between 60 and 90 minutes to play. When I first began testing, it was taking much longer – closer to 2 hours or more. By timing the tests, I knew I had to streamline and adjust the structure of the game to reduce its length.
Also while observing, I keep track of how many times players make errors and whether or not those errors are caught by other players. Keeping track of this helps me to identify areas of the game where the rules may need to be made more clear. For example, in early tests I found that often times when players were supposed to “remove a token” from a card, they interpreted that to mean that they should “remove a token and keep it,” rather than “remove a token and discard it,” which was my intention. After this came up in my notes a few times, it was clear that I needed to describe my intent more clearly in the rules.
By saying as little as possible, keeping printed rules on hand, timing tests, and keeping good notes, I ensure that I get as much as I can out of a play test. These methods help me stay on the right track as I develop the rules to Diabolical! In the future, I plan to start recording play tests. This will allow me to share them on the website as well as more closely observe where players are succeeding or having issues.
Thanks for reading! Keep an eye out for part 2, where I’ll discuss the survey I use as well as provide a copy of it for you to use in your testing!