Usability Testing at the University of Arizona Library: How to Let the Users in on the Design

Ruth Dickstein and Vicki Mills


While the business world has made great strides in focusing on customer service by studying customers, needs and behaviors, libraries have tended to structure their holdings and services around what they believed was good for their customers. It is the "librarians know best" syndrome, and it is pervasive throughout our profession. At the University of Arizona Library, we have been trying to change that model. All the staff are actively working to transform our institution to a user-focused library. We are doing this by asking our users what they need and by listening carefully to what they say. We are continuously monitoring what we are doing and asking what we could be doing better. One important change we are making in our behavior is to evaluate our activities through our customers' eyes and ask ourselves what effect our actions will have on customer service and satisfaction.

A combination of factors came together in 1997 to prompt us to work on redesigning SABIO, the library's information gateway (see figure 1). In addition to our library's increased emphasis on user needs assessment, we were concerned about the rapid expansion of Web-based indexes and full-text databases and the obvious frustration of our customers with using the current gateway. A project team, Access 2000, was created to redesign SABIO. The team was charged with including our customers in the redesign process in order to create an end product that would increase their satisfaction with, and success in, finding the information sources they needed via the gateway.


Figure 1. SABIO Gateway 1997

Figure 1. SABIO Gateway 1997

Two years later, we have a unique and highly successful information gateway for our library. We used our customers, primarily our students, to guide us in the design. We believe that we have created a user-centered site. In other words, it is a design that fits the user, rather than one that makes the user fit the design. This article will describe the road we took from the beginning of our project in 1997 to the present. It will outline the different usability evaluation methods we used, how we conducted usability tests, how we analyzed the data, and how we continually redesigned the Web site in response to our user input.

Access 2000: The Design Team and Beginnings

Access 2000 consisted of five librarians, one systems expert, and a graphic artist. None of the team members had any experience with or significant understanding of user-centered design and usability methodologies when we began. Of necessity, the first order of business was to read and educate ourselves. The writings of Jared Spool, Jakob Nielsen, and Jeffrey Rubin were what we found most helpful.1 As we read these works, we also began to gather information from our users. We did this in several ways: a user satisfaction survey, five focus groups (one each for faculty, graduate students, and library staff, and two for undergraduates), and an analysis of customer feedback to the library from a variety of sources.2 We also visited other Web sites, both library and commercial, collecting ideas that we thought we could use.

Based on our readings, customers, comments, and review of other library and commercial Web sites, Access 2000 developed a set of design guidelines (see appendix A).3 Our site design attempted to follow these guidelines. They helped ground us and give us direction. Whenever we ran into a problem, we would go back and review the guidelines.

As we look back at the design guidelines, we believe that they were sound and gave us good direction in our subsequent work. (In fact, we are impressed with how astute we were in developing those guidelines so early in our project.) During the usability testing, we often discovered that when parts of the design were not working correctly, it was often because we had not followed one or more of our guidelines. For example, there was a guideline relating to consistency of language between pages. On occasion, we would use a term or phrase on a link to a page, but not repeat that word or phrase on the connecting page. This lack of continuity confused the test subjects and alerted us to a design problem and a violation in our guidelines.

Despite our best intentions, we found that several of the guidelines were in conflict. These guidelines were:

In an effort to observe the guidelines about making links predictable, we put lists of words under each button to cue users about where that link would lead. For example, on the first screen under one major link labeled E-Resources, we listed the following: "Indexes to articles by subject, electronic reference sources (e.g., encyclopedias), electronic journals and texts, subject guides and exhibits."

In trying to "make the link predictable", we had violated the guideline to "make screens . . . simple" and also to "make screens easy to scan". After analyzing the results of the usability tests for the second prototype, we completely changed our design for the third. In this design, we followed the two guidelines about simple, uncluttered screens and easy to scan screens, but broke the minimal graphic rule. In order to make the links predictable and to make the page simple, uncluttered, and readable, we used a few carefully chosen words along with meaningful graphics. We found there was no way to follow all four of these guidelines at the same time and meet the needs of our students. Moreover, we came to understand that we were wrong about our guideline on minimal graphics.

The creation of guidelines is a vital initial step in good Web design. The guidelines are instrumental in keeping designers on track. They are essential when conducting one of the earliest types of usability methods, heuristic evaluation.

Usability Methods

Usability evaluation is the mechanism that provides designers the crucial information central to the user-focused design process. Usability methodologies should be a regular part of site maintenance, but they are particularly useful at the early stages of site design. Ongoing user input encourages an iterative design process. As Garlock and Piontek observed, "With the iterative approach, a site prototype can be created, viewed, and evaluated by a small subset of users who identify problems and suggest changes. Once immediate problems are corrected and suggestions are implemented, the site is released to a larger audience, who become part of the design and quality control team by using and evaluating the resources and sharing their feedback with the developers."4 The iterative process is continuous and always includes user feedback and evaluation. While the most commonly known type of usability evaluation is the formal usability test -- the tester observes and records the user performing specific tasks -- there are other methods of measuring usability. The Access 2000 team used several of these other methods including a combination of heuristic evaluation, design walk-through, and card sorting.

Heuristic Evaluation and Design Walk-Through

Heuristic evaluation is a systematic inspection of a user interface to examine if the design is in compliance with recognized usability.5 Design "walk-throughs" are used to explore how a user might fare with a product by envisioning a user's route through an early concept or prototype of the product.6 A walk-through is intended to anticipate problems that might occur before formal testing with users begins.

Access 2000 used a combination of heuristic evaluation and design walk-through before testing any part of the new design with users. First, team members reviewed each page against the design guidelines ("heuristics"). Then we checked every link to make sure that the proper connections worked. Using real-life scenarios, we "walked through" the prototype following a range of choices we thought our users might take. We looked for gaps in logic, pages that were missing, inconsistencies in terminology, and use of library jargon. At all times we tried to ensure that pages were kept simple and uncomplicated and that links were obvious and not hidden in text.

Conducting heuristic and walk-through evaluations did uncover problems, inconsistencies, and violations of guidelines, information that we then used to improve our design before we tested it on users. However, no matter how thoroughly we, the designers, evaluated and examined our site, we were never able to anticipate all the problems that our users encountered when they actually used the site.

One other very important outcome of heuristic and walk-through evaluations is that they provided us a way to look at our work less subjectively. Success in usability testing requires Web designers to be willing to step back and view their work in an objective and critical manner. Heuristic evaluation and design walk-throughs are a first step in this process.

Card Sorting

Card sorting is a method for testing organization and menu structure. A typical application of this method is to ask users to sort cards with concept or menu terms into meaningful groups and then label or name each group.7

Access 2000 used card sorting to help develop terminology and hierarchy for the organization of the "Indexes to Articles" page. At the time Access 2000 began its work, the library was subscribing to over fifty Web-based indexes and about as many CD-ROM products. Because we were aware of the great difficulties customers had in locating indexes on the old indexes page (which were organized alphabetically by title), we thought that grouping the indexes together by broad subject would be easier for our users. We believed card sorting would provide us with the guidance about how to group the indexes and what to call the subject groupings. We wanted to use language that students recognized and understood.

A set of index cards was prepared with the name of each database (Web or CD-ROM) on a card. If the title of the index did not readily indicate subject coverage, a parenthetical topic word was added, such as ERIC (education), or ABI/Inform (business and management). There were a total of eighty-two titles in the card set.

Test subjects were given a set of cards to sort, group, and then label. The contents of each group and group name were written on a form. It was our hope that the test subjects would identify ten or fewer clear categories to use for the index page and that they would give us meaningful terms for each of these categories. However, test subjects preferred more than ten categories. In fact, they created from thirteen to thirty-seven different categories, and there was agreement in terminology for only a few.

Initially we were disappointed with the results of the card sorting, and since we did not get what we expected, we ignored much of our users, advice. In hindsight, we realize they were giving us useful information. Our design team did not want to create a long list of subject categories, even though this was clearly what the students, feedback indicated.

This was an example of forgetting to put the user in the center and falling back on "librarian knows best" behavior. We went ahead and developed the page we wanted, an Indexes page with twelve broad categories, such as social science, humanities, life sciences, etc. -- words supplied by librarians rather than students. We made all the indexes fit into those categories, although some were not an easy fit. This was the indexes page that we brought up to the public.

After several months, we noticed that many library users were confused with the organization and terminology we had created. Users complained that they could not find indexes for some subjects, such as medicine, literature, or religion. This was because these subjects were all but hidden under the broad categories of Agriculture and Life Sciences for medicine and Humanities and Fine Arts for literature and religion. Our users, unsuccessful experiences with the indexes page we had designed forced us to


Figure 2. Subject Scroll Box

Figure 2. Subject Scroll Box

reexamine our work and return to the results of the card sorting exercise. We needed to find a way to list each subject independently, rather than just group them under broad categories. The solution was to use a scroll box that allowed as many specific subjects to be listed as needed (see figure 2). The scroll box also enabled synonyms to be included for some subjects, such as both health and medicine, which were both used by students in the card sorting exercise. As a result of this revision, we now have an index page that is organized according to student suggestions with terms that they understand and use.

Formal Usability Testing

Formal usability testing is the observation and analysis of user behavior while users use a product or product prototype to achieve a goal. Usability testing has been a method for improving product usability and design in the computer industry for years. It has been slow to make inroads into the library world.

Libraries may have been reluctant to conduct formal usability studies in the past for several reasons. First, libraries have not been more involved in usability studies because librarians have not traditionally viewed themselves as designers of systems. The Web is changing this, and we are being exhorted to become more involved in designing systems that are usable, that work well and easily for our customers, so easily that they require little instruction in their use.8

Another reason that libraries have been slow in adopting usability testing is that they may presume that these tests are difficult to conduct, costly, and time consuming.9 For many years, formal usability testing was the province of human factors experts who had at their disposal expensive video equipment and test rooms with hidden observation windows. However, at the University of Arizona, we have been able to conduct nine cycles of usability tests with a small group of people, on a limited budget, and in a relatively short period of time. As the team became more adept at conducting these tests, we were able to plan, conduct, and analyze a set of tests in several weeks. We have discovered that usability tests can be successfully conducted on a test group of between eight and twelve users and that enough information can be extrapolated from them to be representative of the general population.10 It was usually apparent by the fourth or fifth test that the tested feature was either a problem or a success. However, we almost always learned something valuable and unique from each of our test subjects.

The key to successful usability testing is careful planning and preparation. The following are the steps we used for formal usability testing.

1. Decide what to test.

2. Write a set of scenarios that will require the users to perform the tasks you want to test.

3. Write a script for administering the test -- this is necessary for consistency. It is particularly important if more than one person will be conducting the test.

4. Test the test. Try it on some users -- find out if the scenarios are comprehensible. Ask yourself if the scenarios are having users perform the appropriate tasks and if you are actually testing the functions that you wanted to test.

5. Train the testers and recorders.

6. Gather volunteers to be tested.

7. Make sure you have a quiet place to do the testing. It's a good idea to conduct the practice test in the actual room that you will use for the real testing.

8. Conduct the tests.

9. Record the test results as soon as possible after the test is completed.

10. Analyze the results and determine how to correct the design problems. Redesign based upon usability evaluation results.

Recording the Test and the Path Form

Besides the activity of administering the test, we developed a method of recording the tests. While practicing our skills at conducting the test, we realized that two people were needed for this process. One person, the questioner, conducted the test and a second person recorded what the subject said. We chose not to use a tape recorder for these sessions, but rather have the recorder capture as much as possible of what each test subject said. The questioner had a Path Form which contained the questions and the potential screen choices available. As a test subject clicked on the screens, the questioner noted the paths a test subject used, placing a "1" next to the choice for the first click, "2" next to the second choice, etc. Subjects did not always follow a linear path and this method allowed us to visualize the backward and forward steps a subject used. The Path Form also contained questions that provided demographic data, such as student status, major, previous library instruction experience, and Web experience.13

We found that asking subjects an open-ended question about what they liked and did not like about the site was a good way to end the test session. We didn't originally write this question into our script, but some of us just naturally began asking this, so it became formalized. We frequently gained additional insight about our design from these comments.

When the test was completed, subjects were paid and thanked for their assistance. After the subjects left, the two team members reviewed what had occurred. Problems the subjects had with the prototype were analyzed and summarized.

As soon as possible after the test was completed, the recorder filled in an online Path Form, which included both the paths followed for each task and the comments made. After all the tests had been recorded, each team member read and analyzed the results. Then we met as a group and shared our individual analysis of the test results. We developed a list of design successes, problems, and possible solutions to the problems. Based upon the result of each test cycle, we either redesigned or pronounced a portion of our design complete.

What We Learned from User-Centered Evaluation and How it Changed Our Design

Our initial attempts at screen design proved almost totally ineffective. Users were uniformly confused by our first two attempts. While this was very disheartening, it forced us to reconsider what we were doing, and helped us get beyond the traditional library thinking. We realized that we were still designing for librarians and not for the customers. The result was a wholesale rethinking of our work. We threw out all of our early work and started all over. This was a difficult but crucial decision.

Some of the problems that surfaced with the first two prototype designs were caused by librarians expecting users to understand how library information is organized and to know the meaning of standard library terminology. "Catalog", "index", "resources", "databases", and "reference" are meaningless to many students. We learned that if students have no idea why or when they should use an index, they will not choose a link labeled Index, no matter how well designed the Web page is. We decided to use a combination of graphics and a few carefully chosen terms to represent the six major links to library research information.14 For example, with the graphic for the catalog, we added the term "Catalogs", which the experienced user would recognize, and the words "Books & More" and also "What We Own" to teach the inexperienced users that "Catalogs" is the place to search for books and other items the library owns. Likewise, the link to indexes pairs the term "indexes" with "articles", as well as includes an image of a magazine and newspaper in the graphic (see figure 3).


Figure 3.


Figure 3. New SABIO Graphics

In addition to the use of graphics and simple terminology, Access 2000 also developed a series of help features for the user who had no idea where to begin.

Our first help feature was the idea of our system's specialist, Gene Spesard. He was puzzled about how the system itself could help the person who was truly stymied. His answer was the "HOW TO FIND" pop-up box (see figure 4).


Figure 4. New Help Feature


Figure 4. New Help Feature

This box lists many basic library materials or search requests and either links directly to the appropriate search page or teaches the user how to search for that item. For example, the "How to find MAGAZINES owned by the library" connects to the catalog with the search query set for a journal title search. The "How to find MAGAZINE articles" leads to a page with a brief explanation about the purpose of an index and then connects to the "Indexes to Articles" page.

In addition to the HOW TO FIND box, we developed a number of tips pages located at the point of need. These pages are called "Tips" because the literature indicates that users are more likely to click on a link labeled "Tips" or "Hints" rather than "Help". According to the User Interface Engineering Web site: "The word 'Help' implies that the user must admit failure."15

On the other hand, tips should be short and to the point. This is what we tried to do when we created our tips pages, brief helping screens that we placed at the point of need. In creating these pages, we avoided using lengthy paragraphs of text, choosing to use bullets, charts, and as much white space as possible. We tried to make our tips pages attractive, inviting, and easily scanned.

The last type of help page we developed was a series of subject pathfinders listed under an icon labeled "Research by Subject".16 These pathfinders guide beginning researchers to basic sources, both print and electronic. Access 2000 developed the model and template and asked all subject specialist librarians to develop a pathfinder for their subject areas. These subject pathfinders have proven to be quite popular with both students and reference staff. Log statistics indicate that they are being used heavily.

At the same time that Access 2000 was working, another group at the University of Arizona Library was developing a series of instructional tutorials called RIO "Research Instruction Online". These award-winning instruction modules provided the final piece of the online help features.17 While RIO was designed to be used as a part of class instruction, it is also available to all users. Many of the Tips provide both quick helping suggestions as well as a link to RIO as a choice for the user who wishes to take the time to learn about parts of SABIO in more detailed fashion.

The Value of User-Centered Design Methods

Usability testing methods, by their very nature, keep the user in the forefront. User testing helped us, the designers, distance ourselves from the product. Rather than focusing on what we thought/liked/needed, we could focus on what the user thinks/likes/needs (see figure 5).


Figure 5. 
        SABIO Gateway 2000 (www.library.arizona.edu)

Figure 5. SABIO Gateway 2000 (www.library.arizona.edu)

This constant interaction between designers and users encouraged us to be open and to design the product for users rather than for ourselves. Our library users are more successful in their use of SABIO because they guided us throughout our design process. Personal observation, classroom encounters, and log analysis data support this.18

There are many reasons to include user-centered testing in all kinds of product designs:

Finally, the most important reason to utilize user testing methods is -- testing works. It enriches the end product because input is received from diverse thinkers with a wide range of creative ideas. It ensures that the product works for those who will be its harshest critics -- the users. It turns the user into the designer. Indeed, your users are the experts, listen to them.


References and Notes

1. Jared M. Spool and others, Web Site Usability: A Designer's Guide (North Andover, Mass.: User Interface Engineering, 1997); Jakob Nielsen, "Finding Usability Problems through Heuristic Evaluation," in Human Factors in Computing Systems: CHI Conference 92 (New York: Association for Computing Machinery, 1992): 373-80; Jakob Nielsen, useit.com: Jakob Nielsen's Web Site. Online. Accessed 10 Nov. 1997, www.useit.com/; Jakob Nielsen and Darrell Sano, "SunWeb: User Interface Design for Sun Microsystem's Internal Web (proceedings of the 2d World Wide Web Con ference 94: Mosaic and the Web, Chicago, October 1720, 1994), 547-57. Also available at www.ncsa.uiuc.edu/SDG/IT94/ Proceedings/HCI/nielsen/sunweb.html. Accessed 10 Nov. 1997; Jeffrey Rubin, Handbook of Usability Testing: How to Plan, Design, and Conduct Effective Tests (New York: John Wiley & Sons, 1994).

2. The User Survey Questionnaire is available at http://dizzy.library.arizona. edu/lib_comp_survey/survey.htm.

3. Additional sources about Web site design and usability methods that would be useful to those beginning this process include Susan Feldman, "The Key to Online Catalogs that Work? Testing: One, Two, Three," Computers in Libraries 19 (May 1999): 16-20; Kristen L. Garlock and Sherry Piontek, Building the Service-Based Library Web Site: A Step-by-Step Guide to Design and Options (Chicago: American Library Association, 1996); and Kristen L. Garlock and Sherry Piontek, Designing Web Interfaces to Library Services and Resources (Chicago: American Library Association, 1999).

4. Garlock and Piontek, Designing Web Interfaces, 75.

5. Nielsen, "Finding Usability Problems," 373.

6. Rubin, 21.

7. For a more detailed explanation of card sorting, see "What Is Card Sorting?" Information and Design. Online. Accessed 13 July 2000, www.infodesign.com.au/ usability/cardsorting.html.

8. Verlene J. Herrington, "Way Beyond BI: A Look to the Future," The Journal of Academic Librarianship 24 (Sept. 1998): 38186.

9. Since we began our work in 1997, more libraries have begun to conduct usability tests for Web designs and other product testing. Some libraries that we know about include the Massachusetts Institute of Technology, University of Washington, University of Georgia, University of Minnesota, and University of Wisconsin. The last two institutions are using this method for the design of instructional tutorials.

10. Rubin, 93. According to Rubin, "for the purpose of conducting a less formal usability test, recent research has shown that four to five participants will expose 80 percent of the usability deficiencies of a product, and that this 80 percent will represent most of the major problems."

11. An example of the questions used during the second round of testing can be found at http://dizzy.library.arizona. edu/library/teams/access9798/ usability_studies/questions.htm.

12. An example of the script can be found at http://dizzy.library.arizona. edu/library/teams/access9798/ usability_studies/guide.htm.

13. An example of the Path Form can be found at http://dizzy.library.arizona. edu/library/teams/access9798/ usability_studies/pathform.htm.

14. All of the icons and page design on the current SABIO site have been designed by Marty Taylor, the library's graphic artist and a member of Access 2000.

15. User Interface Engineering, "Making Tips Work. Online." Accessed 21 Apr. 1998, http://world.std.com/~uieweb/tips.htm.

16. These idea for pages was copied from the Ohio State University Library's Subject Guides. Available at www.lib. ohio-state.edu/gateway/subjects.html.

17. RIO received the 1999 ACRL Innovation in Instruction Award.

18. With the new SABIO gateway, there has been a 59 percent reduction in the number of hits on our home page relative to hits on secondary pages. This indicates that users are being more efficient in the use of our site and do less "pogo sticking" (jumping back to the home page when their first choice didn't locate what they wanted). The data for this information can be found at http://dizzy.library.arizona. edu/library/teams/access9798/reports/graphs/beforeandafter.jpg and http://dizzy.library.arizona.edu/library/teams/ access9798/asu/stats.htm.

19. Rubin, 12.


Appendix A

Guidelines for Web Design


Ruth Dickstein (dicksteinr@u.library.arizona.edu) is a Social Sciences Librarian and Vicki Mills (millsv@u.library.arizona.edu) is an Undergraduate Services Librarian at the University of Arizona Library, Tucson. Both were members of the Access 2000 Team. To see the library's Web site, go to www.library.arizona.edu.


| ITAL Vol. 19, No. 3|


http://www.lita.org/ital/1903_mills.html
Copyright 2000, American Library Association