If At First You Don’t Assess, Try, Try Again

ACRLog welcomes a guest post from Katelyn Tucker & Alyssa Archer, Instruction Librarians at Radford University.

Instruction librarians are always looking for new & flashy ways to engage our students in the classroom. New teaching methods are exciting, but how do we know if they’re working? Here at Radford University, we’ve been flipping and using games for one-shot instruction sessions for a while, and our Assessment Librarian wasn’t going to accept anecdotal evidence of success any longer. We decided that the best way to see if our flipped and gamified lessons were accomplishing our goals was to evaluate the students’ completed assignments. We tried to think of every possible issue in designing the study. Our results, however, had issues that could have been prevented in hindsight. We want you to learn from our mistakes so you are not doomed to repeat them.

Our process

Identifying classes to include in this assessment of flipped versus gamified lessons was a no-brainer for us. A cohort of four sections of the same course that use identical assignment descriptions, assignment sheets, and grading rubrics meant that we had an optimal sample population. All students in the four sections created annotated bibliographies based on these same syllabi and assignment instructions. We randomly assigned two classes to receive flipped information literacy instruction and two to play a library game. After final grades had been submitted for the semester, the teaching faculty members of each section stripped identifying information from their students’ annotated bibliographies and sent them to us. We assigned each bibliography a number and then assigned two librarian coders to each paper. We felt confident that we had a failsafe study design.

Using a basic rubric (see image below, click to enlarge), librarians coded each bibliography for three outcomes using a binary scale. Since our curriculum lists APA documentation style, scholarly source evaluation, and search strategy as outcomes for the program, we coded for competency in these 3 areas. This process took about two months to complete, as coding student work is a time-consuming process.


The challenges

After two librarians independently coded each bibliography, our assessment librarian ran inter-rater reliability statistics, and… we failed. We had previously used rubrics to code annotated bibliographies for another assessment project, so we didn’t spend any time explaining the process with our experienced coders. As we only hit around 30% agreement between coders, it is obvious that we should have done a better job with training.

Because we had such low agreement between coders, we weren’t confident in our success with each outcome. When we compared the flipped sections to the gamified ones, we didn’t find any significant differences in any of our outcomes. Students who played the game did just as well as those who were part of the flipped sections. However, our low inter-rater reliability threw a wrench in those results.

What we’ve learned

We came to understand the importance of norming, discussing among coders what the rubric means, and incorporating meaningful conversations on how to interpret assessment data into the norming process. Our inter-rater reliability issues could have been avoided with detailed training and discussion. Even though we thought we were safe on this project, because of earlier coding projects, the length of time between assessments created some large inconsistencies.

We haven’t given up on norming: including multiple coders may be time-intensive, but when done well, gives our team confidence in the results. The same applies to qualitative methodologies. As a side part of this project, one librarian looked at research narratives written by some participants, and decided to bravely go it alone on coding the students’ text using Dedoose. While it was an interesting experiment, the key point learned was to bring in more coders! While qualitative software can help identify patterns, it’s nothing compared to a partner looking at the same data and discussing as a team.

We also still believe in assessing output. As librarians, we don’t get too many opportunities to see how students use their information literacy skills in their written work. By assessing student output, we can actually track competency in our learning outcomes. We believe that students’ papers provide the best evidence of success or failure in the library classroom, and we feel lucky that our teaching faculty partners have given us access to graded work for our assessment projects.

One thought on “If At First You Don’t Assess, Try, Try Again

  1. Thanks for sharing your experiences with assessment, Katelyn and Alyssa. It’s really helpful to read about the things that don’t go quite as planned, perhaps even more helpful than reading about perfectly successful projects. I’ve found coding to be especially tricky since training can be a time-consuming process. Thanks for the reminder that the time spent training is worth it!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>