I’m a coder.

Well, learning to be, at least. On a whim, I took a Computer Science (CS) course during my first year at UVA and quickly found out that taking a CS class means having to learn an entirely new, and sometimes difficult, skill: coding. Coding is essentially writing instructions for computers to perform. Though it’s frustrating and challenging at times, I found myself really enjoying the problem-solving aspect of it, so much so that I decided to stick with it and major in CS. So, when I found out that one of my main responsibilities as a research assistant at the Motivate Lab would be coding, I thought it would be a cool opportunity to transfer my CS skills to the social sciences. It was tough for me to imagine though—why would you need to write software to study educational psychology?

Turns out coding isn’t just something you do for computers. Qualitative coding, which involves looking for common themes in non-numerical data such as student essays, audio interviews, or videos, is fairly different from the software coding I was just getting used to. This new kind of coding is frequently used in social science research as a method to gain in-depth insights into the population you’re working with. In my case, I’ve been coding essays from students in developmental math classes at a community college trying to connect the math they’re learning to their everyday lives. I was used to spending hours writing code and struggling to make it coherent enough for another person to decipher, not poring over someone else’s writings and having to pull meaning from it. I had never been exposed to qualitative work before so I buckled my seatbelt and got ready to brace for the learning curve.

In order to search for and quantify common themes in non-numerical data, researchers come up with a coding scheme: a set of categories aimed to comprehensively capture every attribute worth noting for the purpose of the study. Let’s say you’re interested in kids’ favorite foods and want to study how many of these “foods” are really just sweets vs. actual sustenance you might have for dinner. You’ll want a coding scheme that accurately captures when it is appropriate to eat a certain kind of food. Lucky for us, there are already some categories that capture that: breakfast, lunch, dinner, and dessert. So, these four categories would be your “coding scheme,” meaning that all of the student responses would have at least one of these categories applied to them. Someone said pancakes? That gets a “breakfast” code. Another says cookies? Dessert. But can you see where this gets tricky? What if the food fits into multiple categories? Is hot dog lunch or dinner? What about soup? Who’s to say a kid can’t have leftover pizza for breakfast? Should the time when someone eats a food change the category it falls under?

**Coding schemes have to be defined as precisely as possible in order for them to be objective and consistent**. This is achieved by setting precedents (code enough “My favorite food is ice cream for dinner” and eventually you and the other coders might give in and begrudgingly classify ice cream as a dinner food in these cases) and really thinking about what you want each category to say about its contents.

If this sounds challenging, that’s because it is—at least at first. If done wrong, it can become too subjective, even with well-defined coding schemes. A collaborative and iterative coding process combats this. Once we have generated a coding scheme that we agree captures the information that we’re looking for, we make sure that we code consistently with one another. We do this by testing interrater reliability, a measure of agreement we must achieve amongst all, and ensuring that we, as a team, agree before doing our own individual coding. We do this by having multiple researchers thematically code the same student responses so that we can measure how consistently we are coding. Any disagreements on which codes should be assigned to which responses are resolved by discussing and coming to a consensus among the researchers. This process continues until all disagreements have been resolved and coders code with at least 80% consistency with one another. So the work doesn’t stop after you’ve created a coding scheme to your liking. For us, we test and compare and learn and revise and test and compare and learn and revise with each other until we get to this point.

*Looks juuuust a little different than your typical C++ or Python, yeah?*

This is a long and, at times, exhausting process. You can spend weeks perfecting a coding scheme only to then spend months trying to achieve interrater reliability. I’m used to testing my software and knowing immediately what’s wrong with it. It has frustrating for me that I had to decide all the rules and not a computer. But, on the flip side, I’m experiencing an entirely new world of social science research and helping to support students build their motivation and improve their academic outcomes. Through hard work, practice, and a lot of time, something that once was completely foreign to me now comes with much greater ease (growth mindset, anyone?).

It’s taken about a year, but I actually feel as comfortable qualitatively coding as I do coding for a Computer Science class, if not more so.

So yeah, I’m a coder—double the coder now actually. And I’m learning to be better every day.

Mira Lee is a 3rd year UVA student and undergraduate research assistant with Motivate Lab. When she’s not writing code or coding student essays, you can find her making an abundance of Spotify playlists, watching subpar romantic comedies, or scarfing down an impressive number of tacos.

Double the Coding, Double the Fun