: Step one of thematic analysis – and there is no way around this – is to know your data. There really is no substitute for knowing your data, permit me to use the word “intimately,” if you are conducting a literature review. It means that you have read through the literature and you have understood it – that is, your raw material.
: If you are doing qualitative research where you have gathered your data through interviews and any number of other methods, knowing your data means you read through the transcribed or summarised text, perhaps multiple times. A word to PhD students here, especially those researching in the qualitative tradition: at this level, I generally advise students not to outsource their data collection. Gathering your data yourself and understanding the context that comes with it is the first step to knowing your data inside and out in a way that nobody else can.
: As a qualitative researcher, your data even includes the pictures that you take while in the field – with the permission of research participants, of course.
: Those, and the notes you take of your observations in the field – those are the primary materials that you use to piece together a holistic narrative when the time comes for analysis, and you only get them by being in the field yourself. Another useful way to know your data, to familiarise yourself with it as a qualitative research student, is to transcribe relevant audio recordings of your interviews yourself, however tedious that might sound.
: Transcribing not only helps you recall the context in which something was said in the field; it also sometimes brings exchanges to life in a different way than you understood them at the time of the interview or the focus group discussion.
: This is an excerpt from an interview I conducted as part of the sample project I introduced in the previous unit. It will be the basis of the codes and themes I am going to highlight subsequently. But before I do that, a few more points pertaining to the need to know your data. One golden rule that works whatever kind of research you are doing, but which is perhaps most appreciated in qualitative research, is that everything is data in the context of your research.
: This is the case even when you are unable to find data on a particular topic. When you can’t find data on a topic, there is usually a reason for that. Maybe someone, or a group of people, refused to grant you an interview. The reason behind this prospective interviewee’s hesitation – that could signal the existence of a dynamic that provides valuable context for the remainder of your data set.
: For example, if the topic you are researching is a sensitive one, it may be that this person, or this group of people, you had hoped to interview was fearful that harm could come to them as a result of their participation in your research. So, in a sense, that is data for your research. A second point is that, even when you are dealing with more obvious forms of data, it is important to be systematic in the way that you curate and comb through those data.
: It is essential that you leave no stone unturned in collating the information you have gathered using various techniques of interviewing and observation. Your field notes, interview summaries, transcripts, observation notes – all these constitute textual data that lend themselves quite readily to the next step in the process of thematic analysis, which is generating codes.
: It always strikes me as interesting that what researchers think of as data Is really just real life for the research participants involved. When I conduct an interview with somebody, they will have shared several different snippets and tidbits that I find interesting as a researcher, but which really just add up to this person’s lived experience.
: The way that I, as a researcher, then begin to elicit particular insights that are relevant to my research focus is by generating codes from the universe of data at my disposal. You can think of codes as the building blocks that you will subsequently use to construct, refine and substantiate your themes. Codes can be generated either deductively from broad trends that you identify in the process of reading through the literature on your topic and developing your theoretical framework.
: Or they can be generated inductively by letting grounded observations emanate from your data set as you trawl through it.
: In practice, qualitative researchers use a mix of the two: drawing out specific insights from primary data, but also guided by the particular lens or framing that had been more or less established in earlier phases of the work. Let’s look at my sample except again. You will recall that my research is looking at the factors that energy policymakers in Nigeria prioritise in decision-making.
: These are some of the codes I generated inductively from the whole data set. The codes highlighted are those that relate to the two research questions – that is, questions one and four that I highlighted along with that interview excerpt earlier. Take the code “incentives for policy engagement,” for example.
: The data points that generated that code include this assertion by the interviewee in the sample excerpt: “Now, I don’t blame academia because the other side, the receiving side as well – that is, the policy-making side – I don’t think is interested in too much inquiry, because they want things to be as dark as possible so that they can do it anyhow.” In other words, the interviewee is saying policymakers in this context really don’t have any incentives to engage with researchers because they really don’t want anyone scrutinising what they are doing.
: So, what I have done with the code “incentives for policy engagement” is reduce the idea expressed in the interviewee’s statement into a phrase which captures a dynamic that links back to the questions around evidence-informed policymaking that I posed earlier.
: Let’s take another example: the code “problem definition/agenda setting.” Reading again from the excerpt where it says, “So, all the opportunity that you may have had you been there from the beginning to contribute significantly, all that is gone. Because the document is already real, we’re just being asked to talk about the implementation process.” End of quote.
: The interviewee is saying that even senior professionals like himself are often not invited to contribute to decision-making processes at the beginning stages when it matters the most. The trick to generating effective codes is that they need to be quite specific, yet sufficiently broad to capture the range of possible outcomes under the dynamic you’re trying to capture.
: So, if I had other interview excerpts which indicated that there are other stakeholder groups, say, multilateral organisations, or big business, that are involved in these discussions from the beginning stages, then I would also put those under the “problem definition/agenda setting” code.
: Note that it is not necessary to then have another code that says something like, “lack of engagement with problem definition/agenda setting.” The neutral phrase “problem definition/agenda setting” is broad enough to cover instances in which certain groups are excluded as well as instances in which other groups are included at this crucial stage.
: What this helps us to do is, when you then come back to synthesise the information you have tucked away under these discrete codes, you are able to see a rounded and nuanced picture under each code.
: A final note on generating codes: though this is not a primary focus of this module, it is important to mention that there are broadly two approaches to the process of code generation, or coding, as it is more commonly referred to. And these are: software assisted and manual approaches. There are several software applications out there that are dedicated to qualitative coding. The more popular applications – Nvivo, Atlas -ti, MAXQDA – tend to require a subscription.
: However, there are others, like Taguette and Qualcoder, that are open source. There are many easily accessible online tutorials that show how to use many of these applications. One of the main advantages of software-assisted coding is that the software helps you organise your data and can even help with high-level pattern identification.
: However, as with manual coding, where the most sophisticated application at your disposal may be a humble word processor, the task of knowing your data well enough to code meaningfully, and at a sufficiently granular level, still falls to the analyst.
: The important thing in any case is to do a thorough job of interpreting the data against the backdrop of the contextual realities within which they were elicited, so that the insights that emerge from the analysis are reliable and valid.