The transcription rollercoaster (1/2)

I am currently in the midst of an intense data collection phase and have been conducting 90-minute focus groups with nurses and assistant nurses (5-6 at a time) from different departments and with different specializations (i.e. ward and surgery nurses) at the Uppsala University Hospital. (I am still pinching myself about our having been able to gather such large groups of participants since having nursing staff dedicating 90 minutes of their time to a study is very difficult.) Designing, planning and performing the focus groups has been a very exciting and positive experience so far. Right now however, I am experiencing the maybe least fun side of data collection: transcription.

I had heard that transcribing was a very time-consuming activity, of course, and had made approximate calculations of the time I would need to transcribe each focus group. Unfortunately, my calculations happened to be way, way off. (In case you are wondering: according to my latest estimations, I have needed about one hour per 5 minute of recording!). It is rather amazing the amount of things one has time to say in only a few seconds… It is actually quite funny (in retrospect, I did not find it funny while doing it, as you can imagine) – it feels like you have been transcribing forever, and then you look up and see that you actually only have gotten 2 minutes further since the last time you checked!

But transcription is not only time-consuming, it is also much more difficult than what I expected. (Of course, I have been transcribing in Swedish, which I have only been speaking somewhat fluently for the past year and half or so, but I do believe that some of the difficulties I have encountered are inherent to the task and not entirely dependent on the degree of familiarity of the transcriber with the language spoken.)

Here are some of the difficulties I have been experiencing:

  • Not understanding what is being said: it can be because several participants talk at the same time, talk too fast or simply mumble – in any case, the result is that no matter how many times I listen to a segment and how much I slow down the pace of the recording, I simply do not grasp what is being said. As a result, “data retention” is definitely not 100% percent (as one could maybe have thought because of the use of the recording device). Part of the data does get “lost” in the transcription process.
  • Not recognizing who is speaking: this is a problem I had not at all anticipated, but it definitely is a big one. Voices sound quite different on the recording (it gets even worse if you are slowing down the pace of the recording when transcribing), and when you have been talking to 5-6 people you had never met (and heard!) before, recognizing who is speaking when simply is impossible. I used volume as an indication – I knew that those who I heard most loudly were those sitting closest to me and the recording device – but that is of course not fully reliable. Fortunately, accurately recognizing who was speaking was not really needed for the analysis of the data.
  • Finding the appropriate level of detail: it is up to the transcriber to determine how faithful and detailed the transcript should be. If you are doing, say, a discourse analysis, I guess you need to have every word in precisely the order in which they are spoken (I was told that some researchers even count the number of seconds of pauses in the conversation). Luckily, this was not the case for me. Although, I wrote everything in detail at first – every hesitation, every start to a sentence, every nodding sound – I then realized that this level of detail was unnecessary for the kind of analysis I was planning to undertake. I thus started to focus much more of the content of what was being said, skipping hesitations and unfinished sentences (although we do not always notice it when listening to somebody speak, the oral discourse is very fragmented and contains many aborted sentences) as well as words only used orally (like for example, “like” or “ah”). This made for a transcript that was much easier to read and better suited to my needs.

Do you recognize those difficulties? Have you experienced other difficulties that I have not mentioned here? Do you have any tips and tricks for transcribing from audio recordings?