As
I have discussed in previous blog posts, non-experimental data, such as
transcribed oral texts, is an excellent source of syntactic data.
In
this blog post, I outline some of the factors that affect the rate of
transcription of recorded oral texts when doing syntactic fieldwork. If anybody
knows of relevant literature on this topic, please let me know. A systematic
survey amongst fieldworkers would probably be useful in helping to understand
the process, and maybe to help make it more efficient.
Certainly,
the advent of tools like digital recorders, Praat, ELAN and FLEx has increased
the speed at which it is possible to process oral texts. I can recall the days not
that long ago (the early nineties) when I would transcribe a text by pressing
the rewind button over and over. As far as I know, there were no digital
recorders at that point, or at least none that I had access to. And therefore,
I also did not have the benefit of using Praat to see the actual waveform while
transcribing.
How
long does transcription take exactly, and what are the factors influencing how
long it takes? My answer is that in ideal circumstances, the rate of
transcription will be around 40 to 1 (40 minutes of transcription for one
minute of recording). In this blog post, I will explain my answer.
For
concreteness, I am assuming that the linguist is working with ELAN, and is
transcribing the recording into either IPA or an orthography. I assume that the
linguist is also translating (but not glossing) the text. Glossing would add more time, and is more efficient in FLEx (than in ELAN). In my calculations
of rate of transcription, I put aside here the time it takes to set up the
project in ELAN. The linguist needs to have the video and audio file ready (or
to prepare them in Adobe Premier Pro CC, or a similar program). They need to
load the files into ELAN and create the tier types and tiers. All this takes
time (e.g., 5-15 minutes, depending on what exactly needs to be done).
My
process for transcription is as follows:
a. Segment the recording in ELAN.
On
many occasions the boundaries have to be redone during transcription (adjusting
boundaries, merging segments, dividing segments).
b. Listen to an individual segment.
Usually,
we listen to the segment two or three times just to get going. I also focus in
on any unclear parts of the segment and play them a few times.
c. Transcribe the segment into IPA
If
I have any problems with transcription, I ask the native speaker consultant for
help. They can help me identify a word, or repeat part or all of the segment.
d.
Translate the segment into Setswana.
This
is the job of the native speaker consultant, who speaks both Sasi and Setswana
fluently. The translation is then written down by the translator. I check to
see if there are any egregious errors in the Setswana translation. The most
frequently occurring problem is that the consultant will often not translate
the segment, but rather explain it in some way.
e. Translate the segment into English.
This
is the job of the translator. I check to see if all three (the Sasi
transcription, the Setswana translation and the English translation) match.
Sometimes getting them to match takes a bit adjustment.
f.
Repeat b-e until finished with
the recording.
As
an example, I made a recording in Sasi of how to make local beer. The recording
lasted 1 minute and 35 seconds and took me about an hour and a half to
transcribe and translate (no glossing). So that is a ratio of approximately 60
to 1. The entire text has 32 sentences. The transcription and translation of
this particular text were relatively easy from my point of view. The speaker
spoke in a clear fashion on a clearly delimited topic (with known vocabulary).
Also, the speaker was present, helping me with the transcription when necessary.
A more difficult text with a different speaker could have easily taken much longer.
Not
everybody will have the same set-up as me (e.g., translating into two
languages), but they may have their own complications of various sorts (e.g., a
language without a dictionary). I list below some of the factors that could
affect the rate of transcription, and comment on them with respect to my Sasi
project. Given these factors, it is difficult to estimate how long
transcription should take. But in ideal circumstances, I think a ratio of 60 to
1 is reasonable. Ideal circumstances include: the presence of an experienced
native speaker consultant on hand to help with the transcription, good sound
quality, translation into only one language, the existence of a good dictionary
and limiting the goal to broad phonetic transcription.
Some people with lots of experience in ideal circumstances may transcribe at a faster rate. I personally spend more time that 60 to 1 on oral texts. Usually there are rough spots that I review with the speaker and I might even review with other consultants.
Some people with lots of experience in ideal circumstances may transcribe at a faster rate. I personally spend more time that 60 to 1 on oral texts. Usually there are rough spots that I review with the speaker and I might even review with other consultants.
So
if you have a six minute video, you can plan on taking the day to
transcribe it accurately.
What
is the fastest possible rate? I would say that is around 20 to 1. Just the
physical acts of segmenting the recording, listening to the segments a few
times, transcribing the segments and writing the English translations would
take you very close to a ratio 20 to 1.
Here
are some of the factors that affect the rate of transcription:
1.
Is
the transcriber a native speaker of the language?
Sasi
Project: In my case, I am not a native speaker of Sasi. Furthermore, all
speakers of Sasi are elderly (over 60) and illiterate. There are no existing
native speaker transcribers. If the transcriber were a native speaker of the
language being investigated, it would certainly be a huge advantage to them in
transcribing oral texts. I don’t think it would resolve all the issues listed
below, but it would make the process faster.
2.
Is
the original speaker (of the oral text) present for the transcription? If they
are present, are they able to help with the transcription? (How old are they?
What is their hearing like? What is their health like? Are they inebriated or
otherwise incapacitated?)
Sasi
Project: In my typical set-up, I try to make sure the speaker is present for
the transcription. The speaker can listen to the segment, and if needed, help me
to transcribe it. Some of my consultants have hearing problems making it
difficult for them to help me, even when they are the speaker of the oral text.
3.
Is
a native speaker consultant present to help with the transcription? And if so, are
they experienced? Do they have basic transcription skills?
Basic
transcription skills can be learned and practiced. If you have an experienced
consultant, it can be very helpful in speeding the process along. Such skills
include: (a) repeating a segment exactly (as spoken in the recording), (b)
translating a segment of the recording, (c) repeating a single word from the
segment, (d) defining a single word from a segment, (d) saying whether a
segment has been transcribed correctly, (e) saying whether a segment has been
translated correctly.
Sasi
Project: Because of age, some of my consultants are unable to comply with all the
task demands of helping to transcribing text.
4.
How
good is the sound quality of the recording? Is there clipping, background
noise, wind, hissing, echo in the room, etc.? Did the speaker turn their head
away from the mic often? Any decrease in the quality of the recording can make
transcription more difficult.
Sasi
Project: For example, if the waveform is clipped, a click can sound like a non-click
consonant, making it difficult to recognize the word being used. Usually, a
good consultant can recognize the word anyway (even with clipping), but not
always. In my experience, lavalier mics produce the best sound (less background
noise), but other recording methods can produce acceptable recordings as well.
5.
How
many languages are being translated into?
Sasi Project: Sasi is being transcribed, and translated into Setswana and then English. I find the Setswana and English translations help with the Sasi transcriptions. However, it is often the case that the translation from Setswana into English raises its own particular problems that take time to resolve. If a linguist only had to translate the transcription into one language (e.g., English) the process would be faster. The addition of a second language of translation (Setswana in my case) is one of the biggest time sinks in the process.
6.
If
you have a translator (between two translation languages), what is their
background? Do they have translation experience? What is their level of
English? Does the translator speak the local dialect (of the speakers)? Are
they familiar with the concepts that the speakers use in their oral texts
(e.g., the various plants and animals)? How well does the translator interact
with the speakers?
Sasi
Project: Often my translator has to use a dictionary to translate from Setswana
to English. Each time they pick up the dictionary, it adds a minute to the
process. And the local Setswana (Sengwato, Central Region, Botswana) is not
identical to standard Setswana, raising further difficulties in translation.
7.
How
clear is the speech of the speaker? Do they speak quickly, running words
together? Do they speak really softly, sometimes almost inaudibly? Do they
speak with large bursts of intensity (clipping the sound file)? Do they often
turn their head from mic? These are the kinds of issues that have a big effect
on the difficulty of transcription.
Sasi
Project: I find speakers vary a lot on these dimensions. Some speakers have a
careful pronunciation of words, whereas some speak very quickly leaving out or
greatly reducing morphemes. As for people who speak softly, to some extent one
can increase the gain with software (e.g., Adobe Premiere Pro CC).
8.
What
is the subject matter of the text? Does it involve new vocabulary that needs to
be carefully transcribed and translated? Does it involve a complex topic
unfamiliar to the linguist?
Sasi Project: In the recording of making local beer discussed above, the topic of the text was clearly delimited, and only known vocabulary was used. For other more free flowing recordings (e.g., a life story, relatively unstructured interviews, unscripted conversations), new words will come up, and it takes time to give accurate transcriptions (and translations) of these words, which adds to the time of transcription.
9.
How
well is the language documented? Is there a good grammar and a good dictionary (both
available in an electronically searchable format)? Does the language have a
linguistic tradition of papers written on it? Many endangered or less studied
languages may lack any or all of these useful documents.
Sasi
Project: Sasi is not very well documented. I have a short dictionary that I
wrote (in collaboration with Andy Chebanne), and there are sections about Sasi
in the =Hoan grammar (written with Jeff Gruber). There are no linguistic papers
uniquely on Sasi, but I have written a few papers on the closely related language
=Hoan. There is a spelling primer available online (written with Zach
Wellstood). Other than that, there are no other resources on Sasi.
10.
How
much practice has the linguist had in transcription? Certain transcription
issues occur over and over. Once these issues are resolved, the job of
transcribing new texts is easier.
Sasi
Project:
Since
1996, I have transcribed seventeen short oral texts combined for Sasi and
=Hoan. During this year (2019-2020), I plan to transcribe five hours of video.
In the past, I have only transcribed audio recordings, not video recordings. Even from this limited experience, it is clear that the more you transcribe, the faster it gets. It becomes easier to recognize certain repeated sequences of words, no matter how they are pronounced.
As
an example of a recurring transcription issue, in Sasi ka ki “with” often comes out as ka
i, or even a more reduced form. But the tone is immediately recognizable
(high low). So it only takes a few encounters before “with” becomes easy to
transcribe consistently.
11.
Is
there more than one participant? If so, do their contributions continually overlap?
Sasi
Project: I have both one person and two person recordings. The two person
recordings are mostly interviews (one Sasi speaker interviewing another). In
the interviews, there is lots of overlap at the edges of the sentences, making
transcription more difficult. For most of these interviews, each person has
their own Lavalier mic.
12.
How
much uncertainty are you willing to live with? Are you willing to simply delete
an obscure barely audible passage? Are you willing to have question marks in
your transcription? Are you willing to adopt the suggestions made by your
consultants, even when they clearly diverge from the recording?
Sasi
Project: Many problems of transcription have to be puzzled over in order to try
to figure out exactly what was said. To address these sorts of problems, I find
that it helps to do a complete rough transcription of the whole text and then
to go over the problem areas at a later date in order to resolve them. But of
course, such a second pass adds more time to the transcription process.
13.
What
level of detail do you want in the transcription? Are you just aiming for a
broad phonetic transcription, or do you want to transcribe fine phonetic detail
of production? Are intonation, gesture and possibily other features being
transcribed?
Sasi
Project: I usually aim for broad phonetic transcription. More fine grained
transcriptions would take a longer time.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.