Tuesday, December 31, 2019

ELAN to FLEx to ELAN


Here is a step by step procedure for generating a FLEx file from an ELAN file, and then generating an ELAN file from the FLEx file.

ELAN is a useful program for doing transcription. You can play a segment of a sound file over and over in order to transcribe it (like in Praat). In addition, that sound file is accompanied by a video file (unlike in Praat). However, for glossing, you want to use FLEx, which is directly connected to your dictionary (and so makes glossing easier).

Once you do the glossing in FLEx, you might want to display the glosses with the original transcription back in ELAN (along with any other minor changes you have made to the transcription and translations while working in FLEx).

Warning: The ELAN-->FLEx-->ELAN process is very sensitive to the characters used in the transcription line of the original ELAN file. If you have * or any kind of punctuation, lines might be missing from the resulting ELAN file. Furthermore, some changes to the baseline in FLEx cause similar problems in the output ELAN file (e.g., breaking up or combining lines in the baseline of FLEx). 

ELAN to FLEx

It took me 16 iterations and roughly a whole morning to figure out the following procedure, since the instructions that ELAN gives are not obvious. Nor are the answers to your questions to be found in the ELAN help files or manuals.

In my set-up in ELAN, there is one participant. For that participant, there is a transcription tier for Sasi, and two translation tiers: one for Setswana and one for English. Other set-ups will not work exactly as below. Consult the following helpful link for further information:



1.
Open your ELAN file.

Comment: In my case, the ELAN file is called K_Bojalwa.eaf. K is the first initial of the name of the speaker, and Bojalwa means traditional beer in Setswana. The extension .eaf is for ELAN files. In this short video, K describes how to make bojalwa in the language Sasi.

In my ELAN file, I have one tier for transcription and two tiers for translation. I do not attempt to do the glossing in ELAN, since there is no support for glossing there (there is no built in dictionary, etc.). I named these tiers as follows:

Sasi [Note: This tier has no parent.]
Setswana [Note: Parent tier is Sasi.]
English [Note: Parent tier is Sasi.]

2.
In ELAN, select: “File > Export As > FLEx File”
This will bring up a dialogue box called “Export as FLEx File”. This box has a sequence of four windows (I will call them 1/4, 2/4, 3/4 and 4/4). I will discuss what you need to do at each of the four windows below.

3.
Step 1/4: Element Mapping
Just select “Next” here (don’t change anything)

4.
Step 2/4: Element-Item Configuration
Just select “Next” here (don’t change anything)

5.
Step 3/4: Element-Item ‘type’ and ‘language’ attribute configuration
You should be presented three tiers (in my case, Sasi, Setswana and English) that you need to set ‘type’ and ‘language’ values for.

Comment: This is the most important window. These values need to be exactly right in order to guarantee that the output from ELAN works in FLEx. In other words, the values for the ‘type’ and ‘language’ attributes are FLEx values that you are entering into ELAN so ELAN knows how to produce the correct output file (with extension .flextext) that can be used by FLEx.

6.
First set the types of the three tiers in window 3/4.

The type of the transcription tier should be txt (this should already be set for you). The types of the two translation tiers should be gls. For the types of the translation tiers, you select from a pull-down menu.

Comment: The types for the translation tiers seem counter-intuitive to me (since they are translations and not glosses), but other choices did not work.

7.
Next, set the languages of the three tiers.
Since the process is not immediately obvious, I will describe each of the tiers separately.

8.
The language of the transcription tier needs to match the vernacular language name in FLEx. If it does not match exactly, it will not work.

The FLEx internal language codes are not part of ELAN. So the language name will not show up in the pull-down menu in window 3/4. You need to go to the bottom of window 3/4, click “language” after “Add/remove values for”. In the field “Add custom value”, add the name for your transcription language (as given by FLEx).

To get the exact FLEx name, open FLEx (and the particular FLEx project you are working on) and select “Format > Set up writing system”. In my case the name FLEx internal name of Sasi is: huc-Latn-BW. This is a name that FLEx itself generated when I set up my project.

Once you have added the FLEx internal name for your transcription language to ELAN, you click “select a language” for the transcription tier and select the language name from the pull-down menu.

9.
The language names of the translation tiers need to match the internal FLEx names. Once again, go to “Format > Set up writing system” in FLEx to find those names. In my case, the internal FLEx name for Setswana is tn-Latn-BW. Once again, you need to go to the bottom of window 3/4 and click “language”. In the field “Add custom value”, you type in the language name.

The language of the English translation tier is en. This language name also needs to be added at the bottom of window 3/4.

10.
Once you have set the types and the languages of the three tiers, it should look something like this (It will look different for your project, since the names of the tiers in your ELAN project will be different, and the languages you use will be different too.):

Sasi                 txt                    huc-Latn-BW
Setswana         gls                   tn-Latn-BW
English            gls                   en

11.
You are now done with window 3/4, so click Next.

12.
Step 4/4: Save as
Save the file in a convenient location. You will be prompted to add the extension .flextext to the file. In my case, the resulting file name is: K_Bojalwa.flextext.

13.
Open FLEx by clicking on the FLEx icon on your desktop.

14.
In FLEx, click: “Open a project”
And then open the project you want to upload the text to. In my case, the project name is “Sasi”.

15.
In FLEx, click on “Texts and Words” in the lower left-hand corner.

16.
Click on “File > Import > FLExText Interlinear”

17.
Browse to find the file you saved, and press OK.

18.
Go to “Title” in the “Text” window, and enter the name of the text, in my case K_Bojalwa.

19.
The transcription tier from ELAN should come up as the baseline in FLEx.

20.
Go to “Tools > Configure > Interlinear” and make sure that “Free Translation English” and “Free Translation Setswana” appear in the right-hand window. You can also use this opportunity to make sure the FLEx lines that you want are showing (I chose Word, Morphemes, Lex. Entries and Lex. Gloss).

21.
Click on Analyse.
In this window, you should be able to now complete the glossing of your text. You should also see translations for English (Free en) and Setswana (Free Set) under the glosses.


FLEx to ELAN

FLEx to ELAN is much simpler than ELAN to FLEx.

1.
In FLEx, go to “Text and Words” and choose your text. Select “File > Export interlinear > ELAN, SayMore, FLEx”. Click “Export”.

2.
Save the file to the same folder as the .wav file.

3.
Open ELAN, and select “File > Import > FLEx File”

4.
Browse and select .flextext file you created.
Browse and select the original .wav file
(that is, the .wav file that was used to create the original ELAN file).

5.
In the Import FLEx window:
Click on: Include "interlinear-text" element.
Choose "phrase" not "word" for the "Smallest time-alignable element".
If you chose "word" (and not "phrase") the word tier will be divided according to time subdivision (instead of symbolic subdivision), and this will make it so that your file is slow to load and also slow to scroll through. 
Click on: Create for all basic elements
Type in any number for the window “Duration per phrase element(ms)” (e.g., 2).


6.
Your ELAN display will have a large number of tiers, many which you may not want to display.  Just right click on one of the tier names, and select “Show/hide more”. Then you can select the tiers you want to display.

7.
Now you should have two ELAN .eaf files for the same oral text. In my case, they are called K_bojalwa.eaf and K_bojalwa_2.eaf. The difference between them are as follows:

a.
K_bojalwa.eaf: The original ELAN file. There are three tiers: a transcription tier and two translation tiers. In my case, the names of the tiers are Sasi, Setswana and English.

b.
K_bojalwa.eaf: The ELAN file that is the result of ELAN à FLEx à ELAN. There are four (or more) tiers, in my case: a transcription tier, a glossing tier and two translation tiers. The names of the tiers are completely changed. They now incorporate FLEx internal codes that ELAN uses to name the tiers.

I recommend that you keep both ELAN files, as they present the information in different ways and may be useful for different purposes.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.