Saturday, July 7, 2018

A Syntactician Goes to CoLang 2018

A Syntactician Goes to CoLang 2018
Chris Collins
July 2018

1.         Introduction
I attended CoLang at the University of Florida, Gainesville from Monday June 18 to Friday June 29th, 2018. I had always wanted to attend. The main reason I have not attended in the past is that I usually do fieldwork in the summer, and the timing of that fieldwork coincides with CoLang.
In this post, I outline how I ended up attending CoLang 2018, how the school was structured and what I learned from it.

2.         Background
CoLang was originally called InField and was first held at the University of California, Santa Barbara in 2008, which is relatively recent. The list of locations is found on a Wikipedia page:
The list of topics at CoLang usually includes things like FLEx, ELAN, video recording, audio recording, lexicography, grant writing, orthography, transcription, survey methods, pedagogical grammars, map making and other practical topics. These topics are all of great interest to me.
In many ways, I felt more comfortable in the company of my fellow CoLangers than I do in the company of other colleagues, who are increasingly focused on computational linguistics, neurolinguistics and psycholinguistics. While the areas of computational linguistics, neurolinguistics and psycholinguistics are of great importance, the main priority of linguistics should be to document and analyze endangered and understudied languages. If our goal is to understand the human language faculty, then one of the best paths to follow is through sophisticated fieldwork on a wide variety of languages.

3.         Why I Attended CoLang
            Even though I have wanted to attend CoLang for a very long time (since it was first offered as InField at Oregon), I have never had the time to attend. So why did I attend now?
            In September 2017, I applied for an NSF DEL grant do do work on the Khoisan languages of Botswana. In particular, my idea for the grant was to bring undergraduate and graduate students to Botswana to learn how to do fieldwork on the Khoisan languages. I felt that I had gotten a lot out of studying the Khoisan languages, and now I wanted to make sure there were others to carry on.
            I was very confident of my qualifications to carry out such an educational project. I have written two published grammars of Khoisan languages. Plus I have an unpublished one, and a short dictionary of Sasi. These efforts involved extensive text collection and extensive audio recording (but no video). In addition to these documents, I have numerous academic publications on the Khoisan languages. I have been working on the Khoisan languages for over 20 years, involving both summer field trips and three year-long field trips. My most recent trip was to Botswana as a Guggenheim scholar, where I studied Kua and Sasi in collaboration with Andy Chebanne of the University of Botswana. I felt that I had knowledge about how to do fieldwork on these languages and that students might be interested in learning what I knew. This includes not only elicitation and working with oral texts, but also knowledge of the structures of the languages, the relations between the languages, the locations of the speakers and knowledge of the country and the problems that come up in trying to do fieldwork on the Khoisan languages.
            The NSF reviewers were definitely not impressed. I received 29 pages of reviews, which were very uneven, some very positive and some very negative. For weaknesses, the panel said: “There are areas of worry. The first relates to the fact that the PI lacks previous extensive experience in documentary linguistics.” And they continued: “The panel wonders whether it might be possible for the PI as well as the RA to participate in Co-Lang.  This might help the PI become more comfortable with the demands of documentation and the archiving of data.“
            So because of these recommendations, I was given a choice of no funding or attending CoLang. Since I had always wanted to attend, I was happy to agree to attend. The only negative drawback of my attendance is that I was not able to do fieldwork this summer (summer 2018), since CoLang occurs in the middle of the summer. This has made me very anxious given the age of my consultants. However, as will be seen below, I learned a lot from CoLang that is directly useful to me, so I do not regret attending.
            A side benefit is that NSF DEL also paid for Zach Wellstood, the RA on my Khoisan grant, to attend CoLang. Zach is much more technically proficient than I am, so we did not end up taking any of the same courses. But this means that we could exchange information with one another on the courses we did take.
            The abstract of my NSF DEL grant can be found here:
4.         Arrival and Settling in
I arrived in Gainesville on Saturday June 16, 2018 just to give myself a few days to get settled in. I am glad I did so. It is so hot in Florida that you need to walk around with a sweater. The moment you enter any building, the air conditioning is so cold that you need to wear a sweater, while outside the temperature is always in the 90s and humid. Even in the morning, where one might expect a cool breeze, it is rather just a bit less hot. You have to sleep with the air conditioner on, since it is too hot otherwise. Also, if you dare to turn the A/C off, the moisture in the room will accumulate, and there will be water on the floor and walls in the morning. I know because this happened to me, and I had to change rooms.
One of my very first experiences was meeting a couple from Alaska on the Sunday before classes started. They were heading out to Wal-Mart to pick up supplies, and I joined them. On the bus we had a long talk about the Khoisan populations, and parallels between the Khoisan populations and Native American populations. I expressed the desire to help my community with more than just language. I wanted to help them materially as well. The Alaskans had lots of ideas about this, and shared them freely. They suggested that I establish a non-proft 501(c)(3), to raise funds. I remained friends with this Alaskan group throughout CoLang, often eating breakfast with them. Later on in CoLang, I became friends with a group of Navajo linguists. I shocked them over dinner with the horrors of bed bugs in NYC. And then we were friends for the rest of the school. In fact, I would say that one of the most valuable aspects of attending CoLang for me was being able to talk to Native American community members to see what language meant to them and to hear about the activities that they were involved in.
During the first day in Florida, I also found a bike shop and rented a bike to use in the mornings before class. This kept me sane, and I got to see a lot of the University of Florida campus in the morning.

5.         My Classes
            I wanted to learn technical skills during CoLang. I have always been intimidated by FLEx and ELAN and I wanted to learn those. I have used FLEx and Toolbox to process my texts in the past, but always with the help of a student.
            I took the FLEx course for two consecutive weeks. The first week we focused on lexicon, and the second week we focused on texts. The first week was taught by Juliet Morgan and was organized around four specific data sets in Chickasaw. For example, in data set three we uploaded images and sound files to the lexicon. For all the FLEx and ELAN courses, I appreciated the hands-on, problem solving organization of the classes. One non-trivial thing I learned in FLEx 1 was how to upload the sound files of example sentences into a lexicon. To do so, it is necessary to use audio as a writing system!
            The second week, taught by Carolyn O’Meara, discussed how to create glossed texts. We also discussed various ways to output the data. But what I owe to Carolyn the most is her encouraging me to use FLEx in my Field Methods course at NYU using Language Depot. I had all sorts of qualms about this, but she answered my questions. I now am thinking seriously of organizing all or part of the course around FLEx. It would be a great opportunity for students to get hands-on training in a widely used tool. It will also force us to take glossing and information sharing very seriously, since all of our information will be synced in FLEx.
            Because of FLEX I/II I have decided to redo my whole Sasi lexicon. I was about to publish it. In fact, I had already sent it to a publisher. But now, I feel that it should be online (e.g., using Webonary) and in App form (e.g., using Mother Tongues Dictionaries or SIL Dictionary App Builder), so that it is maximally accessible to the community. They will be the ones who will benefit from it the most. I fear that publishing it with an academic publisher will preclude me from making it widely accessible. Since I do not need the publication of the dictionary for tenure or promotion or raises, and it is unlikely my department would give me much credit for writing the lexicon anyway (in spite of the bone crushing effort it took to write it), my best option would be to simply make it available on the web and as an App to download. I have also learned about the existence of paper self-publishing options at CoLang (e.g., LuLu) that I intend to pursue for the Sasi lexicon so that the elders can have a paper copy.
            ELAN was much rockier than FLEx. It is much harder and less intuitive to use than FLEx. And there is a paucity of pedagogical materials on the web (unlike for FLEx). In the first week, taught by Andrea Berez-Kroeker, we learned how to input data into ELAN. In the second week, taught by Chris Cox, we learned how to output the data. Chris taught us some cool applications for packaging videos for public consumption. He also taught us how to add subtitles to a video posted on Youtube or Facebook. So all of this was very useful, and I anticipate using it in the future when I get more video of Sasi and other languages next summer.
I have been feeling guilty for years now that I have not obtained video of my consultants speaking, especially when they tell me their stories. They often get quite animated, and make interesting gestures, which I have regretted not being able to capture on video. For example, in telling the story of the woman who married a blue wildebeest, a N|uu consultant put her hands up like horns and snorted. It was hilarious, but I do not have it captured on film. From long experience, I know that audio recordings are tremendously useful (for transcription, glossing, prosody, etc.). I feel that the visual information provided by video has the chance to be equally useful to me. I would like to investigate how this video can be used in syntactic and semantic research.
So in addition to FLEx and ELAN, I took anything I could take having to do with film. The two courses were “Getting the Full Picture of Language Use and the Importance of Video” (weeks 1,2) taught by Mandana Seyfeddinipur and “Challenges and Solutions in Advanced Audio Visual Documentation” (week 2) taught by Ben Levine. Both of them gave plenaries talks as well. For the second course, none of us had much background, so it turned into a kind of beginners course.
“Getting” focused the first week on the importance of shooting video. The main argument was that gestures are systematic, and worthwhile documenting. An example of this argument comes from the work done by James Essegbey on Ewe. Because of cultural norms amongst the Ewe (and in West Africa in general), the use of the left hand is highly restricted. For example, one could never hand something to somebody with the left hand. As Essegbey showed, this restriction extends to giving directions, where the use of the left hand is avoided and when used, it is restricted to a small spatial area. We saw videos of people giving directions in Ewe, and it was interesting to see how the avoidance of the left hand worked.
Even though I am interested in these issues, I had already convinced myself that I needed to shoot video before CoLang began, so I was more interested in the second week of “Getting” than the first. In the second week, we shot film on campus and then Mandana gave us feedback. We were asked to film: (a) an interview, (b) somebody giving directions and (c) a conversation. We had to film people who were not part of the CoLang community. We had to take metadata, and to rotate the jobs. We were all given video equipment and we had to put it together and make it work.
From these sessions, and the filming I did in Ben Levine’s class, I learned lots of specific things about video that will be useful to me. For example, I learned that one has to spend time thinking about light and sound ahead of time, and to experiment with what works (e.g., make a site visit beforehand if you are recording in an unfamiliar place). Bad lighting can be a disaster, but it is fairly easy to avoid big mistakes if one plans ahead. Similarly, for sound and the use of mics.
I have been feeling crushed under the sheer amount of materials that I have gathered in the last 20 years (mostly notebooks and sound files, but other file types as well), which I have not yet archived in a permanent place. So I took the course “How to Organize Your Materials and Data for a Language Archive”, taught by Susan Kung (of AILLA), Alicia Niwagaba and Vera Ferreira. One complaint I have about this course is that it was only one-week long. The course presented a large amount of material, and because it was only a week long we did not have much chance for in-class exercises. So if the CoLang organizers are reading this, please consider making it two weeks long. The course was helpful in many ways. For example, we discussed issues having to do with archiving and community rights. But I was a bit disappointed to not find much in the way of archiving for African languages (as compared to other areas of the world).
            I also wanted to take a lexicography course, which to my great disappointment was not offered this year. I have first hand experience writing a dictionary. I struggled through all of 2015-2016 putting together a Sasi dictionary, so I have lots of questions about the process. Hopefully in future years, CoLang will teach lexicography again.
            There were many other courses offered that I wanted to take. I was very interested in the course called “Pedagogical Grammar” but unfortunately it overlapped with another course I was taking. Similarly, I was interested in taking “Survey Methods”. Hopefully, I will be able to return to CoLang at another time and take these classes.

6.         Evening and Plenary Sessions
            Almost every day there was both a 1:00pm plenary session (immediately after lunch) and an evening session immediately after dinner (at 7:00). Two of these sessions were about funding (a plenary session by the director the ELDP, Mandana Seyfeddinipur, who also taught my video class, and an evening session by Colleen Fitzgerald, the director of the NSF DEL, which awarded my Khoisan grant). I found these sessions quite informative, especially the ELDP one.
I have for a long time viewed the ELDP as a kind of impenetrable fortress, impossible to break into (in spite of the fact that I have reviewed grant applications for ELDP several times). I have had two students who have been rejected. But through the presentation, I am now less intimidated. I know what steps I need to follow to increase my chances. I have also learned that preference is given, where possible, to linguists from the communities being documented, a policy that I agree with. Because of my experience at CoLang, I am now planning to try to apply for funding at ELDP for the academic year 2019-2020 and to encourage my students to do so too. Furthermore, I met people at CoLang who have received ELDP grants who will be able to give me feedback on the application.
            Three of the sessions were on work being done at specific archives (e.g., PARADISEC), which emphasized for me once again the prominent place of archiving in fieldwork. Three of the sessions were films about various revitalization efforts in Native American communities. For example, we saw the film "We Still Live Here" on the evening of the first Tuesday. It is a film about the Wampanoag people of Massachusetts and how they brought their language back from over 100 years of dormancy. Lots of fascinating issues for a linguist. How do you know how to pronounce a word in a language that nobody still speaks? It was really inspiring to see linguistics in action. Another inspiring aspect of this film, from my perspective, was the role played by MIT linguists. It showed clearly how there is no conflict between being a dedicated theoretical linguist, and a serious fieldworker producing materials of great use to the local community. This is the Ken Hale tradition which I see my work as part of. The film also underscored the issue of linguist/community relations. The relationship between Ken Hale and the Wampanoag had a rocky start (for reasons discussed in the film). But in the end things worked out well, and they developed a good working relationship.
            There were other interesting sessions: Brent Henderson and Peter Rohloff gave a talk on language and health in Guatamala and Felix Ameka talked about how to go beyond what he called the Boasian trilogy (grammar, dictionary, texts).
Mostly because of the evening and plenary sessions, I was often on the point of physical exhaustion. I did not want to miss a single session, but we had four classes per day, a plenary talk just after lunch and an evening session almost every day. I felt I was not getting enough sleep.

7.         Lessons Learned
When I was in graduate school (1988-1993) there was nothing like CoLang or InField. There was really no community spirit amongst fieldworkers, at least that I had any contact with. I did have Ken Hale as a great example of a theoretical linguist who did a lot of fieldwork, and I did attend the courses he taught on field methods. But I never had any course or any instruction in practical issues such as sound recording.
In fact, there were not even digital recorders when I first graduated. We just used cassettes. I can remember starting to use digital recorders later, in the late 1990s and early 2000s. Digital recordings made recording, transcription and storage of recordings much more manageable. By 2003 I was recording everything.
Everything I learned from 1988 (when I entered graduate school) to my attendance at CoLang I learned through trial and error and hard work. These things include: (a) the importance of high quality sound recording, (b) how to create high quality sound recordings in a way that fits my workflow, (c) the importance of oral texts, (d) how to transcribe and translate oral texts, (e) the importance of writing grammars, (f) how to write a grammar, (g) the importance of writing a dictionary, (h) how to write a dictionary, (i) how to create an orthography, (j) how to write a spelling primer, (k) how to write a grant application (airplane tickets to Africa are costly!), (l) how to power equipment using solar power in the field (and many other gritty nuts and bolts issues of many different kinds that arise in the field). And there are still many important things I am learning about (e.g., archiving, putting a dictionary online, use of video, the use of photography, etc.).
When I say “the importance of” in these points, I mean the importance for my theoretical work. It took me a long time to learn these points in relation to doing fieldwork with theoretical goals in mind. For example, when working on an understudied language it is crucial to collect oral texts. Constructions occur in these oral texts that are difficult to elicit without already knowing that they are in the language. Oral texts are a gold mine of grammatical information that can be explored indefinitely. Even when studying the syntax of a language like English, the most well studied language on earth, one can learn a lot of things by looking at texts (using Google to find example sentences).
 These are all topics that are taught in some form or the other at CoLang. If I had access to CoLang early in my career, I could have avoided many missteps. For example, I still regret to this day the paucity of recorded material that I have for my first fieldwork with Khoisan languages. Also, had I attended a school like CoLang, it probably would not have taken me 20 years to start writing grammars. I probably would have started drafting grammars immediately.
What was missing at CoLang was a discussion of the relation between theory and data, and this is the main focus of Field Methods classes that I teach at NYU. In a such a class, students work from the ground up, analyzing data. They establish the sound system, and then work on morphological and syntactic generalizations. Usually, they are required to write a final paper, which is the focus of the class. The students are expected to collect data, formulate and test hypotheses and argue for some specific analysis. So CoLang complements what is taught in a traditional Field Methods course. But CoLang has also given me a fresh perspective on the Field Methods course, and I hope I can incorporate more practical documentation skills into it.
            In part inspired by CoLang, I am investigating setting up a Fieldwork Discussion Group at NYU. Here is a preliminary description: “This discussion group will be about fieldwork in linguistics, and in particular about the connection between fieldwork and linguistic theory. The group will meet on average once every two weeks. Some topics include the following: (a) Discussions of recent and classical papers about fieldwork or involving a substantial fieldwork component. (b) Student and faculty presentations of ongoing work in the field. (c) Presentations by invited speakers on their own fieldwork experiences. (d) Discussions and workshops on software, equipment, workflow and tools. (e) Methodologies in the various domains (syntax, semantics, phonetics, etc.). (f) Archiving: selection of archive, access, restrictions, formats, costs, etc. (g) Ethics: consent forms, IRB approval, copyrights, community rights, etc. (h) Working with communities (establishing trust, teaching skills, revitalization, etc.) (i) Publishing fieldwork: traditional papers, grammars, primers, dictionaries, etc. (j) Funding sources and grant writing workshops.”
One thing that has changed in the field, I think, is the priority given to language archiving. Certainly this was not a priority in the 90s, and even in the 2000s it was not so acute. But now, to get an NSF DEL grant you need to have a DMP (“Data Management Plan”) and you need to have a letter of agreement with an archive (e.g., ELAR). And it seems that these archives are communicating with one another and adopting similar standards. There is even an umbrella organization, DELAMAN of approved archives. These archives need to be distinguished from other kinds of repositories of data and academic papers. The big difference is that repositories (such as Zenodo) do not guarantee a permanent place for your materials. So this is something that is moving really fast and has become a big priority. I support this development. I don’t think it is a good idea for a fieldworker to sit on data in their computer or external hard drives in perpetuity. I think depositing in an open access archive not only guarantees the safety of the data, but also helps to create professional transparency (e.g., checking research results against data).
Archives charge a fee for their services. For example, the fee that I am required to pay by ELAR for archiving the materials from my NSF DEL grant is 8% (not including salaries). Apparently that is a standard fee used by several archives (e.g., AILLA) that has been negotiated in working with NSF DEL. If you have a 100,000 dollar grant, then 8,000 dollars will go toward archiving. In my CoLang class on archiving, it was explained that there is a monthly fee for server space associated with archiving, in addition to the salary of the archivist and occasional payments to programmers for changes in software. So the 8% fee is justified by the cost of doing the archiving. This kind of fee is a bit hard for me to accept, but realistically it is not out of sync with other internet prices. For example, I have on terabyte of space on Dropbox for 99 dollars a year. So for 10 years that is 1,000 dollars. However, my strong preference would be to get away from the big archive model, and have lots more regional alternatives (e.g., hundreds of regional archives, instead of a dozen DELAMAN archives). And one could imagine other kinds of alternatives, e.g., based on the kinds of scientific goals of the fieldwork project (e.g., an archive of materials related to studies of tone, or an archive of ultrasound studies, etc.).
            DELAMAN is part of a network of organizations supporting fieldwork. Others include NSF DEL, ELDP/ELAR, the Max Plank Institute, CoLang, the Language Archive and SIL. There is a shared sense of purpose between these organizations. We as theoretical/generative/formal linguists have to make sure that we participate in this growing trend, take advantage of the resources being developed to accomplish our goals and help in the development of resources. Theoretical linguists are natural candidates to develop structured questionnaires for various domains (e.g., focus or relative clauses), standard elicitation materials and searchable databases, all of which are important for fieldwork.

8.         Conclusion
One worry I had going into CoLang is that there might be an anti-generative/anti-formal bias. Why I felt this is not entirely clear to me. In this particular case, the organizers (linguists are the University of Florida) are for the most part generative/formal linguists. However, I did not get any such feeling when I was at the school. I heard one comment that might have been interpreted as anti-generative, but that comment was certainly a one-off remark, and did not characterize CoLang at all.          
In fact, there was no real discussion of analysis or theoretical framework (at least in the courses I took). In the ideal world, I would like to see discussions of the interrelation of theory and data at CoLang, in the same spirit as the ACAL 45, 2014 in Kansas, titled “Africa’s Endangered Languages: Documentary & Theoretical Approaches“. However, I also understand that the CoLang organizers are trying to emphasize certain practical topics related to data collection and data management and they are trying to appeal to wide variety of people, including linguists and language activists interested in revitalization. So lack of emphasis on the relation between theory and data is not a huge drawback. The topic may be more appropriate for conferences like ACAL 45.
There is a kind of worldview about fieldwork that has taken shape recently, and is shared by major organizations (listed above). NSF DEL and ELDP also hold the purse strings, so it is important to familiarize yourself with this worldview if you eventually want to get funding for fieldwork. The following videos on applying for an NSF DEL grant are a good place to start. Some of the presenters were also teachers and speakers at CoLang 2018:
            In sum, I highly recommend that theoretical/formal/generative linguists attend CoLang, especially at the beginning of their career, but also later in their careers like me. You will learn a lot. It is a new world of technology and one needs a place where one can keep up with the changes. Also, one needs a place to share common fieldwork goals, to experience camaraderie, to brainstorm and to network. A trip to WalMart can yield surprising outcomes.
Fieldwork can be mentally tough. Being in the field alone or even in a small group can be mentally and physically trying. It is nice to know that one has a community to rely on for resources and advice.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.