Abstract: This is an interview that I gave on fieldwork in my career. The interviewer is Essivi Collins, my daugther. The venue was a stroll on the busy streets of New York City. We were walking from Washington Square Village to the Hudson, and then we turned right and walked up the Hudson.
Ordinary Working Grammarian
A blog about natural language syntax, fieldwork and life.
Saturday, April 4, 2026
Wednesday, April 1, 2026
Guidelines on Grading Undergraduate Syntax Assignments
Grading assignments is an essential part of the teaching process. But professors and TAs are expected to figure out how to do it on their own. This post outlines some guidelines that I have found to be useful when grading undergraduate syntax assignments.
The First Field Trip: 10 Pieces of Advice
You are a scholar at the beginning of your career, planning to do linguistic fieldwork in a distant location on a little-known language. By some miracle, you have obtained adequate funding, and you are now getting ready to go. Fieldwork is all about meeting your research objectives in less-than-ideal circumstances. This blog post outlines some essential advice for you on the eve of your first major expedition.
Tuesday, March 24, 2026
First Steps: Writing the Kpelegbe (Ewe) Dictionary
Kpelegbe is a dialect of Ewe spoken in Togo on the road from Kpalime to Atakpame. I started learning Ewe in Togo the Peace Corps (1985-1987). Then, I wrote my thesis on the syntax of Kpelegbe (MIT, 1993):
Tuesday, March 17, 2026
Book Notice: The Laws of Thought (2026)
Griffiths, Tom. 2026. The Laws of Thought. Henry Holt and Company.
Griffiths 2026 traces three distinct traditions in cognitive science, and tries to show how they can be woven together in a single theory. It characterizes the contributions in terms of Marr's famous theory of levels of analysis: computational, algorithmic, and implementation. Logic (symbolic rule systems) and Bayes rule constitute the computational level. Neural networks constitute the implementation level. As an overview, I enjoyed the book. I think it did a fair job at presenting early generative syntax (before Aspects). I also appreciated the historical overview of artificial neural networks, starting from the work of Mclelland and Rumelhart all the way through modern LLMs. It did a good job of presenting Bayes' rule for beginners, and why it might be useful for cognitive scientists, with some pointers to recent work in this area.
Reconstruction and the New Minimalism
Abstract: In this brief note, I discuss a problem that arises within the theoretical framework for natural language syntax based on Hopf algebra (Marcolli et. al. 2024). The framework does not capture reconstruction effects that are the main empirical motivation for the copy theory of movement, and a cornerstone of minimalist syntax.
Keywords: reconstruction, internal Merge, copies, No Tampering Condition
Tuesday, March 10, 2026
Comparison of COCA and Google for Syntactic Research
Inversion Seminar
March 10, 2026
Comparison of COCA and Google for Syntactic Research
In this blog post, I give a brief comparison of searching for syntactic data on COCA (Corpus of American English) and searching for syntactic data on the internet using Google (for a detailed discussion of the latter, see the appendix of Collins 2024).
Internet COCA
Size vast 1 billion words
Note: The difference in size means that that there are many more kinds of interesting examples that are accessible on the internet than on COCA.
Punctuation not sensitive very sensitive
Note: Google ignores all punctuation in doing searches. COCA does searches of strings including punctuation (including the period and quotation marks). For certain syntactic topics, e.g., quotative inversion, this is a very useful feature.
Statistics very rough precise
Note: If I search for two variants of a construction (e.g., inversion versus no inversion in a quotative construction), I might want to compare the frequency of the two variants. COCA allows very precise comparison of numbers over the corpus. But for Google, the best one can do is zero versus few versus many. The exact numbers seem to be less meaningful for Google.
Tagged Data no yes
Note: The COCA corpus has a rich system of tagging for part of speech, so these categories can be used in syntactic searches. The internet is not tagged for part of search, so no such categories can be used.