Tuesday, March 17, 2026

Book Notice: The Laws of Thought (2026)

Griffiths, Tom. 2026. The Laws of Thought. Henry Holt and Company.

Griffiths 2026 traces three distinct traditions in cognitive science, and tries to show how they can be woven together in a single theory. It characterizes the contributions in terms of Marr's famous theory of levels of analysis: computational, algorithmic, and implementation. Logic (symbolic rule systems) and Bayes rule constitute the computational level. Neural networks constitute the implementation level. As an overview, I enjoyed the book. I think it did a fair job at presenting early generative syntax (before Aspects). I also appreciated the historical overview of artificial neural networks, starting from the work of Mclelland and Rumelhart all the way through modern LLMs. It did a good job of presenting Bayes' rule for beginners, and why it might be useful for cognitive scientists, with some pointers to recent work in this area.

Reconstruction and the New Minimalism

Abstract: In this brief note, I discuss a problem that arises within the theoretical framework for natural language syntax based on Hopf algebra (Marcolli et. al. 2024). The framework does not capture reconstruction effects that are the main empirical motivation for the copy theory of movement, and a cornerstone of minimalist syntax.

Keywords: reconstruction, internal Merge, copies, No Tampering Condition


Tuesday, March 10, 2026

Comparison of COCA and Google for Syntactic Research

Inversion Seminar

March 10, 2026

Comparison of COCA and Google for Syntactic Research

In this blog post, I give a brief comparison of searching for syntactic data on COCA (Corpus of American English) and searching for syntactic data on the internet using Google (for a detailed discussion of the latter, see the appendix of Collins 2024).

Internet COCA

Size vast 1 billion words

Note: The difference in size means that that there are many more kinds of interesting examples that are accessible on the internet than on COCA. 

Punctuation not sensitive very sensitive

Note: Google ignores all punctuation in doing searches. COCA does searches of strings including punctuation (including the period and quotation marks). For certain syntactic topics, e.g., quotative inversion, this is a very useful feature.

Statistics very rough precise

Note: If I search for two variants of a construction (e.g., inversion versus no inversion in a quotative construction), I might want to compare the frequency of the two variants. COCA allows very precise comparison of numbers over the corpus. But for Google, the best one can do is zero versus few versus many. The exact numbers seem to be less meaningful for Google.

Tagged Data no yes

Note: The COCA corpus has a rich system of tagging for part of speech, so these categories can be used in syntactic searches. The internet is not tagged for part of search, so no such categories can be used.

Class Exercise on Using Online Tools in Syntactic Research

Class Exercise

Inversion Seminar

March 10, 2026

Class Exercise on Using Online Tools in Syntactic Research

In the first half of the class, I will present some data that I have gathered from COCA (Corpus of Contemporary American English) on quotative inversion. In the second half of the class (starting from around 10:30am), we will do the following exercise.

Objective: Using the internet, online corpora and chatbots have become important methodologies in syntactic research. The exercise below explores various aspects of these methodologies, using the topics discussed in the seminar as a test case.

Instructions: Choose some property of quotative inversion (or another inversion construction in English) that you would like to investigate. For example, you could study the transitivity constraint, or the distribution of particles, or any other property that interests you. Then choose one of the projects below (or design your own project). Take about 20-30 minutes to carry out the project in class. You can work in small groups, if you prefer. When you finish, we will have mini-presentations (around 5 minutes each) of the results. After class, please send me a very short summary of what you discovered, including the data that you obtained. I will put these summaries in a Google Drive folder. 

Possible Projects:

1. Use COCA (https://www.english-corpora.org/coca/, or one of the other English corpora) to investigate some property of quotative inversion (or another inversion construction). Can the corpus be used to find interesting new data? Can it be used to give accurate statistical information? What challenges does the corpus pose? What benefits does it bring? What prompts did you use?

2. Use Google (or another search engine) to investigate some property of quotative inversion (or another inversion construction). Can the search engine be used to find interesting new data? Can it be used to give accurate statistical information? What challenges does the search engine pose? What benefits does it bring? What prompts did you use?

3. Use a chatbot (such as ChatGPT, Claude or Gemini) to investigate some property of quotative inversion (or another inversion construction). Can the chatbot be used to find interesting new data? Can it be used to give accurate statistical information? What challenges does the chatbot pose? What benefits does it bring? What prompts did you use?

4. Compare two of the corpora on english-corpora.org for their use in syntactic research. What are the pros and cons of each corpus for syntactic research?

5. Compare two different search engines (e.g., Google versus Bing or others) for their use in syntactic research. What are the pros and cons of each search engine for syntactic research?

6. Compare two different chatbots for their use in syntactic research. What are the pros and cons of each chatbot for syntactic research?

7. Compare two of the three methodologies above (corpora, search engines, chatbots). What are the pros and cons of each method for syntactic research?

Friday, March 6, 2026

400,000 Visits for Ordinary Working Grammarian

As of today, March 6, 2026, my blog Ordinary Working Grammarian has reached a total of 400,000 visits. The blog began on March 14, 2017, and it reached the 200,000 mark on October 19, 2024 (over seven years). The second 200,000 took about a year and a half. Currently the blog is visited by around 10,000 people per month. I feel that is a very large number for a blog whose content concerns natural language syntax and linguistic fieldwork.

In celebration of this milestone, I am posting a list of my most popular blog posts over the last year (in order of popularity). I have a broad readership throughout the world, so if you want to post as a guest, please let me know! I welcome different points of view, even those very different from my own, as long as the subject matter is syntax or fieldwork.

1. Writing a Statement of Purpose for Linguistics Graduate School

2. Togo Diary (June-July 2025)

3. On Foundational Work in Syntactic Theory

4. Undergraduate Introduction to Syntax (Lectures, Spring 2026)

5. Statement of Purpose Examples

6. Some Scribblings on Nasal Gobbling

7. Giving a Talk – Some Practical Advice

8. Statement of Objectives

9. Reading Group: Foundations of Minimalist Syntax (Spring 2026) (near final draft)

10. An Interview with Paul Postal