Tuesday, March 10, 2026

Comparison of COCA and Google for Syntactic Research

Inversion Seminar

March 10, 2026

Comparison of COCA and Google for Syntactic Research

In this blog post, I give a brief comparison of searching for syntactic data on COCA (Corpus of American English) and searching for syntactic data on the internet using Google (for a detailed discussion of the latter, see the appendix of Collins 2024).

Internet COCA

Size vast 1 billion words

Note: The difference in size means that that there are many more kinds of interesting examples that are accessible on the internet than on COCA. 

Punctuation not sensitive very sensitive

Note: Google ignores all punctuation in doing searches. COCA does searches of strings including punctuation (including the period and quotation marks). For certain syntactic topics, e.g., quotative inversion, this is a very useful feature.

Statistics very rough precise

Note: If I search for two variants of a construction (e.g., inversion versus no inversion in a quotative construction), I might want to compare the frequency of the two variants. COCA allows very precise comparison of numbers over the corpus. But for Google, the best one can do is zero versus few versus many. The exact numbers seem to be less meaningful for Google.

Tagged Data no yes

Note: The COCA corpus has a rich system of tagging for part of speech, so these categories can be used in syntactic searches. The internet is not tagged for part of search, so no such categories can be used.

Class Exercise on Using Online Tools in Syntactic Research

Class Exercise

Inversion Seminar

March 10, 2026

Class Exercise on Using Online Tools in Syntactic Research

In the first half of the class, I will present some data that I have gathered from COCA (Corpus of Contemporary American English) on quotative inversion. In the second half of the class (starting from around 10:30am), we will do the following exercise.

Objective: Using the internet, online corpora and chatbots have become important methodologies in syntactic research. The exercise below explores various aspects of these methodologies, using the topics discussed in the seminar as a test case.

Instructions: Choose some property of quotative inversion (or another inversion construction in English) that you would like to investigate. For example, you could study the transitivity constraint, or the distribution of particles, or any other property that interests you. Then choose one of the projects below (or design your own project). Take about 20-30 minutes to carry out the project in class. You can work in small groups, if you prefer. When you finish, we will have mini-presentations (around 5 minutes each) of the results. After class, please send me a very short summary of what you discovered, including the data that you obtained. I will put these summaries in a Google Drive folder. 

Possible Projects:

1. Use COCA (https://www.english-corpora.org/coca/, or one of the other English corpora) to investigate some property of quotative inversion (or another inversion construction). Can the corpus be used to find interesting new data? Can it be used to give accurate statistical information? What challenges does the corpus pose? What benefits does it bring? What prompts did you use?

2. Use Google (or another search engine) to investigate some property of quotative inversion (or another inversion construction). Can the search engine be used to find interesting new data? Can it be used to give accurate statistical information? What challenges does the search engine pose? What benefits does it bring? What prompts did you use?

3. Use a chatbot (such as ChatGPT, Claude or Gemini) to investigate some property of quotative inversion (or another inversion construction). Can the chatbot be used to find interesting new data? Can it be used to give accurate statistical information? What challenges does the chatbot pose? What benefits does it bring? What prompts did you use?

4. Compare two of the corpora on english-corpora.org for their use in syntactic research. What are the pros and cons of each corpus for syntactic research?

5. Compare two different search engines (e.g., Google versus Bing or others) for their use in syntactic research. What are the pros and cons of each search engine for syntactic research?

6. Compare two different chatbots for their use in syntactic research. What are the pros and cons of each chatbot for syntactic research?

7. Compare two of the three methodologies above (corpora, search engines, chatbots). What are the pros and cons of each method for syntactic research?

Friday, March 6, 2026

400,000 Visits for Ordinary Working Grammarian

As of today, March 6, 2026, my blog Ordinary Working Grammarian has reached a total of 400,000 visits. The blog began on March 14, 2017, and it reached the 200,000 mark on October 19, 2024 (over seven years). The second 200,000 took about a year and a half. Currently the blog is visited by around 10,000 people per month. I feel that is a very large number for a blog whose content concerns natural language syntax and linguistic fieldwork.

In celebration of this milestone, I am posting a list of my most popular blog posts over the last year (in order of popularity). I have a broad readership throughout the world, so if you want to post as a guest, please let me know! I welcome different points of view, even those very different from my own, as long as the subject matter is syntax or fieldwork.

1. Writing a Statement of Purpose for Linguistics Graduate School

2. Togo Diary (June-July 2025)

3. On Foundational Work in Syntactic Theory

4. Undergraduate Introduction to Syntax (Lectures, Spring 2026)

5. Statement of Purpose Examples

6. Some Scribblings on Nasal Gobbling

7. Giving a Talk – Some Practical Advice

8. Statement of Objectives

9. Reading Group: Foundations of Minimalist Syntax (Spring 2026) (near final draft)

10. An Interview with Paul Postal


Sunday, February 22, 2026

Top Twenty Field Defining Papers for Generative Syntax

Which works define the field of generative syntax?

The following list is the set of the top twenty books, theses, book chapters and journal articles in generative syntax ordered in terms of total citations from Google Scholar. I have left out the following:

a. Works by Noam Chomsky, which have much higher citation rates than the works below.

b. Textbooks.

c. Edited books containing different authors.


1. Citations: 12,844

Fillmore, Charles J. 1968. The Case for Case. In Emmon Back and Robert T. Harms (eds.), Universals in Linguistic Theory, 1-88. Holt, Rinehart and Winston, New York.

2. Citations: 10,627

Kayne, Richard. 1994. The Antisymmetry of Syntax. MIT Press, Cambridge.

3. Citations: 10,299

Jackendoff, Ray. 1972. Semantic Interpretation in Generative Grammar. MIT Press, Cambridge.

4. Citations: 10,257

Rizzi, Luigi. 1997. The Fine Structure of the Left Periphery. In Liliane Haegeman (ed.), Elements of Grammar: Handbook of Generative Syntax. Springer.

5. Citations: 10,202

Levin, Beth. 1993. English Verb Classes and Alternations: A Preliminary Investigation. University of Chicago Press, Chicago.

6. Citations: 9,250 

Ross, John Robert. 1967. Constraints on Variables in Syntax. Doctoral dissertation, MIT. Cambridge.

7. Citations: 9,244

Baker, Mark. 1988. Incorporation: A Theory of Grammatical Function Changing. Chicago University Press, Chicago.

8. Citations: 8,019

Abney, Steven. 1987. The English Noun Phrase in its Sentential Aspect. Doctoral dissertation, MIT. Cambridge.

9. Citations: 7,912

Cinque, Guglielmo. 1999. Adverbs and Functional Heads: A Cross-Linguistic Perspective. Oxford University Press, Oxford.

10. Citations: 7,204

Pollock, Jean-Yves. 1989. Verb Movement, Universal Grammar and the Structure of IP. Linguistic Inquiry 20, 366-424.

11. Citations: 6,962

Grimshaw, Jane. 1990. Argument Structure. MIT Press, Cambridge.

12. Citations: 6,404

Rizzi, Luigi. 1990. Relativized Minimality. MIT Press, Cambridge.

13. Citations: 5,821

Burzio, Luigi. Italian Syntax: A Government-Binding Approach. Springer.

14. Citations: 5,792

Huang, James. 1998. Logical Relations in Chinese and the Theory of Grammar. Garland publishing, New York.

15. Citations: 5,677

Larson, Richard K. 1988. On the Double Object Constructions. Linguistic Inquiry 19, 335-391.

16. Citations: 4,556

Stowell, Tim. 1981. Origins of Phrase Structure. Doctoral dissertation, MIT. Cambridge.

17. Citations: 4,537

Levin, Beth and Malka Rappaport Hovav. 1994. Unaccusativity: At the Syntax-Lexical Sematics Interface. MIT Press, Cambridge.

18. Citations: 4,523

Jackendoff, Ray. 1977. X’-Syntax: A Study of Phrase Structure. MIT Press, Cambridge.

19. Citations: 4,221

Kratzer, Angelika. 1996. Severing the external argument from its verb. In Johann Rooryck and Laurie Zaring (eds.), Phrase Structure and the Lexicon (pp. 109–137). Kluwer Academic Publishers.

20. Citations: 3,902

Perlmutter, David. 1978. Impersonal Passives and the Unaccusative Hypothesis. Proceedings of the Annual Meeting of the Berkeley Linguistics Society 4, 157-189.

Thursday, February 19, 2026

Top Ten Field Defining Papers for Generative Syntax

Which works define the field of generative syntax?

The following list is the set of the top ten books, theses, book chapters and journal articles in generative syntax ordered in terms of total citations from Google Scholar. I have left out the following:

a. Works by Noam Chomsky, which have much higher citation rates than the works below.

b. Textbooks.

c. Edited books containing different authors.

d. Works in adjacent subdisciplines, such as semantics or phonology.

Summary:

I would roughly divide these into three classes:

A. Classic

Ross 1967

B. Argument Structure

Fillmore 1968, Levin 1993, Baker 1988, Grimshaw 1990

C. Cartography

Kayne 1994, Rizzi 1997, Abney 1987, Cinque 1999, Pollock 1989


1. Citations: 12,843

Fillmore, Charles J. 1968. The Case for Case. In Emmon Back and Robert T. Harms (eds.), Universals in Linguistic Theory, 1-88. Holt, Rinehart and Winston, New York.

2. Citations: 10,610

Kayne, Richard. 1994. The Antisymmetry of Syntax. MIT Press, Cambridge.

3. Citations: 10,253

Rizzi, Luigi. 1997. The Fine Structure of the Left Periphery. In Liliane Haegeman (ed.), Elements of Grammar: Handbook of Generative Syntax. Springer.

4. Citations: 10,199

Levin, Beth. 1993. English Verb Classes and Alternations: A Preliminary Investigation. University of Chicago Press, Chicago.

5. Citations: 9,244

Ross, John Robert. 1967. Constraints on Variables in Syntax. Doctoral dissertation, MIT. Cambridge.

6. Citations: 9,242

Baker, Mark. 1988. Incorporation: A Theory of Grammatical Function Changing. Chicago University Press, Chicago.

7. Citations: 8,018

Abney, Steven. 1987. The English Noun Phrase in its Sentential Aspect. Doctoral dissertation, MIT. Cambridge.

8. Citations: 7,910

Cinque, Guglielmo. 1999. Adverbs and Functional Heads: A Cross-Linguistic Perspective. Oxford University Press, Oxford.

9. Citations: 7,204

Pollock, Jean-Yves. 1989. Verb Movement, Universal Grammar and the Structure of IP. Linguistic Inquiry 20, 366-424.

10. Citations: 6,962

Grimshaw, Jane. 1990. Argument Structure. MIT Press, Cambridge.

New York Stories: Space Market

Space Market was buzzing. Students circled the buffet scooping food into their dishes. It was 12:30pm, time for lunch. There were at least 10 customers just at the buffet alone.

The little old guy was rutting into the vegetables with his bare fingers, scraping to get one or two more pieces into his hand. He clenched his hand into a fist to hide them, but I caught the white edge of a piece of cauliflower peeking out between his thumb and forefinger. He held his fist to down his side, and when he thought nobody was looking, popped a vegetable into his mouth, chewing it with as little motion as possible. Then he shuffled down a few feet and started the same process with another one of the food trays that had been displayed in the buffet.

The man was short with a bushy grey beard framing his pink lips. He was obviously homeless, by the clothing his was wearing. But I have seen much worse, and by New York standards, he was actually in pretty good shape. Most likely he was one of the many homeless hanging out in Washington Square Park, just across the street, and had made his way over for his lunchtime routine. As he was rutting around, he had a blank expression on his face, which I now believe was part of his way of hiding his activities. The other odd detail was that he did not carry a food tray in his circuit around the buffet. So if you were willing to put two and two together, it was pretty obvious what he was up to.

At the time I was trying to fill my disposable food dish, carefully selecting my proteins, fats, fibers and carbohydrates into a well-balanced meal, which I would take to the front of the store and pay 20 dollars for. When I realized what he was up to, a great wave of nausea passed through me. I felt like I was going to throw up. But I had already filled half of the food dish, so I could not really abandon my task. Instead, I started staring at the old guy trying to get his attention.

When he looked up, our eyes locked. He did not look away, but rather stared back at me, once again with no expression on his face. I was hoping that just by staring at him, he would become dissuaded and leave, but no such luck. He seemed to be saying to me “What are you going to do about it?” The nausea just got worse and worse.

I packed my little meal, putting on a plastic lid, and bending the tinfoil to clamp down on the lid. Then I went to the front of the store to pay. As I left, I said to the cashier, with whom I had a friendly relationship, “There is a little old guy digging into the food with his fingers and eating it.” She looked exasperated and glanced to the back of the store, but did not make any move to call the manager or any other worker at the store. Because I left as quickly as possible, I am not sure how the situation played out. The typical New York response would definitely be to ignore it, hoping it would go away.

I am pretty sure that is the last time I will eat at Space Market.

Sunday, February 15, 2026

Implicit Arguments versus Implicit Predicates

Abstract: Much has been written in the syntax literature about implicit arguments. In this squib, I introduce the term ‘implicit predicate’. I define implicit predicates and argue that they are syntactically projected, exactly like Collins 2024 argues for implicit arguments.  

Keywords: implicit arguments, implicit predicates, serial verb constructions

Implicit Arguments versus Implicit Predicates