Wednesday, May 6, 2020

Grammaticality Judgments versus Acceptability Judgments

In this post, I discuss the use of the phrases “grammaticality judgment” and “acceptability judgment”. I end the post by suggesting that the phrase “grammaticality judgment” should be used for the kind of data that syntacticians gather in a judgment task.

Myers (2009) comments on the use of the phrases “grammaticality judgment” and “acceptability judgment”:

“Unfortunately, many linguists, even those who should know better, persist in using the misleading phrase ‘grammaticality judgment (e.g. Schütze 1996; Penke and Rosenbach 2004; Featherston 2007). As Newmeyer (2007) points out, a grammaticality judgment, strictly speaking, is a theoretical claim, not evidence at all…Fortunately, it seems that ‘acceptability judgment’ is slowly becoming more common in the literature, as reflected in Figure 1.”

I have heard such opinions voiced widely amongst syntacticians, especially those who do quantitative/lab work. There are two basic sets of claims that people seem to be making:

Grammaticality is a theoretical property. It is the property of being generated or allowed by a particular grammar. We are unable to measure this directly, since speakers are unable to access information directly about what is allowed by a particular grammar.

Acceptability, on the other hand, can be influenced by a wide variety of factors, such as sentence length, processing difficulty, prescriptive rules, register clashes, semantic anomaly, lexical choice, as well as many other factors. We are able to measure acceptability directly. The consultant accepts a sentence (to various degrees) or does not.

I am willing to accept the claims made in (2a) and (2b) for the sake of argument.

The point I want to make is that when we as syntacticians obtain judgments, we are trying to obtain information on grammaticality, not acceptability.

Acceptability covers a vast terrain. I enumerate a few different ways in which a sentence may be unacceptable.

For example, if a sentence is semantically strange, then it may be judged as unacceptable:

The penguin flew high over the rooftops to find his flock.

The above sentence is not only false, but strange, because everybody knows that penguins do not fly over rooftops, and so there is no situation (in the real world) in which the sentence would be true. The most famous example of a semantically anomalous sentence is Colorless green ideas sleep furiously.

Similarly, if a sentence contains a register clash, it may also be judged as unacceptable:

Would your Majesty like her fucking tea on the veranda?

In this example, the use of your Majesty (addressing royalty) indicates a high register (reinforced by the third person singular pronoun her). But the use of the swear word fucking indicates a low register. So, this is register clash.

A violation of prescriptive rules might also be judged as unacceptable by some people:

Who did you give the money to?

I have had many students in undergraduate Introduction to Syntax claim that sentences such as (5) are unacceptable, because they involve a stranded preposition, and the use of who rather than whom (see Blanchette 2015 on negative concord in English for a particularly interesting case study of a construction that violates prescriptive rules).

Even an odd word choice could lead to unacceptability:

My female sibling has two children.

In colloquial American English, we use the term sister to refer to female siblings, so (7) is odd out of the blue. In fact, even thinking of a context that would render (6) fully natural is a challenge.

Difficult-to-process sentences might be judged as unacceptable, such as the following famous garden path example:

The horse raced past the barn fell.

A reader who parses raced as the past tense form, and not the passive participle, will judge (7) as unacceptable. From experience, I know that it takes students a bit of thought and time to be able to recover from the garden path illustrated in (7).

When we as syntacticians construct our test materials, we do everything possible to avoid having our data be influenced by the factors above. We design our test sentences to avoid semantic anomalies, register clashes, prescriptive rule violations, word choice problems and processing difficulties. So, in syntax, we are definitely not interested in the vast and nebulous notion of acceptability. We are only interested in acceptability as it bears on the grammatical system that a particular person uses.

But it could be argued that the participant does not have the same perspective as the syntactician. From their perspective, they are just saying how acceptable the sentence is. I think that this position is incorrect. When administering a judgment task with naïve participants, typically instructions are given about what the researcher is looking for. Then, crucially, practice sentences are given that illustrate to the participant what kinds of responses they should be giving. Those practice sentences usually illustrate clear cases where a sentence is unacceptable because of grammatical properties. Lastly, the participants can get a feel directly from the testing materials what is being investigated.

Of course, it could be argued that no matter what the synactician does, in terms of designing test materials and designing the experimental task, they may fail in obtaining information on grammaticality. Even given careful instructions and practice sentences, the participant may not be judging the sentences in the way expected by the syntactician. There may be very subtle confounds that the syntactician did not control for so that the judgments given do not reflect grammaticality, but reflect something else (e.g., processing difficulty or prescriptive bias).

But in this case, I would say that the experiment has failed to yield information about grammaticality. In other words, the task was a designed to yield information about grammaticality, but it failed in the purpose it was designed for because of the confounds. Note that this is quite different from saying that the experiment failed to yield interesting and useful information. The natural way to continue after such a failure would be to design new materials that somehow avoided the confounds, or to try to explain the results introducing some kind of additional factor (not grammaticality).

For these reasons, a perfectly accurate term for the syntacticians data might be “grammatical acceptability judgement” meaning the following: an acceptability judgment of a sentence that is meant to probe whether or not the sentence conforms to the grammatical system of the participant. Such a task would be in contrast to a task eliciting “semantic acceptability judgment” or “register acceptability judgment” or “processing acceptability judgment”.

But nobody ever uses such cumbersome terminology, so the choice seems to be between “grammaticality judgment” and “acceptability judgment “. The later suffers from the problem that it does not tell us what the purpose of the study was, and it makes it seem like we are interested in some general notion of acceptability. Therefore, I suggest that we use “grammaticality judgment” to mean: an acceptability judgment of a sentence that is meant to probe whether or not the sentence conforms to the grammatical system of the consultant.

In order to illustrate my claims, I would like to respond to some remarks posted by Alan Munn on Facebook concerning an earlier draft of this post (on Wednesday, May 6, 2020). He began by outlining his objections to the use of the phrase “grammaticality judgment”:

“I seem to agree with you on most things Chris, but here I really disagree. Of course, as syntacticians we are trying to understand grammaticality, and in some contexts an acceptability judgement will indeed show us something about it, but I do think that making the distinction is important, especially so outside the context of fieldwork. As a fieldworker, you have a vastly enriched set of information you can get about the source of people's judgements, and so you have perhaps more reasons to be fairly certain that you are, in fact, testing grammaticality. But in the context of judgement studies with large(ish) numbers of random subjects be they undergraduates or Mechanical Turkers, there can be no such assurance. It's up to those who do such experiments to make the argument that the results of their acceptability judgement task are best explained by the grammaticality claim they are making. But I strongly think that the terms should not be conflated.”

Then to illustrate the kind of difficulty that can arise, he gave the following very striking example:

“To give you a concrete example of what we're talking about, I was once working with a native speaker consultant asking pretty simple judgements about sentences verbs involving psychological predicates like ‘scare’ in their language. A pretty standard example would be something like “The ghost scared John”. This speaker, who was an astrophysicist by training, would judge sentences with ‘ghost’ in them as unacceptable, even though if I changed the sentence to "The dog scared John" or “The train scared John”, they would judge them to be fine. Now I have no doubts that the 'ghost' sentences were also completely grammatical in terms of their syntax, but this speaker really didn't like sentences involving ghosts. I didn't press them on the reason, since it didn't matter for me. But this underlines the fact that people can judge even the simplest sentences as unacceptable for reasons that have no relation to the properties that we as linguists think we are testing. This is why it's so important not to conflate the two terms.”

From my perspective (as outlined in this blog post), the experiment was a grammaticality judgment task because the participant was asked to give an acceptability judgment of a sentence which was meant to probe whether or not the sentence conforms to the grammatical system of the participant. They were presented a sentence “The ghost scared John”, which they judged as unacceptable because of the word choice ‘ghost’. So, it is clear that the participant was not judging this sentence on the basis of their own grammatical system, which would clearly allow “The ghost scared John”. In other words, I would say that the experiment was a grammaticality judgment task, but the participant failed to provide the information that they were intended to provide.

Acknowledgments: I thank Frances Blanchette for comments on several early versions of this post.

Blanchette, Frances. 2015. English Negative Concord, Negative Polarity, and Double Negation. Doctoral dissertation, CUNY Graduate Center. (

Myers, James. 2009. Syntactic Judgment Experiments. Language and Linguistics Compass 3/1, 406-423.

