This
blog post is a prelude to future blog post: Why Video? In that blog post I will
tackle the question of why a generative syntactician, like me, should care
about video? But before I get to that point, I need to tackle some background
issues concerning the source of data in syntactic fieldwork.
Disclaimer: I am in Botswana this year (2019-2020), and I have none of my precious books by my side. So I will state positions in this post with no references. This kind of writing frees me from the chains of what people have actually said, but it also means that you should be careful to not attribute what I say to any specific person or specific framework until you check the references.
Before
discussing the main issue, let me get one non-issue out of the way. In
generative grammar, the goal is to understand the human language faculty (UG)
which is argued to be innate. The syntactician looks at particular I-languages,
and tries to draw conclusions about UG (e.g., by poverty of stimulus arguments,
or by comparisons to other languages). Other researchers may not share this
framework. They may take a more functionalist perspective, and not assume the
existence of UG. What I have to say in the following blog post is largely independent
of these framework issues. Whatever perspective one takes on syntactic theory
(Minimalism, Principles and Parameters, Arc-Pair Grammar, HPSG, LFG, DM,
functionalism, etc.), it is important to establish basic empirical generalizations
about the language one is working with, the kinds of empirical generalizations
that one finds in a good descriptive grammar. The following blog posts
addresses the issue of how to get the kind of data that allows one to establish
those empirical generalizations.
In
classical generative grammar, in the so-called Chomskyan tradition, the primary
data is the grammaticality judgment. The following paradigm illustrates the
relevant concepts:
(1)
a. Mary wrote up the paper.
b. Mary wrote the paper up.
(2)
a. Mary jogged up the hill.
b. *Mary jogged the hill up.
The
data is organized into minimal pairs (e.g., (2a) vs. (2b)), and paradigms
(e.g., (1) vs. (2)). Minimal pairs differ by at most one simple property. For
example, the only difference between (2a) and (2b) is the position of the word up. The * in (2b) means that (2b) is
ungrammatical.
I
draw no difference in this post between a grammaticality judgment and an
acceptability judgment. Syntacticians largely use these two terms
interchangeably. The assumption is that speakers of English in hearing or
reading a sentence such as (2b) know intuitively that something is wrong. They have
a sense/perception/feeling that the sentence is ill-formed in some way. They
are also able to evaluate the degree of ungrammaticality. The grammaticality
levels are traditionally written with diacritics such as: ?, ??, ?*, *, ** (in
that order).
From
such tightly controlled data, one can draw solid conclusions. For example, from
the contrast between (2a) and (2b), the conclusion is that locative
prepositions cannot appear following the noun phrase. From the contrast between
(1b) and (2b), the conclusion is that up
has two uses: as a particle in (1) and as a locative preposition in (2).
Creating
and evaluating such sentences constitutes an experiment. The syntactician constructs a sentence, or a minimal
pair, and then evaluates the sentences for grammaticality. Hundreds of
sentences like (1) and (2) can be generated and tested in a short time, leading
to the possibility of developing sophisticated analyses of the data in a
relatively short time (e.g., hours or days, as opposed to months or years).
Generating sentences, testing them, and building analyses based on the results
is the bread-and-butter of modern syntax. But it takes training to learn how to
do this. There is lots more to say about grammaticality judgments. But I will
leave all these issues aside in order to focus on the main topic of the essay.
The
grammaticality judgments described above contrast with another messier sort
data. People use language all the time: greetings, conversations, talking to
themselves, buying and selling, school lectures, singing, television, movies, church
services, the Bible, newspaper articles, advertisements, recipes, warning signs,
instructions on packages, Google searches, e-mail messages,
Facebook posts, texting, etc. Our lives are literally saturated with language.
All
of this data is non-experimental, since it is not carefully designed and tested
for grammaticality by the syntactician. Rather, it simply occurs. Sometimes it
is recorded (e.g., written material accessible on the internet, or in recorded political
debates, etc.), but mostly not. Most of it (99.99…%) is never evaluated
explicitly for grammaticality. For lack of a better term, I will call such data
natural, in order to contrast it with the carefully controlled grammaticality
judgments described above.
So there are two types of
syntactic data: experimental and natural.
As a
side note, my usage of the term experimental
here is somewhat non-standard. Nowadays the term seems to have been
appropriated for syntactic data obtained in the lab, or obtained using a
questionnaire administered to a small group of subjects, or obtained with
Mechanical Turk. Such methods usually involve the use of statistics in
evaluating the data. This kind of data is an extremely important development in
modern syntax, and worth a lot of attention. But to call these methods of
collecting data experimental, to the exclusion of the traditional grammaticality
judgment task, is a huge error. The grammatical judgement task is the ultimate
experiment: clearly defined, replicable, tightly controlled with a specific
range of outcomes.
From
the point of view of the grammaticality judgment task outlined above, natural data
found in the wild suffers from two main problems. First, such data is all
positive, in the sense that people do not normally produce ungrammatical
sentences in their speech. If they do produce such sentences, it is because
they are non-native speakers, or they have made some kind of error in speech
production (e.g., starting to say one sentence, but finishing with another). Since
the data is all positive, one cannot make any claim about what sentences cannot
be produced by a particular person speaking a particular language. But knowing
which sentences are ungrammatical is a powerful research tool. This is a
serious deficiency in the use of natural data.
The
second main problem is that such natural data is completely uncontrolled. It is
as if you looked out the window at the leaves blowing around, and tried to come
up with some kind of theory of leaf motion. But leaf motion is complex. You
need to factor in gravity, wind, the size and shape of the leaf, the dryness of
the leaf, and maybe many other factors. There are way too many factors to get a
clear grasp of what is going on. And even describing what is going on is
difficult. What leaf blowing patterns are out there, and which ones should you
attempt to explain? This is why it was so important that the data in (1) and
(2) was organized into minimal pairs and paradigms. Irrelevant factors are
cleared away, and one focuses on some very particular issue (e.g., the position
of up in (1b) and (2b)). Constructing
sentences and evaluating them allows one to focus narrowly on particular parts
of the sentence, and gives one confidence in ascribing the reason for ungrammaticality
to some particular property. This narrow focus in turn allows one to quickly
test various hypothesis, and to either reject them or to accept them.
Or to
put it another way, if one relied solely on natural data, it is unclear that
the relevant minimal pair in (2) would ever occur. And (2) is a relatively
simple paradigm. When looking at more complex things (such as the uses of
logophoric pronouns or the distribution of indefinite noun phrases), one needs
quite precise data to be able to figure out what is going on. And it is unclear
whether such precise data is forthcoming from a natural language corpus,
especially in the fieldwork scenario, where the corpus is often constructed
from transcribing few hours of recorded oral texts.
To
clarify the issue, suppose you wanted to work out the pronominal system of a
language having a complex series of contrasts such as person, number, gender,
inclusive/exlusive, etc. One needs to get a paradigm of such pronouns in
maximally similar contexts (to see how the surrounding environment influences
the form and use of the pronouns). First, getting all the relevant pronouns
from oral texts might not be even be possible for large pronominal inventories.
Second, getting them all in a similar context will definitely be impossible
even for a vast corpus (such as Google searches in English). But such
pronominal paradigms are a core part of the description of any language.
The
conclusions can be summarized in tabular form:
(3) Controlled/Uncontrolled Positive/Negative
Gram.
Judgments Controlled Both
Natural
Data Uncontrolled Positive
But
what about grammaticality judgments? Are there any drawbacks to gathering data purely
based on them? There are at least three main drawbacks, which I will outline
here. Before going into the problems, let me outline what I consider to be a
few non-problems.
A
possible objection to grammaticality judgments is that they are artificial, used
by syntacticians, but with no real connection to language use in the real world.
One reason to doubt such a claim is that people do in fact have systematic and
rich grammaticality judgments. It is hard to see how this could be so, if
grammaticality judgments were not somehow connected to the language abilities
of the speaker, and hence to their use of language in the real world. Second,
in real life there are areas where people do in fact employ grammaticality
judgments. If you hear a non-native speaker speaking English, and they make
mistakes (e.g., in the use of the indefinite article), you recognize it
immediately. You recognize the sentences that they use as ill-formed, and sense
there is something wrong. This recognition is a form of grammaticality
judgment. So this means that grammaticality judgments are not just an artifact
of the syntactician’s experiment, but something that everybody in every
language does all the time. In fact, grammatical judgments serve a social role.
They allow you to pick out people from different dialects (e.g., the Pittsburgh
passive, or positive anymore, or
copula drop), and also to pick out people who have learned your language as a
second language. I am not trying to explain the existence of grammaticality
judgments in terms of their social role. That is, I am definitely not giving a functionalist
explanation of grammaticality judgments. I am merely pointing out that
grammaticality judgments are far from an esoteric task created by generative
syntacticians to amuse themselves.
To
deny the validity of grammaticality judgments is essentially a form of
disrespect toward the speakers. They have knowledge of the language. Many speakers
(but not all, see below) are easily able to judge sentences as grammatical or
not. Why not listen to what they have to say about their own language, at least
as one source of data?
Another
possible objection that could be raised against grammaticality judgments is
that they are dissociated from any context, so they are disembodied and
abstract in a way that day-to-day language is not. While it is true that
linguists often present sentences to their colleagues and students out of the
blue, with no context, there is nothing inherent in the grammaticality judgment
tasks that demands this form of testing. In fact, in Field Methods classes, I
stress the importance of building contexts, even for evaluating the
grammaticality of simple sentences. I also encourage students to write down the
context, so that the context as well as the sentence can be replicated. The use
of carefully constructed contexts is well-known in the formal semantics
fieldwork literature. I see no reason why such contexts should not also be used
for grammaticality judgments.
Now
onto some real problems in the use of grammaticality judgments as the sole
source of data.
First,
in studying a language that one is unfamiliar with, it is impossible to predict
what kinds of data will exist in that language. Suppose that your task is to
investigate the syntax and semantics of =Hoan, a Khoisan language spoken in
Botswana. English does not have pluractionality marking on verbs, so from the
point of view of English there is no way to know that pluractionality marking exists
in =Hoan. And even if the syntactician
somehow stumbles on the existence of pluractionality marking, by accidentally eliciting
some sentence containing a pluractionality marker, they may not be aware of the
scope of the phenomenon and so they might not be able to know what kinds of
sentences to construct and test.
So
the approach to linguistic data based solely on grammaticality judgments runs
into the problems of existence and scope. Does a phenomenon exist in a
language, and if so, what is its empirical scope? These are severe problems. In
fact, they are problems often encountered in ex-situ Field Methods courses
where the emphasis is on grammaticality judgment tasks and translation tasks.
The
problems of existence and scope can be partially solved by collecting oral
texts. I have often been surprised at the kinds of interesting constructions
that pop up in oral texts. There are grammatical constructions one finds in
texts that would have been difficult to anticipate ahead of time. Once one
finds an interesting construction in an oral text, one can begin to explore it
using grammaticality judgments or translation tasks (or other tasks, like truth
condition tasks). But without knowing the construction exists, it is impossible
to explore it. Of course, collecting oral texts is not the final solution to
the issues of existence and scope. There may be some constructions that simply
do not appear very often, and hence may not show up on oral texts, especially is
the collection of oral texts is limited.
The
availability of oral texts also helps to study syntactic properties that are
discourse based, for example, properties having to do with specificity, ellipsis,
deixis, focus and information structure. Of course, all of these properties can
also be studied with controlled experiments (precisely varying context), but
having converging evidence from texts can be helpful, sometimes even providing
hints to the syntactician for designing controlled experiments. Similarly, the
nature of controlled grammaticality judgment experiments probably favors
careful speech, since one is carefully constructing a sentence and evaluating it.
But there may be many interesting fast speech or casual speech syntactic
phenomena, including contraction and ellipsis.
English
is far better studied than =Hoan, or any African language for that matter, and
that is why for many topics one can get away with only looking at
grammaticality judgments (and not looking at natural data). A lot of basic
observations about English grammar have already been made by traditional
grammarians, such as Curme and Jespersen, and many important details are
documented in descriptive grammars. The best ESL grammars are rich sources of
information on English grammar. Also, there is a long tradition of generative
studies on English grammar that have uncovered piecemeal many of its
interesting properties, and these have often been compiled into sources such as
Pullum and Huddleston’s CGEL. For less studied languages like =Hoan, one cannot
rely on this vast background knowledge.
By
the way, lest the reader think that the problems of existence and scope only
plague less studied languages such as =Hoan, it is also true that in English that
there are still large swaths of unexplored territory. This is a fact not often
recognized in the generative literature, and in fact, sometimes one hears
statements to the contrary (e.g., in pronouncements about “The End of Syntax”).
To take an example from my own work, the study of imposters (noun phrases such as
yours truly, the undersigned, etc.) had received no description or analysis
until very recently. And overcoming the problems of existence and scope for English
crucially required using vast corpora (e.g., Google searches), as illustrated
in Collins and Postal 2012.
See
the following blog post on searching for linguistic data using Google:
Another
way to overcome the problems of existence and scope, complementing the use of
oral texts, is the growing tradition of using linguistic questionnaires, often
associated with computer databases. There are lots of interesting and powerful
questionnaires being developed by syntacticians/typologists on various topics
(e.g., word order, anaphora, focus, tense/aspect). Going through these
questionnaires carefully can help to uncover the existence and scope of
particular constructions in a language.
A
second problem of relying solely on grammaticality judgments, especially in a fieldwork
scenario, is that many consultants simply cannot give them. In other words,
there are many consultants who when presented a sentence and asked to give a
grammaticality judgment (suitably explained to them in terms of naturalness,
etc.), either cannot do so, or give all positive responses. These kinds of
consultants are usually elderly and illiterate. The fact that they are
illiterate, in particular, means that they have never been obliged to develop
the kind of meta-linguistic awareness about linguistic form that even very
young literate speakers possess. And the unfortunate fact is that in work on
endangered languages, the remaining speakers are almost always elderly and
illiterate.
How
can one study a language when there are no ungrammaticality judgments
available? In part, one has to rely on oral texts. In part, there are various
translation tasks that are easier for consultants to accomplish that are quite
rich. For example, you can ask the consultant to translate sentence X from the
language of communication L1 (which may be English) into the target language
L2. This kind of task is quite easy for almost everybody, and yields rich
results. And there is also the back-translation tasks which is to translate a sentence
from L2 back into L1 (this test is often a good control on the translation
task). But the translation task, like the grammaticality judgment task, is also
an experiment. It is carefully controlled. The exact construction of X is
important, and usually focuses on some very narrow issue. So some of the same
issues that come up with grammaticality judgment tasks also face translation
tasks. For example, the same issues of existence and scope arise for the
translation task.
The
third problem constructing a theory based solely on grammaticality judgments is
the so-called observer’s paradox. The syntactician constructs a sentence and
then evaluates it. In doing so, they set up all kinds of implicit biases. Do
they want the sentence to come out grammatical or not? Why are they looking at
these sentences in the first place? Why did they choose those particular words,
instead of some other words? Especially for non-linguist consultants, in
fieldwork scenarios, it is difficult to control for these effects. And so,
there is always the question of whether the consultant is giving the syntactican
what they think the syntactician wants. Of course, these effects can be mitigated
in various ways. One can discuss the nature of the task with the consultants,
and explain to them exactly what is wanted (and repeat these instructions on
several occasions). One can give them clear examples of grammatical and
ungrammatical sentences to prep them. One can ask different consultants or
teams of consultants to see if there is uniformity across speakers.
A
related issue is the construction of natural sentences. Sometimes the
syntactician in the field will construct a perfectly straightforward and simple
sentence, which for some reason (unrelated to grammaticality or to the issues
being investigated) sounds unnatural to the speakers. Maybe it is word choice.
Maybe it is register. Maybe it is the subject matter discussed in the sentence.
The factors influencing how a sentence is perceived can be quite subtle and
difficult to isolate. In working with our own native language, English in my
case, we avoid this issue since we can easily construct natural sounding
sentences in our own language. But in fieldwork on less studied languages, it
is a real issue.
At
the extreme end of the set of worries associated with the observer’s paradox is
the issue of whether the syntactician is merely fabricating sentences that
don’t really exist in the language, even though the speakers judge them as grammatical.
Just as fake news plagues our modern political landscape, fake sentences might
be plaguing our modern linguistic landscape. Perhaps in diving deeper and
deeper into some theoretical issue, the syntactician constructs sentences that
no speaker would ever use in any context, but are still judged as grammatical.
Although I note this as a facet of the oberver’s paradox, I think it is less
important that the others, because it is probably difficult to find some
sentence X such that (a) no person would use it in any context and (b) it is
judged as completely grammatical.
Oral
texts can provide additional data to help avoid the various facets of the observer’s
paradox. If one can find instances of constructions that one has been studying
(e.g., involving serial verb constructions) in natural speech, and those
instances have the properties that one has established by looking at
grammaticality judgments (or the results of a translation task), then that is
converging evidence. Using sentences from oral texts as the starting point of
an investigation, based on more controlled experiments, also can help solve the
issue of unnaturally constructed sentences.
A
problem with using oral texts in this way is that in the fieldwork scenario
those oral texts also usually suffer from their own observer’s paradox. The
texts are often elicited by the syntactician. For example, the consultant can
be asked to tell a folktale in their language, which is then recorded using a
video camera and a mic. The presence of the video camera, the mic and the
syntactician are all clues to the consultant that something special is going
on. Whether or not this actually affects their speech is unclear to me. But it
should be kept in mind that such speech is not completely free of the
observer’s paradox, since the observer is so strongly present.
As a
side note, not all natural data needs to be recorded. A particularly rich set
of data in the field is just talking with the consultants either during breaks
or during off hours. Just spending time with your consultants, trying to engage
them in conversation and listening to what they say to each other can turn up
all kinds of interesting syntactic data. In fact, trying to learn a language,
and practicing with your consultants in a casual setting can be an important
source of natural data. The syntactician is still present, but their role as an
observer is replaced by their role as language learner.
I
summarize all these points in the following table:
(4)
Gram.
Judgments
Pros: controlled, both positive and negative data
Cons: issues of existence and scope, not all speakers, observer’s paradox
Pros: controlled, both positive and negative data
Cons: issues of existence and scope, not all speakers, observer’s paradox
Translation
Task:
Pros: controlled, all speakers
Cons: only positive data, issues of existence and scope, observer’s paradox
Pros: controlled, all speakers
Cons: only positive data, issues of existence and scope, observer’s paradox
Natural
Data
Pros: helps address issues of existence and scope, helps address issues of observer’s paradox
Cons: uncontrolled, only positive
Pros: helps address issues of existence and scope, helps address issues of observer’s paradox
Cons: uncontrolled, only positive
My conclusion is that both kinds of data, experimental and natural, are crucial in syntactic fieldwork.
And in fact, I take the stronger
position that both kinds of data are crucial to all syntactic research (even on
well-studied languages like English). They complement each other, and partially
help to resolve each other’s shortcomings. Take the specific task of writing a
grammar or grammatical sketch of a less-studied language. Trying to write a
grammar based purely on grammaticality judgments risks producing a grammar that
is heavily skewed to what the researcher already knows (about other languages,
such as their native language). And so, it risks missing out on interesting
aspects of the language being studied (that could be quite important for
syntactic theory), and not reflecting the real richness of the language. But
trying to write a grammar purely based on a corpus, or recorded natural speech
would be just as catastrophic, missing out on interesting generalizations about
the structure of the language that could easily be uncovered by more controlled
experiments, such as grammaticality judgments or the translation task.
Of course, that conclusion now
raises the issue of how much? What is the right balance of controlled
experimentation and natural speech to use in writing a grammar? Should 50% of
the data (in terms of number of sentences) be from controlled experiments and
50% be from audio/video recordings of natural speech. There is no way to answer
this question a priori. There are no guidelines to follow in terms of
percentages.
But one thing is clear: the presence
of huge corpuses that can be easily searched is of great importance for
syntactic research. Such corpuses help to overcome the problems outlined above,
especially the problems of existence and scope, problems which plague the
description and analysis of even better studied languages like English.
But from the other angle, if one
does find interesting data in a corpus, it is impossible to understand it
without a healthy dose of controlled experimentation. We need to be able to
play with language and manipulate it, and to run through the permutations and
possibilities, in order to be able to discover what its properties are. We
cannot learn about a language simply by keeping our eyes and ears open, and
jotting down what is being said.
Furthermore, the use of powerful
technologies such as audio and video recording equipment, and powerful software
packages, such as FLEx and ELAN, should not distract us from the conclusion
that it is impossible to do syntactic description and analysis solely on the
basis of corpus data. In making this comment, I am in no way criticizing the
use of these technologies in syntactic fieldwork. On the contrary, I feel that
they are extremely important, game-changing technologies. Rather, the point I
am trying to make is that one cannot see syntactic fieldwork merely as the
application of these technologies to natural data. Rather, they need to be
supplement by classical syntactic argumentation supported by classical
experimental methodologies, such as grammaticality judgments and translation
tasks.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.