Thursday, October 24, 2019

Thought as Syntax


Abstract: In this paper, I outline an approach to the study of thought from a syntactic point of view. I propose that sentences (in the sense of generative syntax) are thoughts. Under that assumption, I use natural language syntax as a probe into the structure of our thoughts, and show how such a probe sheds light on how we make deductions and our capacity for imagination.

Keywords: thought, syntax, logic, semantics, imagination


1.         Language and Thought
            What is the relation between language and thought? Certainly, our language tracks our thoughts very closely. As we think about something, we often speak about it at the same time, either to ourselves or to others. And the sentences we speak reflect the steps in our thought process. For example, suppose I am working at my office near the end of the day. I say aloud to myself:

(1)       a.         If I leave now, I will be on time.
            b.         If I leave in 10 minutes, I might or might not be on time.
            c.         I really need to be on time.
            d.         Therefore, I will leave now.

            (1a-d) are the sentences that I speak out loud, basically tracking an argument that I am making about my departure time. (1a-c) are premises, and (1d) is the conclusion. One way to think about (1) is that my spoken language is shining a flashlight on my thoughts as they occur, and so my thoughts and language are independent. In this paper, I propose an alternative:

(2)       Each sentence is a thought.

            Perhaps a more precise characterization would be: syntactic structures are either thoughts or parts of thoughts. I will adopt (2) for simplicity.
            I am assuming in (2) that sentences are mental objects. That is, a sentence is an object that occurs in the mind of a human (realized in some mysterious way by the neurons and physical mechanisms of the brain). Furthermore, I am assuming that thoughts are also mental objects. It is important to note that I did not say that each sentence maps to or corresponds to a particular thought. Rather, the sentence is just exactly the same thing as a particular thought. There is no difference between them (in terms of structure, properties, place in the brain, role in thinking, etc.).
            (2) carefully avoids claiming that all thought is language. The words thought and thinking are informal terms, and so there is no reason to think that they define a unique kind of mental object. I elaborate on this point in section 2. But there is a subset of the things that we could characterize as thoughts that are purely linguistic.
            So the first step is to give a plausibility argument for (2). I will do this in section 3. In section 4, I will discuss some possible objections to (2). Once one adopts (2), then natural language syntax provides a probe into the structure of thought (since natural language sentences are thoughts). Syntacticians have long had the tools to probe deeply into syntactic structure. In sections 5 and 6, I will sketch what I take syntax and semantics to be. Then in sections 7 through 9, I delve into particular topics concerning the relation between thought and language. For example, in section 8, I make the following point: Since the syntax of natural language is very abstract and complex, each sentence contains lots of relevant information about our current situations. On the basis of this information it is easy to make straightforward syntactic deductions, and these syntactic deductions are a kind of thinking.
A note on methodology. I am not a psychologist, and have no training in experimental psychology. However, I do have access to my own mental activity, more so than anyone else in the world. So I am treating my personal thoughts as data in much the same way that in generative syntax we treat personal grammaticality judgments as data. Whether or not anything that I say could be turned into a question that could be probed with experimental methods is a completely different direction than the one I follow here.

2.         Different Types of Thought
            We use the informal terms thought and thinking for all kinds of mental processes. For example, I can visualize my childhood neighborhood (which I have not entered for around 50 years). Starting in the front of my house, I can turn left to see my best friend’s house and its driveway. And I can turn right to see another neighbor’s house and driveway. I can also turn around 180 degrees to look at the house on the other side of the street. Then I can turn to the right of that, and see a hill that we liked to play on. Facing my house again, I can turn 90 degrees to the right, and walk to the end of the street, which was a dead end. And beyond the dead end, there was a hill that I could walk up. I can even wander around some of the houses and look at the backyards, and recognize trees. Of course, I am missing lots of details. Things look a bit blurry. And I am sure my brain is making lots of stuff up. And my information basically stops at the end of the neighborhood, where one would expect a six-year-old to travel to on bicycle. But I have no doubt that I remember the broad outlines of the neighborhood.
            The point of this example is that the information that is being processed when I take a mental tour through the neighborhood does not seem very similar to natural language syntax. It is visual, and involves images of objects and their spatial relations. Of course, visual thought might also have a kind of syntax (defined in a combinatorial sense). But it seems different from the combinatorial system of natural language syntax.
            The visual information that I have of my childhood neighborhood can interact with verbal information. I can describe all the things (and the relations between them) that I see in my mind’s eye. I can sometimes talk about the people who lived in the various houses, and episodes that occurred at the various places. And also, based on the visual information and verbal information, I can make deductions of various kinds.

3.         A Plausibility Argument
            In this section, I will give a preliminary plausibility argument for my central hypothesis in (2).
            Contrary to (2), let’s us adopt the following hypothesis (which I will reject):

(3)       Thoughts are distinct from sentences.

            On this view, both thoughts and sentences are mental objects, but they are distinct. Since thoughts are distinct mental objects from sentences, there must be a mapping from thoughts to sentences, because we are able to express our thoughts in language. Since thoughts are mapped to sentences, there must be some kind of systematic correspondence between them. In fact, the correspondence must be fairly close, because of the following argument: There are an unlimited number of thoughts and for each of those thoughts, there is a sentence expressing it. In other words, there must be a kind of algorithmic procedure that takes any particular thought and yields a sentence based on it.
            So now consider the sentences below:

(4)       a.         John left.
            b.         Bill stayed.
            c.         John left and Bill stayed.

The syntactic relation between (4a,b) and (4c) is that (4c) involves clausal coordination (of the two sentences). Certainly, these three sentences express different thoughts. However, it is intuitively plausible to assume that the thought Tc corresponding to (4c) is somehow a combination of the thoughts Ta and Tb corresponding to (4a,b). But if that assumption is correct, then thoughts have a syntax, which corresponds to the syntax of English language sentences. Just like sentences can be coordinated, the corresponding thoughts can be combined.
            We can run through this argument for all kinds of syntactic constructions (conditionals, clausal embedding, possession, relative clauses, etc.), concluding that the various ways of putting together sentences and phrases correspond to different ways of putting together thoughts and parts of thoughts. So adopting (3) seems to imply that there will be two completely separate systems, with largely parallel rules in our minds: one system for combining thoughts and one for sentences. But this seems to be redundant. Certainly, one would want evidence for two different mental systems with exactly (or mostly) the same operations.
            But the problem runs deeper than simple redundancy. If the operations forming thoughts correspond closely to the operations forming sentences, then the structure of particular thoughts corresponds closely to the structure of particular sentences (since the thoughts and sentences were built by corresponding operations). But then whatever role a thought has in the mind (e.g., giving rise to inferences, see section (8)), a sentence should be able to play a similar role (by virtue of having a closely corresponding structure).

4.         Some Objections
            In general, (2) implies that the mapping between thoughts and sentences is one-to-one (it is the identity function). So any kind of evidence that the mapping is not one-to-one would be evidence against (2).
One objection to (2) is that we sometimes think that two different sentences (with different syntactic structures) express the same thought. How can that be, if sentences are thoughts? The solution to this problem should probably broken into two parts. There are sentences that are very closely related, and only different in what syntacticians would call the PF-representation. Take for example:

(5)       I think (that) John is gone.

            In this sentence, the complementizer that seems to be optional. The sentence is acceptable whether or not that is present. One way to think about this is that the complementizer is always present, but sometimes it is spelled-out as zero and sometimes it is spelled out as that at the PF-interface. So in fact, there is a single underlying syntactic structure, and hence a single underlying thought. Certainly, both versions of (5) would play the same role in the deductive system described in section 8 below. In other words, the two versions of (5) (with and without the overt complementizer) do not correspond to two different thoughts.
A different kind of case comes from sentences express the same thought, but are different in syntax. For example:

(6)       a.         The students read the book.
            b.         The book was read by the students.

            (6a) is the active and (6b) is the passive, and they seem to express the same thought, but differ in their syntactic structures.
I suggest that when we say that two sentences express the same thought in this case, what we really mean is that the sentences are truth conditionally equivalent (or some such semantic characterization). And this is quite different from saying that two different sentences correspond to the same thought (a mental object). In other words, I am proposing that (6a) and (6b) really do correspond to two different thoughts.

5.         Syntax
In the above sections, I have argued for (2), but I have said nothing about what syntax is. Syntax is concerned with the combinatorial abilities that humans have as part of their language faculty. How are two different elements X and Y combined to form a third element Z? By element, I mean mental object represented in the mind of the speaker. What are the properties of the elements X and Y that are combined? What principles constrain the combination of X and Y? What are the properties of the resulting object Z? How are the resulting elements Z used by other faculties of the mind (e.g., How is Z spelled-out?)?
In minimalist syntax, X and Y are combined by Merge. Merge takes two arguments and produces a binary branching structure. As a result, even relatively short sentences have multiple levels of embedding. There are two subcases of Merge: Internal Merge (movement) and External Merge. These are not two different operations, but rather two subcases of the operation Merge. If a phrase undergoes movement, it occupies two (or more) different syntactic positions, but is only spelled out in one of them. In many sentences more than one constituent undergoes movement. And a single constituent can undergo movement several times (successive cyclic movement). And movement operations can interact with other movement operations (e.g., remnant movement and smuggling). For precise definitions of Merge and other related notions, see Collins and Stabler 2016.
Syntactic structures involve many phonologically empty but syntactically present elements. Well-known cases of such empty elements include null pronouns such as pro and PRO, deleted copies (traces), null scope occurrences of quantificational DPs, null functional heads, empty operators (e.g., in comparative constructions and purpose clauses), elided constituents in VP-deletion and sluicing and relative clause deletion. But in addition to these there are is a vast range of other kinds of empty elements that have just begun to be investigated in syntactic theory: implicit arguments (e.g., the external argument in the short passive), null locative and directional prepositions, null negations, null lexical items and phrases of various sorts (see Kayne 2005, 2010 for a survey).
Justification for syntactic structure is through standard syntactic argumentation, involving distributional restrictions, constituent structure tests, etc. The hypothesis in (2) says that thoughts are characterized by the same structure.

6.         Semantics
I assume that there are no word-object relations (unlike the standard conception of a model in logic). However, our language works as if there were word-object relations. In other words, we refer to London as if it actually existed in the real world even though it is not possible for something with the properties that we attribute to London to exist in the real world (see Chomsky 2000, and Gondry 2013 for a popular exposition of Chomsky’s ideas on this topic). These objects are created by the human mind, but we assume that they exist in the real world. Formal semantics includes the study of these objects. What kind of properties do these objects have, and how is a syntactic system related to them compositionally?
            A natural language ontology is a classification of objects that are assumed to exist in the real world by users of natural language, by virtue of their use of the language. Some examples are events, states, individuals, locations, degrees, times, possible worlds, and sets or sums of these objects. I assume that the syntactic structures discussed in section 5 contain DPs (determiner phrases) that refer to or quantify over such objects.
Many semantic or pragmatic properties should be analyzed as syntactic properties (which then in turn lead to interpretational differences between sentences) (see for example Collins and Postal 2012 on the syntax of imposters and how it relates to interpretation). The more that semantics and pragmatics properties are directly represented in syntactic structure, the more it will be possible to make deductions on a purely syntactic basis (see section 8).
I assume that rules of truth-conditional semantic interpretation are transparent, in the sense that they are simple and operate directly on syntactic structures (there is no type shifting or existential closure). There are no semantically understood elements that are not present in the syntax (e.g., the implicit argument in the passive, see Collins 2005). Semantic relations are represented directly in syntactic structures (e.g., UTAH). Semantic values of individual morphemes are simple (e.g., semantic functions should involve at most one or two arguments, reflecting the binary nature of syntactic structure). There is no semantics without syntax, in the sense that all semantic rules of interpretation are defined directly and transparently in terms of syntactic structures.

7.         Syntax versus Semantics in Logic
            Before talking about deduction, I will give a summary of how inference works in logical systems. In logic, it is customary to define the syntax of a language (e.g., propositional logic). Then based on the syntax of the language, the compositional semantics is defined. In order to characterize inferences, two relations are usually defined: logically consequence, and deduction.

(7)       a.         S1 S2           (logical consequence)
b.         S1 S2          (deduction)

            (7a) is a semantic relation: S2 is a logical consequence of S1 iff whenever an assignment of values to the propositional variables makes S1 true, that same assignment will make S2 true. (7b) is a syntactic relation: S2 can be deduced by proof from S1 and the axioms and syntactic rules of inference (e.g., modus ponens).
            There is a special case when S1 = {} (the empty set), written as follows:

(8)       a.         S2                 (S2 is valid)
b.         S2                (S2 is a theorem)
           
            It is a surprising fact about many logical systems (although not all) that the two relations in (8a,b) coincide. For example, for the propositional logic, the following properties hold:

(9)       a.         Soundness:                  If S is a theorem, then S is valid.
            b.         Completeness:             If S is valid, then S is a theorem.

            The important point from the perspective of this paper is that modern formal semantics in the field of linguistics is completely defined in terms of semantics and logical consequence. Deduction plays no role at all in standard expositions despite the importance of deduction in all standard theories of logic. The question of the soundness and completeness of natural language is never raised in linguistics papers or basic semantic textbooks.
            In this paper, I propose that syntactic deduction plays an important role in natural language understanding. But that does not mean that I abandon truth-conditional (model-theoretic) semantics. Rather it raises the question of what the balance between syntactic deduction and semantic inference (see (7)) should be in human thought.

8.         Inferences in Natural Language
            Given the abstractness and complexity of syntactic representations, as outlined in section 5 and 6, every sentence is extremely rich in the kind of information that can be syntactically deduced from it. For example, suppose somebody utters the following sentence:

(10)     The cow and the dog jumped over the moon.

            From (10) one can easily deduce at least the following sentences (as well as many others, depending on the richness of the syntactic representation):

(11)     a.         There is a unique cow.
            b.         There is a unique dog.
            c.         There is a unique moon.
            d.         Something happened in the past.
            e.         Something did something in the past.
            f.          Something jumped in the past.
            g.         Something jumped over the moon in the past.
            h.         The cow did something in the past.
            i.          The cow jumped in the past.
            j.          The cow jumped over something in the past.
            k.         The cow jumped over the moon in the past.
            l.          The dog did something in the past.
            m.        The dog jumped in the past.
            n.         The dog jumped over something in the past.
            o.         The dog jumped over the moon in the past.
            p.         The cow was over (above) the moon in the past.
            q.         The dog was over (above) the moon in the past.

            These are all automatic syntactic deductions based on the syntax of the sentence in (10). If one has confidence that (10) is true, then one can immediately assume all of the sentences in (11) are true without calculating reference or truth conditions or checking anything against external reality (e.g., looking into the sky, looking around at the cows and the dogs). Therefore, uttering an apparently simple sentence like (10) fills out our beliefs about the world to a considerable extent in an automatic and rapid fashion with only reference to the syntax of (10).
            Of course, there may be other non-syntactic inferences that one can make on the basis of (10). For example, if (10) is true, one can infer that some dogs have abilities way beyond any human (since humans cannot jump over the moon). But the presence of these non-syntactic inferences should not obscure the fact that quite a few inferences can be made on a purely syntactic basis.
            Generating deductions such as the ones in (11) requires some rules of deduction. Formulating these rules of deduction and developing a theory of them will not be trivial. For the most part, studies in linguistic semantics have focused on formulating rules of interpretation relating sentences to the external world. There has been very little work done on syntactic rules of interference. I will not attempt to formulate such a theory here, but I will give a few cases just so the reader has an idea of what I have in mind. I will assume a tableaux system for natural language deduction. Consider first the rule for conjunction:

(12)                             P and Q
                                         P
                                         Q

            This rule can be translated into English as follows: if the sentence [P and Q] is given, one can conclude P and one can conclude Q as well. So for example, in English:

(13)                             John left and Mary arrived
                                    John left
                                    Mary arrived

            Here is the rule for disjunction:

(14)                             P or Q
                                    P  |   Q

            This rule translates into English as follows: If [P or Q] is given, then one can conclude either P or Q (that is what the vertical line means). This rule is meant to be understood in a purely syntactic fashion, with no reference to the external world.
            Lastly, I give a preliminary version of the rule for restricted quantification in English:

(15)                            (Every P)x Q
                             a is not a P | Q[a/x]            (for any a)

            Q[a/x] means all occurrences of the variable x in Q have been replaced by a (an arbitrary name). This rule translates into English as follows: If [[Every P] Q] is given, then for any a one can conclude either [a is not a P] or Q[a/x]. As with (12) and (14), (15) is a purely syntactic rule.
            I define the rule of deduction for the existential quantifier below:

(16)                             (Some P)x Q
                                       a is a P                     (for some new a)
                                       Q[a/x]

What is the status of rules (12-16)? I assume these are syntactic rules (in the sense that they only make reference to syntactic properties) and that they are part of UG. They are unlike Merge, which forms syntactic structures from lexical items and smaller syntactic structures. The rules of deduction rather make reference to a syntactic structure and determine what kinds of deductions one can draw from it. I propose that the set of these rules forms the core of the LF-Interface (CI-Interface).
A general research program would be to explore which deductive rules humans actually use. A desideratum of this general approach might be that the conclusions in the syntactic rules of deduction above are transparently related (identical) to substructures of the original sentence. Then the rules of deduction could apply without building any additional syntactic structure. For example, in (12), P is a substructure of [P and Q]. This desideratum is not met in (15) and (16), since [a is not a P] and [a is a P] are not part of the original sentence. Carrying out this program is beyond the current paper.
The purpose of this section has simply been to show how syntactic representations could figure directly into the kinds of deductions that people would normally label as thinking (e.g., deductions in (11)). I am not claiming that it is possible to eliminate truth conditional semantics (especially a version consistent with the caveat in section 6). But a legitimate question, from the point of view of this paper, is whether any particular inference that a human makes should be considered purely syntactic or should be considered to be semantic (see (7)). As far as I know, this question has never been raised before.

9.         Imagination
            We have an unlimited capacity to imagine different situations that do not obtain in the actual world at the current time. My claim is that this capacity is in part a syntactic capacity. Each person’s I-language (generative syntactic system) can generate an unlimited number of syntactic structures. Once one of these structures is generated, our mind must deal with it as a thought.
For example, suppose I read the following sentence:

(17)     The cow jumped over the moon.

On the basis of this sentence, my mind generates an image, and can even think about the situation described by the sentence. And, as I noted above, such a sentence generates many inferences that can play a role in further inferences.
But now suppose that I do not read (17) or hear somebody speak it, but I simply form the sentence on my own. My I-language contains all the operations needed to form (17). So generating (17) is purely syntactic. Since (17) does not correspond to anything in the real world, (17) is counter-factual. It is a sentence that describes a situation that has never taken place and probably will never take place. But since I generated (17), I am able to understand it in the usual ways: I can generate deductions from (17), I can generate an image based on (17), I can check to see if (17) obtains in the real world.
The power of syntax to freely generate (17) (not constrained by any facts holding at the current time and location) is exactly what allows us to entertain (17) on a hypothetical basis, and hence is the basis for our imagination.
I suggest that our generative capacity to form sentences is at the root of imagination:

(18)     To syntactically generate a sentence S is to imagine that S.

            Therefore, I would like to propose that the capacity of the faculty of natural language to generate an unlimited number of sentences is closely related to the capacity of humans to use imagination and to think hypothetically.
            Although I do not talk about visual thought and imagination in this paper, I would assume that a similar combinatorial generative capacity underlies visual imagination (see section 2).

10.       Conclusion
            I have argued that natural language syntactic structures are thoughts or parts of thoughts. There is no need for an independent language of thought or mentalese, other than natural language syntax. Natural language syntactic structures perform the roles traditionally ascribed to thoughts. First, sentences play a role in deduction. Second, sentences play a role in imagination.

References:
Chomsky, Noam. 2000. New Verizons in the Study of Language and Mind. Cambridge University Press.

Collins, Chris. 2005. A Smuggling Approach to the Passive in English. Syntax 8, 81-120.

Collins, Chris and Edward Stabler. 2016. A Formalization of Minimalist Syntax. Syntax 19, pgs. 43-78.

Collins, Chris and Paul Postal. 2012. Imposters. MIT Press, Cambridge.

Kayne, Richard. 2005. Movement and Silence. Oxford University Press, Oxford.

Kayne, Richard. 2010. Comparisons and Contrast. Oxford University Press, Oxford.

Gondry, Michael. 2013. Is the Man Who is Tall Happy? IFC Films.







No comments:

Post a Comment

Note: Only a member of this blog may post a comment.