#4
by don't get me started » Aug 05, 2015 4:18 pm
I'm with Spearthrower on this.
The first thing I noticed in the article was that:
" For most of the languages, the researchers used written prose from newspapers, novels, and blogs, although for ancient Greek and Latin they relied on poetry."
Written language is a very different thing from spoken language. Traditionally linguists have dismissed the spoken form of the language as 'degenerate' and 'corrupt' (These are Chomsky's words, from 1965) As such it was not deemed a worthwhile thing to study.
Fortunately, with the advent of reliable, portable, unobtrusive audio and video recording equipment, and good computers to process millions upon millions of words of transcribed spoken text in corpora we are now on happier ground. The fact is that the written form of the language is a narrow and brittle form of language, and the insights gained from studying the written form of the language are by no means always applicable to the spoken form, which is the basic form of human language.
So, in particular, what can be said about dependency length minimization?
Well, first off, the basic unit of language in speaking is not the sentence, so beloved of the generative grammarians, but the turn. And it turns out that turns at talk are usually quite short. Now one reason why turns are quite short might be because we don't want to place too much strain on the cognitive processes of our interlocutor, but the main reason is that we are are observing interactional, not syntactic rules. That is, we limit the amount of floorholding we do during interaction, so that it achieves a balance of speakership. (Research has shown that speakers do a remarkable job of aligning with each other to produce turns of similar length.)
Short turns which contain reference to the previous turn before proceeding with small incremental additions to the ongoing discourse are the norm in spoken interaction. Huge long multiple sub clause sentences with layer on layer of relativization are not the norm. For social rather than cognitive reasons.
The participants in spoken interaction co-create meaning by complex processes of repair (self and other initiated) backchanneling, discourse marking, repetition, reformulations, restarts and so on. The cognitive load of maintaining interaction (for example, achieving precision timing at turn transitions points, recognizing when other speakers are about to finish a turn, or signalling by linguistic and paralinguistic means that your turn is coming to an end, or not) is the main cognitive business of spoken interaction, not parsing sentences.
Also, in spoken interaction, people talk within context so they achieve their communicative goals by embedding their turns in the broad stream of ongoing, unfolding talk, not springing epistemic surprises on their unsuspecting listeners. In much of talk, the interlocutors already have a pretty good idea of what is going to come next, so it doesn't need that much cognitive load bearing to figure out the precise meanings of the particular grammar construction. People recognize schema and skim along, rather than struggle from scratch every turn.
Here's a complex sentence I came up with.
The people who were in the building that was hit by a plane which had been hijacked by terrorists who hate America all died.
I'm guessing that most people reading this will not have struggled too much with comprehension, despite the multiple relative clauses, due to familiarity with the content of the statement. ('Died' comes 21 words after 'people', but its no big deal once you get to 'hit by a plane'. Tell me if I'm wrong, of course.)
Lastly, corpus studies have revealed the ubiquity of chunking in spoken language. Fixed expressions such as 'You know' 'You know what I mean', 'At the end of the day', 'The thing is' are extremely common in spoken English. Because of their familiarity and fixedness, it has been proposed that the mind perceives them as 'chunks' that is, single items that only fill one 'slot' in the ongoing processing of utterances, thus allowing the brain to short cut when processing turns at talk. Multiple chunks would allow seemingly great distance between dependent words, that was not actually, cognitively that great. This data would be absent from studying only the written form of the language.
So, whilst it might be interesting to look at SVO and SOV languages and find differences in how they create meaning and then hypothesize about linguistic and cognitive universals, it all takes a back seat, in my mind at least, to the social universals that underpin all natural language use.
Oh, and to finish...
Mark Twain quipped that Schiller's history of the Thirty Year's War was entirely contained between the two parts of a German phrasal verb...