Posted: Apr 02, 2019 4:23 am
by don't get me started
Nice video. Thanks for posting it Piper. :thumbup:

In the past, scholars and maniacs had to actually sit down with a printed text and go through word by word and keep count and cross reference. Not surprisingly, this only got done for 'big' books like the Bible or the complete works of Shakespeare.
With the advent of computer technology, the possibilities have become enormous.
I often use BNC/ COCA for checking frequencies and collocations (British National Corpus. Corpus of Contemporary American)

https://www.english-corpora.org/bnc/

I'll give an example here.
In Japanese, there is a distinction made between feeling that one needs to go to sleep (眠い.. nemui ) and the feeling that one has expended a good deal of physical and/or mental energy. (疲れた..tsukareta) In English, you can express both feelings by using the word 'tired'.

'I'm tired. I'm off to bed'
'I've been working at the computer all day. I'm tired.'

My students often use the word 'sleepy' which is a direct translation of NEMUI.
My intuition got me thinking...is 'Sleepy' a common word? How do English speakers use it?

So, off I went to the corpus and checked.
Lo and behold...Sleepy comes in at 412 occurrences in the BNC corpus. Tired comes in at 3821 occurrences.
So, pretty clear data to support the claim that 'tired' is more frequent than 'sleepy'.

When I started looking at individual occurrences another pattern emerged.
Both words can have extensions of meaning. E.g. 'I'm (sick and) tired of it' doesn't mean that I'm tired and need to have a sit down.
But, tired got used a lot in its core meaning (needing rest or sleep) and also gets collocated with body parts such as 'tired legs, tired shoulders etc.
On the other hand sleepy had a lot more usage in its more metaphorical meaning, e.g. 'a sleepy backwater' , 'a sleepy village' etc.

So, it seems that my intuitions were on the right track and Japanese speakers making a consistent distinction between 'tired' and 'sleepy' is a carry over from their own language and is at some variance with the usage of native/proficient English speakers.
Not that it is a fatal error or anything...just a point of interest and also an illustration of the fact that the ultimate corpus concordancer is not on a hard drive somewhere, but on the 'soft drive' of the human brain. It seems we are all pretty good at keeping track of frequencies and collocations.