How to build a *.txt database of word pronunciations?

Discuss various aspects of natural language.

Moderators: Calilasseia, ADParker

How to build a *.txt database of word pronunciations?

#1  Postby Scott H » Apr 30, 2010 8:15 pm

I would like to write a program that allows us to scan the English language for properties that words have in common with each other, such as various phonemes and spelling patterns. I believe that this will aid our understanding of not only English as a language but also human nature. It could be, for instance, that many patterns in the English language evolved around Christian notions and are used to manipulate others. 'Good' and 'God,' for example, or 'Heaven' and 'Hell': who can honestly deny a connection in these words?

In order to study the language, I will therefore need a complete text database of English words and their various pronunciations, but so far I have had little success. (I did manage to find one website called the 'English Lexicon Project,' but they do not provide the various alternative pronunciations of individual words.)
http://www.hoge-essays.com/cdl.html

I will not judge you by the color of your skin. But if I have to, I will judge you by the volume of your subwoofer.
User avatar
Scott H
THREAD STARTER
 
Name: Scott Hoge
Posts: 242
Age: 37
Male

Country: United States
United States (us)
Print view this post

Ads by Google


Re: How to build a *.txt database of word pronunciations?

#2  Postby katja z » May 03, 2010 10:30 am

Sounds like you need to delve into the etymology of English words, not write up a new programme. You'll find for example that "hell" is a pre-Christian word and that the word "god" has nothing to do with "good" (you can even check this on wikipedia).

I agree that language history is an extremely interesting subject, but scanning current pronunciations of words will tell you precious little about it. I also honestly don't see how phonemes and spelling patterns of one language will enhance your understanding of human nature in general.
User avatar
katja z
RS Donator
 
Posts: 5353
Age: 40

European Union (eur)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#3  Postby natselrox » May 03, 2010 10:34 am

I think it works the way katja said.
When in perplexity, read on.

"A system that values obedience over curiosity isn’t education and it definitely isn’t science"
User avatar
natselrox
 
Posts: 10037
Age: 109
Male

India (in)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#4  Postby Tursas » May 03, 2010 1:14 pm

We tried to tell him that at RDF (page 2 and onwards) but he didn't believe us.
User avatar
Tursas
 
Posts: 365

Jolly Roger (arr)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#5  Postby katja z » May 03, 2010 1:17 pm

... oh. I see :scratch: Thanks for sharing, Tursas :thumbup: It does seem there's not much to be done here.
User avatar
katja z
RS Donator
 
Posts: 5353
Age: 40

European Union (eur)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#6  Postby Scott H » May 04, 2010 11:52 am

Yes, nice ad populum argument.

You see, there are already patterns in the English language that clue us into human nature. More common words, for instance, tend to be shorter. And as every elementary schooler knows, there are prefixes and suffixes that clue us into the word's function in a sentence.

The question is: how far does this go? Perhaps there is an explanation for why the word 'God' looks like 'good,' or why 'Heaven' and 'Hell' both begin with 'H.' Even a stupid person can realize that patterns emerge in nature.

At any rate, I've found my dictionary file, so little more assistance is needed. :mrgreen:
http://www.hoge-essays.com/cdl.html

I will not judge you by the color of your skin. But if I have to, I will judge you by the volume of your subwoofer.
User avatar
Scott H
THREAD STARTER
 
Name: Scott Hoge
Posts: 242
Age: 37
Male

Country: United States
United States (us)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#7  Postby Darkchilde » May 26, 2010 3:16 pm

I did write in C a similar program for my Master's degree. Its aim was to take abstracts and count how many times certain roots of words appear, and give those. It was named Automatic Keyword Generation.

I had a few papers that helped me with the algorithm:

Porter M.F. An algorithm for suffix stripping, Program Vol 14, No. 3 1980, pages 130-137
Salton G., and Buckley C. Parallel Text Searching Methods, Communications of the ACM, February 1988, 31:2, pages 202-215

Also look for books by G. Salton, he's written a number of books about information retrieval, they were very helpful to me.
User avatar
Darkchilde
RS Donator
 
Posts: 9015
Age: 51
Female

Country: United Kingdom
United Kingdom (uk)
Print view this post

Ads by Google


Re: How to build a *.txt database of word pronunciations?

#8  Postby tactik » Jul 11, 2010 1:36 pm

I like that you think widely and differently, Scott... Please continue ignoring the naysayers.
tactik
 
Posts: 221

Print view this post

Re: How to build a *.txt database of word pronunciations?

#9  Postby katja z » Jul 11, 2010 2:49 pm

What Scott H is saying might make some sense IF languages were designed ... which they're not, they are products of cultural evolution. Plus, there are thousands of languages in the world, but somehow Scott is convinced that English on its own will give him clues into "human nature". Riight. This is akin to looking for a coded message by aliens in the genome of, say, Lactuca sativa, which just happens to grow in my garden.

Scott, if you sincerely wish to know about language, I can point you to some basic sources for linguistics. Have a look at Saussure's Course in General Linguistics, read some sociolinguistics (for example Louis-Jean Calvet's writings on the ecology of languages), acquaint yourself with historical linguistics and read a work or two on the history of the English language. Or simply begin by buying an etymological dictionary, they make for fascinating reading. They're a result of about two centuries of painstaking cumulative research by many scholars who have been working on the history of languages (especially the Indo-European family, to which English belongs) by studying preserved written sources, and have used the observed regularities to build ingenious models for language evolution that have even allowed them to tentatively reconstruct ancestral languages such as the Proto-Indo-European. One thing you'll learn from historical linguistics is that languages change radically and fairly quickly, and what is especially quick to change is the pronunciation you set such great store by. You do know, don't you, that Old English was a very different language from what you speak? The historical phonetics of English alone is a long and complicated story, and much more instructive than speculating on the phoneme [h] in "hell" while gazing into the ceiling.

As it is, Scott H's project has nothing to do with linguistics and I'm honestly surprised that this thread has not yet been moved to debunking, where it properly belongs. Moderators?
User avatar
katja z
RS Donator
 
Posts: 5353
Age: 40

European Union (eur)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#10  Postby tactik » Jul 11, 2010 4:35 pm

Your assumption on the validity of Scott's proposal has been noted already, Katja. Now be quiet.

(wailing to the mods is optional)
tactik
 
Posts: 221

Print view this post

Re: How to build a *.txt database of word pronunciations?

#11  Postby katja z » Jul 11, 2010 4:50 pm

tactik wrote:Your assumption on the validity of Scott's proposal has been noted already, Katja. Now be quiet.

Huh ... thanks for your advice? :scratch:

Or is it only my sarcasm sensor playing up? *taps on side of box*
User avatar
katja z
RS Donator
 
Posts: 5353
Age: 40

European Union (eur)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#12  Postby natselrox » Jul 11, 2010 4:58 pm

Just a casual question, katja. Are you a linguist by any chance? I desperately need to know one.
When in perplexity, read on.

"A system that values obedience over curiosity isn’t education and it definitely isn’t science"
User avatar
natselrox
 
Posts: 10037
Age: 109
Male

India (in)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#13  Postby katja z » Jul 11, 2010 5:18 pm

natselrox wrote:Just a casual question, katja. Are you a linguist by any chance? I desperately need to know one.

Let's say I'm half of one ;) I studied Comparative Literature and two languages at the Uni, and in each course I kept stumbling over various areas of linguistics, so I'd say I have the basics pretty much covered. Plus, I'm a translator with some theoretical involvement in translatology too (which, unsurprisingly, has a lot to do with linguistics). Depending on which topics you are interested in, I may be able to help, just don't expect me to know everything about everything! :angel:
PM me or start a new thread, and I'll see if I can be of any use! :cheers:
User avatar
katja z
RS Donator
 
Posts: 5353
Age: 40

European Union (eur)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#14  Postby natselrox » Jul 11, 2010 5:22 pm

katja z wrote:
natselrox wrote:Just a casual question, katja. Are you a linguist by any chance? I desperately need to know one.

Let's say I'm half of one ;) I studied Comparative Literature and two languages at the Uni, and in each course I kept stumbling over various areas of linguistics, so I'd say I have the basics pretty much covered. Plus, I'm a translator with some theoretical involvement in traductology too (which, unsurprisingly, has a lot to do with linguistics). Depending on which topics you are interested in, I may be able to help, just don't expect me to know everything about everything! :angel:
PM me or start a new thread, and I'll see if I can be of any use! :cheers:


I have a few questions. Let me sort them out properly and I'll start a thread. Mostly on syntax of different languages.

Thank you, dignified lady. :smile:
When in perplexity, read on.

"A system that values obedience over curiosity isn’t education and it definitely isn’t science"
User avatar
natselrox
 
Posts: 10037
Age: 109
Male

India (in)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#15  Postby Scott H » Jul 11, 2010 6:33 pm

Unfortunately, Katja Z's reply is another overconfident ad hominem attack that fails to take into account the possibility that while certain aspects of language do not have any lexical or grammatical definition, they may nevertheless reflect on some aspect of human nature. Like I said, words that are shorter are more likely to refer to activities commonly engaged in by homo sapiens -- not to mention that certain words might have been given similar spellings to aid in cognition by allowing the brain to form more efficient synaptic connections (e.g. by close association, as with Pavlov's dogs).

I'm really having trouble understanding how anyone could fail to realize this. Do you just get cocky whenever you see a post that expresses a new idea? Or is it something about my name, or avatar, or style of writing that leads you into these senseless attacks? Really, this is baby stuff. Patterns emerge in nature as a result of the fact that the laws of the universe are mathematical. Maybe some of these patterns exist not only in English but in languages around the world, and reflect the behaviors of different societies.

Just some advice: you can't win an argument by punching the arguer.
http://www.hoge-essays.com/cdl.html

I will not judge you by the color of your skin. But if I have to, I will judge you by the volume of your subwoofer.
User avatar
Scott H
THREAD STARTER
 
Name: Scott Hoge
Posts: 242
Age: 37
Male

Country: United States
United States (us)
Print view this post

Ads by Google


Re: How to build a *.txt database of word pronunciations?

#16  Postby katja z » Jul 11, 2010 6:55 pm

:roll:
Do you even know what an "ad hominem" is? And "senseless attacks"? Don't play the poor victim, Scott. Just where have I punched you (as opposed to your arguments, who we know are fair game)?

If you think I'm wrong, please present some counterarguments which will demonstrate your grasp of some of the knowledge on how language functions amassed by linguistics. Failing that, your stance is not so different from that of an ID'er who insists: "but it's obvious we couldn't have evolved from monkeys, I don't know how it is possible for anyone to fail to realize that."

Re spelling. Words were not "given" spelling "to aid cognition". Given by whom, by the way? Who is this mysterious authority who is supposed to help our cognition? English spelling reflects the history of language in various ways, for example how a word used to be pronounced in the past (written language is more conservative than spoken, so with time a gap tends to form between the two), or sometimes real or imagined etymology, etc. In fact, English orthography is a real mess - complicated and inconsistent. Some other languages (for example Portuguese) have a much closer fit between pronunciation and writing. This has to do partly with incidents of history, and partly with deliberate language policies (standardization).
User avatar
katja z
RS Donator
 
Posts: 5353
Age: 40

European Union (eur)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#17  Postby Scott H » Jul 11, 2010 7:07 pm

I have a lot to study, so I don't have time for this. Rest assured that you know nothing about my suffering or my status as a victim.
http://www.hoge-essays.com/cdl.html

I will not judge you by the color of your skin. But if I have to, I will judge you by the volume of your subwoofer.
User avatar
Scott H
THREAD STARTER
 
Name: Scott Hoge
Posts: 242
Age: 37
Male

Country: United States
United States (us)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#18  Postby katja z » Jul 11, 2010 8:06 pm

Scott H wrote:I have a lot to study, so I don't have time for this.

Evading the debate? As you wish. I too have a lot to do.

Rest assured that you know nothing about my suffering

If you suffer, I'm sorry for you (and I don't say this ironically). But this doesn't make your ideas any jot better. Sorry.

or my status as a victim.

This referred to your reaction to my criticism of your ideas:
Do you just get cocky whenever you see a post that expresses a new idea? Or is it something about my name, or avatar, or style of writing that leads you into these senseless attacks?
(...)
Just some advice: you can't win an argument by punching the arguer.

Here, you were playing the victim of the evil, violent katja z in order to avoid the actual issues. I repeat the question, just where have I senselessly attacked you, or even "punched" you?
User avatar
katja z
RS Donator
 
Posts: 5353
Age: 40

European Union (eur)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#19  Postby katja z » Jul 11, 2010 8:12 pm

natselrox wrote:
I have a few questions. Let me sort them out properly and I'll start a thread. Mostly on syntax of different languages.

:thumbup: Do drop me a pm when you've set it up.

Thank you, dignified lady. :smile:

You're welcome, teasing sir. :smile:
:cheers:
User avatar
katja z
RS Donator
 
Posts: 5353
Age: 40

European Union (eur)
Print view this post

Re: How to build a *.txt database of word pronunciations?

#20  Postby Scott H » Jul 17, 2010 9:02 pm

katja z wrote:If you think I'm wrong, please present some counterarguments which will demonstrate your grasp of some of the knowledge on how language functions amassed by linguistics.


On how "language functions"? Grammatically, or cognitively? What I'm saying is that certain features of language (perhaps related to onomatopoiea, among other things) might be used to aid cognition, a statement you explicitly deny below:

Re spelling. Words were not "given" spelling "to aid cognition".


What proof do you have?

Given by whom, by the way? Who is this mysterious authority who is supposed to help our cognition?


It's not the Christian God, in case you were wondering. There is, nevertheless, some order in the universe that arises from the laws of nature and may perhaps manifest itself in certain properties of language and spelling.

English spelling reflects the history of language in various ways, for example how a word used to be pronounced in the past (written language is more conservative than spoken, so with time a gap tends to form between the two), or sometimes real or imagined etymology, etc.


You are correct: it does reflect on the history of language, but it may not only reflect on that history. Consider, for example, the contemporaneous relationships between words (which we have been discussing), in addition to their historical and evolutionary relationships. 'God' is spelled like 'good' and 'splash' like 'special' -- could there be a significance in this? How do you know there isn't?
http://www.hoge-essays.com/cdl.html

I will not judge you by the color of your skin. But if I have to, I will judge you by the volume of your subwoofer.
User avatar
Scott H
THREAD STARTER
 
Name: Scott Hoge
Posts: 242
Age: 37
Male

Country: United States
United States (us)
Print view this post


Return to Linguistics

Who is online

Users viewing this topic: No registered users and 1 guest