Posted: Mar 02, 2010 3:05 am
by Calilasseia
Frequently Occurring Fallacies No. 1: The Fallacy of 'One True Sequence'

A number of fallacies are in circulation amongs the enthusiasts for reality denial, and one that I wish to highlight here is known in scientific circles as "The Error of the One True Sequence". This fallacy asserts that one, and ONLY one, DNA sequence can code for a protein that performs a specific task. This is usually erected alonside assorted bogus "probability" calculations that purport to demonstrate that evolutionary processes cannot achieve what they plainly do achieve in the real world, but those other probability fallacies will be the subject of other posts. Here I want to destroy the myth that one, and ONLY one, sequence can ever work in a given situation.

Insulin provides an excellent example for my purposes, because insulin is critical to the health and well being of just about every vertebrate organism on the planet. When a vertebrate organism is unable to produce insulin, the well-known condition of diabetes mellitus, then the ability to regulate blood sugar is seriously disrupted, and in the case of Type 1 diabetes mellitus, in which the beta-cells of the Islets of Langerhans in the pancreas are destroyed by an autoimmune reaction, the result is likely to be fatal in the medium to long term due to diabetic nephropathy resulting in renal failure.

Consequently, the insulin molecule is critical to healthy functioning of vertebrate animals. The gene that codes for insulin is well known, and has been mapped in a multiplicity of organisms, including organisms whose entire genomes have been sequenced, ranging from the pufferfish Tetraodon nigroviridis through to Homo sapiens. There is demonstrable variability in insulin molecules (and the genes coding for them) across the entire panoply of vertebrate taxa. Bovine insulin, for example, is not identical to human insulin. I refer everyone to the following gene sequences, all of which have been obtained from publicly searchable online gene databases:

[1] Human insulin gene on Chromosome 11, which is as follows:

atg gcc ctg tgg atg cgc ctc ctg ccc ctg ctg gcg ctg ctg gcc ctc tgg gga cct gac
cca gcc gca gcc ttt gtg aac caa cac ctg tgc ggc tca cac ctg gtg gaa gct ctc tac
cta gtg tgc ggg gaa cga ggc ttc ttc tac aca ccc aag acc cgc cgg gag gca gag gac
ctg cag gtg ggg cag gtg gag ctg ggc ggg ggc cct ggt gca ggc agc ctg cag ccc ttg
gcc ctg gag ggg tcc ctg cag aag cgt ggc att gtg gaa caa tgc tgt acc agc atc tgc
tcc ctc tac cag ctg gag aac tac tgc aac tag

which codes for the following protein sequence (using the standard single letter mnemonics for individual amino acids, which I have colour coded to match the colour coding in this diagram of the insulin synthesis pathway in humans):

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKT
RREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKR
GIVEQCCTSICSLYQLENYCN

Now, I refer everyone to this data, which is the coding sequence for insulin in the Lowland Gorilla (differences are highlighted in boldface):

atg gcc ctg tgg atg cgc ctc ctg ccc ctg ctg gcg ctg ctg gcc ctc tgg gga cct gac
cca gcc gcg gcc ttt gtg aac caa cac ctg tgc ggc tcc cac ctg gtg gaa gct ctc tac
cta gtg tgc ggg gaa cga ggc ttc ttc tac aca ccc aag acc cgc cgg gag gca gag gac
ctg cag gtg ggg cag gtg gag ctg ggc ggg ggc cct ggt gca ggc agc ctg cag ccc ttg
gcc ctg gag ggg tcc ctg cag aag cgt ggc atc gtg gaa cag tgc tgt acc agc atc tgc
tcc ctc tac cag ctg gag aac tac tgc aac tag

this codes for the protein sequence:

MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKT
RREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKR
GIVEQCCTSICSLYQLENYCN

which so happens to be the same precursor protein. However, Gorillas are closely related to humans. Let's move a little further away, to the domestic cow, Bos taurus (whose sequence is found here):

atg gcc ctg tgg aca cgc ctg cgg ccc ctg ctg gcc ctg ctg gcg ctc tgg ccc ccc ccc
ccg gcc cgc gcc ttc gtc aac cag cat ctg tgt ggc tcc cac ctg gtg gag gcg ctg tac
ctg gtg tgc gga gag cgc ggc ttc ttc tac acg ccc aag gcc cgc cgg gag gtg gag ggc
ccg cag gtg ggg gcg ctg gag ctg gcc gga ggc ccg ggc gcg ggc ggc ctg gag ggg ccc
ccg cag aag cgt ggc atc gtg gag cag tgc tgt gcc agc gtc tgc tcg ctc tac cag ctg
gag aac tac tgt aac tag

Already this is a smaller sequence - 318 codons instead of 333 - so we KNOW we're going to get a different insulin molecule with this species ... which is as follows:

MALWTRLRPLLALLALWPPPPARAFVNQHLCGSHLVEALYLVCGERGFFYTPK
ARREVEGPQVGALELAGGPGAGGLEGPPQKRGIVE
QCCASVCSLYQLENYCN

clearly a different protein, but one which still functions as an insulin precursor and results in a mature insulin molecule in cows, one which differs in exact sequence from that in humans. Indeed, prior to the advent of transgenic bacteria, into which human insulin genes had been transplanted for the purpose of harnessing those bacteria to produce human insulin for medical use, bovine insulin harvested from the pancreases of slaughtered beef cows was used to treat diabetes mellitus in humans. Now, of course, with the advent of transgenically manufactured true human insulin, from a sterile source, bovine insulin is no longer needed, much to the relief of those who are aware of the risk from BSE.

Moving on again, we have a different coding sequence from the tropical Zebrafish, Danio rerio, (sequence to be found here) which is as follows:

atg gca gtg tgg ctt cag gct ggt gct ctg ttg gtc ctg ttg gtc gtg tcc agt gta agc
act aac cca ggc aca ccg cag cac ctg tgt gga tct cat ctg gtc gat gcc ctt tat ctg
gtc tgt ggc cca aca ggc ttc ttc tac aac ccc aag aga gac gtt gag ccc ctt ctg ggt
ttc ctt cct cct aaa tct gcc cag gaa act gag gtg gct gac ttt gca ttt aaa gat cat
gcc gag ctg ata agg aag aga ggc att gta gag cag tgc tgc cac aaa ccc tgc agc atc
ttt gag ctg cag aac tac tgt aac tga

And this sequence codes for the following protein:

MAVWLQAGALLVLLVVSSVSTNPGTPQHLCGSHLVDALYLVCGPTFTGFFYNP
KRDVEPLLGFLPPKSAQETEVADFAFKDHAELIRK
RGIVEQCCHKPCSIFELQNYCN

so again we have a different insulin precursor protein that is ultimately converted into a different insulin molecule within the Zebra Fish.

I could go on and extract more sequences, but I think the point has already been established, namely that there are a multiplicity of possible insulin molecules in existence, and consequently, the idea that there can only be ONE sequence for a functional protein, even one as critically important to life as insulin, is DEAD FLAT WRONG. Now, if this is true for a protein as crucial to the functioning of vertebrate life as insulin, you can be sure that the same applies to other proteins, including various enzymes, and therefore, whenever the "One True Sequence" fallacy rears its ugly head in various places, the above provides the refutation thereof.