De-Novo Gene Origination from protogenes.

Incl. intelligent design, belief in divine creation

Moderators: Calilasseia, DarthHelmet86, Onyx8

Re: De-Novo Gene Origination from protogenes.

#41  Postby Rumraket » Jun 01, 2014 10:17 am

DavidMcC wrote:I weant to come back about Rumraket's point about fish with large genomes. Firstly, I suspect that that is more due to genome and/or chromosome duplication than to massive quantities of extra, non-genic DNA just appearing, due to failure to suppress it.

That hypothesis has been tested. While polyploidy definitely takes place, it is much less frequent in animals than in plants. Researchers know how to look for evidence of polyploidy, they have rejected it as an explanation for the extremely large range in genome sizes between closely related species.

In contrast, we have an empirically observed process of transposition of selfish genetic elements evolving at a netrual rate.

DavidMcC wrote:Secondly, the point that single-celled eukaryotes still exist says nothing about whether wrapped chromosomes help multicellularity, but more about the continued existence of an ecological niche for them.

I'm not sure what your point is, I don't see how it relates to a point I've raised. All organisms in existence have a method of protecting their chromosomes against damage, by wrapping or curling it up tightly so it makes a stronger structure that also takes up less volume. An often taught and simple example from microbiology and practical molecular biology is the supercoiling of bacterial plasmids.

http://en.wikipedia.org/wiki/DNA_supercoil
Half-Life 3 - I want to believe
User avatar
Rumraket
 
Posts: 13215
Age: 40

Print view this post

Ads by Google


Re: De-Novo Gene Origination from protogenes.

#42  Postby DavidMcC » Jun 01, 2014 3:07 pm

Rumraket wrote:...
DavidMcC wrote:Secondly, the point that single-celled eukaryotes still exist says nothing about whether wrapped chromosomes help multicellularity, but more about the continued existence of an ecological niche for them.

I'm not sure what your point is, I don't see how it relates to a point I've raised.
...

The point concerning the fact that there are single-celled, eukayotes with their wrapped chromosomes. I thought that would be obvious.
May The Voice be with you!
DavidMcC
 
Name: David McCulloch
Posts: 14913
Age: 67
Male

Country: United Kigdom
United Kingdom (uk)
Print view this post

Re: De-Novo Gene Origination from protogenes.

#43  Postby BooBoo » Jun 05, 2014 11:20 pm

GenesForLife wrote:
Because it is likely to be more read by people here, firstly, and secondly it debunks creationist canards regarding gain of information and associated bollocks quite nicely.


It is important to actually read the paper which is also outdated:

it is hard to reconcile this proposed mechanism with expectations that non-genic sequences should lack translational activity and, even if translated, should encode insignificant polypeptides


Such translation events would not systematically lead to de novo gene birth, as the corresponding polypeptides would not necessarily have specific biological function


So one of the principal arguments against this alleged mechanism of gene creation is that a random ORF in non-coding DNA would be extremely unlikely to encode anything functional, least of all because every protein needs at the very least a valid target signal and fold as well as some key binding sites. This requires a precise arrangement of amino acids encoded in DNA.

However, it is now known that there are many ORFs in intergenic sequences that have escaped detection until now:

http://www.sciencedaily.com/releases/20 ... 092933.htm

Researchers have come full circle and predicted that some long non-coding RNAs can give rise to small proteins that have biological functions. A recent study describes how researchers have used ribosome profiling to identify several hundred long non-coding RNAs that may give rise to small peptides.


So the "proto-genes" identified in this paper are probably just "genes", some of which are both RNA and also protein-coding. of the 1,900 genes that are regarded as possible "proto-genes" only 19 are believed to be under purifying selection. Also, In prokaryotes, there is insufficient ncDNA for this de novo origination to be viable. Only gene duplication makes sense.
User avatar
BooBoo
Banned Troll
 
Name: Rowena
Posts: 361

Print view this post

Re: De-Novo Gene Origination from protogenes.

#44  Postby Rumraket » Jun 06, 2014 7:43 am

BooBoo wrote:So one of the principal arguments against this alleged mechanism of gene creation is that a random ORF in non-coding DNA would be extremely unlikely to encode anything functional, least of all because every protein needs at the very least a valid target signal

What do you mean by a "valid target signal" ? Are you talking about intracellular transport signal peptides?

BooBoo wrote:and fold

It just needs to fold into something "sufficiently stable" to go on to acquire a biologically relevant function. In a pool of highly randomized amino acid chains, some of these will fold stably. So we would expect stably folding proteins in large randomized pools simply through probability.

The Szostak Lab proved this with muliple experiments in the late 90's and early 2000's.

BooBoo wrote:as well as some key binding sites.

It is almost impossible to imagine a biological polymer with zero binding activity.

In the very same experiments with randomized polymers I mentioned above, the Szostak lab could isolate stably folding peptides and RNA sequences that bound ATP. Subjecting these to multiple rounds of mutagenesis and selection would of course significantly improve the binding affinity and selectivity. Nevertheless, even in the starting population of randomized sequence polymers, some would fold into stable structures and bind biologically important substrates like ATP.

BooBoo wrote:This requires a precise arrangement of amino acids encoded in DNA.

Everything requires a precise arrangement of bases encoded in DNA. Even a noisily transcribed and random stretch of DNA will be "precisely arranged" in it's sequence, so the statement would seem somewhat redundant.

BooBoo wrote:However, it is now known that there are many ORFs in intergenic sequences that have escaped detection until now:

http://www.sciencedaily.com/releases/20 ... 092933.htm

Researchers have come full circle and predicted that some long non-coding RNAs can give rise to small proteins that have biological functions. A recent study describes how researchers have used ribosome profiling to identify several hundred long non-coding RNAs that may give rise to small peptides.

So the "proto-genes" identified in this paper are probably just "genes"

Isn't that a bit of a semantic issue? A proto-gene is also a gene, that's why they include the word 'gene' :grin:
I think in the view of the authors, what distinguishes a proto-gene from just a plain old mature gene is that the proto-gene is thought to have arisen relatively recently from junk or other non-coding regions, and is still technically an ORFan gene because it has no known homologue in closely related species.

BooBoo wrote:Also, In prokaryotes, there is insufficient ncDNA for this de novo origination to be viable. Only gene duplication makes sense.

I would modify the statement slightly to say that prokaryotes have so much less ncDNA that if and when de novo origination takes place, this phenomenon is so rare it is virtually insignificant compared to classic duplication scenarios. I do wonder though, how many genes have been identified as possible ORFans in prokaryotes? I wouldn't expect the number to be zero.
Half-Life 3 - I want to believe
User avatar
Rumraket
 
Posts: 13215
Age: 40

Print view this post

Re: De-Novo Gene Origination from protogenes.

#45  Postby BooBoo » Jun 07, 2014 1:26 am

Rumraket wrote:
What do you mean by a "valid target signal" ? Are you talking about intracellular transport signal peptides?


Yes. The signal peptide at the N-terminus is necessary to determine where the protein is used following its synthesis in the ribosome. If the cell doesn't recognize what the protein is supposed to go, where would it transport it?

It just needs to fold into something "sufficiently stable" to go on to acquire a biologically relevant function. In a pool of highly randomized amino acid chains, some of these will fold stably. So we would expect stably folding proteins in large randomized pools simply through probability.


The probabilities involved in generating a "sufficiently stable" protein are very small indeed given that functional proteins are themselves only marginally stable and prone to aggregation and misfolding:
http://deepblue.lib.umich.edu/bitstream ... sequence=1

The Szostak Lab proved this with muliple experiments in the late 90's and early 2000's.


Yes, Szoskstak did investigate this phenomenon in the paper whose link is provided below:

http://molbio.mgh.harvard.edu/szostakwe ... ure_01.pdf

But this is an exercise in directed evolution by artificial selection rather than a truly random search.

It is almost impossible to imagine a biological polymer with zero binding activity.


But, again, folding is important because the key binding sites have to be located at particular points within the 3D structure.

In the very same experiments with randomized polymers I mentioned above, the Szostak lab could isolate stably folding peptides and RNA sequences that bound ATP. Subjecting these to multiple rounds of mutagenesis and selection would of course significantly improve the binding affinity and selectivity.


I didn't read that: it seems that limited ATP-binding was achieved through selection.

Nevertheless, even in the starting population of randomized sequence polymers, some would fold into stable structures and bind biologically important substrates like ATP.


Again, limited ATP-binding appears to have emerged through successive selection <i>in vitro</i>. However, the authors note:

We therefore estimate that roughly 1 in 10^11 of all randoms sequence proteins have ATP-binding activity comparable to the proteins isolated in this study.


Even if true in the sense you imply, that means that 1 in 100 billion random sequence proteins can bind with ATP...it also isn't known if they would actually perform a useful biological function by binding with ATP (there is only the potential that they could).

Everything requires a precise arrangement of bases encoded in DNA. Even a noisily transcribed and random stretch of DNA will be "precisely arranged" in it's sequence, so the statement would seem somewhat redundant.


A random string of bases, by definition, would not be precisely arranged.

Isn't that a bit of a semantic issue? A proto-gene is also a gene, that's why they include the word 'gene' :grin:
I think in the view of the authors, what distinguishes a proto-gene from just a plain old mature gene is that the proto-gene is thought to have arisen relatively recently from junk or other non-coding regions, and is still technically an ORFan gene because it has no known homologue in closely related species.


Well, that is my point. Many of these "proto-genes" that exist in intergenic sequences are actually lnRNA genes that also encode ORFs that are translated into small proteins. So they have not been "generated" de novo. They always doubled up.

I would modify the statement slightly to say that prokaryotes have so much less ncDNA that if and when de novo origination takes place, this phenomenon is so rare it is virtually insignificant compared to classic duplication scenarios. I do wonder though, how many genes have been identified as possible ORFans in prokaryotes? I wouldn't expect the number to be zero.


No gene has been identified in prokaryotes that potentially could have been generated <i>de novo</i>. The ncDNA sequences are just too small. Most of them are essential regulatory sequences.
User avatar
BooBoo
Banned Troll
 
Name: Rowena
Posts: 361

Print view this post

Re: De-Novo Gene Origination from protogenes.

#46  Postby Rumraket » Jun 07, 2014 8:44 am

BooBoo wrote:
Rumraket wrote:
What do you mean by a "valid target signal" ? Are you talking about intracellular transport signal peptides?


Yes. The signal peptide at the N-terminus is necessary to determine where the protein is used following its synthesis in the ribosome. If the cell doesn't recognize what the protein is supposed to go, where would it transport it?

Not all proteins need to be transported, some arrive at location through simple diffusion. This is comparatively rare though.

BooBoo wrote:
Rumraket wrote:It just needs to fold into something "sufficiently stable" to go on to acquire a biologically relevant function. In a pool of highly randomized amino acid chains, some of these will fold stably. So we would expect stably folding proteins in large randomized pools simply through probability.

The probabilities involved in generating a "sufficiently stable" protein are very small indeed given that functional proteins are themselves only marginally stable and prone to aggregation and misfolding:
http://deepblue.lib.umich.edu/bitstream ... sequence=1

Absolutely irrelevant. It doesn't matter that the probability is low, the phenomenon takes place. Randomized sequence inevitably produceses proteins with stable folds.

BooBoo wrote:
Rumraket wrote:The Szostak Lab proved this with muliple experiments in the late 90's and early 2000's.

Yes, Szoskstak did investigate this phenomenon in the paper whose link is provided below:

http://molbio.mgh.harvard.edu/szostakwe ... ure_01.pdf

But this is an exercise in directed evolution by artificial selection rather than a truly random search.

It is both. They start by randomizing a large pool of polymers. You can't do selection on a function that isn't there. To improve upon it through multiple rounds of seleciton, it obviously have to be present to begin with.

BooBoo wrote:
Rumraket wrote:It is almost impossible to imagine a biological polymer with zero binding activity.

But, again, folding is important because the key binding sites have to be located at particular points within the 3D structure.

And yet the Szostak lab found correctly folding proteins in their randomized pools, that correctly bound ATP at measurable levels within a few rounds selection.

BooBoo wrote:
Rumraket wrote:In the very same experiments with randomized polymers I mentioned above, the Szostak lab could isolate stably folding peptides and RNA sequences that bound ATP. Subjecting these to multiple rounds of mutagenesis and selection would of course significantly improve the binding affinity and selectivity.

I didn't read that: it seems that limited ATP-binding was achieved through selection.

Then let me find it for you:
"Successive rounds of n vitro selection and amplification were performed starting with this random-sequence library. In each round the mRNA-displayed proteins were incubated with immobilized ATP, washed and eluted with free ATP. The eluted fractions were collected and amplified by polymerase chain reaction (PCR); this DNA was then used to generate a new library of mRNA displayed proteins, enriched in sequences that bind ATP, for input into the next round of selection (Fig. 1). After eight rounds, the fraction of mRNA-displayed proteins eluting with ATP had risen from 0.1 to 6.2% (Fig. 2). We cloned and sequenced 24 individual library members, which showed that the population was now dominated by 4 families of ATP-binding proteins (Fig. 3a). These families show no sequence relationship to each other or to any known biological protein. The members of each family are closely related, indicating that each family is descended from a single ancestral molecule, which was one of the original random sequences."

0.1% is not zero. There was something to select already to begin with. The implication is that already the starting pool contained four distinct proteins capable of binding ATP in solution.

BooBoo wrote:
Rumraket wrote:Nevertheless, even in the starting population of randomized sequence polymers, some would fold into stable structures and bind biologically important substrates like ATP.

Again, limited ATP-binding appears to have emerged through successive selection <i>in vitro</i>.

Although weak and somewhat nonspecific, it was there to begin with and was substantially improved in binding affinity and selectivity through multiple rounds of in vitro selection.

BooBoo wrote:However, the authors note:

BooBoo wrote:We therefore estimate that roughly 1 in 10^11 of all randoms sequence proteins have ATP-binding activity comparable to the proteins isolated in this study.


Even if true in the sense you imply, that means that 1 in 100 billion random sequence proteins can bind with ATP

...it also isn't known if they would actually perform a useful biological function by binding with ATP (there is only the potential that they could).

Well the same goes with basically any randomly mutating sequence. Which is why they don't think ORFan genes arise to stick around in the millions. Estimates for humans are in the 10-20's range, for insects they might be as high as >100 ORFans. This would seem to fit expectation from population size and generation time. Enormous insect populations that reproduce much more, would produce many more "events" where something potentially functional could arise from the junk regions.

BooBoo wrote:
Rumraket wrote:Everything requires a precise arrangement of bases encoded in DNA. Even a noisily transcribed and random stretch of DNA will be "precisely arranged" in it's sequence, so the statement would seem somewhat redundant.

A random string of bases, by definition, would not be precisely arranged.

If you randomly generate a sequence, that sequence will have a precise arrangement. If you change it, it's not longer that sequence. It is therefore precisely that sequence and not another.

Alright, I don't see why we have to go down a silly semantical digression like this. :lol:

BooBoo wrote:
Rumraket wrote:Isn't that a bit of a semantic issue? A proto-gene is also a gene, that's why they include the word 'gene' :grin:
I think in the view of the authors, what distinguishes a proto-gene from just a plain old mature gene is that the proto-gene is thought to have arisen relatively recently from junk or other non-coding regions, and is still technically an ORFan gene because it has no known homologue in closely related species.


Well, that is my point. Many of these "proto-genes" that exist in intergenic sequences are actually lnRNA genes that also encode ORFs that are translated into small proteins. So they have not been "generated" de novo. They always doubled up.

But the point is that those translated proteins might not always have had any functions, that's what would make them into genuine ORFan genes, not just transcripts that were accidentally translated too.

Also, if the originally transcribed region isn't under purifying selection, odds are it's not an lnRNA gene. If such a stretch of DNA has recently acquired function, I don't see how you can avoid the implication that we're dealing with "de novo" gene origination from junk DNA.

BooBoo wrote:
Rumraket wrote:I would modify the statement slightly to say that prokaryotes have so much less ncDNA that if and when de novo origination takes place, this phenomenon is so rare it is virtually insignificant compared to classic duplication scenarios. I do wonder though, how many genes have been identified as possible ORFans in prokaryotes? I wouldn't expect the number to be zero.

No gene has been identified in prokaryotes that potentially could have been generated <i>de novo</i>. The ncDNA sequences are just too small. Most of them are essential regulatory sequences.

I guess it would be very difficult to determine, given all the other possible explanations for finding a gene with no known sequence homologues. HGT from uncharacterized species, gene loss in relatives and the like.
Half-Life 3 - I want to believe
User avatar
Rumraket
 
Posts: 13215
Age: 40

Print view this post

Re: De-Novo Gene Origination from protogenes.

#47  Postby BooBoo » Jun 08, 2014 6:22 am

Rumraket wrote:
Not all proteins need to be transported, some arrive at location through simple diffusion. This is comparatively rare though.


Diffusion and Brownian motion are the means by which all proteins move. But where they move to is determined by the cell.

Absolutely irrelevant. It doesn't matter that the probability is low, the phenomenon takes place. Randomized sequence inevitably produces proteins with stable folds.


The probabilities do indeed matter. And, no, a randomized sequence will not inevitably produce a stable fold especially when proteins with known biological function are only marginally stable.

It is both. They start by randomizing a large pool of polymers. You can't do selection on a function that isn't there. To improve upon it through multiple rounds of seleciton, it obviously have to be present to begin with.


Actually, with artificial selection you can "evolve" a function that isn't there to begin with. That appears to be what they did. In fact, all they state in the first paragraph is that they avoided mRNA sequences with stop codons.

And yet the Szostak lab found correctly folding proteins in their randomized pools, that correctly bound ATP at measurable levels within a few rounds selection.


They don't actually state anything about correct folding. In fact, they say the exact opposite:

The low level of ATP-binding is conformational heterogeneity, possibly reflecting inefficient folding of these primordial protein sequences.


0.1% is not zero. There was something to select already to begin with. The implication is that already the starting pool contained four distinct proteins capable of binding ATP in solution.


The trouble is I don't know where they get the figure of 0.1%, especially since they later claim that the chances of a randomized sequence binding with ATP are 100 billion to one. They seem to go straight into selection without describing the initial properties of the randomized sequences..which would be of relevance to de novo origination.

Although weak and somewhat nonspecific, it was there to begin with and was substantially improved in binding affinity and selectivity through multiple rounds of in vitro selection.


Like I say, they don't provide enough information on this initial functionality. They go straight into in vitro selection from the third paragraph of the letter.

Well the same goes with basically any randomly mutating sequence. Which is why they don't think ORFan genes arise to stick around in the millions. Estimates for humans are in the 10-20's range, for insects they might be as high as >100 ORFans. This would seem to fit expectation from population size and generation time. Enormous insect populations that reproduce much more, would produce many more "events" where something potentially functional could arise from the junk regions.


3 billion bases in the human genome. How many ORFs over 240 nucleotides long exist do you think? How many of these are potentially functional and also useful?

If you randomly generate a sequence, that sequence will have a precise arrangement. If you change it, it's not longer that sequence. It is therefore precisely that sequence and not another.


It will have a particular sequence rather than a precise one where the specific arrangement matters..

But the point is that those translated proteins might not always have had any functions, that's what would make them into genuine ORFan genes, not just transcripts that were accidentally translated too.


If there are ORFs that have previously been undetected, and these exist within lncRNA intergenic sequences, then this changes everything. It shows that many stretches of hitherto non-coding DNA is, in fact, coding.

Also, if the originally transcribed region isn't under purifying selection, odds are it's not an lnRNA gene. If such a stretch of DNA has recently acquired function, I don't see how you can avoid the implication that we're dealing with "de novo" gene origination from junk DNA.


lncRNA genes are not very well conserved in sequence as compared to micro-RNAs that are highly conserved....and neither are of them are "junk". If they double up as proteins, de novo generation is a redundant explanation for the origins of genes.

I guess it would be very difficult to determine, given all the other possible explanations for finding a gene with no known sequence homologues. HGT from uncharacterized species, gene loss in relatives and the like.


Bacteria don't have introns and long intergenic sequences. Many gene sequences overlap as they do in mtDNA.
User avatar
BooBoo
Banned Troll
 
Name: Rowena
Posts: 361

Print view this post

Ads by Google


Re: De-Novo Gene Origination from protogenes.

#48  Postby BooBoo » Dec 06, 2014 4:44 pm

Rumraket wrote:
I guess it would be very difficult to determine, given all the other possible explanations for finding a gene with no known sequence homologues. HGT from uncharacterized species, gene loss in relatives and the like.


Here is a paper which provides more evidence that de novo protein-coding genes originate from functional non-coding RNA genes: Hominoid-Specific De Novo Protein-Coding Genes Originating from Long Non-Coding RNAs
http://www.plosgenetics.org/article/inf ... en.1002942

The ancestral form (of the protein-coding gene) was a functional non-coding RNA, given its regulated rather than promiscuous transcription.
User avatar
BooBoo
Banned Troll
 
Name: Rowena
Posts: 361

Print view this post

Re: De-Novo Gene Origination from protogenes.

#49  Postby Rumraket » Dec 06, 2014 6:56 pm

BooBoo wrote:
Rumraket wrote:
I guess it would be very difficult to determine, given all the other possible explanations for finding a gene with no known sequence homologues. HGT from uncharacterized species, gene loss in relatives and the like.


Here is a paper which provides more evidence that de novo protein-coding genes originate from functional non-coding RNA genes: Hominoid-Specific De Novo Protein-Coding Genes Originating from Long Non-Coding RNAs
http://www.plosgenetics.org/article/inf ... en.1002942

The ancestral form (of the protein-coding gene) was a functional non-coding RNA, given its regulated rather than promiscuous transcription.

Yes, this supports what I've been saying all throughout this thread, and contradicts your claim that it would be extremely improbably for a functional protein coding gene to originate from RNA transcripts that don't happen to code for a valid n-terminus target signal. I guess we must conclude that by chance, the RNA transcripts here either just happened to contain such a stretch of code, or that such a target signal isn't always strictly necessary.
Half-Life 3 - I want to believe
User avatar
Rumraket
 
Posts: 13215
Age: 40

Print view this post

Re: De-Novo Gene Origination from protogenes.

#50  Postby BooBoo » Dec 06, 2014 8:24 pm

Rumraket wrote:
Yes, this supports what I've been saying all throughout this thread, and contradicts your claim that it would be extremely improbably for a functional protein coding gene to originate from RNA transcripts that don't happen to code for a valid n-terminus target signal. I guess we must conclude that by chance, the RNA transcripts here either just happened to contain such a stretch of code, or that such a target signal isn't always strictly necessary.


We can't say that because we don't know if the translated sequence is actually functional or not. Also, the authors appear to suggest that the protein-coding gene derived from the non-coding RNA gene is expressed and used just like the latter is. So, in that case, the target signal might well not be important if the cell treat it as it would its non-coding parent/sister.

Anyway, the major point here is that de novo protein-coding genes don't just pop out of existence from non-functional junk DNA as some would like to think can happen.
User avatar
BooBoo
Banned Troll
 
Name: Rowena
Posts: 361

Print view this post

Re: De-Novo Gene Origination from protogenes.

#51  Postby Rumraket » Dec 06, 2014 9:57 pm

BooBoo wrote:
Rumraket wrote:
Yes, this supports what I've been saying all throughout this thread, and contradicts your claim that it would be extremely improbably for a functional protein coding gene to originate from RNA transcripts that don't happen to code for a valid n-terminus target signal. I guess we must conclude that by chance, the RNA transcripts here either just happened to contain such a stretch of code, or that such a target signal isn't always strictly necessary.

We can't say that because we don't know if the translated sequence is actually functional or not.

Wait, so you link a study that says a protein-coding gene originates de-novo from a lncRNA transcript, in order to argue that they don't originate from junk. You say it is unlikely to originate from junk, because the junk transcript has to contain a n-terminus target signal. But now you're arguing it's irrelevant, because we don't know whether the protein-coding gene is actually functional or not.

So what statement are you trying to support? If you're trying to argue that the RNA transcript from which de-novo genes originate, already has to be functional in order for a protein-coding gene to arise from it, then this study is irrelevant because as you say, we don't know whether the protein gene is even functional. You've shot yourself in the foot with this one.

BooBoo wrote:Also, the authors appear to suggest that the protein-coding gene derived from the non-coding RNA gene is expressed and used just like the latter is. So, in that case, the target signal might well not be important if the cell treat it as it would its non-coding parent/sister.

So now it's functional again, in the same post? What statement are you trying to support, your argument is all over the place.

Also, isn't it extremely unlikely for a de-novo protein translated from lncRNA (a rather confused title, since if the RNA is translated, it's not actually noncoding, but I digress) to just so happen to take up the same function as the lncRNA from which it is translated? That would seem extraordinarily unlikely. Is that really what you want to argue took place? That seems to absurdly dwarf the odds of a mere junk region producing a functional, translatable transcript.

BooBoo wrote:Anyway, the major point here is that de novo protein-coding genes don't just pop out of existence from non-functional junk DNA as some would like to think can happen.

Technically your paper does not even support that statement, all we can now conclude is that de novo protein coding genes ALSO can arise from non-junk regions. It does not preclude or even imply that junk-regions cannot also produce stranslatable transcripts with potential functions. The paper merely adds to the number of hypothetical mechanisms responsible for de-novo protein coding genes, it does not overturn other mechanisms.

A relevant question here, with regards to the whole n-terminus peptide address signal. I'm wondering why you think this makes it more unlikely for a de-novo originating protein coding region to be functional? I mean, whether the last stretch of amino-acids on a newly arisen protein coding gene is part of a region that gives the protein an address code, or whether it is another stretch of amino acids that contribute to the structure and function of the requisite protein, why would one be more unlikely than the other? Are you saying that a random stretch of amino acid is more likely to happen upon a biologically relevant function, than it happening to contain a valid n-terminus signal? If so, I'd like to see some calculation in support of that statement, because that would imply you know the total landscape of biologically relevant functions. I'm pretty sure you don't.
Half-Life 3 - I want to believe
User avatar
Rumraket
 
Posts: 13215
Age: 40

Print view this post

Re: De-Novo Gene Origination from protogenes.

#52  Postby BooBoo » Dec 08, 2014 12:27 am

Rumraket wrote:
Wait, so you link a study that says a protein-coding gene originates de-novo from a lncRNA transcript, in order to argue that they don't originate from junk. You say it is unlikely to originate from junk, because the junk transcript has to contain a n-terminus target signal. But now you're arguing it's irrelevant, because we don't know whether the protein-coding gene is actually functional or not.


We don't have all the information available to be able to say for sure.

So what statement are you trying to support? If you're trying to argue that the RNA transcript from which de-novo genes originate, already has to be functional in order for a protein-coding gene to arise from it, then this study is irrelevant because as you say, we don't know whether the protein gene is even functional. You've shot yourself in the foot with this one.


What we know is that the protein-coding gene originated from a functional RNA transcript. Whether the gene itself is functional or not is unclear. But if it is, then its functionality is derived from the RNA transcript that gave birth to it.

So now it's functional again, in the same post? What statement are you trying to support, your argument is all over the place.


It may be functional or it may not be. But if it is functional, then it could be that the cell uses the translated peptide as it would the non-coding RNA transcript which may explain why a target signal is not so important in this case.

Also, isn't it extremely unlikely for a de-novo protein translated from lncRNA (a rather confused title, since if the RNA is translated, it's not actually noncoding, but I digress) to just so happen to take up the same function as the lncRNA from which it is translated? That would seem extraordinarily unlikely. Is that really what you want to argue took place? That seems to absurdly dwarf the odds of a mere junk region producing a functional, translatable transcript.


That's a good point, and it is not clear how these long RNA transcripts contain viable ORFs. However, there is growing evidence that lnCRNAs can also double up as enhancer sequences containing binding sites for transcription factors. So maybe they have a very flexible role we don't fully understand. http://www.ncbi.nlm.nih.gov/pubmed/20887892

Technically your paper does not even support that statement, all we can now conclude is that de novo protein coding genes ALSO can arise from non-junk regions. It does not preclude or even imply that junk-regions cannot also produce stranslatable transcripts with potential functions. The paper merely adds to the number of hypothetical mechanisms responsible for de-novo protein coding genes, it does not overturn other mechanisms.


The paper provides evidence that the sequences that give rise to de novo protein-coding genes are often functional.

A relevant question here, with regards to the whole n-terminus peptide address signal. I'm wondering why you think this makes it more unlikely for a de-novo originating protein coding region to be functional? I mean, whether the last stretch of amino-acids on a newly arisen protein coding gene is part of a region that gives the protein an address code, or whether it is another stretch of amino acids that contribute to the structure and function of the requisite protein, why would one be more unlikely than the other? Are you saying that a random stretch of amino acid is more likely to happen upon a biologically relevant function, than it happening to contain a valid n-terminus signal? If so, I'd like to see some calculation in support of that statement, because that would imply you know the total landscape of biologically relevant functions. I'm pretty sure you don't.


I'm saying that even if a de novo gene product did have a function, and could fold properly, the lack of a valid signal peptide/address code means that the cell doesn't know where to put it. Also, the upstream sequence that regulate the gene are equallyimportant. You don't want a gene that codes for a digestive enzyme to be expressed in the brain!
User avatar
BooBoo
Banned Troll
 
Name: Rowena
Posts: 361

Print view this post

Re: De-Novo Gene Origination from protogenes.

#53  Postby Wortfish » Feb 27, 2017 7:14 pm

Interestingly, the NCBI is reporting that two of the three human protein-coding genes identified in the Knowles and Mclysaght study (2009) do not code for proteins after all and so do not differ from orthologous sequences in other primates.

1. CLLU1: https://www.ncbi.nlm.nih.gov/gene/574028

Expression of this gene has been shown to be upregulated in some individuals with chronic lymphocytic leukemia (CLL), and has been used for prognostic and diagnostic purposes. This gene was originally identified as a human-specific putative protein-coding gene due to the presence of a peptide (PAp00140670, HIIYSTFLSK) that could have supported translation at this locus. This peptide is not present in more recent builds of PeptideAtlas, and the presence of a protein product at this locus has not been independently verified. For this reason, this gene is being represented as non-coding. Sequence comparisons to other primates indicates that no other primate is predicted to contain an open reading frame.


2. C22orf45: https://www.ncbi.nlm.nih.gov/gene/?term=C22orf45

This locus has been reported as a novel protein-coding gene (Knowled and McLysaght, PMID: 19726446). NCBI has noted that there is uncertainty about the correct definition of the full-length transcript and which of the predicted short open reading frames is translated. Therefore, we have elected to represent the locus as non-coding until additional data supporting a specific protein product becomes available.


So the creationists might have been right all along.
User avatar
Wortfish
 
Posts: 971

United Kingdom (uk)
Print view this post

Re: De-Novo Gene Origination from protogenes.

#54  Postby Rumraket » Feb 28, 2017 8:52 am

Wortfish wrote:

So the creationists might have been right all along.

.. about what specifically?
Half-Life 3 - I want to believe
User avatar
Rumraket
 
Posts: 13215
Age: 40

Print view this post

Re: De-Novo Gene Origination from protogenes.

#55  Postby Wortfish » Feb 28, 2017 12:57 pm

Rumraket wrote:
Wortfish wrote:

So the creationists might have been right all along.

.. about what specifically?


On two counts. The Knowles & Mclysaght paper was heralded by the New Scientist in 2009 with the following headline:
"Three human genes evolved from junk": https://www.newscientist.com/article/mg ... from-junk/

But it now turns out that the orthologous sequences are not "junk", they are actually functional ncRNA genes and are present not just in humans but in other primates. Secondly, the paper was considered proof that "new information" could arise spontaneously in the genome, something the creationists insisted could not happen. That too has been refuted by the scientists at the NCBI.
User avatar
Wortfish
 
Posts: 971

United Kingdom (uk)
Print view this post

Ads by Google


Re: De-Novo Gene Origination from protogenes.

#56  Postby Rumraket » Feb 28, 2017 1:15 pm

Wortfish wrote:
Rumraket wrote:
Wortfish wrote:

So the creationists might have been right all along.

.. about what specifically?


On two counts. The Knowles & Mclysaght paper was heralded by the New Scientist in 2009 with the following headline:
"Three human genes evolved from junk": https://www.newscientist.com/article/mg ... from-junk/

But it now turns out that the orthologous sequences are not "junk", they are actually functional ncRNA genes and are present not just in humans but in other primates. Secondly, the paper was considered proof that "new information" could arise spontaneously in the genome, something the creationists insisted could not happen. That too has been refuted by the scientists at the NCBI.

Ehh no, wrong on both counts. If it evolved from junk, that just means it used to be junk, but isn't any longer. So finding it to be functional now in humans, doesn't mean it didn't evolve from junk. Second, the fact that it exists in other species doesn't say it isn't junk either. It could simply be junk in both species. In fact that's one way to find junk, if it exists in different species but shows lack of conservation due to accumulation of mutations, chances are it's not under purifying selection and thus, probably junk. It's not completely guaranteed to be junk just because it's not conserved, but it's a strong indication that mutations are accumulating at a near-neutral rate. And if mutations accumulate at a near-neutral rate, then chances are the sequence is irrelevant. And if the sequence is irrelevant, it's probably junk. It could be space DNA of some sort, but then it would probably show size-conservation instead.

A large fraction of the human genome, to pick an example, is broken copies of retroviral reverse-transcriptase genes that have proliferated in primate genomes over tens of millions of years. Most of these reverse-transcriptase genes have been inactivated due to accumulation of deleterious mutations over this time. They no longer function as protein-coding genes, they can no longer be transcribed and translated into the reverse transcriptase enzyme.

But all that DNA is still there, slowly degrading over time. It's there in chimps, and it's there in us, and a lot of our primate cousins.

Eventually, as mutations accumulate in these genes, by chance some regions of the gene will have the same sequence as a transcription factor binding site, which means they can and will recruit transcription factors, which in turn recruit RNA-polymerase and produce an RNA of some sort (usually noncoding). By chance, such a chance transcript might be functional.

In such a situation, it will effectively constitute an example of junk-DNA evolving into a functional gene (though in this particular case, not a protein-coding gene, hence the lack of an open reading frame, though that can happen too). And therefore it will also constitute an example of new information being produced by the evolutionary process: the chance accumulation of mutations results in a functional gene.
Half-Life 3 - I want to believe
User avatar
Rumraket
 
Posts: 13215
Age: 40

Print view this post

Re: De-Novo Gene Origination from protogenes.

#57  Postby Wortfish » Feb 28, 2017 3:08 pm

Rumraket wrote:
Ehh no, wrong on both counts. If it evolved from junk, that just means it used to be junk, but isn't any longer. So finding it to be functional now in humans, doesn't mean it didn't evolve from junk. Second, the fact that it exists in other species doesn't say it isn't junk either. It could simply be junk in both species. In fact that's one way to find junk, if it exists in different species but shows lack of conservation due to accumulation of mutations, chances are it's not under purifying selection and thus, probably junk. It's not completely guaranteed to be junk just because it's not conserved, but it's a strong indication that mutations are accumulating at a near-neutral rate. And if mutations accumulate at a near-neutral rate, then chances are the sequence is irrelevant. And if the sequence is irrelevant, it's probably junk. It could be space DNA of some sort, but then it would probably show size-conservation instead.


No. It was never "junk". The sequences in question are non-coding, true, but they are apparently functional ncRNAs in humans and other primates. Knowles and Myclsaght mistakenly thought that the sequences in humans coded for proteins but there is no evidence for this. To claim that a protein-coding gene is lineage-specific, you need to confirm that a valid peptide is produced in one lineage to the exclusion of all others.
User avatar
Wortfish
 
Posts: 971

United Kingdom (uk)
Print view this post

Re: De-Novo Gene Origination from protogenes.

#58  Postby Calilasseia » Feb 28, 2017 4:14 pm

What I'd like to know, is why a gene purportedly not associated with a peptide product, in the case of CLLU1, is being used as a diagnostic marker for chronic lymphocytic leukaemia? Without a gene product to test for, it's difficult to imagine how this gene can be detectably up-regulated in individuals with the disease. There are at least eight scientific papers in the medical literature devoted to using up-regulation of CLLU1 as a tool for monitoring the disease and its progress during therapeutic management. This paper is but one of them.
Signature temporarily on hold until I can find a reliable image host ...
User avatar
Calilasseia
RS Donator
 
Posts: 22091
Age: 59
Male

Country: England
United Kingdom (uk)
Print view this post

Re: De-Novo Gene Origination from protogenes.

#59  Postby Wortfish » Feb 28, 2017 5:03 pm

Calilasseia wrote:What I'd like to know, is why a gene purportedly not associated with a peptide product, in the case of CLLU1, is being used as a diagnostic marker for chronic lymphocytic leukaemia? Without a gene product to test for, it's difficult to imagine how this gene can be detectably up-regulated in individuals with the disease. There are at least eight scientific papers in the medical literature devoted to using up-regulation of CLLU1 as a tool for monitoring the disease and its progress during therapeutic management. This paper is but one of them.


Good point. The paper you refer to, however, refers only to RQ-PCR analysis to detect gene expression, not a peptide product. The gene's RNA transcript is obviously doing something, but it isn't translated into a protein.
User avatar
Wortfish
 
Posts: 971

United Kingdom (uk)
Print view this post

Re: De-Novo Gene Origination from protogenes.

#60  Postby Rumraket » Feb 28, 2017 5:24 pm

Wortfish wrote:
Rumraket wrote:
Ehh no, wrong on both counts. If it evolved from junk, that just means it used to be junk, but isn't any longer. So finding it to be functional now in humans, doesn't mean it didn't evolve from junk. Second, the fact that it exists in other species doesn't say it isn't junk either. It could simply be junk in both species. In fact that's one way to find junk, if it exists in different species but shows lack of conservation due to accumulation of mutations, chances are it's not under purifying selection and thus, probably junk. It's not completely guaranteed to be junk just because it's not conserved, but it's a strong indication that mutations are accumulating at a near-neutral rate. And if mutations accumulate at a near-neutral rate, then chances are the sequence is irrelevant. And if the sequence is irrelevant, it's probably junk. It could be space DNA of some sort, but then it would probably show size-conservation instead.


No. It was never "junk".

Why do you say this? Nothing you go on to speak about has any bearing on whether this particular locus was ever thought to be junk or not.

Just to get this out of the way, nobody who knew what they were talking about, ever thought that junk-DNA was synonymous with non-coding DNA.

So you say it was never "junk"? Okay, that might in fact be true, but the truth of that statement has no relation to whether it codes for protein or not, or whether it produces an RNA transcript.

So please make it clear to me why you say it isn't junk?

Wortfish wrote:The sequences in question are non-coding, true, but they are apparently functional ncRNAs in humans and other primates.

They are? They might be functional in humans (but even that isn't actually in evidence)*, but where is the evidence they are functional in other primate? I presume you have sequence-conservation studies from which you extract this conclusion (I presume this because in the absense of direct functional genomics, sequence-conservation would be the only other way of telling whether a particular locus is possibly functional).

* Just because something causes disease when upregulated doesn't mean it's functional, that just means it's active. A very important distinction. It might be disturbing normal cellular processes, that doesn't mean it has a selected or adaptive organismal function.

Did you know that even random DNA, deliberately constructed to be nonfunctional, will be transcribed by cellular regulatory elements as if it was functional chromosomal DNA? You can't derive function from activity. Hence I keep asking about sequence conservation comparisons (or better yet, direct biochemical experiments such as gene-knockouts, that demonstrate how it functions, if at all).

Wortfish wrote:Knowles and Myclsaght mistakenly thought that the sequences in humans coded for proteins but there is no evidence for this.

Which is irrelevant to what I am addressing: whether the sequence used to be junk-DNA in our primate ancestors (and possibly still is in our primate cousins).

To claim that a protein-coding gene is lineage-specific, you need to confirm that a valid peptide is produced in one lineage to the exclusion of all others.

I completely agree, but it is completely irrelevant. A lineage-specific protein coding gene would be what is also called an ORFan gene, but that doesn't in itself tell us whether that gene is actually a case of junk-DNA (because it might produce an actual protein, yet still be junk), or a case of functional protein coding gene that has evolved from junk-DNA.

I know this is confusing, because even protein coding genes can be junk-DNA. Most putative ORFan genes turn out to be junk, upon closer inspection, even when it is demonstrated they produce actual protein products.
Half-Life 3 - I want to believe
User avatar
Rumraket
 
Posts: 13215
Age: 40

Print view this post

PreviousNext

Return to Creationism

Who is online

Users viewing this topic: No registered users and 3 guests