because I suck at math
Moderators: Calilasseia, ADParker
Rumraket wrote:No, I think you're definitely on the right track there. I just need some confirmation and it's a go
Edit: I think we still need to factor in that there are 8 possible kinds of mutations at every of those 3.2 billion sites.
zoon wrote:Rumraket wrote:No, I think you're definitely on the right track there. I just need some confirmation and it's a go
Edit: I think we still need to factor in that there are 8 possible kinds of mutations at every of those 3.2 billion sites.
Each one of those 8 possible mutations still counts as a single mutation, with a probability of (1 divided by 3.2 billion).
Rumraket wrote:zoon wrote:Rumraket wrote:No, I think you're definitely on the right track there. I just need some confirmation and it's a go
Edit: I think we still need to factor in that there are 8 possible kinds of mutations at every of those 3.2 billion sites.
Each one of those 8 possible mutations still counts as a single mutation, with a probability of (1 divided by 3.2 billion).
Hmmm yeah, 1 of 8 will happen yes, but there are 8 possibilities, so the probability of a specific mutation at every site would be 1/8 th in 3.2 billion?
I guess that would make the total probability 1/8th divided by 3.2 billion to the 100th power?
lucek wrote:Not to throw a monkey wrench into the works here but this is far more complicated. The odds of all 8 alternatives is different from the rest due to the different chemical reactions required for them to happen Even substitutions have slight biases.
zoon wrote:I have a suspicion that's the probability of getting those 100 mutations in a particular order
zoon wrote:Was there any particular reason for the query?
Rumraket wrote:So, I'm trying to work out how to calculate the probability of some event, but I don't know how.
It's actually a pretty simple thing (I think?): Take the human genome (~3.2 billion base-pairs), randomly insert 100 mutations, then calculate the probability of those 100 specific mutations happening before they did. Every base-pair can mutate of course, and there are multiple mutations possible at every site.
To keep it simple, I just want these mutations included:
Insertion of any basepair anywhere (that means 4 different kinds of insertions possible at 3.2 billions sites)
Deletions of any basepair anywhere (delete one basepair at any of 3.2 billion pasepairs)
Any substitution anywhere (change any basepair at any of 3.2 billion sites into one of the 3 others).
I think you can simplify this to say there is 8 possible changes possible at any site (1 deletion, 4 insertions, 3 substitutions)
100 of those in total will happen in a human genome. How does one make a formula to calculate the probability?
Rumraket wrote:I know the real events are much, much more complicated. Transversion vs transition bias, deletion vs insertion (all these have different probability distributions and even then, their distributions are different in some areas of the genome compared to others), duplication can almost universally only happen in areas with high numbers of repeats and prone to unequal crossover etc. etc.
The point is not to get an accurate representation of the real biochemistry of mutation(one could probably write a fucking dissertation on that). I have even excluded some mutations from my example.
What I want to learn is how to do the math of these kinds of problems of mutations in general. If there are N sites, X number of mutations happen, and there are Y number of possible mutations at each site, how do I calculate the probability? If N is 3x109, X is 100 and Y is 8. How's the formula look?
That way I can just plug in the numbers, so to speak, as I get across different situations.
Users viewing this topic: No registered users and 1 guest