##### CoHSI IV: Unifying Horizontal and Vertical Gene Transfer - is Mechanism Irrelevant ?
In previous papers we have described with strong experimental support, the organising role that CoHSI (Conservation of Hartley-Shannon Information) plays in determining important global properties of all known proteins, from defining the length distribution, to the natural emergence of very long proteins and their relationship to evolutionary time. Here we consider the insight that CoHSI might bring to a different problem, the distribution of identical proteins across species. Horizontal and Vertical Gene Transfer (HGT/VGT) both lead to the replication of protein sequences across species through a diversity of mechanisms some of which remain unknown. In contrast, CoHSI predicts from fundamental theory that such systems will demonstrate power law behavior independently of any mechanisms, and using the Uniprot database we show that the global pattern of protein re-use is emphatically linear on a log-log plot (adj. $R^{2} = 0.99, p < 2.2 \times 10^{-16}$ over 4 decades); i.e. it is extremely close to the predicted power law. Specifically we show that over 6.9 million proteins in TrEMBL 18-02 are re-used, i.e. their sequence appears identically in between 2 and 9,812 species, with re-used proteins varying in length from 7 to as long as 14,596 amino acids. Using (DL+V) to denote the three domains of life plus viruses, 21,676 proteins are shared between two (DL+V); 22 between three (DL+V) and 5 are shared in all four (DL+V). Although the majority of protein re-use occurs between bacterial species those proteins most frequently re-used occur disproportionately in viruses, which play a fundamental role in this distribution. These results suggest that diverse mechanisms of gene transfer (including traditional inheritance) are irrelevant in determining the global distribution of protein re-use.
###### NurtureToken

Token crowdsale for this paper ends in

###### Authors

Are you an author of this paper? Check the Twitter handle we have for you is correct.

###### Subcategories
-

#1. Which part of the paper did you read?

#2. The paper contains new data or analyses that is openly accessible?
#3. The conclusion is supported by the data and analyses?
#4. The conclusion is of scientific interest?
#5. The result is likely to lead to future research?

User:
Repo:
Stargazers:
0
Forks:
0
Open Issues:
0
Network:
0
Subscribers:
0
Language:
None
Views:
0
Likes:
0
Dislikes:
0
Favorites:
0
0
###### Other
Sample Sizes (N=):
Inserted:
Words Total:
Words Unique:
Source:
Abstract:
None
11/06/18 06:03PM
7,738
2,492