Power-law Behavior of the Alternative Splicing of Exons in Human Transcriptome

Main Article Content

Vasily V. Grinev
Petr V. Nazarov
Eugene A. Klimov

Abstract

Aims: To establish the common rules of exon combinatorics during RNA splicing.

Study Design: Inferring a plausible statistical model of exon combinatorics from the annotated models of human genes during RNA splicing.

Place and Duration of Study: Department of Genetics (Belarusian State University), Proteome and Genome Research Unit (Luxembourg Institute of Health), Department of Genetics (Lomonosov Moscow State University) and Moscow Center of Experimental Embryology and Reproductive Biotechnologies, between January 2017 and July 2019.

Methodology: We used human mRNA and EST sequences from GenBank (1093522 unique records in total) and linear models of the human genes from Ensembl (58051 genes), AceView (72384 genes), ECgene (57172 genes), NCBI RefSeq (54262 genes), UCSC Genome Browser (58037 genes) and VEGA (54950 genes) to calculate a combinatorial index of human exons. We inferred the most plausible statistical model describing the distribution of combinatorial index of human exons using Clauset’s mathematical formalism. Predictors of the combinatorial index values and functional outcomes of the predefined behavior of exons during splicing were also determined.

Results: Power-law is the most plausible statistical model describing the combinatorics of exons during RNA splicing. The combinatorial index of human exons is defined by more than 90% by the 138 features that have different importance. The most important of these features are the abundance of exon in transcripts, the strength of splice sites, the rank of exon in transcripts and the type of exon. Analysis of the marginal effects shows that different values of the same feature have unequal influence on the combinatorial index of human exons. Power-law behavior of exons during RNA splicing pre-determines structural diversity of transcripts, low sensitivity of splicing process to random perturbations and its high vulnerability to manipulation with highly combinative exons.

Conclusion: Exons widely involved in alternative splicing are a part of the common power-law phenomenon in human cells. The power-law behavior of exons during RNA splicing gives the unique characteristics to human genes.

Keywords:
Human exons, RNA splicing, combinatorics, statistical modeling, power-law, predictors, functional outcomes.

Article Details

How to Cite
Grinev, V., Nazarov, P., & Klimov, E. (2019). Power-law Behavior of the Alternative Splicing of Exons in Human Transcriptome. Annual Research & Review in Biology, 32(6), 1-13. https://doi.org/10.9734/arrb/2019/v32i630105
Section
Original Research Article

References

Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008; 456(7221):470-476.
DOI: 10.1038/nature07509

Gurskaya NG, Staroverov DB, Zhang L, Fradkov AF, Markina NM, Pereverzev AP, Lukyanov KA. Analysis of alternative splicing of cassette exons at single-cell level using two fluorescent proteins. Nucleic Acids Research. 2012;40(8):e57.
DOI: 10.1093/nar/gkr1314

Marinov GK, Williams BA, McCue K, Schroth GP, Gertz J, Myers RM, Wold BJ. From single-cell to cell-pool transcriptomes: Stochasticity in gene expression and RNA splicing. Genome Research. 2014;24(3):496-510.
DOI: 10.1101/gr.161034.113

Castle JC, Zhang C, Shah JK, Kulkarni AV, Kalsotra A, Cooper TA, Johnson JM. Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines. Nature Genetics. 2008;40(12):1416-1425.
DOI: 10.1038/ng.264

Gamazon ER, Stranger BE. Genomics of alternative splicing: Evolution, development and pathophysiology. Human Genetics. 2014;133(6):679-687.
DOI: 10.1007/s00439-013-1411-3

Mazin P, Xiong J, Liu X, Yan Z, Zhang X, Li M, He L, Somel M, Yuan Y, Phoebe Chen YP, Li N, Hu Y, Fu N, Ning Z, Zeng R, Yang H, Chen W, Gelfand M, Khaitovich P. Widespread splicing changes in human brain development and aging. Molecular Systems Biology. 2013;9:633.
DOI: 10.1038/msb.2012.67

Nilsen TW, Graveley BR. Expansion of the eukaryotic proteome by alternative splicing. Nature. 2010;463(7280):457-463.
DOI: 10.1038/nature08909

Rosa A, Brivanlou AH. Regulatory non-coding RNAs in pluripotent stem cells. International Journal of Molecular Sciences. 2013;14(7):14346-14373.
DOI: 10.3390/ijms140714346

Biamonti G, Caceres JF. Cellular stress and RNA splicing. Trends in Biochemical Sciences. 2009;34(3):146-153.
DOI: 10.1016/j.tibs.2008.11.004

Chandler DS, Singh RK, Caldwell LC, Bitler JL, Lozano G. Genotoxic stress induces coordinately regulated alternative splicing of the p53 modulators MDM2 and MDM4. Cancer Research. 2006;66(19): 9502-9508.
DOI: 10.1158/0008-5472.CAN-05-4271

Barash Y, Calarco JA, Gao W, Pan Q, Wang X, Shai O, Blencowe BJ, Frey BJ. Deciphering the splicing code. Nature. 2010;465(7294):53-59.
DOI: 10.1038/nature09000

Matlin AJ, Clark F, Smith CW. Under-standing alternative splicing: Towards a cellular code. Nature Reviews Molecular Cell Biology. 2005;6(5):386- 398.
DOI: 10.1038/nrm1645

Karolchik D, Barber GP, Casper J, Clawson H, Cline MS, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M, Harte RA, Heitner S, Hinrichs AS, Learned K, Lee BT, Li CH, Raney BJ, Rhead B, Rosenbloom KR, Sloan CA, Speir ML, Zweig AS, Haussler D, Kuhn RM, Kent WJ. The UCSC Genome Browser database: 2014 update. Nucleic Acids Research. 2014; 42(Database issue):D764-770.
DOI: 10.1093/nar/gkt1168

Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. Gen Bank. Nucleic acids research. 2014; 42(Database issue):D32-37.
DOI: 10.1093/nar/gkt1030

Flicek P, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fitzgerald S, Gil L, Giron CG, Gordon L, Hourlier T, Hunt S, Johnson N, Juettemann T, Kahari AK, Keenan S, Kulesha E, Martin FJ, Maurel T, McLaren WM, Murphy DN, Nag R, Overduin B, Pignatelli M, Pritchard B, Pritchard E, Riat HS, Ruffier M, Sheppard D, Taylor K, Thormann A, Trevanion SJ, Vullo A, Wilder SP, Wilson M, Zadissa A, Aken BL, Birney E, Cunningham F, Harrow J, Herrero J, Hubbard TJ, Kinsella R, Muffato M, Parker A, Spudich G, Yates A, Zerbino DR, Searle SM. Ensembl 2014. Nucleic Acids Research. 2014; 42(Database issue):D749-755.
DOI: 10.1093/nar/gkt1196

Thierry-Mieg D, Thierry-Mieg J. AceView: A comprehensive cDNA-supported gene and transcripts annotation. Genome Biology. 2006;7(Suppl 1):11-14.
DOI: 10.1186/gb-2006-7-s1-s12

Kim P, Kim N, Lee Y, Kim B, Shin Y, Lee S. ECgene: Genome annotation for alterna-tive splicing. Nucleic Acids Research. 2005;33(Database issue):D75-79.
DOI: 10.1093/nar/gki118

Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, DiCuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. RefSeq: An update on mammalian reference sequences. Nucleic Acids Research. 2014;42(Database issue): D756-763.
DOI: 10.1093/nar/gkt1114

Harrow JL, Steward CA, Frankish A, Gilbert JG, Gonzalez JM, Loveland JE, Mudge J, Sheppard D, Thomas M, Trevanion S, Wilming LG. The vertebrate genome annotation browser 10 years on. Nucleic Acids Research. 2014; 42(Database issue):D771-779.
DOI: 10.1093/nar/gkt1241

Heber S, Alekseyev M, Sze SH, Tang H, Pevzner PA. Splicing graphs and EST assembly problem. Bioinformatics. 2002; 18(Suppl 1):S181-188.
DOI: 10.1093/bioinformatics/18.suppl_1.s181

Newman MEJ. Power laws, pareto distributions and Zipf's law. Contemporary Physics. 2005;46(5):323-351.
DOI: 10.1080/00107510500052444

Clauset A, Shalizi CR, Newman MEJ. Power-law distributions in empirical data. SIAM Review. 2009;51(4):661-703.
DOI: 10.1137/070710111

Virkar Y, Clauset A. Power-law distributions in binned empirical data. The Annals of Applied Statistics. 2014;8(1):89-119.
DOI: 10.1214/13-aoas710

Klaus A, Yu S, Plenz D. Statistical analyses support power law distributions found in neuronal avalanches. PloS One. 2011;6(5):e19779.
DOI: 10.1371/journal.pone.0019779

Vuong QH. Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica. 1989;57(2):307.
DOI: 10.2307/1912557

Albert R, Barabási AL. Statistical mechanics of complex networks. Reviews of Modern Physics. 2002;74(1):47-97.
DOI: 10.1103/RevModPhys.74.47

Grinev VV, Ramanouskaya TV, Gloushen SV. Multidimensional control of cell structural robustness. Cell Biology International. 2013;37(10):1023-1037.
DOI: 10.1002/cbin.10128

Stelling J, Sauer U, Szallasi Z, Doyle FJ, 3rd, Doyle J. Robustness of cellular functions. Cell. 2004;118(6):675-685.
DOI: 10.1016/j.cell.2004.09.008

Koonin EV, Wolf YI, Karev GP. Power laws, scale-free networks and genome biology. Molecular biology intelligence unit. Landes Bioscience/Eurekah.com; Springer Science+Business Media, Georgetown, Tex. New York, N.Y; 2006.

Kaneko K, Furusawa C. Consistency principle in biological dynamical systems. Theory in biosciences = Theorie in den Biowissenschaften. 2008;127(2):195-204.
DOI: 10.1007/s12064-008-0034-z

Marquet PA, Quinones RA, Abades S, Labra F, Tognelli M, Arim M, Rivadeneira M. Scaling and power-laws in ecological systems. The Journal of Experimental Biology. 2005;208(Pt 9):1749-1769.
DOI: 10.1242/jeb.01588

Stauffer D. Phase transitions on fractals and networks. 2009;6783-6789.
DOI: 10.1007/978-0-387-30440-3_406

Nykter M, Price ND, Larjo A, Aho T, Kauffman SA, Yli-Harja O, Shmulevich I. Critical networks exhibit maximal information diversity in structure-dynamics relationships. Physical Review Letters. 2008;100(5):058702.
DOI: 10.1103/PhysRevLett.100.058702

Carlson JM, Doyle J. Complexity and robustness. Proceedings of the National Academy of Sciences of the United States of America. 2002;99(Suppl 1):2538-2545.
DOI: 10.1073/pnas.012582499