Mining the Biomedical Literature (Computational Molecular Biology)

By Hagit Shatkay

The creation of high-throughput equipment has reworked biology right into a data-rich technological know-how. wisdom approximately organic entities and techniques has generally been obtained by means of millions of scientists via a long time of experimentation and research. the present abundance of biomedical information is followed through the production and fast dissemination of recent info. a lot of this knowledge and data, despite the fact that, is represented merely in textual content form--in the biomedical literature, lab notebooks, websites, and different resources. Researchers' have to locate appropriate info within the substantial quantities of textual content has created a surge of curiosity in computerized text-analysis.

In this e-book, Hagit Shatkay and Mark Craven supply a concise and available advent to key principles in biomedical textual content mining. The chapters disguise such issues because the correct resources of biomedical textual content; text-analysis equipment in ordinary language processing; the projects of knowledge extraction, info retrieval, and textual content categorization; and strategies for empirically assessing text-mining structures. eventually, the authors describe numerous functions that realize entities in textual content and hyperlink them to different entities and information assets, aid the curation of based databases, and utilize textual content to allow extra prediction and discovery.

Baumgartner, H. Johnson, P. Ogren, and ok. B. Cohen. 2008. OpenDMAP: An open resource, ontology-driven proposal research engine with functions to taking pictures wisdom relating to protein shipping, protein interactions and celltype-specific gene expression. BMC Bioinformatics nine: seventy eight. 103. iHOP. http://www. ihop-net. org/, 2010. 104. IntAct Molecular interplay Database. http://www. ebi. ac. uk/intact/, 2010. a hundred and five. Iossifov, I. , M. Krauthammer, C. Friedman, V. Hatzivassilogou, J. Bader, ok. White, and A. Rzhetsky. 2004. GeneWays: A process for extracting, examining, visualizing, and integrating molecular pathway information. magazine of Biomedical Informatics 37 (1): 43–53. 106. Jain, N. , C. Knirsch, C. Friedman, and G. Hripcsak. id of suspected tuberculosis sufferers in accordance with usual language processing of chest radiograph experiences. In complaints of the yearly Symposium of the yank scientific Informatics organization, 1996, 542–546. 107. JCVI CHAR Database. http://www. jcvi. org/cms/research/projects/algorithmically -tuned-protein-families-rule-base-and-characterized-proteins/overview/, 2011. 108. JCVI. finished Microbial source. http://cmr. jcvi. org/tigr-scripts/CMR/ CmrHomePage. cgi, 2011. 109. Jenssen, T. -K. , A. Laegreid, J. Komorowski, and E. Hovig. 2001. A literature community of human genes for high-throughput research of gene expression. Nature Genetics 28: 21–28. one hundred ten. Jiang, F. , and M. Littman. Approximate size equalization in vector-based details retrieval. In complaints of 17th overseas convention on computer studying. Burlington, MA: Morgan Kaufmann, 2000, 423–430. 111. Jiang, J. , and C. Zhai. 2007. An empirical examine of tokenization thoughts for biomedical info retrieval. info Retrieval 10 (4–5): 341–363. 121 References 112. Joachims, T. A probabilistic research of the Rocchio set of rules with TFIDF for textual content categorization. In complaints of the Fourteenth foreign convention on desktop studying. Burlington, MA: Morgan Kaufmann, 1997, 143–151. 113. Joachims, T. textual content categorization with aid vector machines: studying with many appropriate positive factors. In lawsuits of the 10th eu convention on laptop studying. long island: Springer, 1998, 137–142. 114. Johnson, H. , okay. B. Cohen, W. A. Baumgartner, Z. Lu, M. Bada, T. Kester, H. Kim, and L. Hunter. review of lexical tools for detecting relationships among options from a number of ontologies. In complaints of the Pacific Symposium on Biocomputing. Hackensack, NJ: global clinical, 2006, 28–39. a hundred and fifteen. Jurafsky, D. , and J. Martin. 2009. Speech and Language Processing: An creation to average Language Processing, Speech reputation, and Computational Linguistics. second ed. top Saddle River, NJ: Prentice corridor. 116. Kang, N. , E. M. van Mulligan, and J. A. Kors. 2011. evaluating and mixing chunkers of biomedical textual content. magazine of Biomedical Informatics forty four (2): 354–360. 117. Karamanis, N. , I. Lewin, R. Seal, R. Drysdale, and E. J. Briscoe. Integrating traditional language processing with FlyBase curation. In complaints of the Pacific Symposium on Biocomputing.

