{"633633":{"#nid":"633633","#data":{"type":"news","title":"Machine Learning Tool May Help Us Better Understand RNA Viruses","body":[{"value":"\u003Cp\u003E\u003Ca href=\u0022https:\/\/github.com\/ml4bio\/e2efold\u0022\u003EE2Efold\u003C\/a\u003E\u0026nbsp;\u0026nbsp;is an end-to-end deep learning model developed at Georgia Tech that can predict RNA secondary structures, an important task used in virus analysis, drug design, and other public health applications.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EAlthough the model has yet to be used in real-life applications, in research testing it has shown at least a 10 percent improvement\u0026nbsp;in structure prediction accuracy compared to previous\u0026nbsp;state-of-the-art methods\u0026nbsp;according to\u0026nbsp;\u003Cstrong\u003EXinshi Chen\u003C\/strong\u003E, a Georgia Tech Ph.D. student specializing in machine learning and co-developer of the new tool.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;The model uses an unrolled algorithm for solving a constrained optimization as a component in the neural network architecture, so that it can directly incorporate a solution constraint, or prior knowledge, to predict the RNA base-pairing matrix,\u0026rdquo; said Chen.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EE2Efold is not only more accurate, it is also considerably faster than current techniques.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ECurrent methods are dynamic programming based, which is a much slower approach for predicting longer RNA sequences, such as the genomic RNA in virus. E2Efold overcomes this drawback by using a gradient-based unrolled algorithm. It also\u0026nbsp;takes advantage of graphic processing units to accelerate its computing process and is now the fastest method available.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ERNA, or ribonucleic acid, is an essential building block that governs gene expression and is particularly important for RNA viruses, which consist only of RNA and the enwrapping virion proteins. These types of viruses make up a wide array of infectious diseases, including SARS,\u0026nbsp;\u0026nbsp;Dengue fever, the common cold, and others.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;Unlike most organisms, the genetic information of an RNA virus is RNA. As a result, almost every stage in the RNA virus life cycle relies on RNA heavily,\u0026rdquo; said\u0026nbsp;\u003Cstrong\u003EYu Li\u003C\/strong\u003E, a computational bioscience researcher from King Abdullah University of Science and Technology and co-investigator.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;Take SARS, as an example. It belongs to an RNA virus. If we can predict its secondary and 3D structure\u0026nbsp;accurately, based on its sequence information, we can potentially design drugs to bind to its local binding pocket and block the RNA from functioning. In other words,\u0026nbsp;researchers\u0026nbsp;might be able to develop treatments for the virus based on the specific local structure of the target RNA using this method as a starting point,\u0026rdquo; said Li.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EOne additional noteworthy ability, E2Efold is its ability to solve for pseudoknots. Pseudoknots are a biologically important RNA secondary structure that are present in roughly 40 percent of RNAs and assist with folding into 3D structures.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;Most previous models were restricted to only predict one type of RNA structure called nested structures. This excluded pseudoknots all together because they were computationally expensive,\u0026rdquo; said Chen. \u0026ldquo;In this paper, we predict RNA structures with pseudoknots by adopting a feed-forward model with a 25 percent greater accuracy than previous versions.\u0026rdquo;\u003C\/p\u003E\r\n\r\n\u003Cp\u003ELed by Georgia Tech\u0026nbsp;\u003Ca href=\u0022https:\/\/cse.gatech.edu\/\u0022\u003ESchool of Computational Science and Engineering\u003C\/a\u003E\u0026nbsp;(CSE) Associate Professor\u0026nbsp;\u003Cstrong\u003E\u003Ca href=\u0022https:\/\/www.cc.gatech.edu\/~lsong\/\u0022\u003ELe Song\u003C\/a\u003E\u0026nbsp;\u003C\/strong\u003Eand\u0026nbsp;KAUST Associate Professor\u0026nbsp;\u003Cstrong\u003EXin Gao\u003C\/strong\u003E, the team of researchers who created the model will present\u0026nbsp;the\u0026nbsp;\u003Ca href=\u0022https:\/\/openreview.net\/forum?id=S1eALyrYDH\u0022\u003Epaper outlining their findings\u003C\/a\u003E\u0026nbsp;at the\u0026nbsp;\u003Ca href=\u0022https:\/\/iclr.cc\/\u0022\u003EInternational Conference on Learning Representations\u003C\/a\u003E\u0026nbsp;(ICLR) 2020.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EAlthough the focus of the paper is on RNA secondary prediction, E2Efold\u0026rsquo;s end-to-end deep learning approach is generic enough to also be applied to other problems such as protein folding and natural language understanding.\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":"Georgia Tech researchers have created an end-to-end deep learning algorithm that is able to more effectively and quickly sequence RNA secondary structures."}],"uid":"34540","created_gmt":"2020-03-17 17:42:48","changed_gmt":"2020-03-17 17:42:48","author":"Kristen Perez","boilerplate_text":"","field_publication":"","field_article_url":"","dateline":{"date":"2020-03-17T00:00:00-04:00","iso_date":"2020-03-17T00:00:00-04:00","tz":"America\/New_York"},"extras":[],"hg_media":{"633626":{"id":"633626","type":"image","title":"RNA Secondary Structure","body":null,"created":"1584459203","gmt_created":"2020-03-17 15:33:23","changed":"1584459203","gmt_changed":"2020-03-17 15:33:23","alt":"RNA secondary structure diagram","file":{"fid":"241102","name":"RNA_Secondary_Structure.png","image_path":"\/sites\/default\/files\/images\/RNA_Secondary_Structure.png","image_full_path":"http:\/\/tlwarc.hg.gatech.edu\/\/sites\/default\/files\/images\/RNA_Secondary_Structure.png","mime":"image\/png","size":68237,"path_740":"http:\/\/tlwarc.hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/images\/RNA_Secondary_Structure.png?itok=xeZwLsw3"}}},"media_ids":["633626"],"groups":[{"id":"47223","name":"College of Computing"},{"id":"50877","name":"School of Computational Science and Engineering"}],"categories":[],"keywords":[{"id":"179628","name":"RNA sequencing"},{"id":"2546","name":"bioinformatics"},{"id":"127171","name":"Le Song"}],"core_research_areas":[{"id":"39441","name":"Bioengineering and Bioscience"},{"id":"39431","name":"Data Engineering and Science"}],"news_room_topics":[],"event_categories":[],"invited_audience":[],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003EKristen Perez\u003C\/p\u003E\r\n\r\n\u003Cp\u003ECommunications Officer\u003C\/p\u003E\r\n","format":"limited_html"}],"email":["kristen.perez@cc.gatech.edu"],"slides":[],"orientation":[],"userdata":""}}}