• Login
    View Item 
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • CS Technical Reports
    • View Item
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Letters and Science, University of Wisconsin–Madison
    • Department of Computer Sciences, UW-Madison
    • CS Technical Reports
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Learning Expressive Computational Models of Gene Regulatory Sequences and Responses

    Thumbnail
    File(s)
    TR1615.pdf (7.269Mb)
    Date
    2007
    Author
    Noto, Keith
    Publisher
    University of Wisconsin-Madison Department of Computer Sciences
    Metadata
    Show full item record
    Abstract
    The regulation and responses of genes involve complex systems of relationships between genes, proteins, DNA, and a host of other molecules that are involved in every aspect of cellular activity. I present algorithms that learn expressive computational models of cis-regulatory modules (CRMs) and gene-regulatory networks. These models are expressive because they are able to represent key aspects of interest to biologists, often involving unobserved underlying phenomena. The algorithms presented in this thesis are designed specifically to learn in these expressive model spaces. I have developed a learning approach based on models of CRMs that represent not only the standard set of transcription factor binding sites, but also logical and spatial relationships between them. I show that my expressive models learn more accurate representations of CRMs in genomic data sets than current state-of-the-art learners and several less expressive baseline models. I have developed a probabilistic version of these CRM models which is closely related to hidden Markov models. I show how these models can perform inference and learn parameters efficiently when processing long promoter sequences, and that these expressive probabilistic models are also more accurate than several baselines. Another contribution presented in this thesis is the development of a general-purpose regression learner for sequential data. This approach is used to discover mappings from sequence features in DNA (e.g. transcription or sigma factor binding sites) to real-valued responses (e.g. transcription rates). The key contribution of this approach is its ability to use the real values directly to discover the relevant sequence features, as opposed to choosing the features beforehand or learning them from sequence alone, and without losing information in a discretization process. Finally, I present and evaluate a gene-regulatory network that learns the hidden underlying state of regulators from expression data and a set of cellular conditions under which expression is measured. I show that using sequence data to estimate the role of regulators (activator or repressor) increases the accuracy of the learned models.
    Permanent Link
    http://digital.library.wisc.edu/1793/60594
    Citation
    TR1615
    Part of
    • CS Technical Reports

    Contact Us | Send Feedback
     

     

    Browse

    All of MINDS@UWCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Contact Us | Send Feedback