Categories
Tags
Category Archives: Publications
A Validated Regulatory Network for Th17 Cell Specification
Abstract Th17 cells have critical roles in mucosal defense and are major contributors to inflammatory disease. Their differentiation requires the nuclear hormone receptor RORγt working with multiple other essential transcription factors (TFs). We have used an iterative systems approach, combining … Continue reading
A Type Theory for Probability Density Functions
Abstract There has been great interest in creating probabilistic programming languages to simplify the coding of statistical tasks; however, there still does not exist a formal language that simultaneously provides (1) continuous probability distributions, (2) the ability to naturally express … Continue reading
The CRIT framework for identifying cross patterns in systems biology and application to chemogenomics
Abstract Biological data is often tabular but finding statistically valid connections between entities in a sequence of tables can be problematic – for example, connecting particular entities in a drug property table to gene properties in a second table, using … Continue reading
Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data
Abstract We present an integrative machine learning method, incRNA, for whole-genome identification of noncoding RNAs (ncRNAs). It combines a large amount of expression data, RNA secondary-structure stability, and evolutionary conservation at the protein and nucleic-acid level. Using the incRNA model … Continue reading
Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project
Abstract We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin … Continue reading
RSEQtools: A modular framework to analyze RNA-Seq data using compact, anonymized data summaries
Abstract Summary: The advent of next-generation sequencing for functional genomics has given rise to quantities of sequence information that are often so large that they are difficult to handle. Moreover, sequence reads from a specific individual can contain sufficient information … Continue reading
Nettle Tech Reports
Andi Voellmy and the Nettle Team have released two tech reports describing our work so far. Don’t Conï¬gure the Network, Program It! Domain-Speciï¬c Programming Languages for Network Systems Nettle: Functional Reactive Programming for OpenFlow Networks
Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays
Abstract Background: Tiling arrays have been the tool of choice for probing an organism’s transcriptome without prior assumptions about the transcribed regions, but RNA-Seq is becoming a viable alternative as the costs of sequencing continue to decrease. Understanding the relative … Continue reading
Toward Interactive Statistical Modeling
Abstract When solving machine learning problems, there is currently little automated support for easily experimenting with alternative statistical models or solution strategies. This is because this activity often requires expertise from several different ï¬elds (e.g., statistics, optimization, linear algebra), and … Continue reading
Genome-Wide Identification of Binding Sites Defines Distinct Functions for Caenorhabditis elegans PHA-4/FOXA in Development and Environmental Response
Abstract Transcription factors are key components of regulatory networks that control development, as well as the response to environmental stimuli. We have established an experimental pipeline in Caenorhabditis elegans that permits global identification of the binding sites for transcription factors … Continue reading