Data Science Research @ Brown

Automated Leaf Classification

Understanding the extremely variable, complex shape and venation characters of angiosperm leaves is one of the most challenging problems in botany.

Automating Pathology with Deep Learning

Cancer is the second leading cause of death in the United States. Diagnosis and prognosis are typically determined by histological analysis of tissue samples by a pathologist, which is time-consuming and costly and suffers from diagnostic inconsistency.

BIGDATA: Analytical Approaches to Massive Data Analysis

The goal of this project is to design and test mathematically well-founded algorithmic and statistical techniques for analyzing large-scale, heterogeneous and noisy data.

Brain Networks: Inferring Cortical Microcircuitry with Electrophysiology

Existing technology cannot directly measure the synaptic connectivity between individual brain cells in an awake, behaving mammals.

Choices: Method and Application to Congressional Speech

We study the problem of measuring group differences in choices when the dimensionality of the choice set is large.

Computational Psychiatry: Combining Theory-Driven and Data-Driven Approaches to Understand Impulsivity

Impulsivity is a substantial risk factor for aberrant behaviors. This project seeks to understand the fundamental cognitive neuroscience mechanisms underlying distinct forms of impulsivity, using a combination of theory-driven and data-driven approaches.

Homomorphic Encryption

An encryption scheme is a method for efficiently computing an encrypted form e(X) of a given input X. It should be invertible, but computing the inverse must require a secret key.

Identification of Phytotherapies

This project is establishing a first-of-its-kind computerized platform to identify and catalog therapeutic uses of plants.

Large-Scale Network Modeling for Brain Dynamics: Statistical Learning and Optimization

This project aims to develop novel statistical machine learning methods for big neuroimaging data.

Learning to Discover Particles in the Early Universe

The Large Hadron Collider (LHC), the world's largest particle accelerator, located at the CERN lab in Geneva Switzerland, collides particles at the rate of 40MHz.

Linking Multiple Datasets Without Unique Identifiers

Analysis of datasets created by linking two or more separate data sources is increasingly important as researchers and policy analysts seek to integrate administrative and clinical datasets while adapting to privacy regulations that limit access to unique identifiers.

Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional Speech

We study the problem of measuring group differences in choices when the dimensionality of the choice set is large.

Network Reconstruction

Network reconstruction is a useful tool in a number of areas reaching from medical imaging to oil exploration.

Predictive Healthcare Analytics

This project develops and applies computational and information science approaches for integrating biological, clinical, and public health data for modeling complex health phenomena, with particular emphasis in pediatrics, psychiatry, emergency medicine, and critical care.

Random Graphs

There is a large body of research in random graphs and networks, and many models of random graphs, some more and some less relevant to real-world networks.

Rapid Analysis and Visualization of Output from Topic Models

A series of methods in genomics use multilocus genotype data to assign individuals membership in latent clusters that often correspond to geographic regions or methods of subsistence. These methods belong to a broad class of topic models, such as latent Dirichlet allocation used to analyze text corpora.

Social and Family History - Extraction, Representation, and Evaluation

This project leverages advanced computational methods to transform social, behavioral, and familial factors from electronic health records into a rich longitudinal resource for generating knowledge regarding various determinants of health including their temporal progression, severity, and relationship to health conditions.

The Murchison Widefield Array (MWA)

The international team of scientists on the MWA is pursuing a number of projects, including studies of the Milky Way and other galaxies, searches for pulsing and exploding stellar objects, and the study of space weather.

Using Clustering Algorithms to Extract Differential Genetic Architectures Between Immunological and Metabolic Phenotypes From Gene-Level Association Statistics

Existing and emerging genome-wide association (GWA) datasets, merged with medical record or survey data, enable testing for associations for dozens of phenotypes, yet methods for characterizing the shared genetic architecture of multiple traits are still not well-established.

Data Science Institute

Data Science Research @ Brown

Breadcrumb

Data Science Research @ Brown

Automated Leaf Classification

Automating Pathology with Deep Learning

BIGDATA: Analytical Approaches to Massive Data Analysis

Brain Networks: Inferring Cortical Microcircuitry with Electrophysiology

Choices: Method and Application to Congressional Speech

Computational Psychiatry: Combining Theory-Driven and Data-Driven Approaches to Understand Impulsivity

Homomorphic Encryption

Identification of Phytotherapies

Large-Scale Network Modeling for Brain Dynamics: Statistical Learning and Optimization

Learning to Discover Particles in the Early Universe

Linking Multiple Datasets Without Unique Identifiers

Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional Speech

Network Reconstruction

Predictive Healthcare Analytics

Random Graphs

Rapid Analysis and Visualization of Output from Topic Models

Social and Family History - Extraction, Representation, and Evaluation

The Murchison Widefield Array (MWA)

Using Clustering Algorithms to Extract Differential Genetic Architectures Between Immunological and Metabolic Phenotypes From Gene-Level Association Statistics