Data Science Institute

DATA Courses

There are many courses at Brown that include data analysis and data science. Searching Courses @ Brown with the keyword "data" will bring up most of these. The courses offered by the DSI are open to all Brown students (some have prerequisites)

Offered fall semester. A course on the social, political, and philosophical issues raised by the theory and practice of data science. Explores how data science is transforming not only our sense of science and scientific knowledge, but our sense of ourselves and our communities and our commitments concerning human affairs and institutions generally. Students will examine the field of data science in light of perspectives provided by the philosophy of science and technology, the sociology of knowledge, and science studies, and explore the consequences of data science for life in the first half of the 21st century. This course is limited to undergraduates and fulfills a requirement for Certificate in Data Fluency.

Instructor: Deborah Hurley

Offered spring semester. As data science becomes more visible, are you curious about its unique amalgamation of computer programming, statistics, and visualizing or storytelling? Are you wondering how these areas fit together and what a data scientist does? This course offers all students regardless of background the opportunity for hands-on data science experience, following a data science process from an initial research question, through data analysis, to the storytelling of the data. Along the way, you will learn about the ethical considerations of working with data, and become more aware of societal impacts of data science. Course does not count toward CS concentration requirements.

PrerequisitesCSCI 0111015001700190, CLPS 0950 or 1292.

Instructor: Linda Clark

Offered fall semester. Data science is growing fast, with tools, approaches, and results evolving rapidly. This course is for students with some familiarity with data science tools and skills, seeking to apply these skills and teach others how to implement and interpret data science. Working in conjunction with a faculty sponsor, this course teaches students communication skills, how to determine the needs (requirements) for a project, and how to teach data science to peers. These valuable agile skills will be an incredible advantage moving forward in your professional development.  Interested students must submit an application form to indicate interest. Override requests will be granted only to students by instructor approval. 

Instructor: Linda Clark

Offered fall semester. Develops all aspects of the machine learning pipeline: data acquisition and cleaning, handling missing data, exploratory data analysis, visualization, feature engineering, modeling, interpretation, presentation in the context of real-world datasets. Fundamental considerations for data analysis are emphasized (the bias-variance tradeoff, training, validation, testing). Classical models and techniques for classification and regression are included (linear and logistic regression with regularization, support vector machines, decision trees, random forests, XGBoost). Uses the Python data science ecosystem (e.g., sklearn, pandas, matplotlib).

Prerequisites: A course equivalent to CSCI 00500150 or 0170.

Instructor: Andras Zsom

Offered fall semester. This course covers the storage, retrieval, and management of various types of data and the computing infrastructure (such as various types of databases and data structures) and algorithmic techniques (such as searching and sorting algorithms) and query languages (such as SQL) for interacting with data, both in the context of transaction processing (OLTP) and analytical processing (OLAP). Students will be introduced to measures for evaluating the efficacy of different techniques for interacting with data (such as ‘Big-Oh’ measure of complexity and the number of I/O operations) and various types of indexes for the efficient retrieval of data. The course will also cover several components of the Hadoop ecosystem for the processing of "big data." Additional topics include cloud computing, NoSQL databases, and modern data architectures. Introduction to some of the concepts and techniques of computer science essential for data science will also be covered. 

PrerequisitesCSCI 0150 or 0170 or equivalent programming experience.

Instructor: Shekhar Pradhan

Offered spring semester. A modern introduction to inferential methods for regression analysis and statistical learning, with an emphasis on application in practical settings in the context of learning relationships from observed data. Topics will include basics of linear regression, variable selection and dimension reduction, and approaches to nonlinear regression. Extensions to other data structures such as longitudinal data and the fundamentals of causal inference will also be introduced. 

Prerequisite: APMA 1690 or equivalent.

Instructor: Roberta DeVito

Offered spring semester. A course on the social, political, and philosophical issues raised by the theory and practice of data science. Explores how data science is transforming not only our sense of science and scientific knowledge, but our sense of ourselves and our communities and our commitments concerning human affairs and institutions generally. Students will examine the field of data science in light of perspectives provided by the philosophy of science and technology, the sociology of knowledge, and science studies, and explore the consequences of data science for life in the first half of the 21st century.

Instructor: Deborah Hurley

This course will explore recent work that leverages machine learning (ML) as a tool for tackling climate change, with a focus on climate science and climate adaptation. We will discuss how modern machine learning can be used to assess, understand and respond to projected climate extremes, natural disasters, and environmental change. The target audience for this course is advanced undergraduate students or graduate students who are interested in using ML and AI to address high-impact global issues. 

Instructor: Karianne Bergen

We know we want to build more equitable technology, but how? In this course we’ll review the latest developments in how to build more equitable algorithms, including definitions of (un)fairness, the challenges of explaining how ML works, making sure we can get accountability, and much more.

PrerequisitesCSCI 14201950FDATA 0200, or equivalent.

Instructor: Suresh Venkatasubramanian

Data science techniques and tools are all around us. Machine learning is a term used across many different disciplines, and often people use machine learning tools without a thorough understanding of how and why the tools work. This course will provide students with a foundation of machine learning grounded in the mathematical models behind the techniques. The course will cover the theory, computational methods, and visualization inherent in the application of machine learning models. In this course, you will learn the statistical learning framework, common assumptions in the data generation process, the mathematics behind machine learning models, including supervised and unsupervised techniques, as well as how to implement machine learning models in Python from scratch.

Pre-requisites: DATA1030 and/or APMA1690 (equivalencies will be considered by the instructor).

Instructor: Andras Zsom

Courses Fall 2023