Course Offerings (DATA & DSIO)
There are many courses at Brown that include data analysis and data science. Searching Courses @ Brown with the keyword "data" will bring up most of these.
DATA Courses
Offered fall semester. A course on the social, political, and philosophical issues raised by the theory and practice of data science. Explores how data science is transforming not only our sense of science and scientific knowledge, but our sense of ourselves and our communities and our commitments concerning human affairs and institutions generally. Students will examine the field of data science in light of perspectives provided by the philosophy of science and technology, the sociology of knowledge, and science studies, and explore the consequences of data science for life in the first half of the 21st century. This course is limited to undergraduates and fulfills a requirement for Certificate in Data Fluency.
Instructor: Deborah Hurley
Offered spring semester. As data science becomes more visible, are you curious about its unique amalgamation of computer programming, statistics, and visualizing or storytelling? Are you wondering how these areas fit together and what a data scientist does? This course offers all students regardless of background the opportunity for hands-on data science experience, following a data science process from an initial research question, through data analysis, to the storytelling of the data. Along the way, you will learn about the ethical considerations of working with data, and become more aware of societal impacts of data science. Course does not count toward CS concentration requirements.
Prerequisites: CSCI 0111, 0150, 0170, 0190, CLPS 0950 or 1292.
Instructor: Linda Clark
Offered fall semester. Data science is growing fast, with tools, approaches, and results evolving rapidly. This course is for students with some familiarity with data science tools and skills, seeking to apply these skills and teach others how to implement and interpret data science. Working in conjunction with a faculty sponsor, this course teaches students communication skills, how to determine the needs (requirements) for a project, and how to teach data science to peers. These valuable agile skills will be an incredible advantage moving forward in your professional development. Interested students must submit an application form to indicate interest. Override requests will be granted only to students by instructor approval.
Instructor: Linda Clark
Offered fall semester. Develops all aspects of the machine learning pipeline: data acquisition and cleaning, handling missing data, exploratory data analysis, visualization, feature engineering, modeling, interpretation, presentation in the context of real-world datasets. Fundamental considerations for data analysis are emphasized (the bias-variance tradeoff, training, validation, testing). Classical models and techniques for classification and regression are included (linear and logistic regression with regularization, support vector machines, decision trees, random forests, XGBoost). Uses the Python data science ecosystem (e.g., sklearn, pandas, matplotlib).
Prerequisites: A course equivalent to CSCI 0050, 0150 or 0170.
Instructor: Andras Zsom
Offered fall semester. This course covers the storage, retrieval, and management of various types of data and the computing infrastructure (such as various types of databases and data structures) and algorithmic techniques (such as searching and sorting algorithms) and query languages (such as SQL) for interacting with data, both in the context of transaction processing (OLTP) and analytical processing (OLAP). Students will be introduced to measures for evaluating the efficacy of different techniques for interacting with data (such as ‘Big-Oh’ measure of complexity and the number of I/O operations) and various types of indexes for the efficient retrieval of data. The course will also cover several components of the Hadoop ecosystem for the processing of "big data." Additional topics include cloud computing, NoSQL databases, and modern data architectures. Introduction to some of the concepts and techniques of computer science essential for data science will also be covered.
Prerequisites: CSCI 0150 or 0170 or equivalent programming experience.
Instructor: Shekhar Pradhan
Offered spring semester. A modern introduction to inferential methods for regression analysis and statistical learning, with an emphasis on application in practical settings in the context of learning relationships from observed data. Topics will include basics of linear regression, variable selection and dimension reduction, and approaches to nonlinear regression. Extensions to other data structures such as longitudinal data and the fundamentals of causal inference will also be introduced.
Prerequisite: APMA 1690 or equivalent.
Instructor: Roberta DeVito
Offered spring semester. A course on the social, political, and philosophical issues raised by the theory and practice of data science. Explores how data science is transforming not only our sense of science and scientific knowledge, but our sense of ourselves and our communities and our commitments concerning human affairs and institutions generally. Students will examine the field of data science in light of perspectives provided by the philosophy of science and technology, the sociology of knowledge, and science studies, and explore the consequences of data science for life in the first half of the 21st century.
Instructor: Deborah Hurley
This course will explore recent work that leverages machine learning (ML) as a tool for tackling climate change, with a focus on climate science and climate adaptation. We will discuss how modern machine learning can be used to assess, understand and respond to projected climate extremes, natural disasters, and environmental change. The target audience for this course is advanced undergraduate students or graduate students who are interested in using ML and AI to address high-impact global issues.
Instructor: Karianne Bergen
We know we want to build more equitable technology, but how? In this course we’ll review the latest developments in how to build more equitable algorithms, including definitions of (un)fairness, the challenges of explaining how ML works, making sure we can get accountability, and much more.
Prerequisites: CSCI 1420, 1950F, DATA 0200, or equivalent.
Instructor: Suresh Venkatasubramanian
Data science techniques and tools are all around us. Machine learning is a term used across many different disciplines, and often people use machine learning tools without a thorough understanding of how and why the tools work. This course will provide students with a foundation of machine learning grounded in the mathematical models behind the techniques. The course will cover the theory, computational methods, and visualization inherent in the application of machine learning models. In this course, you will learn the statistical learning framework, common assumptions in the data generation process, the mathematics behind machine learning models, including supervised and unsupervised techniques, as well as how to implement machine learning models in Python from scratch.
Pre-requisites: DATA1030 and/or APMA1690 (equivalencies will be considered by the instructor).
Instructor: Andras Zsom
The course will trace the origins and trajectory of ideas about artificial intelligence, starting with the “active intellect” of Aristotle, early analog computers and automata, Ada Lovelace’s “calculus of the nervous system,” through the “general intellect” and “machine capital” of Karl Marx, Karel Čapek’s “Universal Robots,” the “Turing test,” “cyberpositivity” and Donna Haraway’s “Cyborg Manifesto,” and concepts like “swarm intelligence,” “singularity,” and the “black box” problem. Sources will encompass a range of disciplinary approaches (philosophy, sociology, computer science, etc.), formats (text, film, graphic novel), and genres including Japanese anime and Afrofuturism
Instructor: Holly Case
DSIO Courses
Focused on practical applications, this course covers key concepts from linear algebra, calculus, probability, and statistics that are most relevant to data science workflows. This course closely integrates theory with practice to enable students with the specific techniques required for real-world data analysis. Learners will have the opportunity to explore more in-depth topics to discover how foundational topics like derivatives, matrices, and distributions are applied in machine learning and data modeling. Examples are drawn from advanced courses to highlight how these tools can be applied to theories and practice examined in this course.
This course provides an introduction to the ethical and policy considerations surrounding artificial intelligence (AI) in today's society. Students will explore key ethical concerns, such as data privacy, bias, and accountability, as well as the societal and historical contexts that have shaped current AI governance. In addition, the course will offer a high-level overview of how AI systems are developed, including the basics of data collection, usage, and the training process for machine learning models. By examining these topics, students will gain a foundational understanding of the complex interactions between AI technologies and societal impacts, preparing them for deeper discussions in future courses on AI governance and responsible innovation.
This course introduces students to the core principles of data engineering, emphasizing the often-hidden ethical choices that shape how massive datasets are managed. Students will learn about the fundamentals of data architecture, storage, and processing, while exploring critical values issues such as privacy, biases, data provenance, ownership, and copyright. This course interweaves the ethical considerations with the technical mechanics of data engineering, which exemplifies the real-world choices data engineers make as well as their broader societal implications. By the end of the course, students will understand not just the technical foundations of data engineering, but also the value-laden decisions involved in handling large-scale data.
This course explores the role of artificial intelligence (AI) and machine learning (ML) in shaping evidence-driven policy decisions across various sectors. Using case studies from various AI/ML sectors, students will critically examine how AI/ML tools influence policy outcomes. Rather than delving into the technical intricacies of ML, this course emphasizes a "black box" approach, where data inputs lead to predictions. Students will learn to distinguish between prediction and intervention, recognize the limitations of AI/ML, and develop a transparent, precise language for discussing these technologies. By fostering healthy skepticism, this course equips students to make informed decisions about AI's role in evidence-based policy making.
This course offers a comprehensive introduction to machine learning (ML), deep learning (DL), and generative AI, preparing students to become informed users of these powerful technologies. The first half of the course focuses on classical ML techniques, while the second half is split between deep learning applications and the emerging field of generative AI, particularly large language models (LLMs). Students will explore key concepts like backpropagation to understand how models are updated with new data, and the differences between pretraining, fine-tuning, and alignment strategies, including Deep Policy Optimization (DPO) and Reinforcement Learning from Human Feedback (RLHF).
This course investigates the pursuit of building equitable technology by addressing fairness and bias in algorithmic systems. Students will review the latest advancements in creating more equitable algorithms, exploring definitions and types of (un)fairness. The course covers the challenges of explaining machine learning processes, ensuring accountability in algorithmic decisions, and addressing systemic biases. Through a combination of theoretical insights and practical approaches, students will gain a comprehensive understanding of how to design and implement fair and accountable AI systems.
Drawing from the lessons of earlier courses, this course provides a thorough exploration of data governance within the context of artificial intelligence (AI) and machine learning (ML). Students will start by defining AI ethics and core principles guiding AI and ML development, expanding on key issues such as bias, fairness, and transparency. This course will then explore the current landscape of AI regulation and legislation, examining the roles of governments and international organizations in shaping and enforcing these regulations. Students will discuss the challenges and opportunities associated with AI governance, gaining insights into how regulatory frameworks can both address ethical concerns and foster innovation.
In this capstone course, students will apply their comprehensive knowledge of data science to a significant project centered on policy and governance issues and their societal impacts. This course integrates students' understanding of policy, governance, machine learning, and data science into a practical project, which may include case studies, practicums, projects related to current employment, or research papers. Students will engage in hands-on work with data, reflecting key aspects of the data science pipeline. Additionally, the course offers career-oriented skills development to enhance students' professional readiness. Through this culminating project, students will demonstrate their ability to synthesize and apply their learning in real-world scenarios.