The Data Science Fellows program, offered in collaboration between the Data Science Institute and the Sheridan Center for Teaching and Learning, provides a unique opportunity for undergraduate students to collaborate with faculty members from across the university to infuse data science tools and practice in existing undergraduate courses.
The Data Science Fellows course (DATA 1150) has been offered each fall since 2022 and has had over 50 undergraduate students from a diverse set of concentrations participate in this unique program. Professor Linda Clark, Associate Teaching Professor of Data Science and Academic Director of the DSI Online Master's Program created and runs the program. The Data Fellows course prepares the Data Science Fellows to serve as consultants for faculty wishing to enhance data science curricula at Brown and teaches a core set of data science practices, active learning pedagogies, and collaborative communication skills. This program not only helps students develop their data science and collaborative project management skills, but also provides faculty across the university with an invaluable resource for creating innovative curricula at Brown.
This week, the twelve students in the 2025 cohort presented on their semester-long projects where they worked with faculty to incorporate data science tools into courses at Brown. This year’s projects ranged from building a French-language learning website to TA-like AI agents for a statistics course to an interactive visualization platform for English courses.
Check out each 2025 Data Fellows project below!
2025 Data Fellows Projects

Data Fellow: Joseph Oduro
Faculty Collaborators: Elizabeth Chen and Neil Sakar (Medical Science & Health Services, Policy and Practice)
This project aimed to enhance student preparedness for the AI in Health course by developing clear, accessible, and technically accurate learning resources. Over the semester, I created onboarding documentation, platform setup guides, and a series of foundational quizzes covering EHR concepts, Python programming, and UNIX commands. These materials were designed to help new students build confidence with core tools and concepts before beginning more advanced coursework. I also updated the CODIAC for Health website with missing EHR content to improve the accessibility and continuity of the curriculum. Together, these efforts support a more streamlined and effective learning experience for future cohorts as the course moves toward online delivery.

Data Fellow: Michael Wang
Faculty Collaborators: Nik Marda (Data Science Institute)
This project focuses on developing interactive tools for students in DSIO 2120 to visualize fairness and bias concepts in AI. Specific tasks include identifying datasets containing proxy variables, injecting synthetic protected variables stochastically, training a multi-layer perceptron model on potentially biased attributes, and implementing feature reweighting techniques to toggle effects of such attributes. The goal is to support an intuitive understanding of contemporary fairness metrics interactively, so there is also development of frontend components. Final Website.
Data Fellow: Stiles Begnaud
Faculty Collaborators: Stéphanie Gaillard (French and Francophone Studies)
This project builds on the work of two previous fellows: Nuj Naguleswaran (2022) and Diana Nazari (2022). The objective is to create a website that combines documentation of American and French proficiency standards for language learning (ACTFL and CEFRL) and an assessment tool that gauges the user's knowledge level based on those standards. The finished site will be able to assess users and provide them with resources to further their learning based on where they stand. Language Learning Portal.
Data Fellows: Julia Belle Reyfaman and Jessie Chen
Faculty Collaborators: Andras Zsom and Linda Clark (Data Science Institute)
Together we collaborated with Professor Clark and Professor Zsom to redesign the current DATA 1030 course into a data science course for the Executive MBA program. We utilized prior DSIO course materials as well as our own knowledge to develop a data science course for professionals. We were responsible for creating the weekly course modules which included slideshow presentations, active learning activities and we also had weekly meetings with faculty. This experience was very valuable to both of us as we were able to reflect critically on our data science knowledge, pushed us to learn continuously, and allowed us to directly apply concepts from our fellows coursework to a real project.
Data Fellow: Daniel Ma
Faculty Collaborator: Jacy Weems (Health Services, Policy & Practice)
This semester, I collaborated with Jacy Weems (with guidance from advisor Dr. Alice Paul) to develop the foundation for a proposed DSI-funded qualitative content analysis course. This work involved drafting a full curriculum outline, organizing proposed modules, writing learning outcomes, and designing Quarto- and Jupyter notebook-based instructional materials modeled after existing open-source data science courses. I completed an outline which, should the grant application make further progress, be able to be implemented as a stand-alone course. As the project shifted focus, I began testing and refining our large-scale content-analysis pipeline on Brown’s OSCAR supercomputing cluster, reviewing the R and SLURM scripts, validating parallel job execution, and documenting how researchers can run these analyses themselves/providing feedback on existing scripts. I also started developing a GitHub site to host the workflow, curriculum materials, and lessons learned, to demystify the pipeline for future learners.
Data Fellow: Anika Mahns
Faculty Collaborator: Peter Lipman (Biostatistics)
Working with Professor Lipman, I designed and implemented Isabelle Curve, a conversational AI system that supports students learning biostatistics by walking them through statistical concepts and empirical research article assessments. I built a FastAPI backend that processes PDFs, evaluates article validity, and generates multi-step feedback cycles using OpenAI models. I also developed a frontend interface where students can upload articles, respond to AI-generated prompts, and receive reflection and clarification feedback meant to strengthen statistical communication skills. The project integrates principles from teaching and learning by encouraging iterative reasoning, promoting clarity of explanation, and grounding responses in course materials. Overall, the system aims to serve as an accessible study tool that supplements traditional instruction and gives students structured practice analyzing real research.
Data Fellow: Jacqui Culver
Faculty Collaborator: Karianne Bergen (Earth, Environmental, and Planetary Sciences & Data Science Institute)
This semester, I helped Professor Bergen revamp her freshman seminar, "DATA0150: Data Detectives". She originally developed the seminar with another fellow in Fall 2023; however, this semester’s work involved refreshing aspects of the course based on student feedback and her teaching experience. Professor Bergen and I decided to redesign the course’s final project and develop new in-class activities that would encourage active learning and cross-disciplinary engagement with data science. The final project is now a partner-based blog-like assignment, consisting of 6 different data science-related assignments. Each task requires 900-1500 words and instructs students to create, interact with, and critically analyse data.
Data Fellow: Sean Kim
Faculty Collaborator: Jenna Morton-Aiken (English)
This semester, I collaborated with Dr. Jenna Morton-Aiken to build an automated folksonomy network visualization system for ENGL1190M course. The project aimed to create a collaborative knowledge-mapping tool where students tag weekly readings with hashtags, and the system automatically generates an interactive network showing connections between texts and emerging themes. I developed a fully automated pipeline using Google Forms for data collection, Google Colab for processing, and GitHub Pages for hosting, eliminating all manual file management through PyGithub API integration. The visualization, built with Sigma.js, features interactive hover highlighting, real-time search filtering, and responsive pan-and-zoom controls that work seamlessly on both desktop and mobile devices. The system processes cumulative data from all submissions throughout the semester, creating timestamped version control for pedagogical research. This approach allows students to discover unexpected connections between readings based on their own interpretations, rather than following only the professor's planned course structure. The one-click update workflow enables Dr. Morton-Aiken to refresh the visualization weekly in under 60 seconds, making it sustainable for semester-long use and potentially shareable with other courses exploring collaborative knowledge construction.
Data Fellow: Sima Raha
Faculty Collaborator: Pierre de Galbert (Education)
As part of my Data Science Fellowship, I am collaborating with Professor Galbert to develop an interactive online textbook for Brown’s Education Department. The project involves building a Quarto-based website hosted on GitHub that integrates Stata code for applied statistics and quantitative research methods courses (Education 1230 and 2320). As a Developer and Quant Research Assistant, my work focuses on the technical infrastructure, including GitHub workflow, Python integration, and documentation of the development process to ensure accessibility and reproducibility. This digital resource aims to data literacy through an engaging, hands-on learning experience for both undergraduate and graduate students in statistics.
Data Fellow: Suhaila Hashimi
Faculty Collaborator: Latha Ganthi (Medical Science)
Collaborating with Professor Ganti on developing the “Get Published Now” course, which is designed to teach students how to publish in health research literature. My role focuses on helping students understand and present data clearly and accurately. I clean and enhance lecture materials, create interactive visualizations like dynamic maps that update with student-provided data, and develop guided notes that walk students through using clinical datasets with tools such as Python, R, Tableau, and JMP.
I also design hands-on in-class activities and homework assignments that help students analyze data distributions, calculate key statistics, and visualize their findings effectively. Part of my work involves ensuring that all tables and visualizations adhere to AMA-style presentation standards. This project allows me to combine my interests in data science, education, and health research while creating resources that actively support student learning and engagement.
Data Fellow: Maryam Khademi
Faculty Collaborator: Carlin Corrigan and Shelly Strunk (Sheridan Center for Teaching and Learning)
This semester, I worked with Carlin and Shelly from the Sheridan Center to analyze student engagement with video content across the Graduate Public Health Program. I built a Python-based analytics pipeline that processes Panopto viewership data (over 1,275 videos across 33 course sections) to answer key questions about which media types perform best, how courses compare, and what video lengths maximize student engagement. The analysis presents actionable insights, including that self-recorded content drives significantly higher engagement than other formats. I also created an interactive HTML dashboard with filtering by year, semester, and course, designed to be maintainable by team members with limited programming experience. Together, these tools give the Sheridan Center a sustainable, replicable process for tracking content performance and making evidence-based decisions about educational content.