Data Science Institute

Tongtong ZhaoHow did you first become interested in data science?

My interest in data science stemmed from my love for pure mathematics, particularly mathematical logic. However, I discovered the practical application of my mathematical knowledge when I took a data science course during my undergraduate studies. It was fascinating to see how mathematical models could be used to help people in the real world. I was drawn to the idea of applying theoretical concepts to real-life problems, as opposed to solely focusing on abstract ideas. I realized that mathematics didn't have to be esoteric and inaccessible to people.

Data science was a way to bridge the gap between mathematical theory and practical application, making it more approachable and accessible to a wider audience. Ultimately, my interest in data science is rooted in the desire to use my mathematical background to solve real-world problems and to show people that math doesn't have to be intimidating or unapproachable.

What was your practicum project and what did you learn from it?

For my practicum project, I worked with Professor Alice Paul on a clinical risk prediction project. Our goal was to build an optimized risk prediction model for children's annual check-ups using clinical records. We preprocessed clinical records and used logistic regression and coordinate descent method to build the new model with python and R. The model achieved an accuracy of over 90%. We also built an end-to-end pipeline for the project.

One of the key things I learned from this project was the importance of understanding the mathematical logic and equations behind the model. By understanding the underlying mathematical concepts, we were able to build a more accurate and optimized model for clinical risk prediction. I also learned how to interpret the results of the model. Furthermore, understanding the mathematical logic behind the model allowed me to customize and optimize the model to suit our specific project needs. I was able to experiment with different parameters and features to improve the accuracy of the model. Currently, we are preparing a publication on our findings and working on creating an R package for the model.

This project provided me with invaluable hands-on experience in data science and reinforced my passion for using data to solve real-world problems.

What are some of the most important skills you learned from the program?

Throughout the program, I developed a variety of technical skills such as data manipulation, statistical analysis, and machine learning.  Beyond the technical skills, I also learned valuable lessons on how to be a successful and ethical data scientist. The program emphasized the importance of data privacy, bias mitigation, and transparency in the data science process. Understanding the ethical implications of data science and being able to make ethical decisions is a critical skill for any data scientist.

Is there anything else you'd like to share about your experience in the program?

I had a wonderful experience in the program, thanks to the amazing professors and staff who were always available to help and provide guidance. The program had a great atmosphere for learning, which was further enhanced by my supportive classmates. We had a great dynamic and helped each other learn and grow. We shared our thoughts, ideas, and feedback which helped us improve both individually and as a group.