For this project, I analyzed the Hospital Readmissions Dataset sourced from the UCI Machine Learning Repository. The dataset includes over 100,000 inpatient records from diabetic patients across 130 U.S. hospitals between 1999 and 2008. The primary focus was to examine readmission patterns in relation to demographics, medication use, and hospital stay characteristics.
Using Python and Jupyter Notebook, I began by cleaning and preparing the dataset—addressing missing values, standardizing column names, and transforming coded fields into human-readable formats. I then created visualizations to uncover trends in hospital readmissions by age, race, and diabetes medication status.
This analysis not only improved my skills in Python data manipulation and Matplotlib plotting, but also gave me valuable insight into patient outcomes and risk patterns that are essential for evidence-based healthcare decisions.
Through this exploratory analysis, I discovered that older adults (ages 60–80) are the most frequently readmitted demographic, and that patients prescribed diabetes medication tend to have higher interaction with hospital services overall. The findings also highlight the importance of clean data—particularly for demographic fields like race and gender—in performing accurate and ethical healthcare analysis.
This project demonstrates my ability to:
Clean and transform real-world clinical data
Visualize meaningful patterns using Python
Translate analytical findings into actionable healthcare insights
Moving forward, I aim to expand this project by applying machine learning models to predict readmission likelihood and evaluating the impact of specific medications on 30-day readmissions.
Click each image to see a breakdown of each image