We Don't Crunch Numbers We Create Clarity

Aspiring data scientist passionate about machine learning, predictive analytics, and data visualization. Building skills through hands-on projects and academic exploration to solve real-world problems using data-driven techniques

Machine Learning Data Visualization Statistical Analysis Python Data Preprocessing

Featured Projects

A showcase of my data science work

Blackhole

ClimateBench2.0: Probabilistic Climate Model Scoring

Google Cloud Platform React/Node.js Data Engineering ML Model Evaluation FastAPI/ Flask/ Django

Received a $4,500 research scholarship under PhD scientist Duncan Watson-Parris. Processed 240+ climate datasets (e.g., CMIP6) into ndpyramid format on Google Cloud to support ML model validation. Developed a pipeline with 20+ custom metrics to benchmark 50+ models and exposed black-box limitations. Presented findings via a React/Node.js web app for stakeholder engagement.

energy

Predictive Modeling of Heating and Cooling Loads

Scikit-learn Multi Linear Regression K-means Clustering Statistical Analysis

Predicted energy loads with 91% accuracy using linear regression and R-squared, optimizing energy models. Applied k-means clustering to detect shifts in consumption patterns. Conducted EDA to uncover key trends for targeted energy optimization.

Blackhole

Simulating Black Hole Evolution: Comparative Analysis of Light and Heavy Seeds

Computational Astrophysics Numerical Modeling Astropy Report Writing & Documentation

Simulated black hole growth using Eddington and super-Eddington accretion, analyzing astrophysical datasets. Researched the impact of seed mass on growth trajectories, visualizing key differences. Conducted 12 weeks of research under PhD guidance, advancing computational astrophysics.

energy

Compost Analytics & Consulting Project

Correlational Analysis Data Cleaning Data Validation Statistical Analysis

Led analysis of 2,000+ composting data records, increasing waste diversion rates by 15%. Enhanced data accuracy by 25% through Python-driven cleaning and transformation. Enabled scalable, reliable environmental data analysis with structured workflows and agile project management.

friends

Rebooting Sitcom Friends with Data-Driven Strategies for Enhanced Engagement

Bayesian Inference Data Preprocessing Hypothesis Testing Correlation Analysis Ngrams Pandas

Used statistical testing to analyze episode ratings and viewership trends, guiding reboot decisions. Applied sentiment analysis and Bayesian modeling for character development. Optimized episode count with linear regression to sustain audience engagement.

TSwift Tunes: Data-Driven Insights and Recommender System

Exploratory Data Analysis (EDA) TF-IDF Keyword Search Text Mining Sentiment Analysis

Conducted EDA to uncover trends and improve song categorization. Built a recommender system using audio features for personalized track suggestions. Developed a TF-IDF lyric search tool for faster, more accurate theme identification.

About Me

Hi, I'm Pratham Aggarwal

I'm a Data Science major at the Halıcıoğlu Data Science Institute, University of California San Diego, driven by curiosity and a passion for uncovering patterns—whether in data, the night sky, or human behavior. My work spans diverse domains, from analyzing audience engagement in entertainment to optimizing energy consumption and researching black hole growth in astrophysics.

Beyond data, karate has been a lifelong discipline that instilled perseverance and focus—qualities that shape my problem-solving mindset. I also have a deep love for astronomy, not just as a science but as a way to spark imagination and wonder. I see stories in the stars, much like I find insights in data.

Technical Skills

Python
SQL
Google Cloud Platform
Pandas
SQL
MATLAB
Java
Scikit-learn
Git & Github
Tableau
Data Preprocessing
Alex Chen - Data Scientist

Get In Touch

Let's Connect

I'm currently open to new opportunities and collaborations. Feel free to reach out if you'd like to discuss a project or position.