ClimateBench 2.0
Probabilistic benchmarking framework for climate ML models. Research scholarship under Prof. Duncan Watson-Parris.
Funding: $4,500 research scholarship — Halıcıoğlu Data Science Institute, UC San Diego
Supervisor: Prof. Duncan Watson-Parris (Scripps Institution of Oceanography / HDSI)
Repository: climate-analytics-lab/ClimateBench_app
ClimateBench 2.0 is a scalable benchmarking platform for evaluating machine learning models on climate prediction tasks. The project extends the original ClimateBench dataset and evaluation pipeline with probabilistic scoring, broader model coverage, and a cloud-native data infrastructure.
My contributions:
- Processed 240+ CMIP6 climate simulation datasets into ndpyramid format on Google Cloud Storage, enabling efficient multiscale access for large model ensembles.
- Built an evaluation pipeline implementing 20+ probabilistic and deterministic metrics (CRPS, skill scores, spatial correlation, bias) to benchmark climate emulators.
- Applied the pipeline to score 50+ ML models, spanning neural emulators, statistical baselines, and ensemble methods, across temperature and precipitation prediction tasks.
The work contributes directly to the climate ML community’s ability to compare models rigorously on physically meaningful benchmarks.