
DESI Research and Beyond
Galaxy evolution, quasar physics, and ML-driven spectral analysis on the largest spectroscopic survey to date
Research Focus
Our research uses large public spectroscopic survey data, primarily DESI Data Release 1, the largest spectroscopic survey to date. We investigate fundamental questions about galaxy evolution and quasar physics. We build Analysis-Ready Datasets (ARDs) that transform raw survey data into enriched, science-ready products, then apply those ARDs to targeted research questions.
Active Research Areas
Environmental Quenching in Cosmic Voids
Cosmic voids are vast underdense regions, the "bubbles" between filaments of the cosmic web. Galaxies in voids experience minimal environmental interactions, which makes them ideal laboratories for studying intrinsic evolution. We compare void galaxies to wall galaxies to disentangle "nature" (mass-driven) from "nurture" (environment-driven) quenching mechanisms.
AGN Feedback and Outflow Energetics
Quasar-driven outflows may regulate galaxy growth through AGN feedback. We use semi-automated spectral fitting and Cloudy photoionization modeling to measure outflow properties at scale, producing distances, mass outflow rates, and kinetic luminosities. The goal is the first comprehensive catalog of quasar outflow energetics.
ML-Driven Anomaly Detection
With 1.6 million quasar spectra, systematic outlier detection reveals rare objects that manual inspection would miss. We use autoencoder architectures to identify statistical anomalies that may represent unusual accretion physics, rare evolutionary phases, or potentially new source types.
Value-Added Catalogs
Nine DESI DR1 Value-Added Catalogs integrated in our ARD:
| Category | VAC | Purpose |
|---|---|---|
| Galaxy | FastSpecFit | Stellar continuum + emission lines |
| Galaxy | PROVABGS | Bayesian SED fitting with posteriors |
| Galaxy | DESIVAST | Cosmic void classifications (4 algorithms) |
| Galaxy | Gfinder | Halo-based group catalog |
| QSO | AGN/QSO | Spectral and IR classification |
| QSO | CIV Absorber | Intervening CIV systems |
| QSO | MgII Absorber | Intervening MgII systems |
| QSO | QMassIron | Black hole masses |
| QSO | Stellar Mass/EmLine | CIGALE masses and emission line properties |
Methodology
We follow a three-layer enrichment model:
- 1 Foundation Layer Unified catalog with cross-match linkage, environmental classifications, quality flags
- 2 Physics Layer Derived physical quantities: Lick indices, pPXF kinematics, SED posteriors, outflow energetics
- 3 AI / Embeddings Layer Neural spectral embeddings, similarity metrics, anomaly scores
PostgreSQL serves as the materialization engine where VAC joins and derived computations occur. Final ARD products are exported to Parquet for distribution and analysis. The pipeline currently manages approximately 32 GB of catalog data in PostgreSQL and 108 GB of spectral tiles in Parquet.
Data at scale
Galaxy rows
Void classifications
Quasar spectra
DESI DR1 VACs
Upcoming Work
As DESI research matures, two newer initiatives are taking shape:
A Fink community alert broker science module for DESI-contextualized anomaly detection on the Rubin/LSST alert stream.
Systematic anomaly detection on COSMOS-Web DR1 imaging, exploiting tension between independent measurements to surface candidates for follow-up.