NASA Unveils Galileo: The Groundbreaking Open-Source Multimodal Model Transforming Earth Observation
NASA has released Galileo, an open-source multimodal AI model that advances Earth observation by integrating diverse remote sensing data for applications like agriculture and disaster management.
Introducing Galileo: A Multimodal Foundation Model for Earth Observation
Galileo is an innovative open-source multimodal foundation model designed to analyze and interpret diverse Earth observation (EO) data streams at scale. Developed collaboratively by researchers from institutions including NASA, McGill University, and Arizona State University, Galileo processes optical, radar, elevation, climate, and auxiliary map data. This unified approach supports crucial applications such as agricultural land mapping, disaster response, and environmental monitoring.
Advanced Architecture and Multimodal Processing
At its core, Galileo utilizes a Vision Transformer (ViT) architecture tailored to handle various data types:
- Multispectral optical imagery (e.g., Sentinel-2)
- Synthetic Aperture Radar (SAR) imagery (e.g., Sentinel-1)
- Elevation and slope data (e.g., NASA SRTM)
- Weather and climate data (e.g., ERA5 precipitation and temperature)
- Land cover maps, population density, and night-light data
Galileo's tokenization pipeline segments inputs into spatial patches, time steps, and channel groups, enabling the model to ingest images, time series, and tabular data seamlessly within a single architecture.
Dual-Objective Self-Supervised Pretraining
A key innovation is Galileo's dual-objective self-supervised pretraining that simultaneously learns local and global features:
- Global losses focus on capturing broad spatial and temporal patterns, ideal for detecting large-scale phenomena like glaciers and deforestation.
- Local losses enhance the model’s sensitivity to small, rapidly changing objects such as fishing boats or debris.
These objectives differ in prediction depth and masking strategies, creating a robust multi-scale feature representation that generalizes well across various EO tasks, even with limited labeled data.
Comprehensive Pretraining Dataset and Strategy
Galileo's training dataset spans the entire globe, selected through clustering to maximize land cover diversity and geographic representation. It includes over 127,000 samples aligned spatiotemporally across multiple data types. Training runs for 500 epochs with batch size 512, employing data augmentations like flipping and rotation and optimized using AdamW with scheduled learning rate and weight decay adjustments.
Benchmark-Breaking Performance
Galileo excels across 11 datasets and 15 downstream tasks, including image classification, pixel-wise time series classification, and segmentation. Highlights include:
- EuroSat classification accuracy of 97.7%, surpassing specialized models like CROMA and SatMAE.
- CropHarvest pixel time series classification accuracy of 84.2%, outperforming Presto and AnySat.
- Segmentation mIoUs of 67.6% on MADOS and 79.4% on PASTIS datasets.
Even smaller Galileo variants (ViT-Nano, ViT-Tiny) achieve competitive results, making the model suitable for resource-limited environments.
Importance of Multimodality
Ablation studies show that removing any single data modality degrades performance, underscoring the value of Galileo’s full multimodal input approach. For example, excluding VIIRS night lights reduces segmentation accuracy significantly.
Open Source and Real-World Impact
All code, model weights, and pretraining data are publicly accessible on GitHub, promoting transparency and community adoption. Galileo supports critical NASA Harvest missions, including global crop mapping, disaster assessment, and marine pollution detection, proving especially valuable where labeled data is scarce.
With its open-source availability, advanced architecture, and superior performance, Galileo sets a new standard in remote sensing AI, empowering global efforts in environmental monitoring and climate resilience.
Сменить язык
Читать эту статью на русском