Almanac
dataset

FusionRS

datasetactiveprovisionalfusionrs-64518d33·1 events·first seen 32h ago

Aliases: FusionRS

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.AI·32h ago·source ↗

FusionRS: Large-scale RGB-infrared-text dataset for dual-modal remote sensing vision-language models

Researchers introduce FusionRS, the first large-scale dataset pairing RGB and infrared remote sensing images with both conventional and IR-aware text captions, designed to support dual-modal vision-language learning. The dataset is constructed by translating public RGB remote sensing images into infrared-style counterparts using image translation. Using FusionRS, the authors train CLIP-style alignment models and fine-tune generative VLMs, demonstrating improvements in RGB-IR alignment, infrared-to-text retrieval, and dual-modal captioning over RGB-only baselines. The work addresses a gap in multimodal remote sensing foundation models by providing modality-specific textual supervision for infrared imagery.