Abstract: Hyperspectral image (HSI) records plentiful spectral-spatial information of land covers while the light detection and ranging (LiDAR) reflects the elevation and structural elements. HSI and ...
Abstract: This paper proposes a novel framework utilizing multimodal large language models (MLLMs) for referring video object segmentation (RefVOS). Previous MLLMbased methods commonly struggle with ...