In remote sensing and materials science, identifying the true chemical or magnetic state of a surface is often hindered by thick atmospheric layers or bulk substrate interference. Traditional physical models struggle to decouple these complex, multi-layered scattering effects without immense computational cost. By treating hyperspectral datacubes as hierarchical structures—where local pixel spectra act as words and regional atmospheric patterns as paragraphs—we can leverage multi-scale neural architectures to dynamically filter out global interference and isolate hidden surface states.
Approach
We propose a Hierarchical Spectral-Spatial Network (HSSN) that adapts pyramid salient-aware architectures to disentangle surface signals from atmospheric haze. Similar to how the [Self-Adaptive Hierarchical Sentence Model](/paper/art_22b2a809e9b1458193736104683a6a96) constructs a multi-scale pyramid to gate relevant text features, HSSN first encodes local spectral absorption bands using 1D convolutions, then aggregates them into regional patches using a transformer layer to model global atmospheric scattering. By applying a global attention mechanism akin to [Towards Causal Explanation Detection with Pyramid Salient-Aware Network](/paper/art_4fa83d5f524b4e88a21da5f1964ea807), the model dynamically weighs and subtracts the low-frequency atmospheric context from the local signal. This allows the network to isolate intrinsic surface reflectance without relying on computationally heavy radiative transfer modeling.
Experimental Plan
We will evaluate HSSN on the Cassini VIMS (Visual and Infrared Mapping Spectrometer) dataset of Titan, specifically targeting mid-latitude regions where surface albedo is heavily obscured by nitrogen-methane haze. Our primary hypothesis is that HSSN will achieve higher structural similarity in recovered surface features compared to flat sequence models, while operating orders of magnitude faster than traditional radiative transfer codes used in [Titan’s mid-latitude surface regions with Cassini VIMS and RADAR](/paper/art_55995e309f674695aa94cf72fad33c71). Baselines will include standard 3D-CNNs, flat Vision Transformers, and a traditional correlated-k radiative transfer model. Performance will be measured using Mean Squared Error (MSE) against atmospherically-corrected reference spectra and the Structural Similarity Index (SSIM) of the reconstructed geomorphological units.