Beyond Tokenization: Direct Timestep Embedding and Contrastive Alignment for Time-Series Question Answering
beyond-tokenization-direct-timestep-embedding-and-contrastive-alignment-for-time-series-question-answering-d1bb5894·1 events·first seen 2d agoAliases: Beyond Tokenization: Direct Timestep Embedding and Contrastive Alignment for Time-Series Question Answering
Co-occurring entities
More like this (12)
Recent events (1)
CADE framework proposes direct timestep embedding and contrastive alignment for time-series question answering
A new arXiv preprint introduces CADE (Contrastive Alignment with Direct Embedding), a framework for time-series question answering (TSQA) that bypasses the tokenization bottleneck of standard LLMs by mapping each timestep directly into the LLM embedding space via a point-wise linear encoder and MLP projector. The approach also introduces a one-directional supervised contrastive loss to align time-series embeddings with frozen class-name text anchors, bridging the semantic gap between numerical and language representations. Evaluated on the Time-MQA benchmark across six TSQA tasks, CADE outperforms both open-source and proprietary LLM baselines. The work addresses a concrete limitation of patch-based encoders — fixed granularity and poor cross-dataset transfer — with a cleaner architectural alternative.