3arXiv cs.LG (Machine Learning)·11d ago

Zero Touch Predictive Orchestration: Automated Time-Series Forecasting for Cloud-Edge Continuum Cold Start

A preprint proposes a fully automated time-series prediction architecture for Cloud-Edge Continuum (CEC) orchestration, addressing the cold-start problem where newly discovered edge nodes lack historical data for localized model training. The system combines a lightweight Resource Exposer for telemetry collection with a novel data-mixing methodology that merges sparse local samples with TimeTrack, a publicly released high-resolution dataset, then feeds the result through a Neural Architecture Search engine to auto-generate baseline models. Experiments show the approach improves MSE, MAE, and MAPE and accelerates convergence versus training on local data alone or generic datasets.

Training Infrastructure Zero Touch Predictive Orchestration: Automating Time-Series Models for the Cloud-Edge Continuum Neural Architecture Search TimeTrack

Related guides (1)

Training InfrastructureTopic guide

Training Infrastructure: The Compute Arms Race Powering Modern AI

Read asBeginner In-depth

Related events (8)

4arXiv · cs.LG·10d ago·source ↗

COGENT: Continuous graph emulator with Neural ODEs for long-term physical forecasting on irregular meshes

COGENT is a new architecture combining graph neural networks with Neural Ordinary Differential Equations for continuous-time physical forecasting on irregular geospatial meshes. The model encodes historical system states and forcings into latent dynamics that can be queried at arbitrary future times, avoiding the error accumulation of autoregressive rollout. Evaluated on ice-sheet simulations from the Ice-sheet and Sea-level System Model, COGENT shows improved long-range stability over autoregressive graph baselines. The work introduces training stabilization strategies including rollout-horizon sampling and progressive scheduling.

Neural Ordinary Differential Equations Ice-sheet and Sea-level System Model COGENT

4arXiv · cs.AI·5d ago·source ↗

Benchmark of deep learning architectures for multi-horizon behavioural forecasting in mobile health

A new arXiv preprint benchmarks six deep learning architectures, two zero-shot foundation models, and statistical baselines on multi-horizon behavioural forecasting from wearable and smartphone data across 800+ participants. Key findings include: no single architecture dominates (PatchTST leads among trained models), TimesFM matches or exceeds trained models zero-shot especially in low-data regimes, and participant-level fine-tuning reduces per-feature RMSE by 16–60%. The study is the first to jointly evaluate modern deep learning, foundation models, and personalisation for this domain.

Evaluation and Benchmarking A Comparative Study of Deep Learning Architectures for Multi-Horizon Behavioural Forecasting for Mobile Health TimesFM TCN +1 more

6arXiv · cs.LG·17d ago·source ↗

q0: Hyper-Epoch Pretraining turns multi-epoch budgets into diverse model populations for better generalization

A new arXiv preprint introduces hyper-epoch pretraining (q0), a framework that reframes multi-epoch training as exploration of a model population rather than refinement of a single model. The approach uses three primitives—cyclic schedules with anti-correlated learning rate and weight decay, chain distillation, and a learned prior for inference-time weighting—to achieve lower validation loss than single-model training. On a 1.8B-parameter model trained on FineWeb, q0 matches a 256-epoch ensemble baseline using only ~56 epochs (~4.6× fewer), with cumulative ~12.9× data efficiency under the Slowrun setting. The work directly addresses the emerging regime where compute scales faster than high-quality data supply.

Training Infrastructure Open Weights Progress FineWeb q0: Primitives for Hyper-Epoch Pretraining

5arXiv · cs.AI·4d ago·source ↗

HAMON: Passive diffractive optical system for long-horizon time-series forecasting

HAMON is a proposed forecasting architecture that replaces learned digital sequence-mixing layers with a passive diffractive optical core: historical values are encoded onto an optical aperture and cascaded trainable phase masks with free-space diffraction produce forecasts in the output field. At inference, prediction requires only a single passive optical propagation pass with no digital temporal mixing. The system outperforms strong digital baselines on ETTm2 and ETTh2 benchmarks by up to 14% MSE improvement, though it trails on high-channel-count datasets like Traffic and Electricity. The work raises a substrate-level question about whether forecasting operators need to be implemented digitally at all, and defines a concrete target for optical computing hardware.

Evaluation and Benchmarking Inference Economics HAMON ETTh2 TorchOptics +1 more

4Hugging Face Blog·1mo ago·source ↗

Probabilistic Time Series Forecasting with Transformers

This Hugging Face blog post introduces probabilistic time series forecasting using Transformer-based models available in the Hugging Face ecosystem. It covers the application of attention-based architectures to sequential prediction tasks with uncertainty quantification. The post serves as a tutorial and capability demonstration for time series modeling within the Transformers library.

Agent and Tool Ecosystem Probabilistic Time Series Forecasting Hugging Face Transformers Hugging Face

7arXiv · cs.AI·1mo ago·source ↗

Toto 2.0: Open-Weights Time Series Foundation Models Demonstrate Scaling Laws from 4M to 2.5B Parameters

Datadog releases Toto 2.0, a family of five open-weights time series forecasting models ranging from 4M to 2.5B parameters, demonstrating consistent forecast quality improvements with scale. The models achieve state-of-the-art results on three benchmarks: BOOM (observability), GIFT-Eval (general-purpose), and TIME (contamination-resistant). The release includes architectural details, a u-muP hyperparameter transfer pipeline, and all base checkpoints under Apache 2.0 license.

Training Infrastructure Frontier Model Releases Toto 2.0 GIFT-Eval TIME +5 more

5arXiv · cs.AI·18d ago·source ↗

LLM Agent Framework for Last-Mile Time Series Forecasting Revision

This paper introduces a 'last-mile forecasting' framework where an LLM agent sits atop a statistical forecasting backbone to incorporate weakly structured business context—holidays, campaigns, expert feedback, external events—into decision-ready forecasts. The system uses tool-invocation for contextual retrieval, converts reasoning into explicit revision actions under safety constraints, and supports long-horizon forecasting via map-reduce decomposition with a memory bank for post-hoc reflection. The authors validate the approach through real-world case studies, positioning it as a bridge between statistical prediction and operationally usable forecasts.

Enterprise Deployment Patterns Agent and Tool Ecosystem Map-Reduce Decomposition Last-Mile Forecasting Framework Time Series Foundation Models +2 more

5Hugging Face Blog·1mo ago·source ↗

Back to The Future: Evaluating AI Agents on Predicting Future Events

This Hugging Face blog post introduces FutureBench, a benchmark designed to evaluate AI agents on their ability to predict future events, addressing the challenge of data contamination in standard benchmarks by using temporally forward-looking tasks. The approach tests whether agents can reason about and forecast outcomes beyond their training data cutoff. This framing positions future-event prediction as a rigorous, contamination-resistant evaluation methodology for frontier models and agents.

Evaluation and Benchmarking Agent and Tool Ecosystem FutureBench Hugging Face