Multivariate Probabilistic Time Series Forecasting with Informer
A Hugging Face blog post introduces the Informer model for multivariate probabilistic time series forecasting. The post covers the architecture and usage of Informer, which uses a sparse attention mechanism (ProbSparse) to handle long sequences more efficiently than standard Transformers. It demonstrates how to use the model via the Hugging Face Transformers library for forecasting tasks.
Related guides (2)
Related events (8)
Probabilistic Time Series Forecasting with Transformers
This Hugging Face blog post introduces probabilistic time series forecasting using Transformer-based models available in the Hugging Face ecosystem. It covers the application of attention-based architectures to sequential prediction tasks with uncertainty quantification. The post serves as a tutorial and capability demonstration for time series modeling within the Transformers library.
Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)
A Hugging Face blog post examines the effectiveness of Transformer architectures for time series forecasting, with a focus on the Autoformer model. The post addresses ongoing debate about whether Transformers are suitable for time series tasks, countering claims that simpler linear models outperform them. It covers the Autoformer architecture's decomposition-based approach and its integration into the Hugging Face ecosystem.
PatchTSMixer in HuggingFace
Hugging Face introduces PatchTSMixer, a lightweight MLP-Mixer-based model for multivariate time-series forecasting, now available in the Transformers library. The model is designed for efficient patch-based mixing of temporal and channel information. This integration expands Hugging Face's time-series modeling capabilities alongside the previously added PatchTST model.
Accelerating Hugging Face Transformers with AWS Inferentia2
Hugging Face published a blog post detailing how to accelerate Transformer model inference using AWS Inferentia2, Amazon's second-generation ML inference chip. The post covers integration patterns between the Hugging Face ecosystem and the Neuron SDK for deploying models on Inferentia2 hardware. This represents a practical guide for enterprise and cloud-based inference deployment using dedicated AI accelerators.
An Overview of Inference Solutions on Hugging Face
Hugging Face published a blog post surveying its inference product offerings as of late 2022. The post covers the range of hosted and API-based inference solutions available on the platform, aimed at helping developers choose appropriate deployment paths. This serves as a reference overview of Hugging Face's inference infrastructure ecosystem at that time.
Accelerate BERT inference with Hugging Face Transformers and AWS Inferentia
This Hugging Face blog post describes how to deploy BERT models on AWS Inferentia chips using the Hugging Face Transformers library and Amazon SageMaker. It covers the workflow for compiling models with AWS Neuron SDK and running optimized inference on Inferentia hardware. The post targets practitioners looking to reduce inference costs and latency for transformer-based NLP workloads.
Patch Time Series Transformer in Hugging Face
Hugging Face has integrated PatchTST, a patch-based Transformer architecture for time series forecasting, into its ecosystem. PatchTST applies the patching concept from vision transformers to time series data, dividing sequences into subseries-level patches as input tokens. The blog post covers usage, fine-tuning, and zero-shot transfer capabilities of the model within the Hugging Face Transformers library.
Optimization story: Bloom inference
This Hugging Face blog post documents practical inference optimization techniques applied to the BLOOM large language model. It covers strategies for reducing latency and memory footprint during deployment, likely including quantization, tensor parallelism, and batching approaches. The post serves as a technical case study for serving very large open-weights models efficiently.

