Almanac
← Events
3Hugging Face Blog·1mo ago

Multivariate Probabilistic Time Series Forecasting with Informer

A Hugging Face blog post introduces the Informer model for multivariate probabilistic time series forecasting. The post covers the architecture and usage of Informer, which uses a sparse attention mechanism (ProbSparse) to handle long sequences more efficiently than standard Transformers. It demonstrates how to use the model via the Hugging Face Transformers library for forecasting tasks.

Related guides (2)

Related events (8)

4Hugging Face Blog·1mo ago·source ↗

Probabilistic Time Series Forecasting with Transformers

This Hugging Face blog post introduces probabilistic time series forecasting using Transformer-based models available in the Hugging Face ecosystem. It covers the application of attention-based architectures to sequential prediction tasks with uncertainty quantification. The post serves as a tutorial and capability demonstration for time series modeling within the Transformers library.

3Hugging Face Blog·1mo ago·source ↗

Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)

A Hugging Face blog post examines the effectiveness of Transformer architectures for time series forecasting, with a focus on the Autoformer model. The post addresses ongoing debate about whether Transformers are suitable for time series tasks, countering claims that simpler linear models outperform them. It covers the Autoformer architecture's decomposition-based approach and its integration into the Hugging Face ecosystem.

4Hugging Face Blog·1mo ago·source ↗

PatchTSMixer in HuggingFace

Hugging Face introduces PatchTSMixer, a lightweight MLP-Mixer-based model for multivariate time-series forecasting, now available in the Transformers library. The model is designed for efficient patch-based mixing of temporal and channel information. This integration expands Hugging Face's time-series modeling capabilities alongside the previously added PatchTST model.

4Hugging Face Blog·1mo ago·source ↗

Accelerating Hugging Face Transformers with AWS Inferentia2

Hugging Face published a blog post detailing how to accelerate Transformer model inference using AWS Inferentia2, Amazon's second-generation ML inference chip. The post covers integration patterns between the Hugging Face ecosystem and the Neuron SDK for deploying models on Inferentia2 hardware. This represents a practical guide for enterprise and cloud-based inference deployment using dedicated AI accelerators.

3Hugging Face Blog·1mo ago·source ↗

An Overview of Inference Solutions on Hugging Face

Hugging Face published a blog post surveying its inference product offerings as of late 2022. The post covers the range of hosted and API-based inference solutions available on the platform, aimed at helping developers choose appropriate deployment paths. This serves as a reference overview of Hugging Face's inference infrastructure ecosystem at that time.

3Hugging Face Blog·1mo ago·source ↗

Accelerate BERT inference with Hugging Face Transformers and AWS Inferentia

This Hugging Face blog post describes how to deploy BERT models on AWS Inferentia chips using the Hugging Face Transformers library and Amazon SageMaker. It covers the workflow for compiling models with AWS Neuron SDK and running optimized inference on Inferentia hardware. The post targets practitioners looking to reduce inference costs and latency for transformer-based NLP workloads.

4Hugging Face Blog·1mo ago·source ↗

Patch Time Series Transformer in Hugging Face

Hugging Face has integrated PatchTST, a patch-based Transformer architecture for time series forecasting, into its ecosystem. PatchTST applies the patching concept from vision transformers to time series data, dividing sequences into subseries-level patches as input tokens. The blog post covers usage, fine-tuning, and zero-shot transfer capabilities of the model within the Hugging Face Transformers library.

4Hugging Face Blog·1mo ago·source ↗

Optimization story: Bloom inference

This Hugging Face blog post documents practical inference optimization techniques applied to the BLOOM large language model. It covers strategies for reducing latency and memory footprint during deployment, likely including quantization, tensor parallelism, and batching approaches. The post serves as a technical case study for serving very large open-weights models efficiently.