Entity · product

PyTorch

productactivepytorch-c54a4cdc·17 events·first seen May 18, 2026

Aliases: PyTorch

Co-occurring entities

More like this (12)

PyTorch Foundation TensorFlow PyTorch/XLA PyTorch DDP PyTorch FSDP Python NVIDIA Google TPU scikit-learn Blender CUDA PyQu

Recent events (17)

5arXiv · cs.LG·4d ago·source ↗

Bloomberg releases Causal-TS: open-source Python library for causal discovery in time series

Bloomberg has open-sourced Causal-TS, a Python library for causal discovery in high-dimensional and nonstationary multivariate time series. The library implements four specialized algorithms (CDNOTS, CDNOTS+, CEDAR, GRACE) plus wrappers for established methods, with a unified conditional independence test layer GPU-accelerated via PyTorch. It includes a regime discovery pipeline for structural breaks, synthetic data generators, CLI tooling, and optional DoWhy integration for end-to-end causal effect estimation.

Agent and Tool Ecosystem Causal-TS Bloomberg PyTorch +1 more

4Hugging Face Blog·Jul 10, 2026·source ↗

Hugging Face blog: profiling attention mechanisms in PyTorch (Part 3)

A Hugging Face blog post (Part 3 of a series) covers profiling attention operations in PyTorch, targeting practitioners who want to understand and optimize attention layer performance. The post is focused on inference/training infrastructure tooling relevant to LLM workloads. It is a practical technical guide rather than a novel research finding.

Training Infrastructure Inference Economics Hugging Face PyTorch

5arXiv · cs.AI·Jun 18, 2026·source ↗

NeSyCat Torch: Differentiable tensor framework unifying neurosymbolic semantics via monadic abstraction

NeSyCat Torch extends the NeSyCat/ULLER neurosymbolic framework with neural network support for predicates and functions, implemented via probabilistic programming and tensor backends (HaskTorch, JAX, PyTorch). The key technical contribution is a lazy log-tensor monad over the log-semiring enabling numerically stable, differentiable training, alongside a batch monad for efficient batched inference. On MNIST addition benchmarks, the implementations outperform LTN and DeepProbLog in speed and accuracy while remaining within a uniform categorical framework that generalizes across first-order neurosymbolic approaches. The work positions itself as a unifying foundation for classical, fuzzy, probabilistic, and neural truth semantics.

Evaluation and Benchmarking ULLER NeSyCat Torch DeepProbLog +4 more

4Hugging Face Blog·Jun 11, 2026·source ↗

Hugging Face blog: Profiling PyTorch nn.Linear toward a fused MLP implementation

A Hugging Face blog post (Part 2 of a profiling series) walks through optimizing PyTorch's nn.Linear layers toward a fused MLP kernel. The post covers profiling methodology and kernel fusion techniques relevant to inference and training efficiency. This is a practical deep-dive into low-level PyTorch optimization for ML practitioners.

Training Infrastructure Inference Economics Hugging Face PyTorch

6arXiv · cs.AI·May 21, 2026·source ↗

torchtune: PyTorch Native Post-Training Library for LLMs

Meta's PyTorch team introduces torchtune, a PyTorch-native library for post-training LLMs that emphasizes modularity, hackability, and direct access to underlying PyTorch components. The library supports fine-tuning, experimentation, and deployment-oriented workflows across distributed training settings. Benchmarked against popular frameworks Axolotl and Unsloth, torchtune demonstrates competitive performance and memory efficiency while maintaining flexibility for research iteration. The paper presents design principles, model builders, training recipes, and distributed training stack details.

Training Infrastructure Open Weights Progress Unsloth Axolotl torchtune +4 more

5Openai Blog·May 20, 2026·source ↗

OpenAI standardizes on PyTorch

OpenAI announced in January 2020 that it is standardizing its deep learning framework on PyTorch. This marks a consolidation away from any internal or alternative frameworks toward the widely-adopted open-source library. The move signals organizational alignment on tooling infrastructure for all future research and development.

Training Infrastructure Agent and Tool Ecosystem PyTorch OpenAI

4Hugging Face Blog·May 19, 2026·source ↗

Block Sparse Matrices for Smaller and Faster Language Models

This Hugging Face blog post introduces block sparse matrix techniques as a method to reduce the size and improve the inference speed of language models. Block sparsity enforces structured zero patterns in weight matrices, enabling hardware-friendly sparse operations compared to unstructured sparsity. The post likely covers implementation details and benchmarks showing efficiency gains for transformer-based models.

Training Infrastructure Inference Economics block sparse matrices Hugging Face PyTorch

5Hugging Face Blog·May 19, 2026·source ↗

Introducing 🤗 Accelerate

Hugging Face introduced Accelerate, a library designed to simplify distributed training of PyTorch models across multiple GPUs and TPUs with minimal code changes. The library abstracts away the complexity of multi-device training setups, allowing researchers to scale training with a few lines of code. This was a notable contribution to the ML training infrastructure ecosystem at the time of release.

Training Infrastructure Agent and Tool Ecosystem Accelerate Hugging Face PyTorch

4Hugging Face Blog·May 19, 2026·source ↗

Accelerate Large Model Training using PyTorch Fully Sharded Data Parallel

This Hugging Face blog post explains how to use PyTorch's Fully Sharded Data Parallel (FSDP) to train large models that exceed single-GPU memory limits. It covers the integration of FSDP with the Hugging Face Accelerate library, enabling distributed sharding of model parameters, gradients, and optimizer states across multiple GPUs. The post provides practical guidance on configuration and usage for scaling large model training.

Training Infrastructure PyTorch FSDP Hugging Face Hugging Face Accelerate +1 more

5Hugging Face Blog·May 19, 2026·source ↗

How Hugging Face Accelerate Runs Very Large Models Thanks to PyTorch

This Hugging Face blog post explains the technical mechanisms behind the Accelerate library for running large models that exceed single-GPU memory, leveraging PyTorch features such as device maps, CPU/disk offloading, and sharded checkpoints. It describes how models can be distributed across multiple GPUs, CPU RAM, and disk storage transparently. The post serves as both documentation and a technical explainer for practitioners working with large-scale inference and deployment.

Training Infrastructure Inference Economics Hugging Face Hugging Face Accelerate PyTorch

4Hugging Face Blog·May 19, 2026·source ↗

Accelerating PyTorch Transformers with Intel Sapphire Rapids - Part 1

This Hugging Face blog post covers hardware-level inference acceleration for PyTorch Transformer models using Intel's Sapphire Rapids Xeon processors. It likely details how the new AVX-512 and AMX (Advanced Matrix Extensions) instructions in Sapphire Rapids can speed up transformer workloads without requiring GPU hardware. The post is part one of a series, suggesting a practical, tutorial-oriented treatment of CPU-based inference optimization.

Inference Economics Enterprise Deployment Patterns Advanced Matrix Extensions (AMX)Intel Sapphire Rapids Hugging Face +2 more

4Hugging Face Blog·May 19, 2026·source ↗

Accelerating PyTorch Transformers with Intel Sapphire Rapids - Part 2

This Hugging Face blog post covers inference optimization techniques for PyTorch Transformer models on Intel Sapphire Rapids (4th Gen Xeon) CPUs. It likely demonstrates performance gains using hardware-specific features such as AMX (Advanced Matrix Extensions) and BF16 support. The post is part of a series focused on making transformer inference more efficient on Intel server hardware without requiring GPU acceleration.

Inference Economics Enterprise Deployment Patterns Advanced Matrix Extensions (AMX)Intel Sapphire Rapids Hugging Face +2 more

5Hugging Face Blog·May 19, 2026·source ↗

Quanto: a PyTorch quantization backend for Optimum

Hugging Face introduced Quanto, a new PyTorch-based quantization backend integrated into the Optimum library. Quanto supports multiple quantization schemes and data types, targeting efficient inference for large language models and other neural networks. The tool is designed to work across hardware backends and integrates with the Hugging Face ecosystem.

Inference Economics Agent and Tool Ecosystem Optimum Quanto Hugging Face +1 more

5Hugging Face Blog·May 19, 2026·source ↗

Accelerate 1.0.0 Released

Hugging Face has released Accelerate 1.0.0, marking the library's first stable major version. Accelerate is a widely-used PyTorch training library that abstracts distributed training across hardware configurations including multi-GPU, TPU, and mixed-precision setups. The 1.0.0 milestone signals API stability and production readiness for the training infrastructure ecosystem.

Training Infrastructure Open Weights Progress Accelerate Hugging Face PyTorch

4Hugging Face Blog·May 19, 2026·source ↗

Visualize and Understand GPU Memory in PyTorch

A Hugging Face blog post explains how to visualize and analyze GPU memory usage during PyTorch model training. The post covers tools and techniques for understanding memory allocation patterns, helping practitioners diagnose and reduce memory bottlenecks. This is practical infrastructure knowledge relevant to training large models efficiently.

Training Infrastructure Inference Economics Hugging Face GPU memory visualization PyTorch

4Hugging Face Blog·May 19, 2026·source ↗

nanoVLM: Minimal Pure-PyTorch Repository for Training Vision-Language Models

Hugging Face published nanoVLM, a minimal open-source repository designed to make training vision-language models (VLMs) as simple as possible using pure PyTorch. The project aims to lower the barrier to entry for VLM research and experimentation by providing a clean, readable codebase without heavy abstractions. It follows in the tradition of educational ML repositories like nanoGPT, targeting researchers and practitioners who want to understand or customize VLM training from scratch.

Open Weights Progress Agent and Tool Ecosystem nanoGPT nanoVLM Hugging Face +2 more

5Hugging Face Blog·May 18, 2026·source ↗

Safetensors is Joining the PyTorch Foundation

The safetensors format, developed by Hugging Face as a secure and fast alternative to pickle-based model serialization, is being adopted under the PyTorch Foundation. This move formalizes safetensors as part of the broader PyTorch ecosystem, signaling growing standardization around safe model weight storage. The transition reflects increasing industry concern about supply-chain security in ML model distribution.

Training Infrastructure Open Weights Progress PyTorch Foundation safetensors Hugging Face +2 more