Almanac
product

Amazon SageMaker

productactiveamazon-sagemaker-1cdb0f70·10 events·first seen 28d ago

Aliases: Amazon SageMaker

Co-occurring entities

More like this (12)

Recent events (10)

5Hugging Face Blog·28d ago·source ↗

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

Hugging Face and Amazon Web Services have launched a dedicated LLM inference container for Amazon SageMaker, enabling optimized deployment of large language models on managed cloud infrastructure. The container is built on Hugging Face's Text Generation Inference (TGI) toolkit, which supports features like continuous batching, tensor parallelism, and quantization. This integration lowers the barrier for enterprise teams to deploy open-weight LLMs at scale on AWS without managing custom serving infrastructure.

5Hugging Face Blog·28d ago·source ↗

The Partnership: Amazon SageMaker and Hugging Face

Hugging Face and Amazon announced a partnership integrating Hugging Face models and tools natively into Amazon SageMaker. This collaboration enables developers to train and deploy Hugging Face Transformers models directly within SageMaker's managed ML infrastructure. The partnership represents an early major cloud-provider integration for Hugging Face, expanding enterprise access to open-source NLP models.

4Hugging Face Blog·28d ago·source ↗

Introducing the Hugging Face Embedding Container for Amazon SageMaker

Hugging Face has launched a dedicated embedding container for Amazon SageMaker, enabling streamlined deployment of text embedding models on AWS infrastructure. The container is designed to simplify production deployment of embedding models for use cases like semantic search and retrieval-augmented generation. This represents a deeper integration between Hugging Face's model ecosystem and AWS's managed ML platform.

4Hugging Face Blog·28d ago·source ↗

Llama 2 on Amazon SageMaker: A Benchmark

This Hugging Face blog post benchmarks Llama 2 model inference on Amazon SageMaker, examining performance and cost characteristics across different instance types and configurations. The analysis provides practical guidance for deploying open-weights LLMs in cloud infrastructure. It covers throughput, latency, and cost trade-offs relevant to enterprise deployment decisions.

4Hugging Face Blog·28d ago·source ↗

Deploy Hugging Face Models Easily with Amazon SageMaker

Hugging Face and Amazon SageMaker announced an integration enabling streamlined deployment of Hugging Face models via SageMaker's managed infrastructure. The partnership provides dedicated Hugging Face Deep Learning Containers on AWS, simplifying the path from model hub to production inference. This represents an early milestone in the enterprise deployment pattern of hosted model hubs integrating with cloud ML platforms.

3Hugging Face Blog·28d ago·source ↗

Accelerate BERT inference with Hugging Face Transformers and AWS Inferentia

This Hugging Face blog post describes how to deploy BERT models on AWS Inferentia chips using the Hugging Face Transformers library and Amazon SageMaker. It covers the workflow for compiling models with AWS Neuron SDK and running optimized inference on Inferentia hardware. The post targets practitioners looking to reduce inference costs and latency for transformer-based NLP workloads.

3Hugging Face Blog·28d ago·source ↗

Deploy GPT-J 6B for Inference Using Hugging Face Transformers and Amazon SageMaker

This Hugging Face blog post provides a tutorial for deploying the GPT-J 6B open-weights language model on Amazon SageMaker using the Hugging Face Transformers library. It covers the infrastructure and tooling steps needed to serve a large language model in a managed cloud environment. The post reflects early 2022 patterns for productionizing open-weight models via cloud ML platforms.

8Mistral Ai News·15d ago·source ↗

Mistral AI Releases Magistral: First Reasoning Model in Open and Enterprise Variants

Mistral AI announces Magistral, its first reasoning model, released in two variants: Magistral Small (24B parameters, open-weight, Apache 2.0) and Magistral Medium (enterprise, closed). Magistral Medium scores 73.6% on AIME2024 (90% with majority voting @64), while Magistral Small scores 70.7% (83.3% respectively). Key differentiators include native multilingual chain-of-thought reasoning across eight major languages, transparent traceable reasoning steps, and up to 10x faster token throughput in Le Chat via Flash Answers. The release is accompanied by a research paper covering training infrastructure, reinforcement learning algorithm, and novel observations for training reasoning models.

7Mistral Ai News·15d ago·source ↗

Mistral Medium 3: Frontier-Class Performance at 8x Lower Cost

Mistral AI has released Mistral Medium 3, a new enterprise-focused language model priced at $0.4/$2 per million input/output tokens. The model claims to achieve 90%+ of Claude Sonnet 3.7's benchmark performance while undercutting cost leaders like DeepSeek v3, and outperforming open models including Llama 4 Maverick. It supports hybrid, on-premises, and in-VPC deployment on as few as four GPUs, and is available immediately on Mistral La Plateforme and Amazon SageMaker, with additional cloud platforms coming soon. The announcement also teases an upcoming large open-weights model release.

5Anthropic News·13d ago·source ↗

Anthropic, AWS, and Accenture form enterprise AI collaboration targeting regulated sectors

Anthropic, Amazon Web Services, and Accenture announced a three-way collaboration to accelerate enterprise generative AI adoption, with particular focus on regulated industries requiring accuracy, reliability, and data security. Over 1,400 Accenture engineers will be trained as specialists in Anthropic's models on AWS, supporting customers through fine-tuning, prompt engineering, and deployment via Amazon Bedrock and SageMaker. An early production deployment is already live: a Claude-powered bilingual chatbot called Knowledge Assist, built with the DC Department of Health. The partnership combines Anthropic's model expertise, AWS infrastructure, and Accenture's industry consulting reach.