Entity · company

TNG Technology Consulting

companyactivetng-technology-consulting-ea950960·4 events·first seen May 19, 2026

Aliases: TNG Technology Consulting

Co-occurring entities

Hugging Face Prefill/Decode Disaggregation olmOCR

More like this (12)

Normal Tech Technology Innovation Institute Dell Technologies τ³-Telecom NTIA TCN Nanyang Technological University Normal Tech (newsletter)normaltech.ai Normal Technology Framework National Telecommunications and Information Administration Accenture

Recent events (4)

4Hugging Face Blog·May 19, 2026·source ↗

Efficient Request Queueing – Optimizing LLM Performance

This TNG Technology Consulting post on the Hugging Face blog examines request queueing strategies for improving LLM inference throughput and latency. It addresses how queuing policies and batching decisions affect performance under varying load conditions. The piece is aimed at practitioners deploying LLM inference infrastructure at scale.

Inference Economics Hugging Face TNG Technology Consulting

4Hugging Face Blog·May 19, 2026·source ↗

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

This Hugging Face blog post from TNG Technology Consulting examines how prefill and decode phases interact under concurrent request loads in LLM serving systems. It analyzes performance bottlenecks that arise when multiple requests share GPU resources, covering throughput-latency tradeoffs and optimization strategies. The piece targets practitioners deploying LLMs at scale who need to understand scheduling and batching behavior.

Training Infrastructure Inference Economics Prefill/Decode Disaggregation Hugging Face TNG Technology Consulting

4Hugging Face Blog·May 19, 2026·source ↗

Finetuning olmOCR to be a faithful OCR-Engine

TNG Technology Consulting describes a fine-tuning approach applied to olmOCR, a vision-language model designed for document OCR tasks, to improve its faithfulness and reduce hallucinations. The post covers dataset construction, training methodology, and evaluation results showing improved accuracy on document extraction benchmarks. This represents a practical community contribution to the open-weights document-understanding ecosystem.

Open Weights Progress Agent and Tool Ecosystem Hugging Face olmOCR TNG Technology Consulting +1 more

4Hugging Face Blog·May 19, 2026·source ↗

How Long Prompts Block Other Requests - Optimizing LLM Performance

This Hugging Face blog post from TNG Technology Consulting examines how long prompts create head-of-line blocking in LLM serving systems, degrading latency for concurrent requests. The post analyzes the mechanics of prompt processing in inference pipelines and discusses optimization strategies to mitigate throughput bottlenecks caused by lengthy context inputs. It is framed as a practical guide for teams deploying LLMs in production environments where mixed prompt-length workloads are common.

Long Context Evolution Inference Economics Hugging Face TNG Technology Consulting +1 more