Entity · benchmark

CruxEval

benchmarkactivecruxeval-a8b0ffc5·1 events·first seen Jun 1, 2026

Aliases: CruxEval

Co-occurring entities

Mistral AI LlamaIndex GPT-4 Turbo Continue.dev JetBrains La Plateforme Mistral AI Non-Production License Spider LangChain deepseek-coder Codestral RepoBench HumanEval MBPP HuggingFace Tabnine

More like this (12)

CRUX ValueEval SummEval SciKnowEval HypoEval DeepEval Claw-Eval Codex HumanEval ParaEval L-Eval ARC Evals UniEval

Recent events (1)

7Mistral Ai News·Jun 1, 2026·source ↗

Mistral AI Releases Codestral: 22B Open-Weight Code Generation Model

Mistral AI has released Codestral, a 22B open-weight model explicitly designed for code generation, supporting 80+ programming languages with a 32k context window. The model is available under a non-production license on HuggingFace, with commercial licenses available on request, and is accessible via a dedicated API endpoint (codestral.mistral.ai) free during an 8-week beta. Codestral claims state-of-the-art performance on RepoBench, HumanEval, and fill-in-the-middle benchmarks, outperforming DeepSeek Coder 33B and matching or exceeding GPT-4-Turbo on some language-specific evals. Integrations are available with LlamaIndex, LangChain, Continue.dev, and Tabnine for IDE-based developer workflows.

Frontier Model Releases Evaluation and Benchmarking Mistral AI LlamaIndex GPT-4 Turbo +17 more