Almanac
model

Vision-Language-Action models

modelactiveprovisionalvision-language-action-models-d6ef443b·3 events·first seen 19d ago

Aliases: Vision-Language-Action models

Co-occurring entities

More like this (12)

Recent events (3)

5arXiv · cs.AI·19d ago·source ↗

RoboWits: Benchmark for Robotic Creative Problem Solving Under Unexpected Conditions

RoboWits is a new bi-manual robotic benchmark designed to evaluate cognitive reasoning, creative tool use, and robustness to unexpected conditions in robotics. The authors introduce an automated multi-agent task generation pipeline that produces 30 seed tasks and 208 mutated tasks spanning geometry, material, and assembly-based reasoning. Benchmarking results show that pre-trained Vision-Language-Action models (VLAs) achieve limited success on seed tasks after fine-tuning but fail on mutated variants, exposing brittleness in reasoning and strategy adaptation. The benchmark highlights a significant gap between skill-level execution and genuine cognitive reasoning in current robotic systems.

6arXiv · cs.LG·19d ago·source ↗

DynaFLIP: Dynamics-Aware Multimodal Pre-Training for Robot Manipulation Perception

DynaFLIP is a pre-training framework that injects motion understanding into visual encoders for robot manipulation by constructing image-language-3D flow triplets from human and robot videos. The method encourages tri-modal alignment via simplex-volume minimization in a shared hyperspherical space, combined with cosine regularization and contrastive objectives. The resulting dynamics-aware visual backbone consistently outperforms baselines across diverse downstream policies including VLAs, with gains up to +22.5% in out-of-distribution scenarios. The work argues that robot generalization requires encoding how the world changes under action, not just static scene content.

7Mistral Ai News·15d ago·source ↗

Mistral AI Announces Strategic Partnerships with SAP and Helsing for German/European AI Sovereignty

Mistral AI has announced a multiyear partnership with SAP to deliver a sovereign AI stack for Germany and Europe, integrating Mistral models into SAP's AI Foundation and co-developing industry-specific solutions. Separately, Mistral is partnering with defense-AI firm Helsing to develop vision-language-action models for defense and security applications. The company is also expanding its physical presence in Germany with a new office and increased local headcount, framing these moves as part of a broader commitment to European AI autonomy.