ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining Paper • 2606.17200 • Published 3 days ago • 40
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 2 days ago • 113
Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack Paper • 2606.14409 • Published 6 days ago • 11
WBench Collection WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation • 4 items • Updated 16 days ago • 4
YoCausal: How Far is Video Generation from World Model? A Causality Perspective Paper • 2605.30346 • Published 21 days ago • 54
VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions Paper • 2605.27141 • Published 23 days ago • 19
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling Paper • 2310.04691 • Published Oct 7, 2023 • 3
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published 23 days ago • 141
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published 24 days ago • 102
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction Paper • 2604.27393 • Published Apr 30 • 79
HumanNet: Scaling Human-centric Video Learning to One Million Hours Paper • 2605.06747 • Published May 7 • 52
A Benchmark for Interactive World Models with a Unified Action Generation Framework Paper • 2605.03941 • Published May 5 • 5
WorldMark: A Unified Benchmark Suite for Interactive Video World Models Paper • 2604.21686 • Published Apr 23 • 36
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published Apr 15 • 165
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Paper • 2604.08995 • Published Apr 10 • 51
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published Apr 6 • 115