📑 Table of Contents

Former Horizon Robotics Executive Launches Startup, Wheeled Robots Tackle Warehouse Pick-and-Place

📅 · 📁 Industry · 👁 11 views · ⏱️ 3 min read
💡 Embodied AI startup ZhiWang Future focuses on warehouse logistics, using a wheeled chassis plus dual-arm design to tackle pick-and-place operations that account for 60% of labor costs. The company innovatively adopts Human-in-the-Loop online reinforcement learning and targets 100-unit shipments this year.

Introduction: Differentiated Breakthrough in the Embodied AI Space

While the embodied AI industry is largely fixated on bipedal humanoid robots and simulation-based training, a Nanjing-based startup has chosen a markedly different path — focusing on warehouse logistics scenarios and using a pragmatic "wheeled chassis plus dual arms" approach to tackle pick-and-place operations that account for 60% of warehouse labor costs.

The company, ZhiWang Future, was incubated by the CAS Nanjing Institute of Software Technology. Founder Sun Junkai previously served as General Manager of the smart cockpit product line at Horizon Robotics, where he led the mass production deployment of millions of terminal units, gaining extensive experience in zero-to-one product design and scaled manufacturing. The company initially operated as an "embodied AI research group" under the Chinese Academy of Sciences ecosystem for two years before officially registering as an independent entity in late 2025, converting its technical accumulations into commercial products.

Core Approach: Human-in-the-Loop Bridges the Sim2Real Gap

The core challenge of embodied AI in real physical environments lies in the enormous Sim2Real (simulation-to-reality) gap. In traditional technical approaches, offline reinforcement learning is heavily dependent on simulated data, and success rates often drop significantly when deployed in real-world scenarios. Online reinforcement learning offers higher precision but requires lengthy learning cycles, making it impractical for e-commerce warehouses with millions of SKUs.

ZhiWang Future has innovatively introduced a Human-in-the-Loop online reinforcement learning method that deeply integrates human real-time correction capabilities with unified reinforcement learning objectives, creating a critical pathway from imitation learning to autonomous exploration. Using this method, the system requires only a small amount of demonstration data and brief online learning to significantly improve task success rates. According to the team, sample efficiency has improved by orders of magnitude compared to traditional paradigms.

In simple terms, the core logic of this approach is: rather than trying to make robots "learn everything" in simulation, let them learn on the job in real-world settings while human operators provide real-time corrective guidance at critical junctures, dramatically shortening learning cycles and improving generalization capabilities.