On June 17, local time in the United States, NVIDIA's General Embodied Agent Research Laboratory (GEAR Lab) announced a new robot self-improvement plan: through a "coaching team" composed of AI programming agents, a training process is designed for the robot arm with almost no human intervention, so that the robot can learn to cut plastic ties, organize small parts, and even accurately insert the GPU into the motherboard expansion slot.

This solution is based on an "agent harness" called ENPIRE - it is like a software shell wrapped around the large model, allowing AI programming agents to call various tools and have the capabilities of memory, context management, constraint control and feedback loops to automatically plan, execute, evaluate and iterate robot training tasks. Nvidia said the framework was developed by the GEAR Lab team in conjunction with researchers at Carnegie Mellon University and the University of California, Berkeley.

Jim Fan, head of NVIDIA AI, described on social platforms that part of the laboratory can now "self-improve" at night, and researchers only need to check the training report in the morning to understand the progress of the robot the night before. He half-jokingly said that in an ideal world "everyone would go on vacation and Huang Renxun would not find out", and said that the team plans to open source the relevant results so that anyone can build their own "self-running robot laboratory" at home.

The ENPIRE framework currently includes four core modules: First, it provides automatic reset and result verification for robot tasks; Second, it automatically optimizes robot control strategies; Third, it evaluates different strategies in parallel on multiple physical robots; Fourth, it handles failure cases in training by analyzing logs, reading papers, and improving training infrastructure and algorithm codes. The research team published a technical paper on June 16 detailing the implementation details and experimental results of the system.

In the experiment, the researchers introduced three types of mainstream AI programming agents: agents based on OpenAI Codex and GPT‑5.5, agents based on Anthropic Claude Code Opus 4.7, and agents based on Dark Side of the Moon (Moonshot AI) Kimi Code K2.6. These agents will independently propose different algorithm improvement plans as a team, conduct training experiments on real robots, and then retain changes that can improve the overall success rate, and continue to iterate.

The results show that under the scheduling of ENPIRE, AI programming agents can automatically design effective self-improvement strategies for a variety of robotic arm operation tasks: In the standard Push-T desktop operation task, the robot needs to accurately push T-shaped building blocks to the target area; in other tasks, the robot is required to organize small needles in the needle box, tie and cut plastic ties, or insert the GPU into the motherboard slot and pull it out to reset after each round of testing. On multiple tasks, the system ultimately achieved a 99% success rate, with the AI-driven training program reaching a nearly 100% success rate even faster than the "cutting-edge human participatory approach" involving humans on pin insertion and sorting tasks.

Experiments also show that increasing the number of agents can significantly accelerate the learning process: on the Push‑T task, a team of 8 agents pushed the success rate to 99% in just 2 hours of research time, while a team of 4 needed 3 hours and a single agent nearly 5 hours to reach the same level. However, researchers also noticed that the efficiency of multi-agent collaboration does not improve linearly. As the number of agents increases, more time is spent summarizing and communicating with each other rather than actually scheduling robots to perform training.

The research team also pointed out several limitations of the current system: for many periods of time, the robot is idle on the experimental bench, waiting for the AI ​​programming agent to read logs, write and debug code, or wait for the underlying language model to respond. In addition, in terms of parallel training, agents sometimes do not fully utilize existing computing resources, resulting in experimental throughput below the theoretical upper limit. From a cost perspective, the increase in the number of agents and training frequency also means significantly higher token consumption, which is directly related to the fact that many AI service providers are currently considering increasing the token-based charging method.

Although there are still shortcomings, Nvidia is clearly increasing its ambitions for what it calls "physical AI". With the abundant cash flow brought by the AI ​​​​wave, the company continues to invest in multiple robot projects: At the end of May this year, NVIDIA announced that it would cooperate with Unitree, a competitor of Chinese robot company Unitree, to provide research institutions with a "universal humanoid robot reference platform" for the research and development of general AI robots. In early June this year, Huang Renxun paid an intensive visit to South Korea and met with Hyundai Motor Group Executive Chairman Chung Eui-sun to discuss how to expand the large-scale manufacturing of AI robots; Hyundai has previously acquired Boston Dynamics, an American company famous for its four-legged "robot dog" Spot, and is promoting the commercialization of the bipedal humanoid robot Atlas.

On this path, ENPIRE and the team of AI programming agents behind it are regarded as key components towards the "self-driven robot laboratory". They try to hand over a lot of the work of human experts in trial and error, parameter adjustment and reading literature to AI, allowing researchers to play more of a "morning daily review" role. With the open source of relevant codes and frameworks, whether similar autonomous training systems will become popular among universities, enterprises and even individual enthusiasts in the future will become an important window to observe the speed of the implementation of "Physics AI".