AI Tutor Is Real, And It’s Already Here

TRANSIC enables sim-to-real transfer via human-in-the-loop corrections. Humans assist robots online, residual policies learn from interventions.

:::info
Authors:

(1) Yunfan Jiang, Department of Computer Science;

(2) Chen Wang, Department of Computer Science;

(3) Ruohan Zhang, Department of Computer Science and Institute for Human-Centered AI (HAI);

(4) Jiajun Wu, Department of Computer Science and Institute for Human-Centered AI (HAI);

(5) Li Fei-Fei, Department of Computer Science and Institute for Human-Centered AI (HAI).

:::

Abstract and 1 Introduction

2 Preliminaries

3 TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction and 3.1 Learning Base Policies in Simulation with RL

3.2 Learning Residual Policies from Online Correction

3.3 An Integrated Deployment Framework and 3.4 Implementation Details

4 Experiments

4.1 Experiment Settings

4.2 Quantitative Comparison on Four Assembly Tasks

4.3 Effectiveness in Addressing Different Sim-to-Real Gaps (Q4)

4.4 Scalability with Human Effort (Q5) and 4.5 Intriguing Properties and Emergent Behaviors (Q6)

5 Related Work

6 Conclusion and Limitations, Acknowledgments, and References

A. Simulation Training Details

B. Real-World Learning Details

C. Experiment Settings and Evaluation Details

D. Additional Experiment Results

\
Abstract: Learning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots. The key challenge of this approach is to address simulation-to-reality (sim-to-real) gaps. Previous methods often require domain-specific knowledge a priori. We argue that a straightforward way to obtain such knowledge is by asking humans to observe and assist robot policy execution in the real world. The robots can then learn from humans to close various sim-to-real gaps. We propose TRANSIC, a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework. TRANSIC allows humans to augment simulation policies to overcome various unmodeled sim-to-real gaps holistically through intervention and online correction. Residual policies can be learned from human corrections and integrated with simulation policies for autonomous execution. We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly. Through synergistic integration of policies learned in simulation and from humans, TRANSIC is effective as a holistic approach to addressing various, often coexisting sim-to-real gaps. It displays attractive properties such as scaling with human effort. Videos and code are available at transic-robot.github.io.

1 Introduction

Learning in simulation is a potential approach to the realization of generalist robots capable of solving sophisticated decision-making tasks [1, 2]. Learning to solve these tasks requires a large amount of training data [3–5]. Providing unlimited training supervision [6] through state-of-the-art simulation [7–11] could alleviate the burden of collecting data in the real world with physical robots [12, 13]. Therefore, it is crucial to seamlessly transfer and deploy robot control policies acquired in simulation, usually through reinforcement learning (RL), to real-world hardware. Successful demonstrations of this simulation-to-reality (sim-to-real) approach have been shown in dexterous in-hand manipulation [14–18], quadruped locomotion [19–22], biped locomotion [23–28], and quadrotor flight [29, 30].

\
Nevertheless, replicating similar success in manipulation tasks with robotic arms remains surprisingly challenging, with only a few cases in simple non-prehensile manipulation (such as pulling, pushing, and pivoting objects) [31–34], industry assembly under restricted settings [35–39], drawer opening [40], and peg swinging [40]. The difficulty mainly stems from the unavoidable sim-to-real gaps [11, 41], including but not limited to perception gap [19, 42–44], embodiment mismatch [19, 45, 46], controller inaccuracy [47–49], and dynamics realism [50]. Traditionally, researchers tackle these sim-to-real gaps and the transferring problem through system identification [19, 31, 51, 52], domain randomization [14, 53–55], real-world adaptation [56, 57], and simulator augmentation [58–60]. Many of these approaches require explicit, domain-specific knowledge and expertise on tasks or simulators. Although for a particular simulation-reality pair, there may

\
Figure 1: TRANSIC for sim-to-real transfer in contact-rich robotic manipulation tasks. a) and b) Na¨ıvely deploying policies trained in simulation usually fails due to various sim-to-real gaps. Here, the robot attempts to first align the light bulb with the base and then insert and screw the light bulb into the base. c) A human operator monitors robot behaviors, intervenes, and provides online correction through teleoperation when necessary. Human data are collected to train a residual policy to tackle various sim-to-real gaps in a holistic manner. d) The simulation policy and residual policy are integrated together during test time to achieve successful sim-to-real transfer for contact-rich tasks, such as screwing a light bulb into the base.

\
exist specific inductive biases that can be hand-crafted post hoc to close the sim-to-real gap [19], such knowledge is often not available a priori. Identifying its effects on task completion is also intractable.

\
We argue that a straightforward and feasible way for humans to obtain such knowledge is to observe and assist policy execution in the real world. If humans can assist the robot to successfully accomplish the tasks in the real world, sim-to-real gaps are effectively addressed. This naturally leads to a generally applicable paradigm that can cover different priors across simulations and realities — human-in-the-loop learning [61–63] and shared-autonomy [64, 65].

\
Our key insight is that the human-in-the-loop framework is promising for addressing the sim-to-real gaps as a whole, in which humans directly assist the physical robots during policy execution by providing online correction signals. The knowledge required to close sim-to-real gaps can be learned from human signals. To this end, we present TRANSIC (transferring policies sim-to-real by learning from online correction, Fig. 1), a data-driven approach to enable successful transferring of robot manipulation policies trained with RL in simulation to the real world. In TRANSIC, once the base robot policies are acquired from simulation training, they are deployed onto real robots where human operators monitor the execution. When the robot makes mistakes or gets stuck, humans interrupt and assist robot policies through teleoperation. Such human intervention data are collected to train a residual policy, after which the base policy and the residual policy are combined to solve contact-rich manipulation tasks, such as furniture assembly. With the synergetic integration with previous approaches, since humans can successfully assist the robot trained in silico to complete real-world tasks, sim-to-real gaps are implicitly handled and addressed by humans in a domain-agnostic manner. Additionally, human supervision naturally guarantees safe deployment.

\
To summarize, the key contribution of our work is a novel, holistic human-in-the-loop method called TRANSIC to tackle sim-to-real transfer of policies for manipulation tasks. Through extensive evaluation, we show that our method leads to more effective sim-to-real transfer compared to traditional methods [51, 53] and requires less real-robot data compared to the prevalent imitation learning and offline RL algorithms [66–69]. We demonstrate that successful sim-to-real transfer of short-horizon skills can solve long-horizon, contact-rich manipulation tasks in our daily activities, such as furniture assembly. Videos and code are available at transic-robot.github.io.

\

:::info
This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\


Print Share Comment Cite Upload Translate Updates
APA

Learning Rate | Sciencx (2025-06-03T14:06:09+00:00) AI Tutor Is Real, And It’s Already Here. Retrieved from https://www.scien.cx/2025/06/03/ai-tutor-is-real-and-its-already-here/

MLA
" » AI Tutor Is Real, And It’s Already Here." Learning Rate | Sciencx - Tuesday June 3, 2025, https://www.scien.cx/2025/06/03/ai-tutor-is-real-and-its-already-here/
HARVARD
Learning Rate | Sciencx Tuesday June 3, 2025 » AI Tutor Is Real, And It’s Already Here., viewed ,<https://www.scien.cx/2025/06/03/ai-tutor-is-real-and-its-already-here/>
VANCOUVER
Learning Rate | Sciencx - » AI Tutor Is Real, And It’s Already Here. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/06/03/ai-tutor-is-real-and-its-already-here/
CHICAGO
" » AI Tutor Is Real, And It’s Already Here." Learning Rate | Sciencx - Accessed . https://www.scien.cx/2025/06/03/ai-tutor-is-real-and-its-already-here/
IEEE
" » AI Tutor Is Real, And It’s Already Here." Learning Rate | Sciencx [Online]. Available: https://www.scien.cx/2025/06/03/ai-tutor-is-real-and-its-already-here/. [Accessed: ]
rf:citation
» AI Tutor Is Real, And It’s Already Here | Learning Rate | Sciencx | https://www.scien.cx/2025/06/03/ai-tutor-is-real-and-its-already-here/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.