I am a fifth-year Ph.D. student at UC Berkeley advised by Prof. Masayoshi Tomizuka. My research interest lies in the interdisciplinary combination of robotics, optimization, reinforcement learning and control theories with applications to contact-rich manipulation and motion planning, and dexterous manipulation.
Here is my Curriculum Vitae.
- Sep 2023: Paper: Efficient Sim-to-real Transfer of Contact-Rich Manipulation Skills with Online Admittance Residual Learning accepted by CoRL 2023
- June 2023: Workshop Paper: A coarse-to-fine framework for dual-arm manipulation of deformable linear objects with whole-body obstacle avoidance(paper link, video) won the Best Paper Award at the ICRA 2023 Workshop on Representing and Manipulating Deformable Objects
- Jan 2023: Paper: A coarse-to-fine framework for dual-arm manipulation of deformable linear objects with whole-body obstacle avoidance accepted by ICRA 2023
- Jan 2023: Paper: Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning accepted by ICRA 2023
- May 2022: I begin my intern(resident) at (Google) X, the Moonshot Factory on the Everyday Robots project.
- May 2022: Paper: Safe Online Gain Optimization for Cartesian Space Variable Impedance Control accepted by CASE 2022
- Feb 2022: Paper: Offline-Online Learning of Deformation Model for Cable Manipulation With Graph Neural Networks accepted by IEEE Robotics and Automation Letters (RA-L).
- Feb 2022: Paper: Robotic cable routing with spatial representation accepted by IEEE Robotics and Automation Letters (RA-L).
- Jan 2022: Paper: Learning Insertion Primitives with Discrete-Continuous Hybrid Action Space for Robotic Assembly Tasks accepted by ICRA 2022
- Jan 2022: Paper: BPOMP: A Bilevel Path Optimization Formulation for Motion Planning accepted by ACC 2022
- June 2021: Paper: Trajectory Splitting: A Distributed Formulation for Collision Avoiding Trajectory Optimization accepted by IROS 2021.
- June 2021: Paper: Online Learning of Unknown Dynamics for Model-Based Controllers in Legged Locomotion accepted by IEEE Robotics and Automation Letters (RA-L).
- Jan 2021: Paper: Contact Pose Identification for Peg-in-hole Assembly under Uncertainties accepted by ACC 2020.
- May 2020: I begin my robotics research intern at (Google) X, the Moonshot Factory.
- June 2019: Paper: Robust Deformation Model Approximation for Robotic Cable Manipulation accepted by IROS 2019.
- June 2019: I begin my robotics reserach intern at FANUC Advanced Reserach Laboratory.
- July 2018: Paper: A Framework for Manipulating Deformable Linear Objects by Coherent Point Drift accepted by IEEE Robotics and Automation Letters (RA-L).
Ph.D. Mechanical Engineering, UC Berkeley (Berkeley, CA), 2018 - 2023 (Expected)
- Major: Controls Minors: Optimization, Robotics
- GPA: 4.0/4.0 Advisor: Prof. Masayoshi Tomizuka
- Research Interest: Robotics Manipulation, Trajectory Optimization, and Reinforcement Learning.
B.S. Mechanical Engineering, Shanghai Jiao Tong University (Shanghai, China), 2014 - 2018
- Research Intern, Honda Research Institute, (San Jose, CA), May 2023 - August 2023
- Resident, (Google) X, the Moonshot Factory (Mountain View, CA), May 2022 - August 2022
- Robotics Research Intern, (Google) X, the Moonshot Factory (Remote), May 2020 - August 2020
- Robotics Research Intern, FANUC Advanced Research Laboratory (Union City, CA), June 2019 - August 2019
Bridging Sim-to-real Gap for Dexterous Manipulation Skills with Tactile Sensing [Details Coming Soon]
Learning dexterous in-hand manipualtion skills for multi-finger hand is challenging. It is even more challenging to deploy the learned skill in simulation to the real world. In this project, we proposed a framework that is able to transfer the learned manipulation skills to the hardware. In simulation, we proposed a teacher-student model for learning robust tactile signals, where we trained with accurate contact information in simulation to get robust manipulation policy, then we let a student with only binarize tactile information to mimic the teacher. We demonstrated the effectiveness of the approach on an Xela Allegro Hand with Uskin tactile sensors. Furthermore, to bridge the sim-to-real gap, we proposed an online policy residual learning framework to real-time update the policy to account for the domain gap. With this approach, we are able to robustly manipulate objects on the real Allegro hardware.
Efficient Sim-to-real Transfer of Contact-Rich Manipulation Skills with Online Admittance Residual Learning [Website]
Learning contact-rich manipulation skills is essential to robotic applications. Such skills require the robots to interact with the environment with feasible manipulation trajectories and suitable compliance control parameters to enable safe and stable contact. However, learning these skills is challenging due to data inefficiency in the real world and the sim-to-real gap in simulation. In this paper, we introduce a hybrid offline-online framework to learn robust manipulation skills. We employ model-free reinforcement learning for the offline phase to obtain the robot motion and compliance control parameters in simulation. Subsequently, in the online phase, we learn the residual of the compliance control parameters to maximize robot performance-related criteria with force sensor measurements in real time. To demonstrate the effectiveness and robustness of our approach, we provide comparative results against existing methods for assembly and pivoting tasks.
Offline-Online Learning of Deformation Model for Cable Manipulation with Graph Neural Networks [Website]
Manipulating deformable objects by robots has a wide range of applications, e.g., manufacturing and medical surgery. To complete such tasks, an accurate dynamics model for predicting the deformation is critical for robust control. In this work, we deal with this challenge by proposing a hybrid offline-online method to learn the dynamics of deformable objects in a data-efficient manner. In the offline phase, we adopt Graph Neural Network (GNN) to learn the deformation dynamics purely from the simulation data. Then an online local residual model is learned to resolve the sim-to-real gap in order to achieve better accuracy and generalizability. The learned model is then utilized as the dynamics constraint of a Model Predictive Controller (MPC) to calculate the optimal robot movements. The online learning and MPC run in a closed-loop manner to robustly accomplish the task. Comparative results with existing methods are provided to show the effectiveness and robustness quantitatively.
Safe Online Gain Optimization for Variable Impedance Control [Website]
Smooth behaviors are preferable for many contact-rich manipulation tasks. Impedance control arises as an effective way to regulate robot movements by mimicking a mass-spring-damping system. Consequently, the robot behavior can be determined by the impedance gains. However, tuning the impedance gains for different tasks is tricky, especially for unstructured environments. Moreover, online adapting the optimal gains to meet the time-varying performance index is even more challenging. In this paper, we present Safe Online Gain Optimization for Variable Impedance Control (Safe OnGO-VIC). By reformulating the dynamics of impedance control as a control-affine system, in which the impedance gains are the inputs, we provide a novel perspective to understand variable impedance control. Additionally, we innovatively formulate an optimization problem with online collected force information to obtain the optimal impedance gains in real-time. Safety constraints are also embedded in the proposed framework to avoid unwanted collisions.
Trajectory Splitting: A Distributed Formulation for Collision Avoiding Trajectory Optimization
Efficient trajectory optimization is essential for avoiding collisions in unstructured environments, but it remains challenging to have both speed and quality in the solutions. One reason is that second-order optimality requires calculating Hessian matrices that can grow with $O(N^2)$ with the number of waypoints. Decreasing the waypoints can quadratically decrease computation time. Unfortunately, fewer waypoints result in lower quality trajectories that may not avoid the collision. To have both, dense waypoints and reduced computation time, we took inspiration from recent studies on consensus optimization and propose a distributed formulation of collocated trajectory optimization. It breaks a long trajectory into several segments, where each segment becomes a subproblem of a few waypoints. These subproblems are solved classically, but in parallel, and the solutions are fused into a single trajectory with a consensus constraint that enforces continuity of the segments through a consensus update. With this scheme, the quadratic complexity is distributed to each segment and enables solving for higher-quality trajectories with denser waypoints. Furthermore, the proposed formulation is amenable to using any existing trajectory optimizer for solving the subproblems.
Online Learning of Unknown Dynamics for Model-Based Controllers in Legged Locomotion
The performance of a model-based controller can severely suffer when its model inaccurately represents the real world dynamics. We propose to learn a time-varying, locally linear residual model along the robot’s current trajectory, to compensate for the prediction errors of the controller’s model. Supervised learning is performed online, as the robot is running in the unknown environment, using data collected from its immediate past. We theoretically investigate our method in its general formulation, then apply it to a bipedal controller derived from the full-order dynamics of virtual constraints, and a quadrupedal controller derived from a simplified model of contact forces. For a biped in simulation, our method consistently outperforms the baseline and a recent learning-based method. We also experiment with a 12\,kg quadruped in simulation and real world, where the baseline fails to walk with 10\,kg of payload but our method succeeds.
BPOMP: A Bilevel Path Optimization Formulation for Motion Planning [Website]
Balancing computation efficiency and success rate is challenging for path optimization. To obtain a collision-free path, each waypoint along the path should be collision-free, and the waypoints should be dense. As a consequence, the solutions are computationally expensive to obtain. This paper introduces a bilevel path optimization formulation for motion planning (BPOMP). Different from standard formulations that only consider the collision on each waypoint, BPOMP additionally constraints the closest position to the obstacle along the continuous path. Intuitively, if the closest position is out of collision, the entire path should be collision-free. The problem is formulated as a bilevel optimization and then relaxed to canonical nonlinear programming (NLP), which can be solved classically.
Robotic Bottle Filpping and Landing with TRPO and Adaptive MPC
Robotic bottle filpping is a challenging task. Since the dynamics of the bottle is hard to model in the presence of noise and uncertainties, and the bottle flipping skills is hard to learn. We proposed a framework combining the Trust Region Policy Optimization (TRPO) to let the robot learn the bottle flipping skills with an Adaptive MPC controller to stablize the bottle. The trajectory of the bottle is predicted by an three layers LSTM network. Simulation results show that this framework is able to let the robots finish the task robustly.
Robotic Deformable Object Manipulation
Manipulation of deformable objects is a challenging task for robots. These objects have infinite-dimensional configuration space and are computational-expensive to model, making it difficult for real-time tracking, planning and control. To deal with these challenges, we proposed two different model-free methods: