Learning Speed-Adaptive Walking Agent Using Imitation Learning With Physics-Informed Simulation

Yi-Hung Chiu*1,   Ung Hee Lee*2,   Changseob Song1,   Manaen Hu1,   Inseung Kang1

*Equal Contribution         1Carnegie Mellon University         2University of Michigan, Ann Arbor

Abstract

Virtual models of human gait, or digital twins, offer a promising solution for studying mobility without the need for labor-intensive data collection. However, challenges such as the sim-to-real gap and limited adaptability to diverse walking conditions persist. To address these, we developed and validated a framework to create a skeletal humanoid agent capable of adapting to varying walking speeds while maintaining biomechanically realistic motions. The framework combines a synthetic data generator, which produces biomechanically plausible gait kinematics from open-source biomechanics data, and a training system that uses adversarial imitation learning to train the agent’s walking policy. We conducted comprehensive analyses comparing the agent’s kinematics, synthetic data, and the original biomechanics dataset. The agent achieved a root mean square error of 5.24±0.09 degrees at varying speeds compared to ground-truth kinematics data, demonstrating its adaptability. This work represents a significant step toward developing a digital twin of human locomotion, with potential applications in biomechanics research, exoskeleton design, and rehabilitation.

Overview

The framework includes two main components: generating expert demonstrations and policy optimization. Expert demonstrations from motion data are used to train a discriminator that distinguishes expert- from policy-generated actions at different speeds. The reward function combines the discriminator’s output with a speed reward (difference between the agent’s target and actual center of mass speed). These rewards optimize the walking policy using trust region policy optimization. The target speed is part of the agent’s observation space, with a progressive curriculum exposing the agent to a range of speeds during training. Evaluation tests the agent’s ability to achieve the desired speed under varying conditions.

Walking agent performance with optimal and baseline settings. (A) Target speed tracking performance across varying conditions. Tracking error for (B) joint angles and (C) COM speed relative to ground-truth data. (D) Walking agent’s adaptability to dynamically changing walking speeds. Error bars and shaded regions indicate ±1 standard deviation, and asterisks indicate statistical significance (p<$ 0.05).

Gait biomechanics comparison between the open-source dataset (left) and our optimal walking agent (right) across varying speeds. Each curve from the open-source dataset represents the average gait across all subjects in the dataset, while each curve from the optimal walking agent represents the average gait across 50 episodes of evaluation. Torque and power values were normalized by body mass.
Representative walking agent with optimal settings.
Representative walking agent with suboptimal settings. Suboptimal agent (e.g., higher ratio of speed reward) exhibited abnormal gait patterns, such as irregular limb coordination, exaggerated dorsiflexion, and asymmetric range of motion.
Dynamically changing command speed.

Citation

@misc{chiu2024learningspeedadaptivewalkingagent,
title={Learning Speed-Adaptive Walking Agent Using Imitation Learning with Physics-Informed Simulation},
author={Yi-Hung Chiu and Ung Hee Lee and Changseob Song and Manaen Hu and Inseung Kang},
year={2024},
eprint={2412.03949},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2412.03949},
}