OpenDriveLab

FreeTacMan

Robot-free Visuo-Tactile Data Collection System
for Contact-rich Manipulation

Longyan Wu1,4*  Checheng Yu1,5*  Jieji Ren3*  Li Chen2
Ran Huang4  Guoying Gu3  Hongyang Li2,1
1SII
2HKU
3SJTU
4Fudan
5NJU
arXiv 2025
Temporary page. Permanent address: http://opendrivelab.com/blog/freetacman

Good things take time... and so does loading this page! Thanks for your patience! Best viewed in Chrome on a desktop.
FreeTacMan is a robot-free, human-centric visuo-tactile data collection system, featuring low-cost, high-resolution tactile sensors and a portable, cross-embodiment modular design. FreeTacMan transfers human visual perception, tactile sensing, and motion control skills to robots efficiently by integrating visual and tactile data.

Enabling robots with contact-rich manipulation remains a pivotal challenge in robot learning, which is substantially hindered by the data collection gap, including its inefficiency and limited sensor setup.

Motivated by the dexterity and force feedback of human motion, we introduce FreeTacMan, a robot-free and human-centric visuo-tactile data collection system to acquire robot manipulation data accurately and efficiently. Our main contributions are:

1. A portable, high-resolution, low-cost visuo-tactile sensor designed for rapid adaptation across multiple robotic end-effectors.

2. An in-situ, robot-free, real-time tactile data-collection system that leverages a handheld end effector and the proposed sensor to excel at diverse contact-rich tasks efficiently.

3. Experimental validation shows that imitation policies trained with our visuo-tactile data achieve an average 50% higher success rate than vision-only approaches in a wide spectrum of contact-rich manipulation tasks.

Interactive Model Viewer

Dive into our 💡interactive 3D model viewer and explore the most popular native 3D formats with ease.
Try out the 🖱️move command to inspect internal structures.
It's more than just viewing — it's a hands-on exploration. Start 💫discovering now!

FreeTacMan on PIPERFreeTacMan on FRANKA

FreeTacMan features a universal gripper interface with quick-swap mounts compatible with various robots, such as Piper and Franka, with support for more platforms coming soon. It also includes a camera scaffold designed for precise alignment with the wrist-mounted camera, ensuring a consistent perspective. These components demonstrate the plug-and-play modularity of FreeTacMan, enabling seamless integration across diverse robotic platforms without requiring hardware-specific modifications.

We evaluate the effectiveness of FreeTacMan system and the quality of the collected visuo-tactile demonstration through a diverse set of contact-rich manipulation tasks.

Fragile Cup Manipulation
USB Plugging
Texture Classification
Stamp Pressing
Calligraphy Writing
Toothpaste Extrusion
Tissue Grasping
Potato Chip Grasping

We integrate tactile feedback to assess its impact on policy performance, observing a substantial improvement that highlights its dynamic value in contact-rich tasks.Temporal-aware pretraining further enhances performance by aligning visual and tactile embeddings while capturing temporal dynamics. Across five evaluated tasks, imitation policies trained with our visuo-tactile data achieve an average success rate that is 50% higher than vision-only counterparts.

Policy Success Rate (%)
ACT (Vision-only)
Ours-α (Tactile)
Ours-β (Tactile Pretrained)

The robot grasps a plastic cup and places it stably on a tray without causing damage.

The videos are played at normal speed.

Policy Success Rate (%)
ACT: Vision-only
Ours-α: + Tactile
Ours-β: + Tactile Pretrained

We evaluate the usability of FreeTacMan through a user study involving 12 human participants with varying levels of experience, each collecting demonstrations across 8 tasks. Compared to previous setups, FreeTacMan consistently achieves the highest completion rates and efficiency, and is perceived as the most user-friendly and reliable data collection system.

CPUT Score
ALOHA
UMI
Ours

Fragile Cup

CPUT Score

USB Plugging

CPUT Score

Texture Classification

CPUT Score

Stamp Pressing

CPUT Score

Calligraphy

CPUT Score

Toothpaste Extrusion

CPUT Score

Tissue Grasping

CPUT Score

Chip Grasping

CPUT Score

P.S.: Completion per Unit Time (CPUT), defined as completion_rate x efficiency