A-SLIP: Acoustic Sensing for Continuous In-hand Slip Estimation

Uksang Yoo*, Yuemin Mao*, Jean Oh, Jeffrey Ichnowski

Carnegie Mellon University

* Equal contribution

System Overview

A-SLIP System Overview

A-SLIP is a complete system for real-time slip estimation in robotic grasping. The system integrates piezoelectric microphones into a parallel-jaw gripper and uses a convolutional neural network to process synchronized multi-channel audio spectrograms, jointly estimating slip presence, direction, and magnitude in the grasp plane.

Abstract

Reliable in-hand manipulation requires accurate real-time estimation of slip between a gripper and a grasped object. A-SLIP introduces a multi-channel acoustic sensing system integrated into a parallel-jaw gripper for continuous slip estimation in the grasp plane. The sensor uses piezoelectric microphones behind a textured silicone contact pad to capture structured contact-induced vibrations. A lightweight convolutional model processes synchronized multi-channel log-mel spectrograms and jointly predicts slip presence, direction, and magnitude. Across robot- and externally-induced slip experiments, A-SLIP achieves strong directional accuracy and improves slip detection over prior baselines, and further demonstrates robust real-time performance in closed-loop reactive control.

Sensor Design

A-SLIP Sensor Design
Hardware Setup

The A-SLIP sensor consists of piezoelectric microphones embedded behind a textured silicone contact pad in a parallel-jaw gripper. The textured surface promotes structured vibrations during slip, while the piezoelectric microphones capture broadband acoustic signals with minimal footprint.

Design Variations

Microphone Placement
Contact Surface Texture

Contributions

  • A low-profile, low-cost acoustic gripper sensor with textured silicone contact surfaces and embedded piezoelectric microphones.
  • A unified learning objective that jointly estimates slip presence, magnitude, and direction from synchronized multi-channel audio.
  • A two-stage pretraining and finetuning pipeline for robust transfer to real robot slip estimation and reactive control settings.
  • Extensive real-world experiments on continuous in-hand slip estimation and closed-loop manipulation tasks.

Method

A-SLIP Model Architecture

A-SLIP uses a lightweight convolutional network that processes synchronized multi-channel log-mel spectrograms from the piezoelectric microphones. The model employs channel and temporal attention mechanisms to jointly predict slip presence, direction, and magnitude through a unified multi-objective learning framework.

Training Strategy

Pretraining Data
Finetuning Data
Training Data Distribution

Evaluation & Results

Evaluation Tasks
Prediction Results

Closed-Loop Control Tasks

Task 1: Slip Detection & Stop
Task 2: Slip Tracking Control

Prediction Examples

Key Results

  • Finetuned 4-microphone A-SLIP reaches a mean absolute directional error of 14.1°.
  • Improves slip detection accuracy by up to 12% relative to baselines.
  • Reduces directional error by 32% over baseline methods.
  • Compared to single-microphone designs, multi-channel sensing reduces directional error by 64% and magnitude error by 68%.
  • Demonstrates reliable behavior in closed-loop reactive control tasks under real robot operating conditions.

Citation

@inproceedings{a_slip_2026,
  title={A-SLIP: Acoustic Sensing for Continuous In-hand Slip Estimation},
  author={Yoo*, Uksang and Mao*, Yuemin and Oh, Jean and Ichnowski, Jeffrey},
  year={2026},
  note={Under review. * indicates equal contribution}
}