dancing with my shadow: April 2013

Tuesday, 23 April 2013

mmread.m

mmread.m is a Matlab function for reading different media files on different platforms (Linux, Windows and Macs). It is used in the sample code of 2011/2012 One-shot-learning gesture recognition challenge.

When I first run example.m in the sample code, it asks me to install libavbin.so and I clicked yes. However I got segmentation fault after the installation. I also tried to install libavbin.so from the version downloaded from https://code.google.com/p/avbin/, but this causes an error (undefined symbol:
avbin_open_filename_with_format error) in FFGrab.mexa64. The author noted that he fixed this bug. So I downloaded mmread from the Matlab file exchange and replaced that in the sample code. Now mmread works.

Tuesday, 16 April 2013

A Unified Framework for Concurrent Usage of Hand Gesture, Shape and Pose

Natural interaction with hands

hand trajectory

HMM based continuous recognition
not really natural

hand shape

Use depth image directly / label / recognition

articulated hand pose

Hand skeleton extraction

[Download][Summary][Slides]

One-shot gesture recognition

Data
User dependent: the gestures in one batch are performed by a single user. There is a single labeled training example of each gestures of the vocabulary in a given batch. The goal of the challenge is, for each batch, to train a system on the training examples, and to make predictions of the labels for the test examples. The test labels in the validation batches are withheld. Additional batches finalXX will be provided for final testing.

Overall analysis
https://docs.google.com/file/d/0B08QS7nJpK7mX3pFalVzckxoU2M/edit

Winners' methods
1st place
https://docs.google.com/file/d/0B4jW8HPqnNiuU2RiQWl6TnpfQzQ/edit

Initial image preprocessing

used depth information only
Identify outliers (those pixels returned as 0 by kinect) and remove outliers (simple/fast wavelet reconstruction)

Representation of visual features

Mimic behavioral and neural mechanisms underlying visual processing
Selected features of interest (emphasize moving close to the camera gestures)
Feature/background separation
Encode features time-varying shape and trajectory
Similarity measure (robust to variability in features selection or location)

Gesture recognition algorithm

General Bayesian network model similar to speech recognition literature
Can perform simultaneous recognition and segmentation
Compute similarities between each input video frame with sample gesture video frames

2nd place

Pages

Tuesday, 23 April 2013

mmread.m

Tuesday, 16 April 2013

A Unified Framework for Concurrent Usage of Hand Gesture, Shape and Pose

One-shot gesture recognition