Thursday, 23 May 2013

Thursday, 16 May 2013

Namespaces in C# Visual Studio development

There are several namespaces in C# Visual Studio development that are not obvious where to find and add them.

The System.Windows is in the PresentationCore assembly. The System.Windows.Media, System.Windows.Controls and System.Windows.Shapes are in PresentationFramework assembly.

Wednesday, 15 May 2013

Vim on Windows

On Linux, all the Vim plugin and autoload files are put in ~/.vim folder. On Windows, they should be put in the ~/vimfiles folder. For example to use pathogen, put the pathogen.vim file in ~/vimfiles/autoload; otherwise Vim will not be able to find the .vim file.

Tuesday, 14 May 2013

Running STIP feature detection on Ubuntu 12.04

Ivan Laptev's code on Space-time interest points (STIP) detection can be downloaded here. To run it on Ubuntu 12.04, there are several dependencies you need to install.

First you need to install OpenCV. As the newer version of OpenCV's library names are changed and the code is compiled with the older version of the library, you need to make symbolic links that point from the old library names to the new library names. The stackoverflow post lists all the symbolic links you need to make.

In order to make video reading work with OpenCV, you need to compile OpenCV with libgtk2.0-dev and pkg-config installed. Also you need libgstreamer0.10-dev and libgstreamer0.10-vaapi-dev. The libgstreamer0.10-vaapi-dev provides the headers such as gst/video/video.h which is required in compiling OpenCV.

Tuesday, 23 April 2013

mmread.m

mmread.m is a Matlab function for reading different media files on different platforms (Linux, Windows and Macs). It is used in the sample code of 2011/2012 One-shot-learning gesture recognition challenge.

When I first run example.m in the sample code, it asks me to install libavbin.so and I clicked yes. However I got segmentation fault after the installation. I also tried to install libavbin.so from the version downloaded from https://code.google.com/p/avbin/, but this causes an error (undefined symbol: 
avbin_open_filename_with_format error) in FFGrab.mexa64. The author noted that he fixed this bug. So I downloaded mmread from the Matlab file exchange and replaced that in the sample code. Now mmread works.


Tuesday, 16 April 2013

A Unified Framework for Concurrent Usage of Hand Gesture, Shape and Pose

Natural interaction with hands

  • hand trajectory
    • HMM based continuous recognition
    • not really natural
  • hand shape
    • Use depth image directly / label / recognition
  • articulated hand pose
    • Hand skeleton extraction
[Download][Summary][Slides]



    One-shot gesture recognition

    Data
    User dependent:  the gestures in one batch are performed by a single user. There is a single labeled training example of each gestures of the vocabulary in a given batch. The goal of the challenge is, for each batch, to train a system on the training examples, and to make predictions of the labels for the test examples. The test labels in the validation batches are withheld. Additional batches finalXX will be provided for final testing.

    Overall analysis
    https://docs.google.com/file/d/0B08QS7nJpK7mX3pFalVzckxoU2M/edit

    Winners' methods
    1st place
    https://docs.google.com/file/d/0B4jW8HPqnNiuU2RiQWl6TnpfQzQ/edit

    Initial image preprocessing

    • used depth information only
    • Identify outliers (those pixels returned as 0 by kinect) and remove outliers (simple/fast wavelet reconstruction)

    Representation of visual features

    • Mimic behavioral and neural mechanisms underlying visual processing
    • Selected features of interest (emphasize moving close to the camera gestures)
    • Feature/background separation
    • Encode features time-varying shape and trajectory
    • Similarity measure (robust to variability in features selection or location)
    Gesture recognition algorithm
    • General Bayesian network model similar to speech recognition literature
    • Can perform simultaneous recognition and segmentation
    • Compute similarities between each input video frame with sample gesture video frames