XMM - Probabilistic Models for Motion Recognition and Mapping

Download

Table of Contents

C++ Library

The source code is available on Github: https://github.com/Ircam-RnD/xmm

Max/MuBu Implementation

The models are integrated with the MuBu environment within Cycling 74 Max that provides a consistent framework for motion/sound feature extraction and pre-processing; interactive recording, editing, and annotation of the training sets; and interactive sound synthesis. MuBu is freely available on Ircam's Forumnet.

Max is a visual programming environment dedicated to music and interactive media. We provide an implementation of our library as a set of Max externals and abstractions articulated around the MuBu collection of objects developed at Ircam. Training sets are built using MuBu, a generic container designed to store and process multimodal data such as audio, motion tracking data, sound descriptors, markers, etc. Each training phrase is stored in a buffer of the container, and movement and sound parameters are recorded into separate tracks of each buffer. Markers can be used to specify regions of interest within the recorded examples. Phrases are labeled using the markers or as an attribute of the buffer. This structure allows users to quickly record, modify, and annotate the training examples. Training sets are thus autonomous and can be used to train several models.

Each model can be instantiated as a max object referring to a MuBu container that defines its training set. For training, the model connects to the container and transfers the training examples to its internal representation of phrases. The parameters of the model can be set manually as attributes of the object, such as the number of Gaussian components in the case of a GMM, or the number of states in the case of a HMM. The training is performed in background.

For performance, each object processes an input stream of movement features and updates the results with the same rate. For movement models, the object output the list of likelihoods, complemented with the parameters estimated for each class, such as the time progression in the case of a temporal model, or the weight of each Gaussian component in the case of a GMM. For multimodal models, the object also outputs the most probable sound parameters estimated by the model, that can be directly used to drive the sound synthesis.

OpenFrameworks Addon

Coming Soon...