Home Chapter 11 Max MSP Jitter, Kinect & Bodysynth

Site Search


Chinese (Simplified) French German Italian Portuguese Russian Spanish
Max MSP Jitter, Kinect & Bodysynth

Artists have been defining, hacking and creating interfaces since the beginning of musical instrument design and there is a strong international acoustic and experimental musical instrument design movement. As with acoustic instruments, artists are also experimenting with high tech material and processes and help to define and redefine the interface.  The artist Trademark Gunderson is known for hacking Wiimote controllers with his Thimbletron.

Trademark Gunderson a musician and founder of the band The Evolution Control Committee is a culture jammer, equipment designer, software designer, and is known for his copyright-challenging stance, using found sounds to construct sonic and culture jamming mashups.

Trademark Gunderson with the Thimbletron system for composing live digital sound
Later in this chapter Trademark share some MAX MSP code on making a Kinect based Theremin.

The artist musician Pamela Z has performed using a BodySynth™, a MIDI Controller that uses electrode sensors to translate her physical movements into sound and image installations. Pamela Z is a composer/performer and audio artist who works with voice, electronics processes and sampled sound. Her performances are layered with sound that is operatic bel canto at times, mixed dynamically with experimental vocals.



Pamela Z by Lori Eanes, 1998.


She uses digital delays devices with found percussion sounds, along with digitally sampled sounds that are triggered with the BodySynth that takes analog from electrode sensors, which she wears on her skin, and this generates MIDI signals, which trigger the sounds.


Pamela Z Photograph by Donald Swearingen. 2003.


The BodySynth was designed by Ed Severinghaus and allows gesture and performative action to trigger and combine Pamela Z’s sound fusions and result in spiritually charged and organically activated performance installations.Kinect  with the Macintosh

There are also wonderful ways to work with MAX MSP in order to create generative sound. One of the masters in this regard is composer and artist of real time composition Karlheinz Essl, who has created a significant software library for algorithmic composition.

Professor Essl presenting his ideas and software to a class.

His patches and externals for Max/MSP
offer the possibility to experiment with serial procedures, permutations, and controlled randomness. Most of his patches are focused on straightforward processing of data and using these objects makes this kind of composition more fun. Professor Essl has provided many functions for algorithmic composition and therefore artists, composers, sound hackers can concentrate on play in the making of composition vs. more complex programming tasks.


Trademark Gunderson recently completed Kinect Turntable hack video. Below is the work at the Extract Exhibition Art & Technology Exhibition, 2013.

Dean Shanda at OSU plays with Trademark Gunderson's Object Turntable at its premiere. Photo Amy Youngs

Below is a hack software instrument created by TradeMark Gunderson using the Xbox Kinect sensor to create a Kinect Theremin.

Russian inventor Léon Theremin created one of the first electronic musical instruments in 1919 and is played by hovering one's hands near two antennas without touching them.

TradeMark's Kinect Theremin modernizes the concept using the Microsoft Xbox Kinect sensor, a video game controller which can track a user's movements in a three-dimensional space.


Like the original theremin, the player of the Kinect Theremin hovers their hands to make music and both the Kinect sensor and the original theremin offer control and accuracy.  With the Kinect Theremin, Each hand controls a different pitch.  Raising one's hand higher or lower makes the sound's volume higher or lower.  Waving one's hand left or right from the body changes the sound's pitch/frequency.  Reaching one's hand forward or back also adds an effect on the sound.  

Finally, the Kinect allows the "power switch" to start and stop everything to be placed on one's elbow -- touch an elbow, the sound stops; touch it again, and the sound restarts.

The Kinect Theremin operates via joint tracking, where the various joints of the body are tracked through three-dimensional space.  Locations for left and right hands, elbows, feet, and so on are detected by the Kinect sensor and reported back as X, Y, and Z locations.  By simply assigning each axis (X, Y, or Z) to control a different aspect of the sound (volume, pitch, or effect) a simple but expressive virtual musical instrument is created.

At the time of development Kinect hacking was in its infancy, so the project began using one joint tracking system called Synapse, but eventually, this was upgraded to use jit.openni, a solution native to the Max/MSP programming environment and used for this project.  While Synapse has the advantage of easy installation and flexibility of use beyond MaxMSP.  jit.openni is a MaxMSP external, meaning that it is used only within the MaxMSP environment, but ultimately jit.openni proved more reliable than Synapse.

jit.openni also requires the user to correctly install backend software first.

As of (02/2013) install the following software in this order:

1) OpenNI framework

2) NITE middleware

3) SensorKinect drivers



The patch begins by initializing jit.openni with various messages ("skeleton_format 1", etc.).  

If it finds the Kinect and, all is well, a few seconds after the patch loads you should see images from the Kinect appear in the three windows just beneath the jit.openni object.  

Once a human figure ("user") is identified they become one of the "active users" indicated.  Normally only one user makes sound at a time unless the "click for all users" switch is active.

Joint data is routed for both hands (l_hand and r_hand), the torso, and both elbows (l_elbow and r_elbow).

Indicator windows for the left and right hands show the hand's position in reference to the person's torso or body, i.e. the center of the body is the "zero point" where there is no sound.

The data is then scaled appropriately to control pitch, volume, and modulation settings which are finally synthesized into a sounds.

The entire sound stops and starts when an elbow is touched, or more specifically, when the calculated distance between one elbow to the opposite hand is small enough that they must have touched.