Image2MIDI is a midi instrument and algorithmic composition tool that uses visual forms to generate musical scores.


This project is one of my first efforts to cross the visual/audible divide. My intention with this software is not to create abstract, noisy sounds from visual data, but instead to find a way to transform geometries, patterns, and visible structures into useful, musical output. Applications that take in an audio signal and generate a visual output are ubiquitous, thanks in part to the ease of mapping sound on a cartesian plane with a fast fourier transformation. And yet, there are surprisingly few applications that take in visual data and produce an sonic, moreover musical response.

Image2MIDI draws inspiration from the stochastic synthesis techniques of Ian Xenaxis’ UPIC system, the possibility of musicalizing visual data, and the Aphex Twin face, which was probably the first attempt to embed visual data into popular music. Currently there are a handful of programs that try to bridge the visual-audio divide, AudioSculpt,HighC, Hyperscore, to name a few. However, these programs generally have an abstract, representational interface, allowing you to draw a score rather than generate one. Image2MIDI importantly departs from these methods and seeks to draw a closer connection with the actual visible world (as opposed to an interface) and music. That said, I intend on including the ability to draw scores in future versions of Image2Midi.

How Image2MIDI works

Image2MIDI is a software MIDI instrument and composition tool that processes and sonifies digital images. To use Image2MIDI, anyone can import a digital image and apply processing to generate a binary black and white image that can be transcoded into a MIDI score. Image2MIDI utilizes Jean Marc Pelletier’s cv.jit computer vision library to detect edges in imagery and generate outlines.28 The resulting data is a matrix of true or false pixel data that is scalable to every note on a keyboard. By default, images are scaled to a size of 320 x 240 pixels, but the user has final control over image dimensions and MIDI range. Each horizontal row of pixels in the image is scanned row by row, from top to bottom, over a duration set by the user. The rules are simple: if a pixel in a row is black, a corresponding note on the virtual keyboard will be played; if a pixel in a scanned row is white, the corresponding note will not be played. It is also possible to play the rows of pixels in a random sequence or arpeggio, rather than all at once, resulting in a “MIDI image” of single notes, rather than chords. The note-on note-off data can be sent to a MIDI instrument in real time to a DAW such as Ableton Live or Logic, to generate a MIDI score. This application is rather basic in principle, but highly useful as means to produce visual music. Once an image is exported to MIDI, it can be carefully shaped into an algorithmic composition.

The process of scanning pure geometric shapes and complex forms can result in surprisingly “organized” chords and musical sequences that reveal the structure of imagery. The resulting MIDI data is highly malleable; it may be quantized or attenuated to produce harmony or discord. The sonifications produced by Image2MIDI can resemble the impossible player piano pieces of Conlon Nancarrow or the dense “Black MIDI” renditions of Japanese video game soundtracks, which use as many notes as possible.

Depending on image processing parameters, Image2MIDI can generate some awesome sonic bursts, evolving textures, or beautifully rich chords, all for the same image. When scanning images of architecture or geometric shapes, I’m always impressed by the form of the chords and sequences produced.

One current drawback of Image2MIDI is that it must use binary, B&W images to read an image. In future versions of this application, I would like to incorporate each pixel’s RGBA value (color) to create more complex midi notes. For example a purely red note might have a quicker attack, while a blue note might have a longer decay.

If you want, you can listen to  My Face.mid

Download Image2MIDI:

(OS X 10.4 or higher)

Max patch                              Image2MIDI.maxpat
requires Max/MSP/Jitter