Skip to content

Python module for converting MIDI to tabular chord data. Developed for use with interval-based music visualization.

Notifications You must be signed in to change notification settings

khmccurdy/midi_df

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Midi DF

Midi DF is a Python module for converting midi data into DataFrames containing information on each chord in the midi file. DataFrames can be exported as JSON to facilitate use in external applications. Chords are stored as lists of midi pitches. Further reading on "dyads" and "interval classes", as well as the visualization technique that inspired this project, can be found here.

Examples using exported JSON data: After Effects (using AE's Expressions system, based on JavaScript), Processing (p5.js) (realtime and interactive)

Required libraries are pandas (0.20.3), numpy (1.13.3), and midi (0.2.3).

Demo

In short, midi.read_midifile() --> [tempo_map() -->] midi_to_df() --> condense_df() --> [merge_tracks() -->] output_json().

Dependencies

import json
import pandas as pd
import numpy as np
import midi
import copy

import midi_df as mdf

Load a midi file using the "midi" library

kiev = midi.read_midifile("kiev.mid")

Print the summary table

kiev_s = mdf.track_summary(kiev)
kiev_s
Track Name Channel Note Events Tempo Track Instrument ID
0 The Great Gate of Kiev 0 tempo track {}
1 Staff 0 2790 {0}
2 Staff-1 1 2032 {0}

We can see that Track 0 is the tempo track, and Tracks 1 & 2 have note data on channels 0 and 1, both with instrument 0 (Acoustic Piano in the GM standard).

Get the tempo map

Since Track 0 is the default for ch, we don't need to specify it in the parameters for tempo_df().

kiev_tempo = mdf.tempo_df(kiev)
kiev_tempo.head(10)
Tick uSec/tick Time (uSec) Time (sec)
0 0 2403.843750 0 0.000000
1 16128 2604.166667 38769192 38.769192
2 22272 2403.843750 54769192 54.769192
3 35328 2314.812500 86153776 86.153776
4 46950 2332.088542 113056527 113.056527
5 47052 2349.619792 113294400 113.294400
6 47155 2367.421875 113536411 113.536411
7 47257 2385.494792 113777888 113.777888
8 47360 2403.843750 114023594 114.023594
9 47462 2422.479167 114268786 114.268786

Create the first midi DataFrame (Track 1)

kiev_df1 = mdf.midi_to_df(kiev,1,kiev_tempo)
kiev_df1.head(15)
dTicks Tick Pitch On/Off Playing Time (s) Beat
0 0 0 NaN None [] 0.000000 0.00
1 0 0 NaN None [] 0.000000 0.00
2 0 0 NaN None [] 0.000000 0.00
3 0 0 NaN None [] 0.000000 0.00
4 0 0 NaN None [] 0.000000 0.00
5 0 0 63.0 on [63] 0.000000 0.00
6 0 0 67.0 on [63, 67] 0.000000 0.00
7 0 0 70.0 on [63, 67, 70] 0.000000 0.00
8 0 0 75.0 on [63, 67, 70, 75] 0.000000 0.00
9 720 720 63.0 off [67, 70, 75] 1.730767 3.75
10 0 720 67.0 off [70, 75] 1.730767 3.75
11 0 720 70.0 off [75] 1.730767 3.75
12 0 720 75.0 off [] 1.730767 3.75
13 48 768 65.0 on [65] 1.846152 4.00
14 0 768 70.0 on [65, 70] 1.846152 4.00

Condense the DataFrame (Track 1)

kiev_c1 = mdf.condense_df(kiev_df1)
kiev_c1.head()
Beat Time (s) Playing
8 0.00 0.000000 [63, 67, 70, 75]
12 3.75 1.730767 []
16 4.00 1.846152 [65, 70, 74, 77]
20 7.75 3.576919 []
24 8.00 3.692304 [67, 70, 75, 79]

Create and condense the DataFrame for Track 2 in one line

kiev_c2 = mdf.condense_df(mdf.midi_to_df(kiev,2,kiev_tempo))
kiev_c2.head()
Beat Time (s) Playing
8 0.00 0.000000 [43, 46, 51, 55]
12 3.75 1.730767 []
16 4.00 1.846152 [41, 46, 50, 53]
20 7.75 3.576919 []
24 8.00 3.692304 [39, 43, 46, 51]

Merge the DataFrames for Track 1 and 2

kiev_m = mdf.merge_tracks([kiev_c1, kiev_c2],[1,2])
kiev_m.head()
Beat Time (s) Playing 1 Playing 2 Playing
0 0.00 0.000000 [63, 67, 70, 75] [43, 46, 51, 55] [67, 70, 43, 75, 46, 51, 55, 63]
1 3.75 1.730767 [] [] []
2 4.00 1.846152 [65, 70, 74, 77] [41, 46, 50, 53] [65, 70, 41, 74, 77, 46, 50, 53]
3 7.75 3.576919 [] [] []
4 8.00 3.692304 [67, 70, 75, 79] [39, 43, 46, 51] [67, 70, 39, 75, 43, 46, 79, 51]

All main DataFrame steps in one line

mdf.merge_tracks([mdf.condense_df(mdf.midi_to_df(kiev, t, kiev_tempo)) for t in [1,2]]).head()
Beat Time (s) Playing 1 Playing 2 Playing
0 0.00 0.000000 [63, 67, 70, 75] [43, 46, 51, 55] [67, 70, 43, 75, 46, 51, 55, 63]
1 3.75 1.730767 [] [] []
2 4.00 1.846152 [65, 70, 74, 77] [41, 46, 50, 53] [65, 70, 41, 74, 77, 46, 50, 53]
3 7.75 3.576919 [] [] []
4 8.00 3.692304 [67, 70, 75, 79] [39, 43, 46, 51] [67, 70, 39, 75, 43, 46, 79, 51]

Export to JSON

output_json(kiev_m, "kiev.json")

The resulting JSON file can be found here.

Calculate the maximum chord size

print("Maximum chord size:",mdf.max_notes(kiev_m))
print("Dyads in maximum chord:", mdf.max_dyads(kiev_m))
Maximum chord size: 11
Dyads in maximum chord: 55.0

Display the maximum count of each interval

mdf.max_dyad_counts(kiev_m)
0      14
1       9
2      12
3      15
4      16
5      15
6       4
sum    55
dtype: int64

Note that "sum" refers to the maximum of the sums of dyads (not the sum of the maxima), equal to the output of max_dyads().

Sample of Track Summary with more instruments

mdf.track_summary(fake_sample_file, recommend=True)
Recommended tracks: [1,2,3,4,6]
Track Name Channel Note Events Tempo Track Instrument ID
0 A Song 0 tempo track {}
1 Guitar 1 0 3234 {25}
2 Guitar 2 1 1847 {26, 29}
3 Keyboard - Square Wave 2 376 {80}
4 Organ 3 1429 {18}
5 Drum Kit 9 2308 {0}
6 Flute Solo 10 145 {73}

We can see that the author of this midi file has named their tracks by instrument. From here, we can determine which set of tracks we wish to use in our DataFrame. For example, if we only want to output guitars we would use Tracks 1 and 2, or if we wanted the whole band, we would copy the "Recommended tracks" list. Track 5 is a percussion track, set to midi channel 9, so we can assume it does not carry useful melodic information.

About

Python module for converting MIDI to tabular chord data. Developed for use with interval-based music visualization.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published