Improv RNN – Magenta model (google AI, 2016)
Improv RNN is a training model that uses Recurrent Neural Network (more specifically, LSTM – long short term memory networks) to generative melodies over chord progression.
Data it is trained on
the model trains on lead sheets in MusicXML format
lead sheet: is a musical representation containing chords and melody (and lyrics, which are ignored by the model).
You can find lead sheets in various places on the web such as MuseScore. Magenta is currently only able to read lead sheets in MusicXML format; MuseScore provides MusicXML download links, e.g.
I think the model is looking at the both the chord progression and melody in the dataset as well as their relationship between each other.
For the chords: one-hot encoding
one hot encoding is often used for classifying categorical data. It transforms our categorical labels into two vectors of 0s and 1s.
The length of these vectors will be the amount of categories we have, in other words, the length will be equal to the number of our output categories
Each category will correspond to one element of the array and the particular element will be 1, with the rest of the elements stay 0 (that is the reason why it is called one-hot)
for instance, if we have three different categories of data, A, B and C,
the encoding of them will be something similar to [1,0,0] [0,1,0],[0,0,1]
In terms of this specific model, each chord will encoded as a one-hot vector of 48 triads (major/minor/augmented/diminished for all 12 root pitch classes).
for instance: D major would be encoded as [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Also, there are three different configurations:
basic improv rnn / attention improv rnn / chord pitches improv rnn
Q: what are the differences between the three different types of configurations improv (are the used at the same time? I think they are used separately)
Q: how are the melodies encoded?
To use the model (after installing MAGENTA)
- convert a collection of MusicXML lead sheets into NoteSequences
- Create SequenceExamples: SequenceExamples are fed into the model during training and evaluation. Each SequenceExample will contain a sequence of inputs and a sequence of labels that represent a lead sheet.
- Train and Evaluate the ModelRun the command below to start a training job using the attention configuration.
--run_diris the directory where checkpoints and TensorBoard data for this run will be stored.
--sequence_example_fileis the TFRecord file of SequenceExamples that will be fed to the model.
my biggest concern/problem is that I cannot really envision how the dataset looks like.
Generate Melody Over Chords
to generate your own melody you will also need to have a “primer_melody”
At least one note needs to be fed to the model before it can start generating consecutive notes. We can use –primer_melody to specify a priming melody using a string representation of a Python list. The values in the list should be ints that follow the melodies_lib.Melody format (-2 = no event, -1 = note-off event, values 0 through 127 = note-on event for that MIDI pitch). For example –primer_melody=”[60, -2, 60, -2, 67, -2, 67, -2]” would prime the model with the first four notes of Twinkle Twinkle Little Star. Instead of using –primer_melody, we can use –primer_midi to prime our model with a melody stored in a MIDI file.
In addition, the backing chord progression must be provided using –backing_chords, a string representation of the backing chords separated by spaces. For example, –backing_chords=”Am Dm G C F Bdim E E” uses the chords from I Will Survive. By default, each chord will last 16 steps (a single measure), but –steps_per_chord can also be set to a different value.