For the final performance, I want to explore more about different RNN models and try to generate rap beats based on some of the RNNs we have looked at.

I wanted to explore more about ImprovRNN. performance RNN, MelodyRNN as well as DrumsRNN.

I collected different midi files (eminem/jay chou(a Chinese popstatr) as well as .xml files, because the training of ImprovRNN requires musicXML files.

I was actually most interested in the ImprovRNN because I thought the music pieces it generated were by far the most listenable ones.

However, I ran into a lot of problems with the xml files.

There were two ways that I got my musicXML files:

  1. convert midi files to xml files using musescore
  2. directly download xml files from websites like musescoreHowever, both ways didnt work for me. I did not have too much problem converting my musicXML files into note sequencers,  but when I tries to convert the note sequencers into tfrecords, I failed.

After trying for so many times and failed, I decided to stay away from ImprovRNN at this moment and spent some more time on Melody RNN and Performance RNN.

I train my model for both MelodyRNN and PerformanceRNN using eminems midi dataset.

I trainen both models using 10000 steps and the lost is small (I guess it is because my dataset is small too?(15-20 midi files))


I tried different primer melody for the melodyRNN:

melody_rnn_generate \

–config=lookback_rnn \

–run_dir=/tmp/melody_rnn/logdir/run1 \

–output_dir=./generated/melody_rnn/eminem2/ \

–num_outputs=10 \

–num_steps=1000 \

–hparams=”batch_size=64,rnn_layer_sizes=[64,64]” \

–primer_melody=”[60, -2, 52, -2, 60, -2, 52, -2]

melody_rnn_generate \

–config=lookback_rnn \

–run_dir=/tmp/melody_rnn/logdir/run1 \

–output_dir=./generated/melody_rnn/eminem3/ \

–num_outputs=3 \

–num_steps=1000 \

–hparams=”batch_size=64,rnn_layer_sizes=[64,64]” \



performance_rnn_generate \

–config=performance_with_dynamics \

–run_dir=/tmp/performance_rnn/logdir/run2 \

–output_dir=./generated /  \

–num_outputs=10 \

–num_steps=1000 \

–hparams=”batch_size=64,rnn_layer_sizes=[64,64]” \


Some results:


I used the Pre-trained drumsRNN in the end because I did not have enough time to train a third model and I curated some melodyRNN pieces that I think are good with the results I get from DrumsRNN and made a one-minute beat.


Overall I feel though the outcome does not sound like a rap beat at all, I feel the two models has captured the very repetitive nature of rap beats (same bars repeat for many times)

For the future, I definitely want to spend some more time on the MusicXML files and hopefully I can succeed in training my own ImprovRNN model.

I also think the dataset that a model is trained on is so important but I have not really got a chance to work really on. So I probably will spend more time researching on how to create a good dataset.

I also want to look into music VAE more.

