Bach 2.0
This is part of my research in the field of Deep Learning. We employ a 3-layer GRU Recurrent Neural Network and, based on the assumption that musical units are equivalent to words in Natural Language Processing, we train our network on a manually-transcribed database, originally composed by Johann Sebastian Bach. For the internal representation, we used NoteWorthy Composer’s NWCTXT format. The visualization below was generated using Stephen Malinowski’s excellent application, Music Animation Machine.
The full project (including the data set in NWCTXT file format) is available on GitHub.
One can find below a detailed description of the data we used for training our network:
Suite II for solo cello in D minor BWV 1008: Prelude, Allemande, Courante, Sarabande, Menuett, Gigue
Partita in A minor for solo flute BWV 1013: Courante
Suite in E minor for lute BWV 996: Courante, Sarabande
Suite in A minor for lute BWV 995: Gigue
Sonatas and partitas for solo violin:
Sonata I in G minor: Siciliano, Presto
Partita I in B minor: Corrente, Double, Sarabande
Sonata II in A minor: Andante
Partita II in D minor: Courante, Sarabande, Gigue, Chaconne
All compositions have been transposed in G minor.
Bach 2.7
This page is a supplementary material to our paper entitled “A Musical Similarity Metric based on Symbolic Aggregate Approximation“, submitted for review to the 28th International Conference on Software, Telecommunications and Computer Networks (SoftCOM 2020). This project is a direct successor to our previous work, “Bach 2.0 – Generating Classical Music using Recurrent Neural Networks“, presented at the 23rd International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2019), available as an open access article in Procedia Computer Science.
We chose the topic of synthetic classical music generation mainly because it seemed intuitive to treat music, and classical music in particular, as a form of natural language. The fact that there exists a standard musical notation system also comes in our aid. Through statistical analysis we have found an even stronger resemblance to natural language, namely the existence of sentences and phrases. This was done by means of finding recurring patterns, or motifs, in the composer’s corpus of works. A listing of motifs and frequency of apparition can be found here (observe the fact that the motifs take the relative pitch of consecutive notes). Furthermore, the choice of J. S. Bach for our experiments is that the famous composer is very well documented, with a lot of his works being freely available, and last but not least, he wrote a considerable amount of solo compositions. At the time of writing, we did not focus on polyphonic music.
We have improved upon the 3-layer GRU recurrent neural network by replacing the middle layer with a bi-directional LSTM layer (BD-LSTM), aiming to take into account not only the previous sequence of musical notes, but also future ones. We obtain significantly higher validation accuracy (categorical prediction accuracy), raising the bar to little over 90%, compared to the 75% previously reported. We employ 4 distinct such networks, each having the responsibility of a single attribute: type (note, chord, rest or bar), duration, position and optional modifiers. This greatly reduces the vocabulary size for the RNN, and brings the advantage that the final composition will be grammatically sound from NoteWorthy’s perspective. An instance of such a network looks similar to:

Next we wanted a measure of how much such a composition is similar to J. S. Bach’s style. We augmented the database of partita and sonata excerpts, now containing 61 works. The entire database, in NWCTXT format is available here. In order to find this similarity metric, we treat the sequence of notes as a signal composed of discrete values and employ techniques such as piece-wise aggregate approximation (PAA) and symbolic aggregate approximation (SAX). This helps with dimensionality reduction, whilst preserving signal features. At the end, a synthetic piece is compared against the database and the longest common substring is computed. This in turn paves the way for a generative adversarial approach.