Inference
Contents
Inference#
In this notebook we will demonstrate how to use two pretrained models to generate multitrack mixes of drum recordings. We provide models trained on the ENST-drums dataset, which features a few hundred drums multitracks and mixes of these multitracks made by professional audio engineers. We train two different multitrack mixing model architectures: the Differentiable Mixing Console (DMC), and the MixWaveUNet. First we will download the model checkpoints and some test audio, then load up the models and the audio tracks and generate a mix that we can listen to.
Note: This notebook assumes that you have already installed the automix
package. If you have not done so, you can run the following:
!pip install git+https://github.com/csteinmetz1/automix-toolkit
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting git+https://github.com/csteinmetz1/automix-toolkit
Cloning https://github.com/csteinmetz1/automix-toolkit to /tmp/pip-req-build-sko6r3wa
Running command git clone -q https://github.com/csteinmetz1/automix-toolkit /tmp/pip-req-build-sko6r3wa
Requirement already satisfied: torch in /usr/local/lib/python3.8/dist-packages (from automix-toolkit==0.0.1) (1.12.1+cu113)
Requirement already satisfied: torchvision in /usr/local/lib/python3.8/dist-packages (from automix-toolkit==0.0.1) (0.13.1+cu113)
Requirement already satisfied: torchaudio in /usr/local/lib/python3.8/dist-packages (from automix-toolkit==0.0.1) (0.12.1+cu113)
Collecting pytorch_lightning
Downloading pytorch_lightning-1.8.3.post1-py3-none-any.whl (798 kB)
|████████████████████████████████| 798 kB 5.4 MB/s
?25hRequirement already satisfied: tqdm in /usr/local/lib/python3.8/dist-packages (from automix-toolkit==0.0.1) (4.64.1)
Requirement already satisfied: numpy in /usr/local/lib/python3.8/dist-packages (from automix-toolkit==0.0.1) (1.21.6)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.8/dist-packages (from automix-toolkit==0.0.1) (3.2.2)
Collecting pedalboard
Downloading pedalboard-0.6.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
|████████████████████████████████| 3.2 MB 41.2 MB/s
?25hRequirement already satisfied: scipy in /usr/local/lib/python3.8/dist-packages (from automix-toolkit==0.0.1) (1.7.3)
Collecting auraloss
Downloading auraloss-0.2.2-py3-none-any.whl (15 kB)
Collecting wget
Downloading wget-3.2.zip (10 kB)
Collecting pyloudnorm
Downloading pyloudnorm-0.1.0-py3-none-any.whl (9.3 kB)
Collecting sklearn
Downloading sklearn-0.0.post1.tar.gz (3.6 kB)
Requirement already satisfied: librosa in /usr/local/lib/python3.8/dist-packages (from auraloss->automix-toolkit==0.0.1) (0.8.1)
Requirement already satisfied: numba>=0.43.0 in /usr/local/lib/python3.8/dist-packages (from librosa->auraloss->automix-toolkit==0.0.1) (0.56.4)
Requirement already satisfied: decorator>=3.0.0 in /usr/local/lib/python3.8/dist-packages (from librosa->auraloss->automix-toolkit==0.0.1) (4.4.2)
Requirement already satisfied: joblib>=0.14 in /usr/local/lib/python3.8/dist-packages (from librosa->auraloss->automix-toolkit==0.0.1) (1.2.0)
Requirement already satisfied: soundfile>=0.10.2 in /usr/local/lib/python3.8/dist-packages (from librosa->auraloss->automix-toolkit==0.0.1) (0.11.0)
Requirement already satisfied: scikit-learn!=0.19.0,>=0.14.0 in /usr/local/lib/python3.8/dist-packages (from librosa->auraloss->automix-toolkit==0.0.1) (1.0.2)
Requirement already satisfied: audioread>=2.0.0 in /usr/local/lib/python3.8/dist-packages (from librosa->auraloss->automix-toolkit==0.0.1) (3.0.0)
Requirement already satisfied: resampy>=0.2.2 in /usr/local/lib/python3.8/dist-packages (from librosa->auraloss->automix-toolkit==0.0.1) (0.4.2)
Requirement already satisfied: pooch>=1.0 in /usr/local/lib/python3.8/dist-packages (from librosa->auraloss->automix-toolkit==0.0.1) (1.6.0)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.8/dist-packages (from librosa->auraloss->automix-toolkit==0.0.1) (21.3)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.8/dist-packages (from numba>=0.43.0->librosa->auraloss->automix-toolkit==0.0.1) (4.13.0)
Requirement already satisfied: llvmlite<0.40,>=0.39.0dev0 in /usr/local/lib/python3.8/dist-packages (from numba>=0.43.0->librosa->auraloss->automix-toolkit==0.0.1) (0.39.1)
Requirement already satisfied: setuptools in /usr/local/lib/python3.8/dist-packages (from numba>=0.43.0->librosa->auraloss->automix-toolkit==0.0.1) (57.4.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.8/dist-packages (from packaging>=20.0->librosa->auraloss->automix-toolkit==0.0.1) (3.0.9)
Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.8/dist-packages (from pooch>=1.0->librosa->auraloss->automix-toolkit==0.0.1) (2.23.0)
Requirement already satisfied: appdirs>=1.3.0 in /usr/local/lib/python3.8/dist-packages (from pooch>=1.0->librosa->auraloss->automix-toolkit==0.0.1) (1.4.4)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->pooch>=1.0->librosa->auraloss->automix-toolkit==0.0.1) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->pooch>=1.0->librosa->auraloss->automix-toolkit==0.0.1) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->pooch>=1.0->librosa->auraloss->automix-toolkit==0.0.1) (1.24.3)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests>=2.19.0->pooch>=1.0->librosa->auraloss->automix-toolkit==0.0.1) (2022.9.24)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.8/dist-packages (from scikit-learn!=0.19.0,>=0.14.0->librosa->auraloss->automix-toolkit==0.0.1) (3.1.0)
Requirement already satisfied: cffi>=1.0 in /usr/local/lib/python3.8/dist-packages (from soundfile>=0.10.2->librosa->auraloss->automix-toolkit==0.0.1) (1.15.1)
Requirement already satisfied: pycparser in /usr/local/lib/python3.8/dist-packages (from cffi>=1.0->soundfile>=0.10.2->librosa->auraloss->automix-toolkit==0.0.1) (2.21)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.8/dist-packages (from importlib-metadata->numba>=0.43.0->librosa->auraloss->automix-toolkit==0.0.1) (3.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from matplotlib->automix-toolkit==0.0.1) (1.4.4)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.8/dist-packages (from matplotlib->automix-toolkit==0.0.1) (0.11.0)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.8/dist-packages (from matplotlib->automix-toolkit==0.0.1) (2.8.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.8/dist-packages (from python-dateutil>=2.1->matplotlib->automix-toolkit==0.0.1) (1.15.0)
Requirement already satisfied: future>=0.16.0 in /usr/local/lib/python3.8/dist-packages (from pyloudnorm->automix-toolkit==0.0.1) (0.16.0)
Requirement already satisfied: PyYAML>=5.4 in /usr/local/lib/python3.8/dist-packages (from pytorch_lightning->automix-toolkit==0.0.1) (6.0)
Collecting tensorboardX>=2.2
Downloading tensorboardX-2.5.1-py2.py3-none-any.whl (125 kB)
|████████████████████████████████| 125 kB 4.5 MB/s
?25hCollecting lightning-utilities==0.3.*
Downloading lightning_utilities-0.3.0-py3-none-any.whl (15 kB)
Requirement already satisfied: fsspec[http]>2021.06.0 in /usr/local/lib/python3.8/dist-packages (from pytorch_lightning->automix-toolkit==0.0.1) (2022.11.0)
Collecting torchmetrics>=0.7.0
Downloading torchmetrics-0.11.0-py3-none-any.whl (512 kB)
|████████████████████████████████| 512 kB 23.3 MB/s
?25hRequirement already satisfied: typing-extensions>=4.0.0 in /usr/local/lib/python3.8/dist-packages (from pytorch_lightning->automix-toolkit==0.0.1) (4.1.1)
Collecting fire
Downloading fire-0.4.0.tar.gz (87 kB)
|████████████████████████████████| 87 kB 2.7 MB/s
?25hRequirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /usr/local/lib/python3.8/dist-packages (from fsspec[http]>2021.06.0->pytorch_lightning->automix-toolkit==0.0.1) (3.8.3)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>2021.06.0->pytorch_lightning->automix-toolkit==0.0.1) (4.0.2)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>2021.06.0->pytorch_lightning->automix-toolkit==0.0.1) (2.1.1)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>2021.06.0->pytorch_lightning->automix-toolkit==0.0.1) (1.3.3)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>2021.06.0->pytorch_lightning->automix-toolkit==0.0.1) (1.3.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>2021.06.0->pytorch_lightning->automix-toolkit==0.0.1) (6.0.2)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>2021.06.0->pytorch_lightning->automix-toolkit==0.0.1) (22.1.0)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>2021.06.0->pytorch_lightning->automix-toolkit==0.0.1) (1.8.1)
Requirement already satisfied: protobuf<=3.20.1,>=3.8.0 in /usr/local/lib/python3.8/dist-packages (from tensorboardX>=2.2->pytorch_lightning->automix-toolkit==0.0.1) (3.19.6)
Requirement already satisfied: termcolor in /usr/local/lib/python3.8/dist-packages (from fire->lightning-utilities==0.3.*->pytorch_lightning->automix-toolkit==0.0.1) (2.1.1)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /usr/local/lib/python3.8/dist-packages (from torchvision->automix-toolkit==0.0.1) (7.1.2)
Building wheels for collected packages: automix-toolkit, fire, sklearn, wget
Building wheel for automix-toolkit (setup.py) ... ?25l?25hdone
Created wheel for automix-toolkit: filename=automix_toolkit-0.0.1-py3-none-any.whl size=35727 sha256=b5a3c151058126481ce5d442553be9a1308c6f91845b472d2996cc4f99078c3a
Stored in directory: /tmp/pip-ephem-wheel-cache-enn555sr/wheels/66/2a/85/4c0a92c4a2d0108f71a9a138ac530a0346a7d57496aaab973a
Building wheel for fire (setup.py) ... ?25l?25hdone
Created wheel for fire: filename=fire-0.4.0-py2.py3-none-any.whl size=115943 sha256=8b6555b8a47533e9957127618febcbc3d9bd8488334b5ec162fb18b95fd88c01
Stored in directory: /root/.cache/pip/wheels/1f/10/06/2a990ee4d73a8479fe2922445e8a876d38cfbfed052284c6a1
Building wheel for sklearn (setup.py) ... ?25l?25hdone
Created wheel for sklearn: filename=sklearn-0.0.post1-py3-none-any.whl size=2344 sha256=37ef5ceee089b66d9e8c4130167ebf42a02d9be4bd8c03140358f9f7903e12e7
Stored in directory: /root/.cache/pip/wheels/14/25/f7/1cc0956978ae479e75140219088deb7a36f60459df242b1a72
Building wheel for wget (setup.py) ... ?25l?25hdone
Created wheel for wget: filename=wget-3.2-py3-none-any.whl size=9674 sha256=a08e290010532e777bcce424e06841fd82133eb0dde9358038b34afd398f9f47
Stored in directory: /root/.cache/pip/wheels/bd/a8/c3/3cf2c14a1837a4e04bd98631724e81f33f462d86a1d895fae0
Successfully built automix-toolkit fire sklearn wget
Installing collected packages: fire, torchmetrics, tensorboardX, lightning-utilities, wget, sklearn, pytorch-lightning, pyloudnorm, pedalboard, auraloss, automix-toolkit
Successfully installed auraloss-0.2.2 automix-toolkit-0.0.1 fire-0.4.0 lightning-utilities-0.3.0 pedalboard-0.6.6 pyloudnorm-0.1.0 pytorch-lightning-1.8.3.post1 sklearn-0.0.post1 tensorboardX-2.5.1 torchmetrics-0.11.0 wget-3.2
import os
import glob
import torch
import torchaudio
import numpy as np
import IPython
import IPython.display as ipd
import matplotlib.pyplot as plt
import librosa.display
%matplotlib inline
%load_ext autoreload
%autoreload 2
from automix.system import System
Download the pretrained models and multitracks#
First we will download two different pretrained models. Then we will also download a .zip
file containing a drum multitrack and the demo mulitrack that were unseen during training.
# download the pretrained models for DMC and MixWaveUNet trained on ENST-drums dataset
os.makedirs("checkpoints", exist_ok=True)
!wget https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/enst-drums-dmc.ckpt
!wget https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/enst-drums-mixwaveunet.ckpt
!wget https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/medleydb-16-dmc.ckpt
!mv enst-drums-dmc.ckpt checkpoints/enst-drums-dmc.ckpt
!mv enst-drums-mixwaveunet.ckpt checkpoints/enst-drums-mixwaveunet.ckpt
!mv medleydb-16-dmc.ckpt checkpoints/medleydb-16-dmc.ckpt
# then download and extract a drum multitrack from the test set
!wget https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/drums-test-rock.zip
!unzip -o drums-test-rock.zip
!wget https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/flare-dry-stems.zip
!unzip -o flare-dry-stems.zip -d flare-dry-stems
--2022-12-01 17:37:50-- https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/enst-drums-dmc.ckpt
Resolving huggingface.co (huggingface.co)... 54.147.99.175, 34.227.196.80, 2600:1f18:147f:e850:fad3:e054:c752:ff16, ...
Connecting to huggingface.co (huggingface.co)|54.147.99.175|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/02988c14c2aeee899dc44488f61c58ca6902e3d815931e6fdd5edda969f70f18?response-content-disposition=attachment%3B%20filename%3D%22enst-drums-dmc.ckpt%22&Expires=1670159985&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvMDI5ODhjMTRjMmFlZWU4OTlkYzQ0NDg4ZjYxYzU4Y2E2OTAyZTNkODE1OTMxZTZmZGQ1ZWRkYTk2OWY3MGYxOD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMmVuc3QtZHJ1bXMtZG1jLmNrcHQlMjIiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2NzAxNTk5ODV9fX1dfQ__&Signature=PspHsGWMLjuBtXMgKhID4ybZULfzgJqr0O1SD3glnNaDuS~Ve5Grnefnj7hnZXCl4zyxPTDTEP6-UfkTOdQnXYrNJ4q0PYA9rBDlTCPstMmZwX2Hva~urgTNNCVL6rs3fRt6KNTEOHZFdHdR9osrgu90c9s~sFvZIIFcbi0H~9DwuFa4xXHDhkOjw1XfoWLPZ9J0r-tkISsIfr9vysWOfQcgC8Gf5nMm-RdENCFeqBftvFT5Ge2eyTi9TBgPAzAU~vgvzhl1jTWkDCc-Onxwa~tYCILj6X0NL5niLLEOeac4AKIFn5Vuo8CUsFQ6ZpKXq8L2h1wKyTM1Jasjbc1n3A__&Key-Pair-Id=KVTP0A1DKRTAX [following]
--2022-12-01 17:37:50-- https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/02988c14c2aeee899dc44488f61c58ca6902e3d815931e6fdd5edda969f70f18?response-content-disposition=attachment%3B%20filename%3D%22enst-drums-dmc.ckpt%22&Expires=1670159985&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvMDI5ODhjMTRjMmFlZWU4OTlkYzQ0NDg4ZjYxYzU4Y2E2OTAyZTNkODE1OTMxZTZmZGQ1ZWRkYTk2OWY3MGYxOD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMmVuc3QtZHJ1bXMtZG1jLmNrcHQlMjIiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2NzAxNTk5ODV9fX1dfQ__&Signature=PspHsGWMLjuBtXMgKhID4ybZULfzgJqr0O1SD3glnNaDuS~Ve5Grnefnj7hnZXCl4zyxPTDTEP6-UfkTOdQnXYrNJ4q0PYA9rBDlTCPstMmZwX2Hva~urgTNNCVL6rs3fRt6KNTEOHZFdHdR9osrgu90c9s~sFvZIIFcbi0H~9DwuFa4xXHDhkOjw1XfoWLPZ9J0r-tkISsIfr9vysWOfQcgC8Gf5nMm-RdENCFeqBftvFT5Ge2eyTi9TBgPAzAU~vgvzhl1jTWkDCc-Onxwa~tYCILj6X0NL5niLLEOeac4AKIFn5Vuo8CUsFQ6ZpKXq8L2h1wKyTM1Jasjbc1n3A__&Key-Pair-Id=KVTP0A1DKRTAX
Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 108.156.83.97, 108.156.83.35, 108.156.83.76, ...
Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|108.156.83.97|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 149613223 (143M) [binary/octet-stream]
Saving to: ‘enst-drums-dmc.ckpt’
enst-drums-dmc.ckpt 100%[===================>] 142.68M 65.7MB/s in 2.2s
2022-12-01 17:37:53 (65.7 MB/s) - ‘enst-drums-dmc.ckpt’ saved [149613223/149613223]
--2022-12-01 17:37:53-- https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/enst-drums-mixwaveunet.ckpt
Resolving huggingface.co (huggingface.co)... 54.147.99.175, 34.227.196.80, 2600:1f18:147f:e850:fad3:e054:c752:ff16, ...
Connecting to huggingface.co (huggingface.co)|54.147.99.175|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/db99c19bfaca2e83e17d669bb850926a0be567b690f6f63fdb0a7f44202d94a3?response-content-disposition=attachment%3B%20filename%3D%22enst-drums-mixwaveunet.ckpt%22&Expires=1670175474&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvZGI5OWMxOWJmYWNhMmU4M2UxN2Q2NjliYjg1MDkyNmEwYmU1NjdiNjkwZjZmNjNmZGIwYTdmNDQyMDJkOTRhMz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMmVuc3QtZHJ1bXMtbWl4d2F2ZXVuZXQuY2twdCUyMiIsIkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTY3MDE3NTQ3NH19fV19&Signature=An9H4aM9M7P19nY4RaLROlEfSL6eOf2SdwqmlLOQfFeGyFbe1zfwNBbqZhpFdEozQxqwO~YnOmeSSUu-7s0JC6Or~dGciM4du14fH~YegTFwayyJUcwT0pQUP3Ua80RsjGyFsfeO8aslIYGdnue9toUZz6At83pCkkGQjSVtbNhJLto3sEJ5tmkLPSojSss2bCpT69TAi-ztQBLiOXx1wePS3~AKvJUDGROl9bupu9M8XBCqrmx~xxitnZaltbkBil4CWexjrdwx1usgFGEglU3EzIMkqcJE5N~wSmmL-VKHoiWDMvNnbdgP4y66NHH4FEtobP3YVQbjFAMXygaLEA__&Key-Pair-Id=KVTP0A1DKRTAX [following]
--2022-12-01 17:37:53-- https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/db99c19bfaca2e83e17d669bb850926a0be567b690f6f63fdb0a7f44202d94a3?response-content-disposition=attachment%3B%20filename%3D%22enst-drums-mixwaveunet.ckpt%22&Expires=1670175474&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvZGI5OWMxOWJmYWNhMmU4M2UxN2Q2NjliYjg1MDkyNmEwYmU1NjdiNjkwZjZmNjNmZGIwYTdmNDQyMDJkOTRhMz9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMmVuc3QtZHJ1bXMtbWl4d2F2ZXVuZXQuY2twdCUyMiIsIkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTY3MDE3NTQ3NH19fV19&Signature=An9H4aM9M7P19nY4RaLROlEfSL6eOf2SdwqmlLOQfFeGyFbe1zfwNBbqZhpFdEozQxqwO~YnOmeSSUu-7s0JC6Or~dGciM4du14fH~YegTFwayyJUcwT0pQUP3Ua80RsjGyFsfeO8aslIYGdnue9toUZz6At83pCkkGQjSVtbNhJLto3sEJ5tmkLPSojSss2bCpT69TAi-ztQBLiOXx1wePS3~AKvJUDGROl9bupu9M8XBCqrmx~xxitnZaltbkBil4CWexjrdwx1usgFGEglU3EzIMkqcJE5N~wSmmL-VKHoiWDMvNnbdgP4y66NHH4FEtobP3YVQbjFAMXygaLEA__&Key-Pair-Id=KVTP0A1DKRTAX
Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 108.156.83.97, 108.156.83.35, 108.156.83.76, ...
Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|108.156.83.97|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 214227663 (204M) [binary/octet-stream]
Saving to: ‘enst-drums-mixwaveunet.ckpt’
enst-drums-mixwaveu 100%[===================>] 204.30M 64.6MB/s in 3.4s
2022-12-01 17:37:57 (59.6 MB/s) - ‘enst-drums-mixwaveunet.ckpt’ saved [214227663/214227663]
--2022-12-01 17:37:57-- https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/medleydb-16-dmc.ckpt
Resolving huggingface.co (huggingface.co)... 54.147.99.175, 34.227.196.80, 2600:1f18:147f:e850:fad3:e054:c752:ff16, ...
Connecting to huggingface.co (huggingface.co)|54.147.99.175|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/91e2e463c59ca0998177274d7bbbf3debbe187e57f5fabf34ea80ee86e72f6a0?response-content-disposition=attachment%3B%20filename%3D%22medleydb-16-dmc.ckpt%22&Expires=1670164515&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvOTFlMmU0NjNjNTljYTA5OTgxNzcyNzRkN2JiYmYzZGViYmUxODdlNTdmNWZhYmYzNGVhODBlZTg2ZTcyZjZhMD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMm1lZGxleWRiLTE2LWRtYy5ja3B0JTIyIiwiQ29uZGl0aW9uIjp7IkRhdGVMZXNzVGhhbiI6eyJBV1M6RXBvY2hUaW1lIjoxNjcwMTY0NTE1fX19XX0_&Signature=RaCbyq7IyebyWwR5sYBmq0WTRjh0eX3Oqg2Jyi4adjOZ9XGKpZGQ5SA~RoO8e69pb48AL57uGGBah71AVwZfSe3oLoxh9SCWLTsJ0LWL44Z0C8KHqWRu0G1-~fmcd7tqSpoxDncXNwWU3zoG10NNEcIvGiMNGCsrgMwjTRK2kGWkf84p8i0KFSTf-p80uvwB4bljYKNlwUKv~UtJkOjBMBKpbpBDeAvzwKJqbM81Q1hWjkK-ic75jphERGZLzPLDt1PXZkrYq6MgHZJIM9IgyLDuAX7CGAKih~22NcJyHb208QQqdZhr6a4jbx6-RbRsmZznHJT~zDlccZSycsF47w__&Key-Pair-Id=KVTP0A1DKRTAX [following]
--2022-12-01 17:37:57-- https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/91e2e463c59ca0998177274d7bbbf3debbe187e57f5fabf34ea80ee86e72f6a0?response-content-disposition=attachment%3B%20filename%3D%22medleydb-16-dmc.ckpt%22&Expires=1670164515&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvOTFlMmU0NjNjNTljYTA5OTgxNzcyNzRkN2JiYmYzZGViYmUxODdlNTdmNWZhYmYzNGVhODBlZTg2ZTcyZjZhMD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMm1lZGxleWRiLTE2LWRtYy5ja3B0JTIyIiwiQ29uZGl0aW9uIjp7IkRhdGVMZXNzVGhhbiI6eyJBV1M6RXBvY2hUaW1lIjoxNjcwMTY0NTE1fX19XX0_&Signature=RaCbyq7IyebyWwR5sYBmq0WTRjh0eX3Oqg2Jyi4adjOZ9XGKpZGQ5SA~RoO8e69pb48AL57uGGBah71AVwZfSe3oLoxh9SCWLTsJ0LWL44Z0C8KHqWRu0G1-~fmcd7tqSpoxDncXNwWU3zoG10NNEcIvGiMNGCsrgMwjTRK2kGWkf84p8i0KFSTf-p80uvwB4bljYKNlwUKv~UtJkOjBMBKpbpBDeAvzwKJqbM81Q1hWjkK-ic75jphERGZLzPLDt1PXZkrYq6MgHZJIM9IgyLDuAX7CGAKih~22NcJyHb208QQqdZhr6a4jbx6-RbRsmZznHJT~zDlccZSycsF47w__&Key-Pair-Id=KVTP0A1DKRTAX
Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 108.156.83.97, 108.156.83.35, 108.156.83.76, ...
Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|108.156.83.97|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 149614695 (143M) [binary/octet-stream]
Saving to: ‘medleydb-16-dmc.ckpt’
medleydb-16-dmc.ckp 100%[===================>] 142.68M 57.7MB/s in 2.5s
2022-12-01 17:38:00 (57.7 MB/s) - ‘medleydb-16-dmc.ckpt’ saved [149614695/149614695]
--2022-12-01 17:38:01-- https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/drums-test-rock.zip
Resolving huggingface.co (huggingface.co)... 54.147.99.175, 34.227.196.80, 2600:1f18:147f:e850:fad3:e054:c752:ff16, ...
Connecting to huggingface.co (huggingface.co)|54.147.99.175|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/78590471160237edbabf64fc347697793a647ed287bcff367bfa577753e93b70?response-content-disposition=attachment%3B%20filename%3D%22drums-test-rock.zip%22&Expires=1670175481&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvNzg1OTA0NzExNjAyMzdlZGJhYmY2NGZjMzQ3Njk3NzkzYTY0N2VkMjg3YmNmZjM2N2JmYTU3Nzc1M2U5M2I3MD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMmRydW1zLXRlc3Qtcm9jay56aXAlMjIiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2NzAxNzU0ODF9fX1dfQ__&Signature=WLz0gj9xmgeAUD~cpmIYKzJckRDz7V8HJQPluif1IIaXcWLO0z3apnh8JhrhLMp39n0AwEObluV~mJp92MJhdvzI-PVEj0cdvfR7Ap1BaPUrjFC64xW-vNffwznftvWdv7cRyEgGkE1cjnjdiaEV2O3-xj6VTqtF1hINRF~Kn9e1kLTx~Gun0nY54eMU8~Yw018J6rbUlmA5eG~WRl0DIujRZN9bLQm0UTCxy-R3wLHgr9lZNrKAFGHwEVSvcjIfYT2gPVe5MvOB454tk0nwWC5tWzX1b2~mV3YAK8QlqstitOm0cJD4WC8Ew-mSdVToxvQlqOVO-9Nr~eZQBr8Dyw__&Key-Pair-Id=KVTP0A1DKRTAX [following]
--2022-12-01 17:38:01-- https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/78590471160237edbabf64fc347697793a647ed287bcff367bfa577753e93b70?response-content-disposition=attachment%3B%20filename%3D%22drums-test-rock.zip%22&Expires=1670175481&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvNzg1OTA0NzExNjAyMzdlZGJhYmY2NGZjMzQ3Njk3NzkzYTY0N2VkMjg3YmNmZjM2N2JmYTU3Nzc1M2U5M2I3MD9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMmRydW1zLXRlc3Qtcm9jay56aXAlMjIiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2NzAxNzU0ODF9fX1dfQ__&Signature=WLz0gj9xmgeAUD~cpmIYKzJckRDz7V8HJQPluif1IIaXcWLO0z3apnh8JhrhLMp39n0AwEObluV~mJp92MJhdvzI-PVEj0cdvfR7Ap1BaPUrjFC64xW-vNffwznftvWdv7cRyEgGkE1cjnjdiaEV2O3-xj6VTqtF1hINRF~Kn9e1kLTx~Gun0nY54eMU8~Yw018J6rbUlmA5eG~WRl0DIujRZN9bLQm0UTCxy-R3wLHgr9lZNrKAFGHwEVSvcjIfYT2gPVe5MvOB454tk0nwWC5tWzX1b2~mV3YAK8QlqstitOm0cJD4WC8Ew-mSdVToxvQlqOVO-9Nr~eZQBr8Dyw__&Key-Pair-Id=KVTP0A1DKRTAX
Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 108.156.83.97, 108.156.83.35, 108.156.83.76, ...
Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|108.156.83.97|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 20044145 (19M) [application/zip]
Saving to: ‘drums-test-rock.zip’
drums-test-rock.zip 100%[===================>] 19.12M 54.2MB/s in 0.4s
2022-12-01 17:38:02 (54.2 MB/s) - ‘drums-test-rock.zip’ saved [20044145/20044145]
Archive: drums-test-rock.zip
creating: drums-test-rock/
inflating: __MACOSX/._drums-test-rock
inflating: drums-test-rock/.DS_Store
inflating: __MACOSX/drums-test-rock/._.DS_Store
creating: drums-test-rock/tracks/
creating: drums-test-rock/mix/
inflating: drums-test-rock/tracks/04_overhead_L_066_phrase_rock_complex_fast_sticks.wav
inflating: drums-test-rock/tracks/01_kick_066_phrase_rock_complex_fast_sticks.wav
inflating: drums-test-rock/tracks/03_hi-hat_066_phrase_rock_complex_fast_sticks.wav
inflating: drums-test-rock/tracks/02_snare_066_phrase_rock_complex_fast_sticks.wav
inflating: drums-test-rock/tracks/07_tom_2_066_phrase_rock_complex_fast_sticks.wav
inflating: drums-test-rock/tracks/06_tom_1_066_phrase_rock_complex_fast_sticks.wav
inflating: drums-test-rock/tracks/05_overhead_R_066_phrase_rock_complex_fast_sticks.wav
inflating: drums-test-rock/tracks/08_tom_3_066_phrase_rock_complex_fast_sticks.wav
inflating: drums-test-rock/mix/dry_mix_066_phrase_rock_complex_fast_sticks.wav
inflating: drums-test-rock/mix/dry_mix_066_phrase_rock_complex_fast_sticks_DMC.wav
inflating: drums-test-rock/mix/dry_mix_066_phrase_rock_complex_fast_sticks_MixWaveUNet.wav
--2022-12-01 17:38:02-- https://huggingface.co/csteinmetz1/automix-toolkit/resolve/main/flare-dry-stems.zip
Resolving huggingface.co (huggingface.co)... 54.147.99.175, 34.227.196.80, 2600:1f18:147f:e850:fad3:e054:c752:ff16, ...
Connecting to huggingface.co (huggingface.co)|54.147.99.175|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/7ff7a103f3d1ed883038465361fb88dc9812f67c2f7a527e78b4ba95cd7053a9?response-content-disposition=attachment%3B%20filename%3D%22flare-dry-stems.zip%22&Expires=1670160005&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvN2ZmN2ExMDNmM2QxZWQ4ODMwMzg0NjUzNjFmYjg4ZGM5ODEyZjY3YzJmN2E1MjdlNzhiNGJhOTVjZDcwNTNhOT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMmZsYXJlLWRyeS1zdGVtcy56aXAlMjIiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2NzAxNjAwMDV9fX1dfQ__&Signature=J6HhadWoFGixAIfWNptNLng2bf4r6Ewcrkq-A-9MDVS6ZqrVRJcIcNW7PrMHDu6YbuwsNelOzwRPesuIKQnJGd00EbAARUK5sXyIURpKtiLyAez9x~0CtCYzwutw1c7NLyXiQdbb89lhODOmqWL4E1eLSozLq~kpSa5CpsX82ld~D5cK~G-PpF4pQoIKchKNbcu0Yuyz~EijaQYSWq6Tg~hU8lXwYKwg8ZcEjxkRfN3jykB1nEQkElNC4cvCg2lh4vkWSRPCoobQvCOF-CDN6mjna8vDtafa6seVZCx0PfwTQbT1ayW3OqL5O3P6tlJHQyvJxywYx3zX8-EJ0nsDsg__&Key-Pair-Id=KVTP0A1DKRTAX [following]
--2022-12-01 17:38:03-- https://cdn-lfs.huggingface.co/repos/ec/ee/ecee38df047e3f2db1bd8c31a742f3a08f557470cd67cb487402a9c3ed91b5ea/7ff7a103f3d1ed883038465361fb88dc9812f67c2f7a527e78b4ba95cd7053a9?response-content-disposition=attachment%3B%20filename%3D%22flare-dry-stems.zip%22&Expires=1670160005&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZG4tbGZzLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2VjL2VlL2VjZWUzOGRmMDQ3ZTNmMmRiMWJkOGMzMWE3NDJmM2EwOGY1NTc0NzBjZDY3Y2I0ODc0MDJhOWMzZWQ5MWI1ZWEvN2ZmN2ExMDNmM2QxZWQ4ODMwMzg0NjUzNjFmYjg4ZGM5ODEyZjY3YzJmN2E1MjdlNzhiNGJhOTVjZDcwNTNhOT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPWF0dGFjaG1lbnQlM0IlMjBmaWxlbmFtZSUzRCUyMmZsYXJlLWRyeS1zdGVtcy56aXAlMjIiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2NzAxNjAwMDV9fX1dfQ__&Signature=J6HhadWoFGixAIfWNptNLng2bf4r6Ewcrkq-A-9MDVS6ZqrVRJcIcNW7PrMHDu6YbuwsNelOzwRPesuIKQnJGd00EbAARUK5sXyIURpKtiLyAez9x~0CtCYzwutw1c7NLyXiQdbb89lhODOmqWL4E1eLSozLq~kpSa5CpsX82ld~D5cK~G-PpF4pQoIKchKNbcu0Yuyz~EijaQYSWq6Tg~hU8lXwYKwg8ZcEjxkRfN3jykB1nEQkElNC4cvCg2lh4vkWSRPCoobQvCOF-CDN6mjna8vDtafa6seVZCx0PfwTQbT1ayW3OqL5O3P6tlJHQyvJxywYx3zX8-EJ0nsDsg__&Key-Pair-Id=KVTP0A1DKRTAX
Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 108.156.83.97, 108.156.83.35, 108.156.83.76, ...
Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|108.156.83.97|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 271700657 (259M) [application/zip]
Saving to: ‘flare-dry-stems.zip’
flare-dry-stems.zip 100%[===================>] 259.11M 73.9MB/s in 3.6s
2022-12-01 17:38:07 (71.5 MB/s) - ‘flare-dry-stems.zip’ saved [271700657/271700657]
Archive: flare-dry-stems.zip
Written using ZipTricks 5.6.0
extracting: flare-dry-stems/Flare Bass Stem Dry.wav
extracting: flare-dry-stems/Flare Drum Stem Dry.wav
extracting: flare-dry-stems/Flare Instrument Stem Dry.wav
extracting: flare-dry-stems/Flare Vocal Stem Dry.wav
!ls
checkpoints drums-test-rock.zip flare-dry-stems.zip sample_data
drums-test-rock flare-dry-stems __MACOSX
Set configuration#
We have the option to select one of two different checkpoints.
If we select enst-drums-dmc.ckpt
we can use the pretrained Differentiable mixing console model which will directly predict gain and panning parameters for each track. On the other hand we can also select enst-drums-mixwaveunet.ckpt
which will use a multi-input WaveUNet to create a mix of the tracks. To make computation faster we can restrict the maximum number of samples the process with max_samples
. Using the default max_samples = 262144
will mix about the first 6 seconds of the track. You can try increasing this value to see how the results change.
Note: In the case of MixWaveUNet, a power of 2 value for max_samples
is required.
track_dir = "./drums-test-rock/tracks"
track_ext = "wav"
dmc_ckpt_path = "checkpoints/enst-drums-dmc.ckpt"
mwun_ckpt_path = "checkpoints/enst-drums-mixwaveunet.ckpt"
max_samples = 262144
Load pretrained model#
# load pretrained model
dmc_system = System.load_from_checkpoint(dmc_ckpt_path, pretrained_encoder=False, map_location="cpu").eval()
mwun_system = System.load_from_checkpoint(mwun_ckpt_path, map_location="cpu").eval()
/usr/local/lib/python3.8/dist-packages/torchaudio/functional/functional.py:539: UserWarning: At least one mel filterbank has all zero values. The value for `n_mels` (128) may be set too high. Or, the value for `n_freqs` (257) may be set too low.
warnings.warn(
Load multitrack#
Now we will read the tracks from disk and create a tensor with all the tracks. In this case, we first peak normalize each track to -12 dB which is what the models expect. In the case of MixWaveUNet, we will add an extra track of silence if less than 8 are provided. However, the DMC model can accept any number of tracks, wether more or less than it was trained with.
We can also create a simple mono mixture of these tracks to hear what the multitrack sounds like before we do any mixing.
# load the input tracks
track_filepaths = glob.glob(os.path.join(track_dir, f"*.{track_ext}"))
track_filepaths = sorted(track_filepaths)
tracks = []
for idx, track_filepath in enumerate(track_filepaths):
x, sr = torchaudio.load(track_filepath)
x = x[:, : max_samples]
x /= x.abs().max().clamp(1e-8) # peak normalize
x *= 10 ** (-12/20.0) # set peak to -12 dB
tracks.append(x)
plt.figure(figsize=(10, 2))
librosa.display.waveshow(x.view(-1).numpy(), sr=sr, zorder=3)
plt.title(f"{idx+1} {os.path.basename(track_filepath)}")
plt.ylim([-1,1])
plt.grid(c="lightgray")
plt.show()
IPython.display.display(ipd.Audio(x.view(-1).numpy(), rate=sr, normalize=True))
# add dummy tracks of silence if needed
if len(tracks) < 8:
tracks.append(torch.zeros(x.shape))
# stack tracks into a tensor
tracks = torch.stack(tracks, dim=0)
tracks = tracks.permute(1, 0, 2)
# tracks have shape (1, num_tracks, seq_len)
# listen to the input (mono) before mixing
input_mix = tracks.sum(dim=1, keepdim=True)
print(input_mix.shape)
plt.figure(figsize=(10, 2))
plt.title("Mono Mix")
librosa.display.waveshow(input_mix.view(-1).numpy(), sr=sr, zorder=3, color="tab:orange")
plt.ylim([-1,1])
plt.grid(c="lightgray")
plt.show()
IPython.display.display(ipd.Audio(input_mix.view(-1).numpy(), rate=sr, normalize=False))
torch.Size([1, 1, 262144])
Generate the DMC mix#
Now we can listen to the predicted mix. If we create a mix with the differentiable mixing console we can also print out the gain (in dB) and pan parameter for each track.
# pass tracks to the model and create a mix
with torch.no_grad(): # no need to compute gradients
mix, params = dmc_system(tracks[:,:-1,:])
print(mix.shape, params.shape)
# view the mix
mix /= mix.abs().max()
plt.figure(figsize=(10, 2))
plt.title("Differentiable Mixing Console")
librosa.display.waveshow(mix.view(2,-1).numpy(), sr=sr, zorder=3)
plt.ylim([-1,1])
plt.grid(c="lightgray")
plt.show()
IPython.display.display(ipd.Audio(mix.view(2,-1).numpy(), rate=sr, normalize=True))
for track_fp, param in zip(track_filepaths, params.squeeze()):
print(os.path.basename(track_fp), param)
torch.Size([1, 2, 262144]) torch.Size([1, 7, 2])
/usr/local/lib/python3.8/dist-packages/librosa/util/utils.py:198: UserWarning: librosa.util.frame called with axis=-1 on a non-contiguous input. This will result in a copy.
warnings.warn(
01_kick_066_phrase_rock_complex_fast_sticks.wav tensor([12.3843, 0.5003])
02_snare_066_phrase_rock_complex_fast_sticks.wav tensor([13.0229, 0.5067])
03_hi-hat_066_phrase_rock_complex_fast_sticks.wav tensor([5.0208, 0.5011])
04_overhead_L_066_phrase_rock_complex_fast_sticks.wav tensor([6.4820e+00, 1.4221e-03])
05_overhead_R_066_phrase_rock_complex_fast_sticks.wav tensor([7.4902, 0.9986])
06_tom_1_066_phrase_rock_complex_fast_sticks.wav tensor([-4.6055, 0.7456])
07_tom_2_066_phrase_rock_complex_fast_sticks.wav tensor([1.5387, 0.3615])
Generate the Mix-Wave-U-Net Mix#
If we use the MixWaveUNet there are no parameters to show since this model uses a direct transformation method which does not use intermediate mixing parameters.
with torch.no_grad(): # no need to compute gradients
mwun_mix, params = mwun_system(tracks)
print(mix.shape, params.shape)
# view the mix
mwun_mix /= mwun_mix.abs().max()
plt.figure(figsize=(10, 2))
plt.title("Mix-Wave-U-Net")
librosa.display.waveshow(mwun_mix.view(2,-1).numpy(), sr=sr, zorder=3)
plt.ylim([-1,1])
plt.grid(c="lightgray")
plt.show()
IPython.display.display(ipd.Audio(mwun_mix.view(2,-1).numpy(), rate=sr, normalize=True))
torch.Size([1, 2, 262144]) torch.Size([1])
/usr/local/lib/python3.8/dist-packages/librosa/util/utils.py:198: UserWarning: librosa.util.frame called with axis=-1 on a non-contiguous input. This will result in a copy.
warnings.warn(
MedleyDB#
Now we will run DMC that was trained on MedleyDB, which includes many types of instruments. This model was trained with all songs that had 16 or less tracks.
dmc_ckpt_path = "checkpoints/medleydb-16-dmc.ckpt"
# load pretrained model
medley_dmc_system = System.load_from_checkpoint(dmc_ckpt_path, pretrained_encoder=False, map_location="cpu").eval()
/usr/local/lib/python3.8/dist-packages/torchaudio/functional/functional.py:539: UserWarning: At least one mel filterbank has all zero values. The value for `n_mels` (128) may be set too high. Or, the value for `n_freqs` (257) may be set too low.
warnings.warn(
Load tracks#
We will use the stems from the song that Gary mixed in the first part of the tutorial.
track_dir = "./flare-dry-stems"
track_ext = "wav"
start_sample = int(32 * 44100)
end_sample = start_sample + int(40 * 44100)
# load the input tracks
track_filepaths = glob.glob(os.path.join(track_dir, f"*.{track_ext}"))
track_filepaths = sorted(track_filepaths)
tracks = []
track_names = []
for idx, track_filepath in enumerate(track_filepaths):
x, sr = torchaudio.load(track_filepath)
if "Vocal" in track_filepath or "Bass" in track_filepath:
x_L = x[0:1, start_sample:end_sample]
#x_L /= x_L.abs().max().clamp(1e-8) # peak normalize
#x_L *= 10 ** (-12/20.0) # set peak to -12 dB
tracks.append(x_L)
track_names.append(os.path.basename(track_filepath))
else:
x_L = x[0:1, start_sample:end_sample]
x_R = x[1:2, start_sample:end_sample]
#x_L /= x_L.abs().max().clamp(1e-8) # peak normalize
#x_L *= 10 ** (-12/20.0) # set peak to -12 dB
#x_R /= x_R.abs().max().clamp(1e-8) # peak normalize
#x_R *= 10 ** (-12/20.0) # set peak to -12 dB
tracks.append(x_L)
tracks.append(x_R)
track_names.append(os.path.basename(track_filepath) + "-L")
track_names.append(os.path.basename(track_filepath) + "-R")
plt.figure(figsize=(10, 2))
librosa.display.waveshow(x_L.view(-1).numpy(), sr=sr, zorder=3)
plt.title(f"{idx+1} {os.path.basename(track_filepath)}")
plt.ylim([-1,1])
plt.grid(c="lightgray")
plt.show()
IPython.display.display(ipd.Audio(x_L.view(-1).numpy(), rate=sr, normalize=True))
# stack tracks into a tensor
tracks = torch.stack(tracks, dim=0)
tracks = tracks.permute(1, 0, 2)
# tracks have shape (1, num_tracks, seq_len)
# listen to the input (mono) before mixing
input_mix = tracks.sum(dim=1, keepdim=True).clamp(-1, 1)
plt.figure(figsize=(10, 2))
plt.title("Mono Mix")
librosa.display.waveshow(input_mix.view(-1).numpy(), sr=sr, zorder=3, color="tab:orange")
plt.ylim([-1,1])
plt.grid(c="lightgray")
plt.show()
IPython.display.display(ipd.Audio(input_mix.view(-1).numpy(), rate=sr, normalize=False))