References
References#
- A+15
missing journal in tensorflow2015whitepaper
- BKK18
Shaojie Bai, J Zico Kolter, and Vladlen Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018.
- BZSH21a
Dan Barry, Qijian Zhang, Pheobe Wenyi Sun, and Andrew Hines. Go listen: an end-to-end online listening test platform. Journal of Open Research Software, 2021. URL: http://doi.org/10.5334/jors.361.
- BZSH21b
Dan Barry, Qijian Zhang, Pheobe Wenyi Sun, and Andrew Hines. Go listen: an end-to-end online listening test platform. Journal of Open Research Software, 2021.
- BR17
Adán L Benito and Joshua D Reiss. Intelligent multitrack reverberation based on hinge-loss markov random fields. In Audio Engineering Society Conference: 2017 AES International Conference on Semantic Audio. Audio Engineering Society, 2017.
- Bil09
Stefan Bilbao. Numerical sound synthesis: finite difference schemes and simulation in musical acoustics. John Wiley and Sons, 2009.
- BFH+18
missing journal in jax2018github
- BHP17
Jean-Pierre Briot, Gaëtan Hadjeres, and François-David Pachet. Deep learning techniques for music generation–a survey. arXiv:1709.01620, 2017.
- BMBF18
Gary Bromham, Dave Moffat, Mathieu Barthet, and György Fazekas. The impact of compressor ballistics on the perceived style of music. In Audio Engineering Society Convention 145. Audio Engineering Society, 2018.
- BMR+20
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, and others. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- BKBF+21
Nick Bryan-Kinns, Berker Banar, Corey Ford, C Reed, Yixiao Zhang, Simon Colton, Jack Armitage, and others. Exploring xai for the arts: explaining latent space in generative music. In 1st Workshop on eXplainable AI approaches for debugging and diagnosis (XAI4Debugging@NeurIPS2021). 2021.
- CBS22
Jonah Casebeer, Nicholas J Bryan, and Paris Smaragdis. Meta-af: meta-learning for adaptive filters. arXiv preprint arXiv:2204.11942, 2022.
- Che84
Chi-Tsong Chen. Linear system theory and design. Saunders college publishing, 1984.
- CKNH20
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International conference on machine learning, 1597–1607. PMLR, 2020.
- CR17
Emmanouil T Chourdakis and Joshua D Reiss. A machine-learning approach to application of intelligent artificial reverberation. Journal of the Audio Engineering Society, 65(1/2):56–65, 2017.
- CR16
Emmanouil Theofanis Chourdakis and Joshua D Reiss. Automatic control of a digital reverberation effect using hybrid models. In Audio Engineering Society Conference: 60th International Conference: DREAMS (Dereverberation and Reverberation of Audio, Music, and Speech). Audio Engineering Society, 2016.
- CComunitaR22
Joseph T Colonel, Marco Comunità, and Joshua Reiss. Reverse engineering memoryless distortion effects with differentiable waveshapers. In 153rd Convention of the Audio Engineering Society. Audio Engineering Society, 2022.
- CR21
Joseph T Colonel and Joshua Reiss. Reverse engineering of a recording mix with differentiable digital signal processing. The Journal of the Acoustical Society of America, 150(1):608–619, 2021.
- CSMR22
Joseph T Colonel, Christian J Steinmetz, Marcus Michelen, and Joshua D Reiss. Direct design of biquad filter cascades with deep learning by sampling random polynomials. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3104–3108. IEEE, 2022.
- CR+22
Joseph T. Colonel, Joshua D Reiss, and others. Approximating ballistics in a differentiable dynamic range compressor. In 153rd Convention of the Audio Engineering Society. Audio Engineering Society, 2022.
- DamskaggJValimaki+19
Eero-Pekka Damskägg, Lauri Juvela, Vesa Välimäki, and others. Real-time modeling of audio distortion circuits with deep learning. In Proc. Int. Sound and Music Computing Conf.(SMC-19), Malaga, Spain, 332–339. 2019.
- Dan18
Roger B. Dannenberg. Loudness concepts and panning laws. Introduction to Computer Music, 2018.
- DM17
Brecht De Man. Towards a better understanding of mix engineering. PhD thesis, Queen Mary University of London, 2017.
- DMR13
Brecht De Man and Joshua D Reiss. A knowledge-engineered autonomous mixing system. In 135th Audio Engineering Society Convention. Audio Engineering Society, 2013.
- DMR14
Brecht De Man and Joshua D Reiss. APE: audio perceptual evaluation toolbox for MATLAB. In Audio Engineering Society Convention 136. 2014.
- DMRS17
Brecht De Man, Joshua D Reiss, and Ryan Stables. Ten years of automatic mixing. In 3rd AES Workshop on Intelligent Music Production. September 2017.
- DV22
Fotios Drakopoulos and Sarah Verhulst. A differentiable optimisation framework for the design of individualised dnn-based hearing-aid strategies. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 351–355. IEEE, 2022.
- Dug75
Dan Dugan. Automatic microphone mixing. In 151st Convention of the Audio Engineering Society. Audio Engineering Society, 1975.
- DefossezUBB19
Alexandre Défossez, Nicolas Usunier, Léon Bottou, and Francis Bach. Music source separation in the waveform domain. arXiv preprint arXiv:1911.13254, 2019.
- EHGR21
Jesse Engel, Lamtharn Hantrakul, Chenjie Gu, and Adam Roberts. DDSP: differentiable digital signal processing. ICLR, 2021.
- Far00
Angelo Farina. Simultaneous measurement of impulse response and distortion with a swept-sine technique. In Audio Engineering Society Convention 108. 2000.
- Fen18
Steven Fenton. Automatic mixing of multitrack material using modified loudness models. In Audio Engineering Society Convention 145. Audio Engineering Society, 2018.
- Gay04
Patrick Gaydecki. Foundations of digital signal processing: theory, algorithms and hardware design. Volume 15. Iet, 2004.
- GR07
E Perez Gonzalez and Joshua D Reiss. Automatic mixing: live downmixing stereo panner. In Proceedings of the 7th International Conference on Digital Audio Effects (DAFx’07), 63–68. 2007.
- GBC16
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.
- GPAM+20
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
- G+89
Andreas Griewank and others. On automatic differentiation. Mathematical Programming: recent developments and applications, 6(6):83–107, 1989.
- HZRS15
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, 1026–1034. 2015.
- HCE+17
Shawn Hershey, Sourish Chaudhuri, Daniel PW Ellis, Jort F Gemmeke, Aren Jansen, R Channing Moore, Manoj Plakal, Devin Platt, Rif A Saurous, Bryan Seybold, and others. Cnn architectures for large-scale audio classification. In ICASSP, 131–135. IEEE, 2017.
- HJA20
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- HKNR+20
Cheng-Zhi Anna Huang, Hendrik Vincent Koops, Ed Newton-Rex, Monica Dinculescu, and Carrie J Cai. Ai song contest: human-ai co-creation in songwriting. arXiv preprint arXiv:2010.05388, 2020.
- IR11
Rec ITU-R. Itu-r bs. 1770-2, algorithms to measure audio programme loudness and true-peak audio level. International Telecommunications Union, Geneva, 2011.
- IR15
Rec ITU-R. ITU-R BS. 1534-3, method for the subjective assessment of intermediate quality level of audio systems. International Telecommunications Union, Geneva, 2015.
- JMM+15
missing journal in jillings2015web
- JS22
Nicolas Jonason and Bob L. T. Sturm. TimbreCLIP: connecting timbre to text and images. arXiv:2211.11225, 2022.
- KZRS19
Kevin Kilgour, Mauricio Zuluaga, Dominik Roblek, and Matthew Sharifi. Fréchet audio distance: a reference-free metric for evaluating music enhancement algorithms. In INTERSPEECH, 2350–2354. 2019.
- KW13
Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- KMartinezRamirezL+22
Junghyun Koo, Marco A Martínez-Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Kyogu Lee, and Yuki Mitsufuji. Music mixing style transfer: a contrastive learning approach to disentangle audio effects. arXiv preprint arXiv:2211.02247, 2022.
- KPL22
Junghyun Koo, Seungryeol Paik, and Kyogu Lee. End-to-end music remastering system using self-supervised and adversarial training. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4608–4612. IEEE, 2022.
- Kuh58
Walter Kuhl. The acoustical and technological properties of the reverberation plate. EBU Review, Part A-Technical, 49:8–14, 1958.
- KPE20
Boris Kuznetsov, Julian D Parker, and Fabián Esqueda. Differentiable iir filters for machine learning applications. In Proc. Int. Conf. Digital Audio Effects (eDAFx-20), 297–303. 2020.
- LCL22
Sungho Lee, Hyeong-Seok Choi, and Kyogu Lee. Differentiable artificial reverberation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30:2541–2556, 2022.
- LBFM21
M Nyssim Lefford, Gary Bromham, György Fazekas, and David Moffat. Context aware intelligent mixing systems. Journal of the Audio Engineering Society, 2021.
- LE22
Søren Vøgg Lyster and Cumhur Erkut. A differentiable neural network approach to parameter estimation of reverberation. In 19th Sound and Music Computing Conference, SMC 2022, 358–364. Sound and Music Computing Network, 2022.
- MDMP+15
Zheng Ma, Brecht De Man, Pedro DL Pestana, Dawn AA Black, and Joshua D Reiss. Intelligent multitrack dynamic range compression. Journal of the Audio Engineering Society, 63(6):412–426, 2015.
- MJZF21
Pranay Manocha, Zeyu Jin, Richard Zhang, and Adam Finkelstein. Cdpam: contrastive learning for perceptual audio similarity. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 196–200. IEEE, 2021.
- MFR12
Stuart Mansbridge, Saoirse Finn, and Joshua D Reiss. Implementation and evaluation of autonomous multi-track fader control. In Audio Engineering Society Convention 132. Audio Engineering Society, 2012.
- MartinezRamirez20
Marco A Martínez-Ramírez. Deep learning for audio effects modeling. PhD thesis, Queen Mary University of London, 2020.
- MartinezRamirezLF+22
Marco A Martínez-Ramírez, Wei-Hsiang Liao, Giorgio Fabbro, Stefan Uhlich, Chihiro Nagashima, and Yuki Mitsufuji. Automatic music mixing with deep learning and out-of-domain data. In ISMIR. 2022.
- MartinezRamirezSM21
Marco A Martínez-Ramírez, Daniel Stoller, and David Moffat. A deep learning approach to intelligent drum mixing with the Wave-U-Net. Journal of the Audio Engineering Society, 2021.
- MartinezRamirezWSB21
Marco A Martínez-Ramírez, Oliver Wang, Paris Smaragdis, and Nicholas J Bryan. Differentiable signal processing with black-box audio effects. In ICASSP, 66–70. IEEE, 2021.
- MS21
Naotake Masuda and Daisuke Saito. Synthesizer sound matching with differentiable dsp. In ISMIR, 428–434. 2021.
- MBS20
Stylianos I Mimilakis, Nicholas J Bryan, and Paris Smaragdis. One-shot parametric audio production style transfer with application to frequency equalization. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 256–260. IEEE, 2020.
- MS19a
Dave Moffat and Mark Sandler. Machine learning multitrack gain mixing of drums. In 147th Audio Engineering Society Convention. 2019.
- MS19b
David Moffat and Mark B Sandler. Approaches in intelligent music production. Arts, 8(5):14, September 2019.
- MS19c
David Moffat and Mark B Sandler. Approaches in intelligent music production. In Arts, volume 8, 125. MDPI, 2019.
- Ner20
Shahan Nercessian. Neural parametric equalizer matching using differentiable biquads. In Proc. Int. Conf. Digital Audio Effects (eDAFx-20), 265–272. 2020.
- PD00
François Pachet and Olivier Delerue. On-the-fly multi-track mixing. In 109th Convention of the Audio Engineering Society. Audio Engineering Society, 2000.
- PB09
Julian Parker and Stefan Bilbao. Spring reverberation: a physical perspective. In 12th International Conference on Digital Audio Effects (DAFx-09). 2009.
- PGM+19
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, and others. Pytorch: an imperative style, high-performance deep learning library. Advances in neural information processing systems, 2019.
- Pee04
Geoffroy Peeters. A large set of audio features for sound description (similarity and classification) in the CUIDADO project. Analysis/Synthesis Team. IRCAM, Paris, France, 54(0):1–25, 2004.
- PSDV+18
Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. Film: visual reasoning with a general conditioning layer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32. 2018.
- PGR08
Enrique Perez Gonzalez and Joshua Reiss. Determination and correction of individual channel time offsets for signals involved in an audio mixture. In Audio Engineering Society Convention 125. Audio Engineering Society, 2008.
- PGR09
Enrique Perez-Gonzalez and Joshua Reiss. Automatic gain and fader control for live mixing. In 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 1–4. IEEE, 2009.
- pgr09
enrique perez-gonzalez and joshua reiss. Automatic equalization of multichannel audio using cross-adaptive methods. journal of the audio engineering society, ():, october 2009. doi:.
- PR14
Pedro D Pestana and Joshua D Reiss. A cross-adaptive dynamic spectral panning technique. In DAFx, 303–307. Erlangen, 2014.
- PRB17
Pedro Duarte Pestana, Joshua D Reiss, and Álvaro Barbosa. User preference on artificial reverberation and delay time parameters. Journal of the Audio Engineering Society, 65(1/2):100–107, 2017.
- RKH+21
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, and others. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, 8748–8763. PMLR, 2021.
- RamirezR18
Marco A Martínez Ramírez and Joshua D Reiss. End-to-end equalization with convolutional neural networks. In 21st International Conference on Digital Audio Effects (DAFx-18). 2018.
- RM14
Joshua D Reiss and Andrew McPherson. Audio effects: theory, implementation and application. CRC Press, 2014.
- RM15
Danilo Rezende and Shakir Mohamed. Variational inference with normalizing flows. In International conference on machine learning, 1530–1538. PMLR, 2015.
- RFB15
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, 234–241. Springer, 2015.
- SBStoter+18
Michael Schoeffler, Sarah Bartoschek, Fabian-Robert Stöter, Marlene Roess, Susanne Westphal, Bernd Edler, and Jürgen Herre. Webmushra—a comprehensive framework for web-based listening tests. Journal of Open Research Software, 2018.
- SL61
Manfred R Schroeder and Benjamin F Logan. Colorless artificial reverberation. IRE Transactions on Audio, pages 209–214, 1961.
- SPSK11
Jeffrey Scott, Matthew Prockup, Erik M Schmidt, and Youngmoo E Kim. Automatic multi-track mixing using linear dynamical systems. In Proceedings of the 8th Sound and Music Computing Conference, Padova, Italy, 12. Citeseer, 2011.
- SerraPP21
Joan Serrà, Jordi Pons, and Santiago Pascual. Sesqa: semi-supervised learning for speech quality assessment. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 381–385. IEEE, 2021.
- SHC+22
Siyuan Shan, Lamtharn Hantrakul, Jitong Chen, Matt Avent, and David Trevelyan. Differentiable wavetable synthesis. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4598–4602. IEEE, 2022.
- SF19
Di Sheng and György Fazekas. A feature learning siamese model for intelligent control of the dynamic range compressor. In 2019 International Joint Conference on Neural Networks (IJCNN), 1–8. IEEE, 2019.
- Sko16
Esben Skovenborg. Development of semantic scales for music mastering. In Audio Engineering Society Convention 141. 2016.
- SDWMG15
Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, 2256–2265. PMLR, 2015.
- SE19
Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems, 2019.
- Spa97
James C Spall. A one-measurement form of simultaneous perturbation stochastic approximation. Automatica, 33(1):109–112, 1997.
- SB21
Janne Spijkervet and John Ashley Burgoyne. Contrastive learning of musical representations. arXiv preprint arXiv:2103.09410, 2021.
- SRDM19
Ryan Stables, Joshua D. Reiss, and Brecht De Man. Intelligent Music Production. Focal Press, 2019.
- SBR22
Christian J Steinmetz, Nicholas J Bryan, and Joshua D Reiss. Style transfer of audio effects with differentiable signal processing. arXiv preprint arXiv:2207.08759, 2022.
- SIC21
Christian J Steinmetz, Vamsi Krishna Ithapu, and Paul Calamia. Filtered noise shaping for time domain room impulse response estimation from reverberant speech. In 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 221–225. IEEE, 2021.
- SPPSerra21
Christian J Steinmetz, Jordi Pons, Santiago Pascual, and Joan Serrà. Automatic multitrack mixing with a differentiable mixing console of neural audio effects. In ICASSP. IEEE, 2021.
- SR20
Christian J Steinmetz and Joshua D Reiss. Auraloss: audio focused loss functions in pytorch. In Digital Music Research Network One-day Workshop. 2020.
- SPPS21
Christian J. Steinmetz, Jordi Pons, Santiago Pascual, and Joan Serrà. Automatic multitrack mixing with a differentiable mixing console of neural audio effects. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2021.
- SED18
Daniel Stoller, Sebastian Ewert, and Simon Dixon. Wave-u-net: a multi-scale neural network for end-to-end audio source separation. ISMIR, 2018.
- StoterULM19
Fabian-Robert Stöter, Stefan Uhlich, Antoine Liutkus, and Yuki Mitsufuji. Open-unmix-a reference implementation for music source separation. Journal of Open Source Software, 4(41):1667, 2019.
- TMB21
Zehai Tu, Ning Ma, and Jon Barker. Dhasp: differentiable hearing aid speech processing. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 296–300. IEEE, 2021.
- TSK+22
Joseph Turian, Jordie Shier, Humair Raj Khan, Bhiksha Raj, Björn W Schuller, Christian J Steinmetz, Colin Malloy, George Tzanetakis, Gissel Velarde, Kirk McNally, and others. Hear: holistic evaluation of audio representations. In NeurIPS 2021 Competitions and Demonstrations Track, 125–145. PMLR, 2022.
- TJM07
George Tzanetakis, Randy Jones, and Kirk McNally. Stereo panning features for classifying recording production style. In ISMIR, 441–444. 2007.
- VZolzerA06
Vincent Verfaille, U. Zölzer, and Daniel Arfib. Adaptive digital audio effects (A-DAFx): a new class of sound transformations. IEEE Transactions on Audio, Speech and Language Processing, 14(5):1817–1831, 2006.
- ValimakiPS+12
Vesa Välimäki, Julian D Parker, Lauri Savioja, Julius O Smith, and Jonathan S Abel. Fifty years of artificial reverberation. IEEE Transactions on Audio, Speech, and Language Processing, 20(5):1421–1448, 2012.
- ValimakiR16
Vesa Välimäki and Joshua D Reiss. All about audio equalization: solutions and frontiers. Applied Sciences, 6(5):129, 2016.
- WRA12
Dominic Ward, Joshua D Reiss, and Cham Athwal. Multitrack mixing using a model of loudness and partial loudness. In Audio Engineering Society Convention 133. Audio Engineering Society, 2012.
- WWM+17
Dominic Ward, Hagen Wierstorf, Russell Mason, Mark Plumbley, and Christopher Hummersone. Estimating the loudness balance of musical mixtures using audio source separation. In Proceedings of the 3rd Workshop on Intelligent Music Production (WIMP). 2017.
- WMMS20
Thomas Wilmering, David Moffat, Alessia Milo, and Mark B Sandler. A history of audio effects. Applied Sciences, 10(3):791, 2020.
- WFBS20
Minz Won, Andres Ferraro, Dmitry Bogdanov, and Xavier Serra. Evaluation of cnn-based automatic music tagging models. In Proc. of 17th Sound and Music Computing. 2020.
- WValimaki+22
Alec Wright, Vesa Välimäki, and others. Grey-box modelling of dynamic range compression. In Proc. Int. Conf. Digital Audio Effects (DAFX), Vienna, Austria, 304–311. 2022.
- WCZ+22
Yusong Wu, Ke Chen, Tianyu Zhang, Yuchen Hui, Taylor Berg-Kirkpatrick, and Shlomo Dubnov. Large-scale contrastive language-audio pretraining with feature fusion and keyword-to-caption augmentation. arXiv preprint arXiv:2211.06687, 2022.
- YSK20
Ryuichi Yamamoto, Eunwoo Song, and Jae-Min Kim. Parallel wavegan: a fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6199–6203. IEEE, 2020.
- Zolzer11
Udo Zölzer. DAFX: digital audio effects. John Wiley and Sons, 2011.
- ZolzerAA+02
Udo Zölzer, Xavier Amatriain, Daniel Arfib, Jordi Bonada, Giovanni De Poli, Pierre Dutilleux, Gianpaolo Evangelista, Florian Keiler, Alex Loscos, Davide Rocchesso, and others. DAFX-Digital audio effects. John Wiley and Sons, 2002.