Fundamentals of music processing : using Python and Jupyter Notebooks

The textbook provides both profound technological knowledge and a comprehensive treatment of essential topics in music processing and music information retrieval (MIR). Including numerous examples, figures, and exercises, this book is suited for students, lecturers, and researchers working in audio...

Full description

Saved in:
Bibliographic Details
Main Author Muller, Meinard
Format eBook Book
LanguageEnglish
Published Cham Springer 2021
Springer International Publishing AG
Springer International Publishing
Edition2
Subjects
Online AccessGet full text
ISBN9783030698072
3030698076
DOI10.1007/978-3-030-69808-9

Cover

Table of Contents:
  • Chapter 8 Musically Informed Audio Decomposition -- 8.1 Harmonic-Percussive Separation -- 8.1.1 Horizontal-Vertical Spectrogram Decomposition -- 8.1.1.1 Median Filtering -- 8.1.1.2 Binary and Soft Masking -- 8.1.2 Signal Reconstruction -- 8.1.2.1 Signal Reconstruction from Original STFT -- 8.1.2.2 Signal Reconstruction from a Modified STFT -- 8.1.3 Applications -- 8.2 Melody Extraction -- 8.2.1 Instantaneous Frequency Estimation -- 8.2.2 Salience Representation -- 8.2.2.1 Refined Log-Frequency Spectrogram -- 8.2.2.2 Using Instantaneous Frequency -- 8.2.2.3 Harmonic Summation -- 8.2.3 Informed Fundamental Frequency Tracking -- 8.2.3.1 Continuity Constraints -- 8.2.3.2 Score-Informed Constraints -- 8.2.3.3 Applications -- 8.3 NMF-Based Audio Decomposition -- 8.3.1 Nonnegative Matrix Factorization -- 8.3.1.1 Formal Definition of NMF -- 8.3.1.2 Gradient Descent -- 8.3.1.3 Learning the Factorization Using Gradient Descent -- 8.3.1.4 Multiplicative Update Rules -- 8.3.2 Spectrogram Factorization -- 8.3.2.1 Template Constraints -- 8.3.2.2 Score-Informed Constraints -- 8.3.2.3 Onset Models -- 8.3.3 Audio Decomposition -- 8.3.3.1 Separation Process Using Spectral Masking -- 8.3.3.2 Notewise Audio Processing -- 8.3.3.3 Audio Editing -- 8.4 Summary and Further Readings -- Harmonic-Percussive Separation -- Melody Extraction -- NMF-Based Audio Decomposition -- 8.5 FMP Notebooks -- References -- Exercises -- Index
  • 4.2.1 Basic Definitions and Properties -- 4.2.2 Enhancement Strategies -- 4.2.2.1 Feature Representation -- 4.2.2.2 Path Smoothing -- 4.2.2.3 Transposition Invariance -- 4.2.2.4 Thresholding -- 4.3 Audio Thumbnailing -- 4.3.1 Fitness Measure -- 4.3.1.1 Path Family -- 4.3.1.2 Optimization Scheme -- 4.3.1.3 Definition of Fitness Measure -- 4.3.1.4 Thumbnail Selection -- 4.3.2 Scape Plot Representation -- 4.3.3 Discussion of Properties -- 4.4 Novelty-Based Segmentation -- 4.4.1 Novelty Detection -- 4.4.2 Structure Features -- 4.5 Evaluation -- Precision, Recall, F-Measure -- Structure Annotations -- Labeling Evaluation -- Boundary Evaluation -- Thumbnail Evaluation -- 4.6 Summary and Further Readings -- Self-Similarity Matrices -- Audio Thumbnailing -- Segmentation Approaches -- Evaluation -- 4.7 FMP Notebooks -- References -- Exercises -- Chapter 5 Chord Recognition -- 5.1 Basic Theory of Harmony -- 5.1.1 Intervals -- 5.1.1.1 Semitone Differences -- 5.1.1.2 Frequency Ratios -- 5.1.1.3 Consonance and Dissonance -- 5.1.2 Chords and Scales -- 5.1.2.1 Triads -- 5.1.2.2 Major and Minor Chords -- 5.1.2.3 Musical Scales -- 5.1.2.4 Circle of Fifths -- 5.1.2.5 Functional Relation of Chords -- 5.1.2.6 Chord Progressions -- 5.2 Template-Based Chord Recognition -- 5.2.1 Basic Approach -- 5.2.2 Evaluation -- 5.2.2.1 Manual Annotation -- 5.2.2.2 Precision, Recall, F-measure -- 5.2.3 Ambiguities in Chord Recognition -- 5.2.3.1 Chord Ambiguities -- 5.2.3.2 Acoustic Ambiguities -- 5.2.3.3 Tuning -- 5.2.3.4 Segmentation Ambiguities -- 5.2.4 Enhancement Strategies -- 5.2.4.1 Templates with Harmonics -- 5.2.4.2 Templates from Examples -- 5.2.4.3 Spectral Enhancement -- 5.2.4.4 Prefiltering -- 5.3 HMM-Based Chord Recognition -- 5.3.1 Markov Chains and Transition Probabilities -- 5.3.2 Hidden Markov Models -- 5.3.3 Evaluation and Model Specification
  • 5.3.3.1 Evaluation Problem -- 5.3.3.2 Uncovering Problem -- 5.3.3.3 Estimation Problem -- 5.3.4 Application to Chord Recognition -- 5.3.4.1 Specification of Emission Probabilities -- 5.3.4.2 Specification of Transition Probabilities -- 5.3.4.3 Effect of HMM-Based Postfiltering -- 5.4 Summary and Further Readings -- Chord Recognition -- Hidden Markov Models -- 5.5 FMP Notebooks -- References -- Exercises -- Chapter 6 Tempo and Beat Tracking -- 6.1 Onset Detection -- 6.1.1 Energy-Based Novelty -- 6.1.2 Spectral-Based Novelty -- 6.1.3 Phase-Based Novelty -- 6.1.4 Complex-Domain Novelty -- 6.2 Tempo Analysis -- 6.2.1 Tempogram Representations -- 6.2.2 Fourier Tempogram -- 6.2.3 Autocorrelation Tempogram -- 6.2.4 Cyclic Tempogram -- 6.3 Beat and Pulse Tracking -- 6.3.1 Predominant Local Pulse -- 6.3.1.1 Definition of PLP Function -- 6.3.1.2 Discussion of Properties -- 6.3.2 Beat Tracking by Dynamic Programming -- 6.3.3 Adaptive Windowing -- 6.4 Summary and Further Readings -- Onset Detection -- Tempo Analysis -- Beat Tracking -- 6.5 FMP Notebooks -- References -- Exercises -- Chapter 7 Content-Based Audio Retrieval -- 7.1 Audio Identification -- 7.1.1 General Requirements -- 7.1.2 Audio Fingerprints Based on Spectral Peaks -- 7.1.2.1 Design of Audio Fingerprints -- 7.1.2.2 Fingerprint Matching -- 7.1.3 Indexing, Retrieval, Inverted Lists -- 7.1.4 Index-Based Audio Identification -- 7.2 Audio Matching -- 7.2.1 General Requirements and Feature Design -- 7.2.2 Diagonal Matching -- 7.2.3 DTW-Based Matching -- 7.3 Version Identification -- 7.3.1 Versions in Music -- 7.3.1.1 Types of Versions -- 7.3.1.2 Types of Modifications -- 7.3.2 Identification Procedure -- 7.3.3 Evaluation Measures -- 7.4 Summary and Further Readings -- Audio Identification -- Audio Matching -- Version Identification -- Alignment Scenarios -- 7.5 FMP Notebooks -- References -- Exercises
  • Intro -- Preface to the Second Edition -- Preface to the First Edition -- Content -- Target Readership -- View: A First Course in Music Processing -- View: Introduction to Fourier Analysis and Applications -- View: Data Representations and Algorithms -- Acknowledgements -- Contents -- Basic Symbols and Notions -- Chapter 1 Music Representations -- 1.1 Sheet Music Representations -- 1.1.1 Musical Notes and Pitches -- 1.1.2 Western Music Notation -- 1.2 Symbolic Representations -- 1.2.1 Piano-Roll Representations -- 1.2.2 MIDI Representations -- 1.2.3 Score Representations -- 1.2.4 Optical Music Recognition -- 1.3 Audio Representation -- 1.3.1 Waves and Waveforms -- 1.3.2 Frequency and Pitch -- 1.3.3 Dynamics, Intensity, and Loudness -- 1.3.4 Timbre -- 1.4 Summary and Further Readings -- 1.5 FMP Notebooks -- References -- Exercises -- Chapter 2 Fourier Analysis of Signals -- 2.1 The Fourier Transform in a Nutshell -- 2.1.1 Fourier Transform for Analog Signals -- 2.1.1.1 The Role of the Phase -- 2.1.1.2 Computing Similarity with Integrals -- 2.1.1.3 First Definition of the Fourier Transform -- 2.1.1.4 Complex Numbers -- 2.1.1.5 Complex Definition of the Fourier Transform -- 2.1.1.6 Fourier Representation -- 2.1.2 Examples -- 2.1.3 Discrete Fourier Transform -- 2.1.4 Short-Time Fourier Transform -- 2.2 Signals and Signal Spaces -- 2.2.1 Analog Signals -- 2.2.2 Digital Signals -- 2.2.2.1 Sampling -- 2.2.2.2 Quantization -- 2.2.3 Signal Spaces -- 2.2.3.1 Complex Numbers -- 2.2.3.2 Vector Spaces -- 2.2.3.3 Inner Products -- 2.2.3.4 The Space l2(Z) -- 2.2.3.5 The Space L2(R) -- 2.2.3.6 The Space L2([0 -- 1)) -- 2.2.3.7 Hilbert Spaces -- 2.3 Fourier Transform -- 2.3.1 Fourier Transform for Periodic CT-Signals -- 2.3.2 Complex Formulation of the Fourier Transform -- 2.3.2.1 Exponential Function -- 2.3.2.2 Polar Coordinates -- 2.3.2.3 Complex Fourier Series
  • 2.3.2.4 Relation Between Complex and Real Fourier Series -- 2.3.3 Fourier Transform for CT-Signals -- 2.3.3.1 Interference -- 2.3.3.2 Fourier Transform for Impulses -- 2.3.3.3 Translation and Modulation -- 2.3.4 Fourier Transform for DT-Signals -- 2.3.4.1 Periodicity and Aliasing -- 2.3.4.2 Riemann Approximation -- 2.3.4.3 Chirp Signal Example -- 2.4 Discrete Fourier Transform (DFT) -- 2.4.1 Signals of Finite Length -- 2.4.2 Definition of the DFT -- 2.4.3 Fast Fourier Transform (FFT) -- 2.4.4 Interpretation of the DFT -- 2.5 Short-Time Fourier Transform (STFT) -- 2.5.1 Definition of the STFT -- 2.5.1.1 Alternative Definition of the STFT -- 2.5.1.2 Role of the Window Function -- 2.5.2 Spectrogram Representation -- 2.5.3 Discrete Version of the STFT -- 2.5.3.1 Summary -- 2.5.3.2 Examples -- 2.6 Summary and Further Readings -- 2.7 FMP Notebooks -- References -- Exercises -- Chapter 3 Music Synchronization -- 3.1 Audio Features -- 3.1.1 Log-Frequency Spectrogram -- 3.1.2 Chroma Features -- 3.1.2.1 Logarithmic Compression -- 3.1.2.2 Transpositions -- 3.1.2.3 Concluding Example -- 3.2 Dynamic Time Warping -- 3.2.1 Basic Approach -- 3.2.1.1 Warping Path -- 3.2.1.2 OptimalWarping Path and DTW Distance -- 3.2.1.3 Dynamic Programming Algorithm -- 3.2.2 DTW Variants -- 3.2.2.2 LocalWeights -- 3.2.2.3 Global Constraints -- 3.2.2.4 Multiscale DTW -- 3.3 Applications -- 3.3.1 Multimodal Music Navigation -- 3.3.1.1 Interpretation Switcher Interface -- 3.3.1.2 Score Viewer Interface -- 3.3.2 Tempo Curves -- 3.4 Summary and Further Readings -- Audio Features -- Dynamic Time Warping -- Music Synchronization -- Applications -- 3.5 FMP Notebooks -- References -- Exercises -- Chapter 4 Music Structure Analysis -- 4.1 General Principles -- 4.1.1 Segmentation and Structure Analysis -- 4.1.2 Musical Structure -- 4.1.3 Musical Dimensions -- 4.2 Self-Similarity Matrices