Music detection using spectral peak analysis

Abstract

In one embodiment, a music detection (MD) module accumulates sets of one or more frames and performs FFT processing on each set to recover a set of coefficients, each corresponding to a different frequency k. For each frame, the module identifies candidate musical tones by searching for peak values in the set of coefficients. If a coefficient corresponds to a peak, then a variable TONE[k] corresponding to the coefficient is set equal to one. Otherwise, the variable is set equal to zero. For each variable TONE[k] having a value of one, a corresponding accumulator A[k] is increased. Candidate musical tones that are short in duration are filtered out by comparing each accumulator A[k] to a minimum duration threshold. A determination is made as to whether or not music is present based on a number of candidate musical tones and a sum of candidate musical tone durations using a state machine.

Aleksandr Petiushko Александр Петюшко
Aleksandr Petiushko Александр Петюшко
Director, Head of ML Research / Adjunct Professor / PhD

Principal R&D Researcher (15+ years of experience), R&D Technical Leader (10+ years of experience), and R&D Manager (8+ years of experience). Running and managing industrial research and academic collaboration (35+ publications, 30+ patents). Hiring and transforming AI/ML teams. Inspired by theoretical computer science and how it changes the world.