Chord detection algorithms?

后端 未结 7 1084
别那么骄傲
别那么骄傲 2020-12-12 15:40

I am developing software that depends on musical chords detection. I know some algorithms for pitch detection, with techniques based on cepstral analysis or autocorrelation

相关标签:
7条回答
  • 2020-12-12 16:17

    Also http://www.schmittmachine.com/dywapitchtrack.html

    The dywapitchtrack library computes the pitch of an audio stream in real time. The pitch is the main frequency of the waveform (the 'note' being played or sung). It is expressed as a float in Hz.

    And http://clam-project.org/ may help a little.

    0 讨论(0)
  • 2020-12-12 16:21

    The author of Capo, a transcription program for the Mac, has a pretty in-depth blog. The entry "A Note on Auto Tabbing" has some good jumping off points:

    I started researching different methods of automatic transcription in mid-2009, because I was curious about how far along this technology was, and if it could be integrated into a future version of Capo.

    Each of these automatic transcription algorithms start out with some kind of intermediate represenation of the audio data, and then they transfer that into a symbolic form (i.e. note onsets, and durations).

    This is where I encountered some computationally expensive spectral representations (The Continuous Wavelet Transform (CWT), Constant Q Transform (CQT), and others.) I implemented all of these spectral transforms so that I could also implement the algorithms presented by the papers I was reading. This would give me an idea of whether they would work in practice.

    Capo has some impressive technology. The standout feature is that its main view is not a frequency spectrogram like most other audio programs. It presents the audio like a piano roll, with the notes visible to the naked eye.


    (source: supermegaultragroovy.com)

    (Note: The hard note bars were drawn by a user. The fuzzy spots underneath are what Capo displays.)

    0 讨论(0)
  • 2020-12-12 16:25

    This is a very difficult pattern matching problem, probably suitable for an AI technique such as training neural nets or genetic algorithms.

    Basically, at every point in time, you guess the number of notes being play, the notes, the instruments that played the notes, the amplitudes, and the duration of the note. Then you sum the magnitudes of all the harmonics and overtones that all those instruments would generate when played at that volume at that point in thier envelope (attack, decay, etc.). Subtract the sum of all those harmonics from the spectrum of you signal, then minimize the difference over all possibilities. Pattern recognition of the thump/squeak/pluck transient noise/etc. at the very onset of the note might also be important. Then do some decision analysis to make sure your choices make sense (e.g. a clarinet didn't suddenly change into a trumpet playing another note and back again 80 mS later), to minimize the error probability.

    If you can constrain your choices (e.g. only 2 flutes playing only quarter notes, etc.), especially to instruments with very limited overtone energy, it makes the problem a lot easier.

    0 讨论(0)
  • 2020-12-12 16:30

    This is quite a good Open Source Project: https://patterns.enm.bris.ac.uk/hpa-software-package

    It detects chords based on a chromagram - a good solution, breaks down a window of the whole spectrum onto an array of pitch classes (size: 12) with float values. Then, chords can be detected by a Hidden Markov Model.

    .. should provide you with everything you need. :)

    0 讨论(0)
  • 2020-12-12 16:31

    See my answer to this question: How can I do real-time pitch detection in .Net?

    The reference to this IEEE paper is mainly what you're looking for: http://ieeexplore.ieee.org/Xplore/login.jsp?reload=true&url=/iel5/89/18967/00876309.pdf?arnumber=876309

    The harmonics are throwing you off. Plus, humans can find fundamentals in sound even when the fundamental isn't present! Think of reading, but by covering half of the letters. The brain fills in the gaps.

    The context of other sounds in the mix, and what came before, is very important to how we perceive notes.

    0 讨论(0)
  • 2020-12-12 16:32

    This post is a bit old, but I thought I'd add the following paper to the discussion:

    Klapuri,Anssi; Multipitch Analysis of Polyphonic Music and Speech Signals Using an Auditory Model; IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 2, FEBRUARY 2008 255

    The paper acts somewhat like a literature review of multipitch analysis and discusses a method based on an auditory model:

    (The image is from the paper. I don't know if I have to get permission to post it.)

    0 讨论(0)
提交回复
热议问题