Music transcription [closed]

坚强是说给别人听的谎言 提交于 2019-12-02 21:30:43

This field is known as machine listening.

Polyphonic transcription of digitally encoded music is one of the holy grails of machine listening. It is an unsolved problem, and an area of active research. The sub-fields include:

  • Onset detection
  • Beat extraction (detection of the metric structure, time sig, etc)
  • Pitch detection (possible using auto-correllation, and other methods, on monophonic signals, but an unsolved problem when applied to complex polyphonic music)
  • Key detection (key signature detection).

Depending on the nature of your project, you might find it useful to explore the SuperCollider programming environment. SC is a language designed for projects such as this, already has a large number of machine listening plugins (ugens), and a comprehensive framework for dealing with FFT, audio signals, and much more.

Colin Pickard

This question about note onset detection contains a lot of information which may be useful to you.

This sounds a huge but very interesting project, good luck to you.

Music transcription means creating music notation from sound (or audio data). While accomplished musicians and especially composers are able to do this, it's an extremely difficult task to do with a machine, and as far as i know, there has been little success so far - mostly academic experiments.

Basically, to recognize notes, you want to know where they start, where they end, and what is their pitch. Fourier transform is the most basic way to turn time domain (audio data) to frequency domain (pitches) - in principle. In practice, musical instruments generate lots of harmonics (overtones) and if we have polyphony (many F0s) added, it's a mess.

You could try feeding something like 50 millisecond sequential slices of the audio data to the FFT. This way you would get the spectrum of each slice, then detect the strongest peaks in each slice, and infer the rhythm from what happens between successive slices.

Sorry, I couldn't help much... But just wanted to point out that what you're trying to do is extremely difficult, seriously. Perhaps you should start from something simpler, like detecting one-note sine wave melodies. Good luck!

For detecting the fundamental frequency of the melody in polyphonic music you can try out the MELODIA vamp plug-in (non-commercial use only): http://mtg.upf.edu/technologies/melodia

If you want to implement a melody extraction algorithm yourself you're going to have to check out the current state-of-the-art in research, a good place to start might be the MIREX melody extraction annual evaluation campaign: http://www.music-ir.org/mirex/wiki/Audio_Melody_Extraction

That, or just google "melody extraction" ;)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!