I recently discovered the GNSDK (Gracenote SDK) that seems to provide examples in several programming languages to recognize music samples by fingerprinting them, and then to request their audio database to get the corresponding artist and song title.
But the documentation is horrible.
How can I, using Python and the GNSDK, perform a recognition of an audio sample file? There isn't any examples or tutorials in the provided docs.
Edit: I really want to use the GNSDK with Python. Don't post anything unrelated, you'll waste your time.
I ended up using ACRCloud which works very well. Seems that everyone that want to use Gracenote fall back to ACRCloud for reasons... Now I know why.
Python example:
from acrcloud.recognizer import ACRCloudRecognizer
config = {
'host': 'eu-west-1.api.acrcloud.com',
'access_key': 'access key',
'access_secret': 'secret key',
'debug': True,
'timeout': 10
}
acrcloud = ACRCloudRecognizer(config)
print(acrcloud.recognize_by_file('sample of a track.wav', 0))
Keywords are: Beat Spectrum Analysis and Rhythm Detection.
This is a well know Python library can contain a solution for your question: https://github.com/aubio/aubio
Also I recommend that you should check this page for other libraries: https://wiki.python.org/moin/PythonInMusic
Lastly this project more Python friendly solution and easy way to start: https://github.com/librosa/librosa
an example from Librosa to calculate tempo(beats per minute) for the song:
# Beat tracking example
from __future__ import print_function
import librosa
# 1. Get the file path to the included audio example
filename = librosa.util.example_audio_file()
# 2. Load the audio as a waveform `y`
# Store the sampling rate as `sr`
y, sr = librosa.load(filename)
# 3. Run the default beat tracker
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
print('Estimated tempo: {:.2f} beats per minute'.format(tempo))
# 4. Convert the frame indices of beat events into timestamps
beat_times = librosa.frames_to_time(beat_frames, sr=sr)
print('Saving output to beat_times.csv')
librosa.output.times_csv('beat_times.csv', beat_times)
But I have to mention that this field is a very immature field in computer science and every a new paper comes up for that. So it will be useful for you if you also follow scholars for recent discoveries.
ADDITION:
Web API Wrappers mentioned in Gracenote's official docs: https://developer.gracenote.com/web-api#python
For Python:
https://github.com/cweichen/pygn
But as you can see this wrapper is not well documented and immature. Because of that I suggest you to use this Ruby wrapper instead of Python;
For Ruby:
https://github.com/JDiPierro/tmsapi
require 'tmsapi'
# Create Instace of the API
tms = TMSAPI::API.new :api_key => 'API_KEY_HERE'
# Get all movie showtimes for Austin Texas
movie_showings = tms.movies.theatres.showings({ :zip => "78701" })
# Print out the movie name, theatre name, and date/time of the showing.
movie_showings.each do |movie|
movie.showtimes.each do |showing|
puts "#{movie.title} is playing at '#{showing.theatre.name}' at #{showing.date_time}."
end
end
# 12 Years a Slave is playing at 'Violet Crown Cinema' at 2013-12-23T12:45.
# A Christmas Story is playing at 'Alamo Drafthouse at the Ritz' at 2013-12-23T16:00.
# American Hustle is playing at 'Violet Crown Cinema' at 2013-12-23T11:00.
# American Hustle is playing at 'Violet Crown Cinema' at 2013-12-23T13:40.
# American Hustle is playing at 'Violet Crown Cinema' at 2013-12-23T16:20.
# American Hustle is playing at 'Violet Crown Cinema' at 2013-12-23T19:00.
# American Hustle is playing at 'Violet Crown Cinema' at 2013-12-23T21:40.
If you are not comfortable with Ruby or Ruby on Rails then the only option is developing your own Python wrapper.
Just reading your headline question and because there are no examples or tutorials for GNSDK, try looking at other options,
for one:
dejavu
Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here:
Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played.
https://github.com/worldveil/dejavu
seems about right.
来源:https://stackoverflow.com/questions/38075577/how-to-recognize-a-music-sample-using-python-and-gracenote