Reading in pydub AudioSegment from url. BytesIO returning “OSError [Errno 2] No such file or directory” on heroku only; fine on localhost

拈花ヽ惹草 提交于 2019-12-08 06:45:43

问题


EDIT 1 for anyone with the same error: installing ffmpeg did indeed solve that BytesIO error

EDIT 1 for anyone still willing to help: my problem is now that when I AudioSegment.export("filename.mp3", format="mp3"), the file is made, but has size 0 bytes -- details below (as "EDIT 1")


EDIT 2: All problems now solved.

  • Files can be read in as AudioSegment using BytesIO
  • I found buildpacks to ensure ffmpeg was installed correctly on my app, with lame support for exporting proper mp3 files

Answer below


Original question

I have pydub working nicely locally to crop a particular mp3 file based on parameters in the url. (?start_time=3.8&end_time=5.1)

When I run foreman start it all looks good on localhost. The html renders nicely. The key lines from the views.py include reading in a file from a url using

url = "https://s3.amazonaws.com/shareducate02/The_giving_tree__by_Alex_Blumberg__sponsored_by_mailchimp-short.mp3"
mp3 = urllib.urlopen(url).read() # inspired by http://nbviewer.ipython.org/github/ipython-books/cookbook-code/blob/master/notebooks/chapter11_image/06_speech.ipynb
original=AudioSegment.from_mp3(BytesIO(mp3))  # AudioSegment.from_mp3 is a pydub command, see http://pydub.com
section = original[start_time_ms:end_time_ms]

That all works great... until I push to heroku (django app) and run it online. then when I load the same page now on the herokuapp.com, I get this error

OSError at /path/to/page
[Errno 2] No such file or directory
Request Method: GET
Request URL:    http://my.website.com/path/to/page?start_time=3.8&end_time=5
Django Version: 1.6.5
Exception Type: OSError
Exception Value:    
[Errno 2] No such file or directory
Exception Location: /app/.heroku/python/lib/python2.7/subprocess.py in _execute_child, line 1327
Python Executable:  /app/.heroku/python/bin/python
Python Version: 2.7.8
Python Path:    
['/app',
 '/app/.heroku/python/bin',
 '/app/.heroku/python/lib/python2.7/site-packages/setuptools-5.4.1-py2.7.egg',
 '/app/.heroku/python/lib/python2.7/site-packages/distribute-0.6.36-py2.7.egg',
 '/app/.heroku/python/lib/python2.7/site-packages/pip-1.3.1-py2.7.egg',
 '/app',
 '/app/.heroku/python/lib/python27.zip',
 '/app/.heroku/python/lib/python2.7',
 '/app/.heroku/python/lib/python2.7/plat-linux2',
 '/app/.heroku/python/lib/python2.7/lib-tk',
 '/app/.heroku/python/lib/python2.7/lib-old',
 '/app/.heroku/python/lib/python2.7/lib-dynload',
 '/app/.heroku/python/lib/python2.7/site-packages',
 '/app/.heroku/python/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg-info']


Traceback:
File "/app/.heroku/python/lib/python2.7/site-packages/django/core/handlers/base.py" in get_response
  112.                     response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/app/evernote/views.py" in finalize
  105.       original=AudioSegment.from_mp3(BytesIO(mp3))
File "/app/.heroku/python/lib/python2.7/site-packages/pydub/audio_segment.py" in from_mp3
  318.         return cls.from_file(file, 'mp3')
File "/app/.heroku/python/lib/python2.7/site-packages/pydub/audio_segment.py" in from_file
  302.         retcode = subprocess.call(convertion_command, stderr=open(os.devnull))
File "/app/.heroku/python/lib/python2.7/subprocess.py" in call
  522.     return Popen(*popenargs, **kwargs).wait()
File "/app/.heroku/python/lib/python2.7/subprocess.py" in __init__
  710.                                 errread, errwrite)
File "/app/.heroku/python/lib/python2.7/subprocess.py" in _execute_child
  1327.                 raise child_exception

I have commented out some of the original to convince myself that sure enough the single line original=AudioSegment.from_mp3(BytesIO(mp3)) is where the problem kicks in... but this is not a problem locally

The full function in views.py starts like this:

from django.shortcuts import render, get_object_or_404 
from django.http import HttpResponseRedirect #, Http404, HttpResponse
from django.core.urlresolvers import reverse
from django.views import generic
import pydub
# Maybe only need: 
from pydub import AudioSegment # == see below
from time import gmtime, strftime

import boto
from boto.s3.connection import S3Connection
from boto.s3.key import Key

# http://nbviewer.ipython.org/github/ipython-books/cookbook-code/blob/master/notebooks/chapter11_image/06_speech.ipynb
import urllib
from io import BytesIO
# import numpy as np
# import scipy.signal as sg
# import pydub # mentioned above already
# import matplotlib.pyplot as plt
# from IPython.display import Audio, display
# import matplotlib as mpl
# %matplotlib inline

import os
# from settings import AWS_ACCESS_KEY, AWS_SECRET_KEY, AWS_BUCKET_NAME
AWS_ACCESS_KEY = os.environ.get('AWS_ACCESS_KEY') # there must be a better way?
AWS_SECRET_KEY = os.environ.get('AWS_SECRET_KEY')
AWS_BUCKET_NAME = os.environ.get('S3_BUCKET_NAME')

# http://stackoverflow.com/questions/415511/how-to-get-current-time-in-python

boto_conn = S3Connection(AWS_ACCESS_KEY, AWS_SECRET_KEY)
bucket = boto_conn.get_bucket(AWS_BUCKET_NAME)
s3_url_format = 'https://s3.amazonaws.com/shareducate02/{end_path}'

and specifically the view in views.py that's called when I visit the page:

def finalize(request):

    start_time = request.GET.get('start_time')

    end_time = request.GET.get('end_time')

    original_file = "https://s3.amazonaws.com/shareducate02/The_giving_tree__by_Alex_Blumberg__sponsored_by_mailchimp-short.mp3"


    if start_time:

      # original=AudioSegment.from_mp3(original_file)  #...that didn't work 
      # but this works below:

      # next three uncommented lines from http://nbviewer.ipython.org/github/ipython-books/cookbook-code/blob/master/notebooks/chapter11_image/06_speech.ipynb
      # python 2.x
      url = original_file
      # req = urllib.Request(url, headers={'User-Agent': ''}) # Note: I commented out this because I got error that "Request" did not exist
      mp3 = urllib.urlopen(url).read()
      # That's for my 2.7

      # If I ever upgrade to python 3.x, would need to change it to:
      # req = urllib.request.Request(url, headers={'User-Agent': ''}) 
      # mp3 = urllib.request.urlopen(req).read()
      # as per instructions on http://nbviewer.ipython.org/github/ipython-books/cookbook-code/blob/master/notebooks/chapter11_image/06_speech.ipynb

      original=AudioSegment.from_mp3(BytesIO(mp3))
      # original=AudioSegment.from_mp3("static/givingtree.mp3") # alternative that works locally (on laptop) but no use for heroku

      start_time_ms = int(float(start_time) * 1000)
      if end_time:
        end_time_ms = int(float(end_time) * 1000)
      else:
        end_time_ms = int(float(original.duration_seconds) * 1000)
      duration_ms = end_time_ms - start_time_ms
      # duration = end_time - start_time
      duration = duration_ms/1000

   #   section = original[start_time_ms:end_time_ms]
   #   section_with_fading = section.fade_in(100).fade_out(100)

      clip = "demo-"
      number = strftime("%Y-%m-%d_%H-%M-%S", gmtime())
      clip += number
      clip += ".mp3" 

      # DON'T BOTHER writing locally:
      # clip_with_path = "evernote/static/"+clip
      # section_with_fading.export(clip_with_path, format = "mp3")

   #   tempclip = section_with_fading.export(format = "mp3")

      # commented out while de-bugging, but was working earlier if run on localhost
      # c = boto.connect_s3()
      # b = c.get_bucket(S3_BUCKET_NAME)  # as defined above
      # k = Key(b)
      # k.key=clip
      # # k.set_contents_from_filename(clip_with_path)
      # k.set_contents_from_file(tempclip)
      # k.set_acl('public-read')
      clip_made = True
    else: 
      duration = 0.0
      clip_made = False
      clip = ""
    context = {'original_file':original_file, 'new_file':clip, 'start_time': start_time, 'end_time':end_time, 'duration':duration, 'clip_made':clip_made} 
    return render(request, 'finalize.html' , context) 

Any suggestions?

Potentially related: I have ffmpeg installed locally

But have been unable to install it onto heroku, due to not understanding buildpacks. I tried just a moment ago (http://stackoverflow.com/questions/14407388/how-to-install-ffmpeg-for-a-django-app-on-heroku and https://github.com/shunjikonishi/heroku-buildpack-ffmpeg) but so far ffmpeg is not working on heroku (ffmpeg is not recognised when I do "heroku run ffmpeg --version") ...do you think this is the reason?

An answer like any of these would be much appreciated as I'm going round in circles here:

  1. "I think ffmpeg is indeed your problem. Try harder to sort that out, to get it installed on heroku"
  2. "Actually, I think this is why BytesIO is not working for you: ..."
  3. "Your approach is terrible anyway... if you want to read in an audio file to process using pydub, you should just do this instead: ..." (since I'm just hacking my way through pydub for my first time... my approach may be poor)

EDIT 1

ffmpeg is now installed (e.g., I can output wav files)

However, I can't create mp3 files, still... or more correctly, I can, but the filesize is zero

(venv-app)moriartymacbookair13:getstartapp macuser$ heroku config:add BUILDPACK_URL=https://github.com/ddollar/heroku-buildpack-multi.git 
Setting config vars and restarting awe01... done, v93
BUILDPACK_URL: https://github.com/ddollar/heroku-buildpack-multi.git
(venv-app)moriartymacbookair13:getstartapp macuser$ vim .buildpacks 
(venv-app)moriartymacbookair13:getstartapp macuser$ cat .buildpacks 
https://github.com/shunjikonishi/heroku-buildpack-ffmpeg.git
https://github.com/heroku/heroku-buildpack-python.git
(venv-app)moriartymacbookair13:getstartapp macuser$ git add --all
(venv-app)moriartymacbookair13:getstartapp macuser$ git commit -m "need multi, not just ffmpeg, so adding back in multi + shun + heroku, with trailing .git in .buildpacks file"
[master cd99fef] need multi, not just ffmpeg, so adding back in multi + shun + heroku, with trailing .git in .buildpacks file
 1 file changed, 2 insertions(+), 2 deletions(-)
(venv-app)moriartymacbookair13:getstartapp macuser$ git push heroku master
Fetching repository, done.
Counting objects: 5, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 372 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)

-----> Fetching custom git buildpack... done
-----> Multipack app detected
=====> Downloading Buildpack: https://github.com/shunjikonishi/heroku-buildpack-ffmpeg.git
=====> Detected Framework: ffmpeg
-----> Install ffmpeg
       DOWNLOAD_URL =  http://flect.github.io/heroku-binaries/libs/ffmpeg.tar.gz
       exporting PATH and LIBRARY_PATH
=====> Downloading Buildpack: https://github.com/heroku/heroku-buildpack-python.git
=====> Detected Framework: Python
-----> Installing dependencies with pip
       Cleaning up...

-----> Preparing static assets
       Collectstatic configuration error. To debug, run:
       $ heroku run python ./example/manage.py collectstatic --noinput

Using release configuration from last framework (Python).
-----> Discovering process types
       Procfile declares types -> web

-----> Compressing... done, 198.1MB
-----> Launching... done, v94
       http://[redacted].herokuapp.com/ deployed to Heroku

To git@heroku.com:awe01.git
   78d6b68..cd99fef  master -> master
(venv-app)moriartymacbookair13:getstartapp macuser$ heroku run ffmpeg
Running `ffmpeg` attached to terminal... up, run.6408
ffmpeg version git-2013-06-02-5711e4f Copyright (c) 2000-2013 the FFmpeg developers
  built on Jun  2 2013 07:38:40 with gcc 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1)
  configuration: --enable-shared --disable-asm --prefix=/app/vendor/ffmpeg
  libavutil      52. 34.100 / 52. 34.100
  libavcodec     55. 13.100 / 55. 13.100
  libavformat    55.  8.102 / 55.  8.102
  libavdevice    55.  2.100 / 55.  2.100
  libavfilter     3. 74.101 /  3. 74.101
  libswscale      2.  3.100 /  2.  3.100
  libswresample   0. 17.102 /  0. 17.102
Hyper fast Audio and Video encoder
usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] outfile}...

Use -h to get full help or, even better, run 'man ffmpeg'
(venv-app)moriartymacbookair13:getstartapp macuser$ heroku run bash
Running `bash` attached to terminal... up, run.9660
~ $ python
Python 2.7.8 (default, Jul  9 2014, 20:47:08) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pydub
>>> from pydub import AudioSegment
>>> exit()
~ $ which ffmpeg
/app/vendor/ffmpeg/bin/ffmpeg
~ $ python 

Python 2.7.8 (default, Jul  9 2014, 20:47:08) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pydub
>>> from pydub import AudioSegment
>>> AudioSegment.silent(5000).export("/tmp/asdf.mp3", "mp3")
<open file '/tmp/asdf.mp3', mode 'wb+' at 0x7f9a37d44780>
>>> exit ()
~ $ cd /tmp/
/tmp $ ls
asdf.mp3
/tmp $ open asdf.mp3
bash: open: command not found
/tmp $ ls -lah
total 8.0K
drwx------  2 u36483 36483 4.0K 2014-10-22 04:14 .
drwxr-xr-x 14 root   root  4.0K 2014-09-26 07:08 ..
-rw-------  1 u36483 36483    0 2014-10-22 04:14 asdf.mp3

Note the file size of 0 above for the mp3 file... when I do the same thing on my macbook, the file size is never zero

Back to the heroku shell:

/tmp $ python
Python 2.7.8 (default, Jul  9 2014, 20:47:08) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pydub
>>> from pydub import AudioSegment
>>> pydub.AudioSegment.ffmpeg = "/app/vendor/ffmpeg/bin/ffmpeg" 
>>> AudioSegment.silence(1200).export("/tmp/herokuSilence.mp3", format="mp3")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'AudioSegment' has no attribute 'silence'
>>> AudioSegment.silent(1200).export("/tmp/herokuSilence.mp3", format="mp3")
<open file '/tmp/herokuSilence.mp3', mode 'wb+' at 0x7fcc2017c780>
>>> exit()
/tmp $ ls
asdf.mp3  herokuSilence.mp3
/tmp $ ls -lah
total 8.0K
drwx------  2 u36483 36483 4.0K 2014-10-22 04:29 .
drwxr-xr-x 14 root   root  4.0K 2014-09-26 07:08 ..
-rw-------  1 u36483 36483    0 2014-10-22 04:14 asdf.mp3
-rw-------  1 u36483 36483    0 2014-10-22 04:29 herokuSilence.mp3

I realised the first time that I had forgotten the pydub.AudioSegment.ffmpeg = "/app/vendor/ffmpeg/bin/ffmpeg" command, but as you can see above, the file is still zero size

Out of desperation, I even tried adding the ".heroku" into the path to be as verbatim as your example, but that didn't fix it:

/tmp $ python
Python 2.7.8 (default, Jul  9 2014, 20:47:08) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pydub
>>> from pydub import AudioSegment
>>> pydub.AudioSegment.ffmpeg = "/app/.heroku/vendor/ffmpeg/bin/ffmpeg"
>>> AudioSegment.silent(1200).export("/tmp/herokuSilence03.mp3", format="mp3")
<open file '/tmp/herokuSilence03.mp3', mode 'wb+' at 0x7fc92aca7780>
>>> exit()
/tmp $ ls -lah
total 8.0K
drwx------  2 u36483 36483 4.0K 2014-10-22 04:31 .
drwxr-xr-x 14 root   root  4.0K 2014-09-26 07:08 ..
-rw-------  1 u36483 36483    0 2014-10-22 04:14 asdf.mp3
-rw-------  1 u36483 36483    0 2014-10-22 04:31 herokuSilence03.mp3
-rw-------  1 u36483 36483    0 2014-10-22 04:29 herokuSilence.mp3

Finally, I tried exporting a .wav file to check pydub was at least working correctly

/tmp $ python
Python 2.7.8 (default, Jul  9 2014, 20:47:08) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pydub
>>> from pydub import AudioSegment
>>> pydub.AudioSegment.ffmpeg = "/app/vendor/ffmpeg/bin/ffmpeg"
>>> AudioSegment.silent(1300).export("/tmp/heroku_wav_silence01.wav", format="wav")
<open file '/tmp/heroku_wav_silence01.wav', mode 'wb+' at 0x7fa33cbf3780>
>>> exit()
/tmp $ ls
asdf.mp3  herokuSilence03.mp3  herokuSilence.mp3  heroku_wav_silence01.wav
/tmp $ ls -lah
total 40K
drwx------  2 u36483 36483 4.0K 2014-10-22 04:42 .
drwxr-xr-x 14 root   root  4.0K 2014-09-26 07:08 ..
-rw-------  1 u36483 36483    0 2014-10-22 04:14 asdf.mp3
-rw-------  1 u36483 36483    0 2014-10-22 04:31 herokuSilence03.mp3
-rw-------  1 u36483 36483    0 2014-10-22 04:29 herokuSilence.mp3
-rw-------  1 u36483 36483  29K 2014-10-22 04:42 heroku_wav_silence01.wav
/tmp $ 

At least that filesize for .wav is non-zero, so pydub is working

My current theory is that either I'm still not using ffmpeg correctly, or it's insufficient... maybe I need an mp3 additional install on top of basic ffmpeg.

Several sites mention "libavcodec-extra-53" but I'm not sure how to install that on heroku, or to check if I have it? https://github.com/jiaaro/pydub/issues/36 Similarly tutorials on libmp3lame seem to be geared towards laptop installation rather than installation on heroku, so I'm at a loss http://superuser.com/questions/196857/how-to-install-libmp3lame-for-ffmpeg

In case relevant, I also have youtube-dl in my requirements.txt... this also works locally on my macbook, but fails when I run it in the heroku shell:

~/ytdl $ youtube-dl --restrict-filenames -x --audio-format mp3 n2anDgdUHic
[youtube] Setting language
[youtube] Confirming age
[youtube] n2anDgdUHic: Downloading webpage
[youtube] n2anDgdUHic: Downloading video info webpage
[youtube] n2anDgdUHic: Extracting video information
[download] Destination: Boyce_Avenue_feat._Megan_Nicole_-_Skyscraper_Patrick_Ebert_Edit-n2anDgdUHic.m4a
[download] 100% of 5.92MiB in 00:00
[ffmpeg] Destination: Boyce_Avenue_feat._Megan_Nicole_-_Skyscraper_Patrick_Ebert_Edit-n2anDgdUHic.mp3
ERROR: audio conversion failed: Unknown encoder 'libmp3lame'
~/ytdl $ 

The informative link is that it too specificies an mp3 failure, so perhaps they two issues are related.


EDIT 2

See answer, all problems solved


回答1:


All problems sorted, thanks

I can now read in AudioSegments from url using BytesIO. I can now export either mp3 or wav after processing.

ffmpeg issue was solved using the packs recommended here: http://blog.pogoapp.com/youtube-mp3-with-node-js-and-ffmpeg/ (replacing "nodejs" with my language, "python") The ffmpeg pack recommended there (https://github.com/jayzes/heroku-buildpack-ffmpeg) already includes the lame support I needed For some reason, https://github.com/integricho/heroku-buildpack-python-ffmpeg didn't quite do the job for me

I also had to add "ffprobe" into requirements.txt to allow youtube-dl to run properly (I mention that here since it was also previously complaining about lame missing... adding ffprobe was the second step to getting this to work)

Full writeup to my answer is here:https://github.com/rg3/youtube-dl/issues/302#issuecomment-60146845




回答2:


Pydub uses ffmpeg to encode/decode all formats other than wav. So the first issue is getting ffmpeg installed on heroku.

You may find that using heroku run bash you can cd around and find the ffmpeg binary (try in /app/.heroku/vendor).

If that is the case you can explicitly specify where pydub should look like so:

import pydub


pydub.AudioSegment.converter = "/app/.heroku/vendor/ffmpeg/bin/ffmpeg" # or wherever you find it

edit

I was able to get the following to work:

create a requirements.txt in the root directory of an empty git repo.

requirements.txt:

pydub

add this and push to heroku.

the run:

heroku config:add BUILDPACK_URL=https://github.com/integricho/heroku-buildpack-python-ffmpeg.git

then:

heroku run bash

and in the shell

$ which ffmpeg
/app/.heroku/vendor/ffmpeg/bin/ffmpeg

$ python
>>> from pydub import AudioSegment
>>> AudioSegment.silent(5000).export("/tmp/asdf.mp3", "mp3")
<open file '/tmp/asdf.mp3', mode 'wb+' at 0x7ffa8aac0390>

after that ffmpeg is found at: /app/.heroku/vendor/ffmpeg/bin/ffmpeg



来源:https://stackoverflow.com/questions/26477786/reading-in-pydub-audiosegment-from-url-bytesio-returning-oserror-errno-2-no

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!