Split audio file into several files, each below a size threshold

问题

I have a FLAC file which I need to split into several distinct FLAC files, each of which must be below 100 MB in size. Are there any UNIX tools which can do this for me? Can I implement this logic myself?

Side-note: since FLAC is compressed, I figure that the easiest solution will require first converting the file to WAV.

回答1:

There are two parts to your question.

Convert existing FLAC audio file to some other format like wav
Split converted wav file into chunk of specific size.

Obviously, there are more than one way to do this. However, pydub provides easier methods to accomplish above. details on pydub documentation can be found here.

1) Convert existing FLAC audio file to some other format like wav

Using pydub you can read FLAC audio format and then convert to wav as below

flac_audio = AudioSegment.from_file("sample.flac", "flac")
flac_audio.export("audio.wav", format="wav")

2) Split converted wav file into chunk of specific size.

Again, there are various ways to do this. The way I did this was to determine total length and size of the converted wavfile and then approximate that to desired chunk size.

The sample wav file used was of 101,612 KB size and about 589 sec or little over 9 minutes.

Wav File size by observation :

Stereo frame_rate 44.1KHz audio files are approximately 10 Mb per a minute. 48K would be a little larger.That means that the corresponding mono file would be 5 megs per minute

The approximation holds good for our sample file with about10 Mb per minute

Wav file size by math:

Co relation between wav file size and duration is given by

wav_file_size_in_bytes = (sample rate (44100) * bit rate (16-bit) * number of channels (2 for stereo) * number of seconds) / 8 (8 bits = 1 byte)

Source : http://manual.audacityteam.org/o/man/digital_audio.html

The formula I used to calculate chunks of audio file:

Get chunk size by following method

for duration_in_sec (X) we get wav_file_size (Y)
So whats duration in sec (K) given file size of 10Mb

This gives K = X * 10Mb / Y

pydub.utils has method make_chunks that can make chunks of specific duration (in milliseconds). We determine duration for desired size using above formula.

We use that to create chunks of 10Mb (or near 10Mb) and export each chunk separately. Last chunk may be smaller depending upon size.

Here is a working code.

from pydub import AudioSegment
#from pydub.utils import mediainfo
from pydub.utils import make_chunks
import math

flac_audio = AudioSegment.from_file("sample.flac", "flac")
flac_audio.export("audio.wav", format="wav")
myaudio = AudioSegment.from_file("audio.wav" , "wav")
channel_count = myaudio.channels    #Get channels
sample_width = myaudio.sample_width #Get sample width
duration_in_sec = len(myaudio) / 1000#Length of audio in sec
sample_rate = myaudio.frame_rate

print "sample_width=", sample_width 
print "channel_count=", channel_count
print "duration_in_sec=", duration_in_sec 
print "frame_rate=", sample_rate
bit_rate =16  #assumption , you can extract from mediainfo("test.wav") dynamically


wav_file_size = (sample_rate * bit_rate * channel_count * duration_in_sec) / 8
print "wav_file_size = ",wav_file_size


file_split_size = 10000000  # 10Mb OR 10, 000, 000 bytes
total_chunks =  wav_file_size // file_split_size

#Get chunk size by following method #There are more than one ofcourse
#for  duration_in_sec (X) -->  wav_file_size (Y)
#So   whats duration in sec  (K) --> for file size of 10Mb
#  K = X * 10Mb / Y

chunk_length_in_sec = math.ceil((duration_in_sec * 10000000 ) /wav_file_size)   #in sec
chunk_length_ms = chunk_length_in_sec * 1000
chunks = make_chunks(myaudio, chunk_length_ms)

#Export all of the individual chunks as wav files

for i, chunk in enumerate(chunks):
    chunk_name = "chunk{0}.wav".format(i)
    print "exporting", chunk_name
    chunk.export(chunk_name, format="wav")

Output:

Python 2.7.9 (default, Dec 10 2014, 12:24:55) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>> 
sample_width= 2
channel_count= 2
duration_in_sec= 589
frame_rate= 44100
wav_file_size =  103899600
exporting chunk0.wav
exporting chunk1.wav
exporting chunk2.wav
exporting chunk3.wav
exporting chunk4.wav
exporting chunk5.wav
exporting chunk6.wav
exporting chunk7.wav
exporting chunk8.wav
exporting chunk9.wav
exporting chunk10.wav
>>>

来源：https://stackoverflow.com/questions/36632511/split-audio-file-into-several-files-each-below-a-size-threshold

标签

unix

audio

ffmpeg

flac