sox | 易学教程

finding speed and tone of speech in an audio using python

阅读更多关于 finding speed and tone of speech in an audio using python

问题 Given an audio , I want to calculate the pace of the speech. i.e how fast or slow is it. Currently I am doing the following: - convert speech to text and obtaining a transcript (using a free tool). - count number of words in transcript. - calculate length or duration of file. - finally, pace = (number of words in transcript / duration of file) . However the accuracy of the pace obtained is dependent purely on transcription , which I think is an unnecessary step. Is there any python-library

How to change the samples rates when do the format conversion by sox C libraries?

阅读更多关于 How to change the samples rates when do the format conversion by sox C libraries?

问题 I am trying the format conversion between two audio files using the sox libraries. I can convert one to another with no parameters changing by the API provided by the library.Just like the process by executing the command: sox a.wav b.ul And now the question I encounter is how to change the samples rates while converting the audio files. Please give me a hand! Thanks! 回答1: The rate effect is used for resampling. See src/example3.c in the SoX git repository for an example how to use it with

Frequency distribution from wav files

阅读更多关于 Frequency distribution from wav files

问题 I found out that I can use SoX's play file.wav stat -freq to generate a table of levels against frequencies for a file. However, it seems to run in real time, i.e. takes as long to complete as the audio does to play. How can I generate the same table of frequencies and levels but in the shortest time possible? 回答1: The output of SoX is rather slow, but this is mainly caused by the displaying. One solution could be to redirect the output of SoX (which is on the standard error (stderr) stream

Mix .L and .R files into a stereo file using SOX in bulk

阅读更多关于 Mix .L and .R files into a stereo file using SOX in bulk

问题 I have a folder full of WAV files with separate L and R channels. I've been using SOX for some things like changing the sample rate of the audio files inside a specific folder using this code: for file in *.wav; do sox $file -r 44100 -b 24 converted/$(basename $file) -V; done For example, I have these two files that I want to merge: - CLOSE_1_02.L.wav - CLOSE_1_02.R.wav I would like to merge them in a stereo file (L in the left channel and R in the right channel) with the name: "CLOSE_1_02

Converting a call center recording to something useful

阅读更多关于 Converting a call center recording to something useful

问题 I have a call center recording (when played it sounds gibberish) for which the mediainfo shows info as ion@aurora:~/Inbound$ mediainfo 48401-3405-48403--18042018170000.wav General Complete name : 48401-3405-48403--18042018170000.wav Format : Wave File size : 327 KiB Duration : 4mn 11s Overall bit rate : 10.7 Kbps Audio Format : G.723.1 Codec ID : A100 Duration : 4mn 11s Bit rate : 10.7 Kbps Channel(s) : 2 channels Sampling rate : 8 000 Hz Stream size : 327 KiB (100%) The ffmpeg info shows

Raspberry pi: generate and play tone from python code (with sox)

阅读更多关于 Raspberry pi: generate and play tone from python code (with sox)

问题 I am building a simple GUI with Raspberry, TKinter, and sox, using python 3. I want to play a tone, generated on the fly, every time a button in the GUI is pressed. Here's the code: from Tkinter import Tk, Label, Button import os class MyFirstGUI: def __init__(self, master): self.master = master master.title("Random Tone Generator") self.label = Label(master, text="Press Generate and enjoy") self.label.pack() self.generate_button = Button(master, text="Generate", command=self.generate) self

Trying to Split Wav file into two pieces with SoX

阅读更多关于 Trying to Split Wav file into two pieces with SoX

问题 I'm trying to split one .wav file into two pieces where there is a few seconds of silence. Based on the documentation I've found, the following should work: sox testfile.wav tester.wav silence 1 0.50 0.1% 1 1.0 0.1% : newfile : restart "testfile.wav" is a voice recording and I put about 4 seconds of silence right in the middle of it to test. The expected result is that I would get "tester001.wav" and "tester002.wav" from running this. Instead I get one file - "tester.wav" which is the first

SoX running slow using a ProcessBuilder

阅读更多关于 SoX running slow using a ProcessBuilder

问题 I am running SoX using using a ProcessBuilder in java that trims WAV files into 30 second long WAV files. SoX is running because I can get the first 30 seconds of the file successfully trimmed and saved as a new file but it stops there however, it's still running. This is the code for the command generation: command.add (soxCommand); if (SoxWrapper.getMetadata (srcFile, MetadataField.SAMPLE_RATE) != 16000) { command.add ("-V3"); command.add ("-G"); command.add (FilenameUtils.normalize

Coverting PCM 16bit LE to WAV

阅读更多关于 Coverting PCM 16bit LE to WAV

问题 I'm trying to write a program in C that converts a captured Raw 16kHz PCM 16-bit file to a 16-bit WAV . I've read some posts and people recommended using libsox . Installed it and now i'm really struggling with understanding the man-page. So far (by reading the example in the source dist) I've figured out that the structs : sox_format_t sox_signalinfo_t can probably be used to describe the data I'm inputting. I also know how much info i'm processing (time) if that is somehow necessary? Some

Sox batch process under Debian

阅读更多关于 Sox batch process under Debian

问题 I want to resample a bunch of wav files that I got on a folder. My script is this: for f in *.wav; do sox “$f” -r 48000 “${f%%%.wav}.wav”; done The console give me this error: "sox FAIL formats: can't open input file `“90.wav”': No such file or directory" and so on with the 300 files that are placed on that folder. How can I batch processing right this files? Why is it giving me this error? Thanks a lot! Solution: for i in *wav; do echo $i; sox $i -r 48000 ${i%%.wav}r.wav; done 回答1: Summary: