I\'m trying to do a voice quality test (pesq), but I don\'t understand how to start. I trying to compile a public source code (http://www.itu.int/itu-t/recommendations/index.asp
You will need a C compiler (The ITU PESQ reference implementation is actually C, so you don't need a C++ compiler, although both should work just fine)
For instance, on linux, you would enter the source
directory and compile with gcc
:
$ cd Software/P862_annex_A_2005_CD/source
$ gcc -o PESQ *.c
This will compile the files dsp.c, pesqdsp.c, pesqio.c, pesqmain.c, pesqmod.c
into a binary file PESQ
which you can then run with ./PESQ
:
$ ./PESQ
Perceptual Evaluation of Speech Quality (PESQ)
Reference implementation for ITU-T Recommendations P.862, P.862.1 and P.862.2.
Version 2.0 October 2005.
Usage:
PESQ HELP Displays this text
PESQ [options] ref deg
Run model on reference ref and degraded deg
Options: +8000 +16000 +swap +wb
Sample rate - No default. Must select either +8000 or +16000.
Swap byte order - machine native format by default. Select +swap for byteswap.
Default mode of operation is P.862 (narrowband handset listening). Select +wb
to use P.862.2 wideband extension (headphone listening).
File names may not begin with a + character.
Files with names ending .wav or .WAV are assumed to have a 44-byte header, which is automatically skipped. All other file types are assumed to have no header.
To run this binary and test your algorithm, you need the "reference" .wav file (This is the clean, original speech) and the "degraded" .wav file (This is the output of your algorithm). Simply pass both into PESQ
, and it will give you the output of the test. An example run on two .wav files included in the source distribution from the ITU:
$ cd Software/P862_annex_A_2005_CD/conform
$ ../source/PESQ +8000 or105.wav dg105.wav
Perceptual Evaluation of Speech Quality (PESQ)
Reference implementation for ITU-T Recommendations P.862, P.862.1 and P.862.2.
Version 2.0 October 2005.
Reading reference file or105.wav...done.
Reading degraded file dg105.wav...done.
Level normalization...
IRS filtering...
Variable delay compensation...
Acoustic model processing...
P.862 Prediction (Raw MOS, MOS-LQO): = 2.237 1.844
Where the +8000
parameter denotes that the wav files are sampled at 8000Hz.