问题
I'm trying to implement
AudioRecord (MIC) ->
PCM -> AAC Encoder
AAC -> PCM Decode
-> AudioTrack?? (SPEAKER)
with MediaCodec
on Android 4.1+ (API16).
Firstly, I successfully (but not sure correctly optimized) implemented PCM -> AAC Encoder
by MediaCodec
as intended as below
private boolean setEncoder(int rate)
{
encoder = MediaCodec.createEncoderByType("audio/mp4a-latm");
MediaFormat format = new MediaFormat();
format.setString(MediaFormat.KEY_MIME, "audio/mp4a-latm");
format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
format.setInteger(MediaFormat.KEY_SAMPLE_RATE, 44100);
format.setInteger(MediaFormat.KEY_BIT_RATE, 64 * 1024);//AAC-HE 64kbps
format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectHE);
encoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
return true;
}
INPUT: PCM Bitrate = 44100(Hz) x 16(bit) x 1(Monoral) = 705600 bit/s
OUTPUT: AAC-HE Bitrate = 64 x 1024(bit) = 65536 bit/s
So, the data size is approximately compressed x11
,and I confirmed this working by observing a log
- AudioRecoder﹕ 4096 bytes read
- AudioEncoder﹕ 369 bytes encoded
the data size is approximately compressed x11
, so far so good.
Now, I have a UDP server to receive the encoded data, then decode it.
The decoder profile is set as follows:
private boolean setDecoder(int rate)
{
decoder = MediaCodec.createDecoderByType("audio/mp4a-latm");
MediaFormat format = new MediaFormat();
format.setString(MediaFormat.KEY_MIME, "audio/mp4a-latm");
format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
format.setInteger(MediaFormat.KEY_SAMPLE_RATE, 44100);
format.setInteger(MediaFormat.KEY_BIT_RATE, 64 * 1024);//AAC-HE 64kbps
format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectHE);
decoder.configure(format, null, null, 0);
return true;
}
Since UDPserver packet buffer size is 1024
- UDPserver ﹕ 1024 bytes received
and since this is the compressed AAC data, I would expect the decoding size will be
approximately 1024 x11
, however the actual result is
- AudioDecoder﹕ 8192 bytes decoded
It's approximately x8
, and I feel something wrong.
The decoder code is as follows:
IOudpPlayer = new Thread(new Runnable()
{
public void run()
{
SocketAddress sockAddress;
String address;
int len = 1024;
byte[] buffer2 = new byte[len];
DatagramPacket packet;
byte[] data;
ByteBuffer[] inputBuffers;
ByteBuffer[] outputBuffers;
ByteBuffer inputBuffer;
ByteBuffer outputBuffer;
MediaCodec.BufferInfo bufferInfo;
int inputBufferIndex;
int outputBufferIndex;
byte[] outData;
try
{
decoder.start();
isPlaying = true;
while (isPlaying)
{
try
{
packet = new DatagramPacket(buffer2, len);
ds.receive(packet);
sockAddress = packet.getSocketAddress();
address = sockAddress.toString();
Log.d("UDP Receiver"," received !!! from " + address);
data = new byte[packet.getLength()];
System.arraycopy(packet.getData(), packet.getOffset(), data, 0, packet.getLength());
Log.d("UDP Receiver", data.length + " bytes received");
//===========
inputBuffers = decoder.getInputBuffers();
outputBuffers = decoder.getOutputBuffers();
inputBufferIndex = decoder.dequeueInputBuffer(-1);
if (inputBufferIndex >= 0)
{
inputBuffer = inputBuffers[inputBufferIndex];
inputBuffer.clear();
inputBuffer.put(data);
decoder.queueInputBuffer(inputBufferIndex, 0, data.length, 0, 0);
}
bufferInfo = new MediaCodec.BufferInfo();
outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 0);
while (outputBufferIndex >= 0)
{
outputBuffer = outputBuffers[outputBufferIndex];
outputBuffer.position(bufferInfo.offset);
outputBuffer.limit(bufferInfo.offset + bufferInfo.size);
outData = new byte[bufferInfo.size];
outputBuffer.get(outData);
Log.d("AudioDecoder", outData.length + " bytes decoded");
decoder.releaseOutputBuffer(outputBufferIndex, false);
outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 0);
}
//===========
}
catch (IOException e)
{
}
}
decoder.stop();
}
catch (Exception e)
{
}
}
});
the full code:
https://gist.github.com/kenokabe/9029256
also need Permission:
<uses-permission android:name="android.permission.INTERNET"></uses-permission>
<uses-permission android:name="android.permission.RECORD_AUDIO"></uses-permission>
A member fadden who works for Google told me
Looks like I'm not setting position & limit on the output buffer.
I have read VP8 Encoding Nexus 5 returns empty/0-Frames , but not sure how to implement correctly.
UPDATE: I sort of understood where to modify for
Looks like I'm not setting position & limit on the output buffer.
, so add 2 lines within the while loop of Encoder and Decoder as follows:
outputBuffer.position(bufferInfo.offset);
outputBuffer.limit(bufferInfo.offset + bufferInfo.size);
https://gist.github.com/kenokabe/9029256/revisions
However the result is the same.
and now, I think, the errors:
W/SoftAAC2﹕ AAC decoder returned error 16388, substituting silence.
indicates this decoder fails completely from the first. It's again the data is not seekable
issue. Seeking in AAC streams on Android It's very disappointing if the AAC decoder cannot handle the streaming data in this way but only with adding some header.
UPDATE2: UDP receiver did wrong, so modified
https://gist.github.com/kenokabe/9029256
Now, the error
W/SoftAAC2﹕ AAC decoder returned error 16388, substituting silence.
disappeared!!
So, it indicates the decoder works without an error, at least,
however, this is the log of 1 cycle:
D/AudioRecoder﹕ 4096 bytes read
D/AudioEncoder﹕ 360 bytes encoded
D/UDP Receiver﹕ received !!! from /127.0.0.1:39000
D/UDP Receiver﹕ 360 bytes received
D/AudioDecoder﹕ 8192 bytes decoded
PCM(4096)->AACencoded(360)->UDP-AAC(360)->(supposed to be )PCM(8192)
The final result is about 2x size of the original PCM, something is still wrong.
So my Question here would be
Can you properly optimize my sample code to work correctly?
Is it a right way to use
AudioTrack
API to play the decoded PCM raw data on the fly, and can you show me the proper way to do that? A example code is appreciated.
Thank you.
PS. My project targets on Android4.1+(API16), I've read things are easier on API18(Andeoid 4.3+), but for obvious compatibility reasons, unfortunately, I have to skip MediaMuxer etc. here...
回答1:
After testing this is what I came up with from modifying your code:
package com.example.app;
import android.app.Activity;
import android.media.AudioManager;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.os.Bundle;
import android.media.AudioFormat;
import android.media.AudioRecord;
import android.media.AudioTrack;
import android.media.MediaCodec;
import android.media.MediaRecorder.AudioSource;
import android.util.Log;
import java.io.IOException;
import java.net.DatagramPacket;
import java.net.DatagramSocket;
import java.net.InetAddress;
import java.net.SocketAddress;
import java.net.SocketException;
import java.nio.ByteBuffer;
public class MainActivity extends Activity
{
private AudioRecord recorder;
private AudioTrack player;
private MediaCodec encoder;
private MediaCodec decoder;
private short audioFormat = AudioFormat.ENCODING_PCM_16BIT;
private short channelConfig = AudioFormat.CHANNEL_IN_MONO;
private int bufferSize;
private boolean isRecording;
private boolean isPlaying;
private Thread IOrecorder;
private Thread IOudpPlayer;
private DatagramSocket ds;
private final int localPort = 39000;
@Override
protected void onCreate(Bundle savedInstanceState)
{
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
IOrecorder = new Thread(new Runnable()
{
public void run()
{
int read;
byte[] buffer1 = new byte[bufferSize];
ByteBuffer[] inputBuffers;
ByteBuffer[] outputBuffers;
ByteBuffer inputBuffer;
ByteBuffer outputBuffer;
MediaCodec.BufferInfo bufferInfo;
int inputBufferIndex;
int outputBufferIndex;
byte[] outData;
DatagramPacket packet;
try
{
encoder.start();
recorder.startRecording();
isRecording = true;
while (isRecording)
{
read = recorder.read(buffer1, 0, bufferSize);
// Log.d("AudioRecoder", read + " bytes read");
//------------------------
inputBuffers = encoder.getInputBuffers();
outputBuffers = encoder.getOutputBuffers();
inputBufferIndex = encoder.dequeueInputBuffer(-1);
if (inputBufferIndex >= 0)
{
inputBuffer = inputBuffers[inputBufferIndex];
inputBuffer.clear();
inputBuffer.put(buffer1);
encoder.queueInputBuffer(inputBufferIndex, 0, buffer1.length, 0, 0);
}
bufferInfo = new MediaCodec.BufferInfo();
outputBufferIndex = encoder.dequeueOutputBuffer(bufferInfo, 0);
while (outputBufferIndex >= 0)
{
outputBuffer = outputBuffers[outputBufferIndex];
outputBuffer.position(bufferInfo.offset);
outputBuffer.limit(bufferInfo.offset + bufferInfo.size);
outData = new byte[bufferInfo.size];
outputBuffer.get(outData);
// Log.d("AudioEncoder ", outData.length + " bytes encoded");
//-------------
packet = new DatagramPacket(outData, outData.length,
InetAddress.getByName("127.0.0.1"), localPort);
ds.send(packet);
//------------
encoder.releaseOutputBuffer(outputBufferIndex, false);
outputBufferIndex = encoder.dequeueOutputBuffer(bufferInfo, 0);
}
// ----------------------;
}
encoder.stop();
recorder.stop();
}
catch (Exception e)
{
e.printStackTrace();
}
}
});
IOudpPlayer = new Thread(new Runnable()
{
public void run()
{
SocketAddress sockAddress;
String address;
int len = 2048
byte[] buffer2 = new byte[len];
DatagramPacket packet;
byte[] data;
ByteBuffer[] inputBuffers;
ByteBuffer[] outputBuffers;
ByteBuffer inputBuffer;
ByteBuffer outputBuffer;
MediaCodec.BufferInfo bufferInfo;
int inputBufferIndex;
int outputBufferIndex;
byte[] outData;
try
{
player.play();
decoder.start();
isPlaying = true;
while (isPlaying)
{
try
{
packet = new DatagramPacket(buffer2, len);
ds.receive(packet);
sockAddress = packet.getSocketAddress();
address = sockAddress.toString();
// Log.d("UDP Receiver"," received !!! from " + address);
data = new byte[packet.getLength()];
System.arraycopy(packet.getData(), packet.getOffset(), data, 0, packet.getLength());
// Log.d("UDP Receiver", data.length + " bytes received");
//===========
inputBuffers = decoder.getInputBuffers();
outputBuffers = decoder.getOutputBuffers();
inputBufferIndex = decoder.dequeueInputBuffer(-1);
if (inputBufferIndex >= 0)
{
inputBuffer = inputBuffers[inputBufferIndex];
inputBuffer.clear();
inputBuffer.put(data);
decoder.queueInputBuffer(inputBufferIndex, 0, data.length, 0, 0);
}
bufferInfo = new MediaCodec.BufferInfo();
outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 0);
while (outputBufferIndex >= 0)
{
outputBuffer = outputBuffers[outputBufferIndex];
outputBuffer.position(bufferInfo.offset);
outputBuffer.limit(bufferInfo.offset + bufferInfo.size);
outData = new byte[bufferInfo.size];
outputBuffer.get(outData);
// Log.d("AudioDecoder", outData.length + " bytes decoded");
player.write(outData, 0, outData.length);
decoder.releaseOutputBuffer(outputBufferIndex, false);
outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 0..
回答2:
Self Answer, here's my best effort so far
package com.example.app;
import android.app.Activity;
import android.media.AudioManager;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.os.Bundle;
import android.media.AudioFormat;
import android.media.AudioRecord;
import android.media.AudioTrack;
import android.media.MediaCodec;
import android.media.MediaRecorder.AudioSource;
import android.util.Log;
import java.io.IOException;
import java.net.DatagramPacket;
import java.net.DatagramSocket;
import java.net.InetAddress;
import java.net.SocketAddress;
import java.net.SocketException;
import java.nio.ByteBuffer;
public class MainActivity extends Activity
{
private AudioRecord recorder;
private AudioTrack player;
private MediaCodec encoder;
private MediaCodec decoder;
private short audioFormat = AudioFormat.ENCODING_PCM_16BIT;
private short channelConfig = AudioFormat.CHANNEL_IN_MONO;
private int bufferSize;
private boolean isRecording;
private boolean isPlaying;
private Thread IOrecorder;
private Thread IOudpPlayer;
private DatagramSocket ds;
private final int localPort = 39000;
@Override
protected void onCreate(Bundle savedInstanceState)
{
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
IOrecorder = new Thread(new Runnable()
{
public void run()
{
int read;
byte[] buffer1 = new byte[bufferSize];
ByteBuffer[] inputBuffers;
ByteBuffer[] outputBuffers;
ByteBuffer inputBuffer;
ByteBuffer outputBuffer;
MediaCodec.BufferInfo bufferInfo;
int inputBufferIndex;
int outputBufferIndex;
byte[] outData;
DatagramPacket packet;
try
{
encoder.start();
recorder.startRecording();
isRecording = true;
while (isRecording)
{
read = recorder.read(buffer1, 0, bufferSize);
// Log.d("AudioRecoder", read + " bytes read");
//------------------------
inputBuffers = encoder.getInputBuffers();
outputBuffers = encoder.getOutputBuffers();
inputBufferIndex = encoder.dequeueInputBuffer(-1);
if (inputBufferIndex >= 0)
{
inputBuffer = inputBuffers[inputBufferIndex];
inputBuffer.clear();
inputBuffer.put(buffer1);
encoder.queueInputBuffer(inputBufferIndex, 0, buffer1.length, 0, 0);
}
bufferInfo = new MediaCodec.BufferInfo();
outputBufferIndex = encoder.dequeueOutputBuffer(bufferInfo, 0);
while (outputBufferIndex >= 0)
{
outputBuffer = outputBuffers[outputBufferIndex];
outputBuffer.position(bufferInfo.offset);
outputBuffer.limit(bufferInfo.offset + bufferInfo.size);
outData = new byte[bufferInfo.size];
outputBuffer.get(outData);
// Log.d("AudioEncoder", outData.length + " bytes encoded");
//-------------
packet = new DatagramPacket(outData, outData.length,
InetAddress.getByName("127.0.0.1"), localPort);
ds.send(packet);
//------------
encoder.releaseOutputBuffer(outputBufferIndex, false);
outputBufferIndex = encoder.dequeueOutputBuffer(bufferInfo, 0);
}
// ----------------------;
}
encoder.stop();
recorder.stop();
}
catch (Exception e)
{
e.printStackTrace();
}
}
});
IOudpPlayer = new Thread(new Runnable()
{
public void run()
{
SocketAddress sockAddress;
String address;
int len = 1024;
byte[] buffer2 = new byte[len];
DatagramPacket packet;
byte[] data;
ByteBuffer[] inputBuffers;
ByteBuffer[] outputBuffers;
ByteBuffer inputBuffer;
ByteBuffer outputBuffer;
MediaCodec.BufferInfo bufferInfo;
int inputBufferIndex;
int outputBufferIndex;
byte[] outData;
try
{
player.play();
decoder.start();
isPlaying = true;
while (isPlaying)
{
try
{
packet = new DatagramPacket(buffer2, len);
ds.receive(packet);
sockAddress = packet.getSocketAddress();
address = sockAddress.toString();
// Log.d("UDP Receiver"," received !!! from " + address);
data = new byte[packet.getLength()];
System.arraycopy(packet.getData(), packet.getOffset(), data, 0, packet.getLength());
// Log.d("UDP Receiver", data.length + " bytes received");
//===========
inputBuffers = decoder.getInputBuffers();
outputBuffers = decoder.getOutputBuffers();
inputBufferIndex = decoder.dequeueInputBuffer(-1);
if (inputBufferIndex >= 0)
{
inputBuffer = inputBuffers[inputBufferIndex];
inputBuffer.clear();
inputBuffer.put(data);
decoder.queueInputBuffer(inputBufferIndex, 0, data.length, 0, 0);
}
bufferInfo = new MediaCodec.BufferInfo();
outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 0);
while (outputBufferIndex >= 0)
{
outputBuffer = outputBuffers[outputBufferIndex];
outputBuffer.position(bufferInfo.offset);
outputBuffer.limit(bufferInfo.offset + bufferInfo.size);
outData = new byte[bufferInfo.size];
outputBuffer.get(outData);
// Log.d("AudioDecoder", outData.length + " bytes decoded");
player.write(outData, 0, outData.length);
decoder.releaseOutputBuffer(outputBufferIndex, false);
outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 0);
}
//===========
}
catch (IOException e)
{
}
}
decoder.stop();
player.stop();
}
catch (Exception e)
{
}
}
});
//===========================================================
int rate = findAudioRecord();
if (rate != -1)
{
Log.v("=========media ", "ready: " + rate);
Log.v("=========media channel ", "ready: " + channelConfig);
boolean encoderReady = setEncoder(rate);
Log.v("=========encoder ", "ready: " + encoderReady);
if (encoderReady)
{
boolean decoderReady = setDecoder(rate);
Log.v("=========decoder ", "ready: " + decoderReady);
if (decoderReady)
{
Log.d("=======bufferSize========", "" + bufferSize);
try
{
setPlayer(rate);
ds = new DatagramSocket(localPort);
IOudpPlayer.start();
IOrecorder.start();
}
catch (SocketException e)
{
e.printStackTrace();
}
}
}
}
}
protected void onDestroy()
{
recorder.release();
player.release();
encoder.release();
decoder.release();
}
/*
protected void onResume()
{
// isRecording = true;
}
protected void onPause()
{
isRecording = false;
}
*/
private int findAudioRecord()
{
for (int rate : new int[]{44100})
{
try
{
Log.v("===========Attempting rate ", rate + "Hz, bits: " + audioFormat + ", channel: " + channelConfig);
bufferSize = AudioRecord.getMinBufferSize(rate, channelConfig, audioFormat);
if (bufferSize != AudioRecord.ERROR_BAD_VALUE)
{
// check if we can instantiate and have a success
recorder = new AudioRecord(AudioSource.MIC, rate, channelConfig, audioFormat, bufferSize);
if (recorder.getState() == AudioRecord.STATE_INITIALIZED)
{
Log.v("===========final rate ", rate + "Hz, bits: " + audioFormat + ", channel: " + channelConfig);
return rate;
}
}
}
catch (Exception e)
{
Log.v("error", "" + rate);
}
}
return -1;
}
private boolean setEncoder(int rate)
{
encoder = MediaCodec.createEncoderByType("audio/mp4a-latm");
MediaFormat format = new MediaFormat();
format.setString(MediaFormat.KEY_MIME, "audio/mp4a-latm");
format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
format.setInteger(MediaFormat.KEY_SAMPLE_RATE, rate);
format.setInteger(MediaFormat.KEY_BIT_RATE, 64 * 1024);//AAC-HE 64kbps
format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectHE);
encoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
return true;
}
private boolean setDecoder(int rate)
{
decoder = MediaCodec.createDecoderByType("audio/mp4a-latm");
MediaFormat format = new MediaFormat();
format.setString(MediaFormat.KEY_MIME, "audio/mp4a-latm");
format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
format.setInteger(MediaFormat.KEY_SAMPLE_RATE, rate);
format.setInteger(MediaFormat.KEY_BIT_RATE, 64 * 1024);//AAC-HE 64kbps
format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectHE);
decoder.configure(format, null, null, 0);
return true;
}
private boolean setPlayer(int rate)
{
int bufferSizePlayer = AudioTrack.getMinBufferSize(rate, AudioFormat.CHANNEL_OUT_MONO, audioFormat);
Log.d("====buffer Size player ", String.valueOf(bufferSizePlayer));
player= new AudioTrack(AudioManager.STREAM_MUSIC, rate, AudioFormat.CHANNEL_OUT_MONO, audioFormat, bufferSizePlayer, AudioTrack.MODE_STREAM);
if (player.getState() == AudioTrack.STATE_INITIALIZED)
{
return true;
}
else
{
return false;
}
}
}
回答3:
I have tried the above code and it didn't work properly.I was getting lot of silence injected to the decoded output. Issue was not setting proper "csd" value to the decoder.
So if you see "silence" in log or Decoder throwing error make sure you have added the following to your media decoder format
int profile = 2; //AAC LC
int freqIdx = 11; //8KHz
int chanCfg = 1; //Mono
ByteBuffer csd = ByteBuffer.allocate(2);
csd.put(0, (byte) (profile << 3 | freqIdx >> 1));
csd.put(1, (byte)((freqIdx & 0x01) << 7 | chanCfg << 3));
mediaFormat.setByteBuffer("csd-0", csd);
回答4:
Your networking code is combining data. You got 369 bytes of compressed data, but on the receiving end you ended up with 1024 bytes. Those 1024 bytes consist of two whole and one partial frame. The two whole frames each decode to 4096 bytes again, for the total of 8192 bytes that you saw. The remaining partial frame will probably be decoded once you send sufficiently more data to the decoder, but you should generally send only whole frames to the decoder.
In addition, MediaCodec.dequeueOutputBuffer()
does not only return (positive) buffer indices, but also (negative) status codes. One of the possible codes is MediaCodec.INFO_OUTPUT_FORMAT_CHANGED
, which indicates that you need to call MediaCodec.getOutputFormat()
to get the format of the audio data. You might see the codec output stereo even if the input was mono. The code you posted simply breaks out of the loop when it receives one of these status codes.
回答5:
I have tested with your souce. there are some points.
Bit Rate is a natural number of K, but not computer K. 64k = 64000, but not 64 * 1024
It's not recommended to write a long code that shares some variables. A. separate Encoder Thread and Decoder Thread into 2 independet classes. B. The DatagramSocket is shared by Sender and Receiver, it not good.
Enumerate Audio Format need more values. i.e. sample rates should be picked from : 8000, 11025, 22050, 44100
回答6:
D/AudioRecoder﹕ 4096 bytes read D/AudioEncoder﹕ 360 bytes encoded D/UDP Receiver﹕ received !!! from /127.0.0.1:39000 D/UDP Receiver﹕ 360 bytes received D/AudioDecoder﹕ 8192 bytes decoded
This is because acc decoder always decode to stereo channels,even if the encoded data is MONO. so if your encoding side is set to stereo channels, it will be like:
D/AudioRecoder﹕ 8192 bytes read D/AudioEncoder﹕ 360 bytes encoded D/UDP Receiver﹕ received !!! from /127.0.0.1:39000 D/UDP Receiver﹕ 360 bytes received D/AudioDecoder﹕ 8192 bytes decoded
来源:https://stackoverflow.com/questions/21804390/pcm-aac-encoder-pcmdecoder-in-real-time-with-correct-optimization