So I\'m trying to blur an image as fast as possible(instant feel like), as the activity needs to be updated as I press the Blur button.
The problem I am having is that,
Gaussian blur is expensive to do accurately. A much faster approximation can be done by just iteratively averaging the pixels. It's still expensive to blur the image a lot but you can redraw between each iteration to at least give instant feedback and a nice animation of the image blurring.
static void blurfast(Bitmap bmp, int radius) {
int w = bmp.getWidth();
int h = bmp.getHeight();
int[] pix = new int[w * h];
bmp.getPixels(pix, 0, w, 0, 0, w, h);
for(int r = radius; r >= 1; r /= 2) {
for(int i = r; i < h - r; i++) {
for(int j = r; j < w - r; j++) {
int tl = pix[(i - r) * w + j - r];
int tr = pix[(i - r) * w + j + r];
int tc = pix[(i - r) * w + j];
int bl = pix[(i + r) * w + j - r];
int br = pix[(i + r) * w + j + r];
int bc = pix[(i + r) * w + j];
int cl = pix[i * w + j - r];
int cr = pix[i * w + j + r];
pix[(i * w) + j] = 0xFF000000 |
(((tl & 0xFF) + (tr & 0xFF) + (tc & 0xFF) + (bl & 0xFF) + (br & 0xFF) + (bc & 0xFF) + (cl & 0xFF) + (cr & 0xFF)) >> 3) & 0xFF |
(((tl & 0xFF00) + (tr & 0xFF00) + (tc & 0xFF00) + (bl & 0xFF00) + (br & 0xFF00) + (bc & 0xFF00) + (cl & 0xFF00) + (cr & 0xFF00)) >> 3) & 0xFF00 |
(((tl & 0xFF0000) + (tr & 0xFF0000) + (tc & 0xFF0000) + (bl & 0xFF0000) + (br & 0xFF0000) + (bc & 0xFF0000) + (cl & 0xFF0000) + (cr & 0xFF0000)) >> 3) & 0xFF0000;
}
}
}
bmp.setPixels(pix, 0, w, 0, 0, w, h);
}
Try to scale down the image 2, 4, 8, ... times and then scale it up again. That is fast. Otherwise implement it in renderscript.
If you want more than the scaling you can look at this code snippet in renderscript. It does the same kind of bluring as given in another answer. The same algorithm can be implemented in Java and is an optimization of the other answer. This code blurs one line. To blur a bitmap you should invoke this for all lines and then the same for all columns (you need to reimplement it to handle columns). To get a quick blur just do this once. If you want a better looking blur do it several times. I usually only do it twice.
The reason for doing one line is that I tried to parallelize the algorithm, that gave some improvement and is really simple in renderscript. I invoked the below code for all lines in parallel, and then the same for all columns.
int W = 8;
uchar4 *in;
uchar4 *out;
int N;
float invN;
uint32_t nx;
uint32_t ny;
void init_calc() {
N = 2*W+1;
invN = 1.0f/N;
nx = rsAllocationGetDimX(rsGetAllocation(in));
ny = rsAllocationGetDimY(rsGetAllocation(in));
}
void root(const ushort *v_in) {
float4 sum = 0;
uchar4 *head = in + *v_in * nx;
uchar4 *tail = head;
uchar4 *p = out + *v_in * nx;
uchar4 *hpw = head + W;
uchar4 *hpn = head + N;
uchar4 *hpx = head + nx;
uchar4 *hpxmw = head + nx - W - 1;
while (head < hpw) {
sum += rsUnpackColor8888(*head++);
}
while (head < hpn) {
sum += rsUnpackColor8888(*head++);
*p++ = rsPackColorTo8888(sum*invN);
}
while (head < hpx) {
sum += rsUnpackColor8888(*head++);
sum -= rsUnpackColor8888(*tail++);
*p++ = rsPackColorTo8888(sum*invN);
}
while (tail < hpxmw) {
sum -= rsUnpackColor8888(*tail++);
*p++ = rsPackColorTo8888(sum*invN);
}
}
Here is for the vertical bluring:
int W = 8;
uchar4 *in;
uchar4 *out;
int N;
float invN;
uint32_t nx;
uint32_t ny;
void init_calc() {
N = 2*W+1;
invN = 1.0f/N;
nx = rsAllocationGetDimX(rsGetAllocation(in));
ny = rsAllocationGetDimY(rsGetAllocation(in));
}
void root(const ushort *v_in) {
float4 sum = 0;
uchar4 *head = in + *v_in;
uchar4 *tail = head;
uchar4 *hpw = head + nx*W;
uchar4 *hpn = head + nx*N;
uchar4 *hpy = head + nx*ny;
uchar4 *hpymw = head + nx*(ny-W-1);
uchar4 *p = out + *v_in;
while (head < hpw) {
sum += rsUnpackColor8888(*head);
head += nx;
}
while (head < hpn) {
sum += rsUnpackColor8888(*head);
*p = rsPackColorTo8888(sum*invN);
head += nx;
p += nx;
}
while (head < hpy) {
sum += rsUnpackColor8888(*head);
sum -= rsUnpackColor8888(*tail);
*p = rsPackColorTo8888(sum*invN);
head += nx;
tail += nx;
p += nx;
}
while (tail < hpymw) {
sum -= rsUnpackColor8888(*tail);
*p = rsPackColorTo8888(sum*invN);
tail += nx;
p += nx;
}
}
And here is the Java code that calls into the rs code:
private RenderScript mRS;
private ScriptC_horzblur mHorizontalScript;
private ScriptC_vertblur mVerticalScript;
private ScriptC_blur mBlurScript;
private Allocation alloc1;
private Allocation alloc2;
private void hblur(int radius, Allocation index, Allocation in, Allocation out) {
mHorizontalScript.set_W(radius);
mHorizontalScript.bind_in(in);
mHorizontalScript.bind_out(out);
mHorizontalScript.invoke_init_calc();
mHorizontalScript.forEach_root(index);
}
private void vblur(int radius, Allocation index, Allocation in, Allocation out) {
mHorizontalScript.set_W(radius);
mVerticalScript.bind_in(in);
mVerticalScript.bind_out(out);
mVerticalScript.invoke_init_calc();
mVerticalScript.forEach_root(index);
}
Bitmap blur(Bitmap org, int radius) {
Bitmap out = Bitmap.createBitmap(org.getWidth(), org.getHeight(), org.getConfig());
blur(org, out, radius);
return out;
}
private Allocation createIndex(int size) {
Element element = Element.U16(mRS);
Allocation allocation = Allocation.createSized(mRS, element, size);
short[] rows = new short[size];
for (int i = 0; i < rows.length; i++) rows[i] = (short)i;
allocation.copyFrom(rows);
return allocation;
}
private void blur(Bitmap src, Bitmap dst, int r) {
Allocation alloc1 = Allocation.createFromBitmap(mRS, src);
Allocation alloc2 = Allocation.createTyped(mRS, alloc1.getType());
Allocation hIndexAllocation = createIndex(alloc1.getType().getY());
Allocation vIndexAllocation = createIndex(alloc1.getType().getX());
// Iteration 1
hblur(r, hIndexAllocation, alloc1, alloc2);
vblur(r, vIndexAllocation, alloc2, alloc1);
// Iteration 2
hblur(r, hIndexAllocation, alloc1, alloc2);
vblur(r, vIndexAllocation, alloc2, alloc1);
// Add more iterations if you like or simply make a loop
alloc1.copyTo(dst);
}