Optimizing data types to increase speed

淺唱寂寞╮ 提交于 2019-12-14 02:38:55

问题


In order to write efficient code, you should use the simplest possible data types. This is even more true for Renderscript, where the same calculation is repeated so many times in a kernel. Now, I wanted to write a very simple kernel, that takes an (color) bitmap as input and produces an int[] array as output :

#pragma version(1)
#pragma rs java_package_name(com.example.xxx)
#pragma rs_fp_relaxed

uint __attribute__((kernel)) grauInt(uchar4 in) {
uint gr= (uint) (0.21*in.r + 0.72*in.g + 0.07*in.b);    
return gr;  
}

Java side:

int[] data1 = new int[width*height];
ScriptC_gray graysc;
graysc=new ScriptC_gray(rs);
Type.Builder TypeOut = new Type.Builder(rs, Element.U32(rs));
TypeOut.setX(width).setY(height);
Allocation outAlloc = Allocation.createTyped(rs, TypeOut.create());

Allocation inAlloc = Allocation.createFromBitmap(rs, bmpfoto1,     Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_SCRIPT);
graysc.forEach_grauInt(inAlloc,outAlloc);
outAlloc.copyTo(data1);

This takes 40 ms on my Samsung S5 (5.0) and 180 ms on my Samsung Tab2(4.2) for a 600k pixel bitmap. Now I tried to optimize. Since the output is actually an 8bit unsigned integer (0-255), I tried the following:

uchar __attribute__((kernel)) grauInt(uchar4 in) {
uchar gr= 0.2125*in.r + 0.7154*in.g + 0.0721*in.b;
return gr;
}

and in Java changed the 4th line to:

Type.Builder TypeOut = new Type.Builder(rs, Element.U8(rs));

However, this creates the error „32 bit integer source does not match allocation type UNSIGNED_8“. My explanation for this is that the forEach_grauInt(inAlloc,outAlloc) statement expects the same Element-type on the input and output side. Thus I tried to « disconnecd » in- and out-Allocation and consider the input-allocation (bitmap) as a global variable bmpAllocIn as follows:

#pragma version(1)
#pragma rs java_package_name(com.example.dani.oldgauss)
#pragma rs_fp_relaxed

rs_allocation bmpAllocIn;
int32_t width;
int32_t height;

uchar __attribute__((kernel)) grauInt(uint32_t x, uint32_t y) {
uchar4 c=rsGetElementAt_uchar4(bmpAllocIn, x, y);
uchar gr= (uchar) 0.2125*c.r + 0.7154*c.g + 0.0721*c.b;
return gr;
}

With Java side:

int[] data1 = new int[width*height];
ScriptC_gray graysc;
graysc=new ScriptC_gray(rs);

graysc.set_bmpAllocIn(Allocation.createFromBitmap(rs,bmpfoto1));
Type.Builder TypeOut = new Type.Builder(rs, Element.U8(rs));
TypeOut.setX(width).setY(height);
Allocation outAlloc = Allocation.createTyped(rs, TypeOut.create());

graysc.forEach_grauInt(outAlloc);
outAlloc.copyTo(data1);

Now the surprising thing is that I get again the same error message: „32 bit integer source does not match allocation type UNSIGNED_8“. This I cannot understand. What am I doing wrong here?


回答1:


The reason is the

int[] data1 = new int[width * height];

line. You are attempting to use the array it creates as the target for copyTo(), and that throws an exception. Change it to

byte[] data1 = new byte[width * height];

and all will be ok. And BTW, input and output allocations can be of different types.

As a side note, you could also eliminate floating point computation from your RS filter entirely, it will improve the performance on some architectures.



来源:https://stackoverflow.com/questions/32015660/optimizing-data-types-to-increase-speed

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!