Java Array Efficiency

流过昼夜 提交于 2019-12-07 15:08:26

问题


I am not 100% sure of the mechanism in action so I decided to post here for further clarifications.

I am doing a project that should handle large amounts of data in Java (it has to be Java). I would like it to be as efficient as possible. By efficient I mean that memory and speed calculations should come in first and readability should come in second.

Now I have two ways to store my data: create one array of MyObject

1) MyObject[][] V = new MyObject[m][n]

Or create two arrays of int:

2) int[][] V = new int[m][n]

3) int[][] P = new int[m][n]

Clearly MyObject contains at least two fields and some methods. Now I notice that while looping over the MyObject array to assign values I have to call new or else I get a null pointer exception. This means that the new in line 1 didn't suffice. Is this a more expensive operation than, for sake of argument, P[i][j]=n, considering that arrays are also objects in Java?


回答1:


Is this a more expensive operation than, for sake of argument, P[i][j]=n, considering that arrays are also objects in Java?

In the first case you create an array object which is to store other objects of type array. Both the array object and the objects that are to be stored in the array need to be instantiated meaning that you will need m * n + 1 object instantiations and also (m * n + 1) * objectSize memory consumption.

In the second case you only have to instantiate the array object; int primitives are not objects so this should me more faster and also more memory efficient since and Object memory size is several times larger than that of an int. Here you basically have 1 object instantiation and (m * n) * intSize + objectSize memory consumption.

Another reason for using primitives is the fact that when used as local variables they are kept on the stack; you will probably use intermediate local variables inside a method before storing the computed value in the array and the allocation/deallocation time for the memory of these variables is several times higher than that of an object which lives on the heap.




回答2:


I've often found through profiling that replacing an array of objects with several arrays of scalars improves memory consumption and performance.

However, only profiling can tell whether or not it is a worthwhile optimization in your case.

A good profiler will let you measure both the performance and the memory footprint of your code.




回答3:


For fast processing of truly massive amounts of data it's better to lay the data in a single contiguous block of memory in a way that data you access together are close to each other. This should minimize the cache misses, which is one of today's worst performance killers.

In java you achieve this through the use only one single one-dimensional array of primitives. If you use two arrays or even a two dimensional array the data is no longer guaranteed to be in one contiguous block.

Another, slightly more involved solution is using an off-heap data structure, like here: http://mechanical-sympathy.blogspot.com/2012/10/compact-off-heap-structurestuples-in.html




回答4:


First of all, you must use List or Set i.e. Collections in java instead of array. Because you may not know the size of data you need to handle. Moreover, collections has API methods which allow you to perform operations at easy like inserting elements or deleting them. Working with array is quite complex and error prone because you may need to iterate over it again and again and also size has to be determined at compile time which is not possible if you have variable size data.

Also, allocating memory at runtime (i.e. using new keyword) is expensive then just assigning the value to already existing object i.e. p[i][j]=v;



来源:https://stackoverflow.com/questions/15585757/java-array-efficiency

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!