I had a generic question about JavaScript arrays. Are array indices in JavaScript internally handled as strings? I read somewhere that because arrays are objects in JavaScri
Formally, all property names are strings. That means that array-like numeric property names really aren't any different from any other property names.
If you check step 6 in the relevant part of the spec, you'll see that property accessor expressions are always coerced to strings before looking up the property. That process is followed (formally) regardless of whether the object is an array instance or another sort of object. (Again, it just has to seem like that's what's happening.)
Now, internally, the JavaScript runtime is free to implement array functionality any way it wants.
edit — I had the idea of playing with Number.toString
to demonstrate that a number-to-string conversion happens, but it turns out that the spec explicitly describes that specific type conversion as taking place via an internal process, and not by an implicit cast followed by a call to .toString()
(which probably is a good thing for performance reasons).
In JavaScript there are two type of arrays: standard arrays and associative arrays (or an object with properies)
So ...
var arr = [ 0, 1, 2, 3 ];
... is defined as a standard array where indexes can only be integers. When you do arr["something"] since something (which is what you use as index) is not an integer you are basically defining a property to the arr object (everything is object in JavaScript). But you are not adding an element to the standard array.
Let's see:
[1]["0"] === 1 // true
Oh, but that's not conclusive, since the runtime could be coercing "0"
to +"0"
and +"0" === 0
.
[1][false] === undefined // true
Now, +false === 0
, so no, the runtime isn't coercing the value to a number.
var arr = [];
arr.false = "foobar";
arr[false] === "foobar" // true
So actually, the runtime is coercing the value to a string. So yep, it's a hash table lookup (externally).
Yes, technically array-indexes are strings, but as Flanagan elegantly put it in his 'Definitive guide':
"It is helpful to clearly distinguish an array index from an object property name. All indexes are property names, but only property names that are integers between 0 and 232-1 are indexes."
Usually you should not care what the browser (or more in general 'script-host') does internally as long as the outcome conforms to a predictable and (usually/hopefully) specified result. In fact, in case of javascript (or ECMAScript 262) is only described in terms of what conceptual steps are needed. That (intentionally) leaves room for script-host (and browsers) to come up with clever smaller and faster way's to implement that specified behavior.
In fact, modern browsers use a number of different algorithms for different types of arrays internally: it matters what they contain, how big they are, if they are in order, if they are fixed and optimizable upon (jit) compile-time or if they are sparse or dense (yes it often pays to do new Array(length_val)
instead of ninja []
).
In your thinking-concept (when learning javascript) it might help to know that arrays are just special kind of objects. But they are not always the same thing one might expect, for example:
var a=[];
a['4294967295']="I'm not the only one..";
a['4294967296']="Yes you are..";
alert(a); // === I'm not the only one..
although it is easy and pretty transparent to the uninformed programmer to have an array (with indexes) and attach properties to the array-object.
The best answer (I think) is from the spec (15.4) itself:
Array Objects
Array objects give special treatment to a certain class of property names. A property name P (in the form of a String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 232−1. A property whose property name is an array index is also called an element. Every Array object has a length property whose value is always a nonnegative integer less than 232. The value of the length property is numerically greater than the name of every property whose name is an array index; whenever a property of an Array object is created or changed, other properties are adjusted as necessary to maintain this invariant. Specifically, whenever a property is added whose name is an array index, the length property is changed, if necessary, to be one more than the numeric value of that array index; and whenever the length property is changed, every property whose name is an array index whose value is not smaller than the new length is automatically deleted. This constraint applies only to own properties of an Array object and is unaffected by length or array index properties that may be inherited from its prototypes.
An object, O, is said to be sparse if the following algorithm returns true:
- Let len be the result of calling the [[Get]] internal method of O with argument "length".
For each integer i in the range 0≤i
a. Let elem be the result of calling the [[GetOwnProperty]] internal method of O with argument ToString(i).
b. If elem is undefined, return true.Return false.
Effectively the ECMAScript 262 spec just ensures to the javascript-programmer unambiguous array-references regardless of getting/setting arr['42']
or arr[42]
up to 32bit Unsigned.
The main difference is for example (auto-updating of) array.length
, array.push
and other array-sugar like array.concat
etc.
While, yes, javascript also lets one loop over the properties one has set to an object, we can not read how much we have set (without a loop). And yes, to the best of my knowledge, modern browsers (especially chrome in what they call (but don't exactly specify)) 'small integers' are wicked fast with true (pre-initialized) small-int arrays.
Also see for example this related question.
Edit: as per @Felix Kling's test (from his comment above):
After arr[4294967294] = 42;
, arr.length
correctly shows 4294967295
. However, calling arr.push(21)
; throws a RangeError: Invalid array length
. arr[arr.length] = 21
works, but doesn't change length.
The explanation for this (predictable and intended) behavior should be clear after this answer.
Edit2:
Now, someone gave the comment:
for (var i in a) console.log(typeof i) shows 'string' for all indexes.
Since for in
is the (unordered I must add) property iterator in javascript, it is kind of obvious it returns a string (I'd be pretty darned if it didn't).
From MDN:
for..in should not be used to iterate over an Array where index order is important.
Array indexes are just enumerable properties with integer names and are otherwise identical to general Object properties. There is no guarantee that for...in will return the indexes in any particular order and it will return all enumerable properties, including those with non–integer names and those that are inherited.
Because the order of iteration is implementation dependent, iterating over an array may not visit elements in a consistent order. Therefore it is better to use a for loop with a numeric index (or Array.forEach or the for...of loop) when iterating over arrays where the order of access is important.
So.. what have we learned? If order is important to us (often is with arrays), then we NEED this quirky array in javascript, and having a 'length' is rather useful for looping in numerical order.
Now think of the alternative: Give your objects an id/order but then you'd need to loop over your objects for every next id/order (property) once again..
Edit 3:
Someone answered along the lines of:
var a = ['a','b','c'];
a['4'] = 'e';
a[3] = 'd';
alert(a); // returns a,b,c,d,e
Now using the explanation in my answer: what happened is that '4'
is coercible to integer 4
and that is in the range [0, 4294967295]
making it into a valid array index
also called element
. Since var a
is an array ([]
), the array element 4 gets added as array element, not as property (what would have happened if var a
was an object ({}
).
An example to further outline the difference between array and object:
var a = ['a','b','c'];
a['prop']='d';
alert(a);
see how it returns a,b,c
with no 'd' to be seen.
Edit 4:
You commented: "In that case, an integer index should be handled as a string, as it is a property of the array, which is a special type of JS object."
That is wrong in terms of terminology because: (strings representing) integer indexes (between [0, 4294967295]) create array indexes
or elements
; not properties
.
It's better to say: Both an actual integer and a string
representing an integer (both between [0, 4294967295]) is a valid array index (and should conceptually be regarded as integer) and creates/changes array elements (the 'things'/values (only) that get returned when you do arr.join()
or arr.concat()
for example).
Everything else creates/changes a property (and should conceptually be regarded as string).
What the browser really does, usually shouldn't interest you, noting that the simpler and clearer specified you code, the better chance the browser has to recognize: 'oh, lets optimize this to an actual array under the hood'.
That is correct so:
> var a = ['a','b','c']
undefined
> a
[ 'a', 'b', 'c' ]
> a[0]
'a'
> a['0']
'a'
> a['4'] = 'e'
'e'
> a[3] = 'd'
'd'
> a
[ 'a', 'b', 'c', 'd', 'e' ]