I have Googled this and got patchy / contradictory opinions - is there actually any difference between doing a map
and doing a collect
on an array
Ruby aliases the method Array#map to Array#collect; they can be used interchangeably. (Ruby Monk)
In other words, same source code :
static VALUE
rb_ary_collect(VALUE ary)
{
long i;
VALUE collect;
RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);
collect = rb_ary_new2(RARRAY_LEN(ary));
for (i = 0; i < RARRAY_LEN(ary); i++) {
rb_ary_push(collect, rb_yield(RARRAY_AREF(ary, i)));
}
return collect;
}
http://ruby-doc.org/core-2.2.0/Array.html#method-i-map
I've been told they are the same.
Actually they are documented in the same place under ruby-doc.org:
http://www.ruby-doc.org/core/classes/Array.html#M000249
- ary.collect {|item| block } → new_ary
- ary.map {|item| block } → new_ary
- ary.collect → an_enumerator
- ary.map → an_enumerator
Invokes block once for each element of self. Creates a new array containing the values returned by the block. See also Enumerable#collect.
If no block is given, an enumerator is returned instead.a = [ "a", "b", "c", "d" ] a.collect {|x| x + "!" } #=> ["a!", "b!", "c!", "d!"] a #=> ["a", "b", "c", "d"]
The collect
and collect!
methods are aliases to map
and map!
, so they can be used interchangeably. Here is an easy way to confirm that:
Array.instance_method(:map) == Array.instance_method(:collect)
=> true
I did a benchmark test to try and answer this question, then found this post so here are my findings (which differ slightly from the other answers)
Here is the benchmark code:
require 'benchmark'
h = { abc: 'hello', 'another_key' => 123, 4567 => 'third' }
a = 1..10
many = 500_000
Benchmark.bm do |b|
GC.start
b.report("hash keys collect") do
many.times do
h.keys.collect(&:to_s)
end
end
GC.start
b.report("hash keys map") do
many.times do
h.keys.map(&:to_s)
end
end
GC.start
b.report("array collect") do
many.times do
a.collect(&:to_s)
end
end
GC.start
b.report("array map") do
many.times do
a.map(&:to_s)
end
end
end
And the results I got were:
user system total real
hash keys collect 0.540000 0.000000 0.540000 ( 0.570994)
hash keys map 0.500000 0.010000 0.510000 ( 0.517126)
array collect 1.670000 0.020000 1.690000 ( 1.731233)
array map 1.680000 0.020000 1.700000 ( 1.744398)
Perhaps an alias isn't free?
There's no difference, in fact map
is implemented in C as rb_ary_collect
and enum_collect
(eg. there is a difference between map
on an array and on any other enum, but no difference between map
and collect
).
Why do both map
and collect
exist in Ruby? The map
function has many naming conventions in different languages. Wikipedia provides an overview:
The map function originated in functional programming languages but is today supported (or may be defined) in many procedural, object oriented, and multi-paradigm languages as well: In C++'s Standard Template Library, it is called
transform
, in C# (3.0)'s LINQ library, it is provided as an extension method calledSelect
. Map is also a frequently used operation in high level languages such as Perl, Python and Ruby; the operation is calledmap
in all three of these languages. Acollect
alias for map is also provided in Ruby (from Smalltalk) [emphasis mine]. Common Lisp provides a family of map-like functions; the one corresponding to the behavior described here is calledmapcar
(-car indicating access using the CAR operation).
Ruby provides an alias for programmers from the Smalltalk world to feel more at home.
Why is there a different implementation for arrays and enums? An enum is a generalized iteration structure, which means that there is no way in which Ruby can predict what the next element can be (you can define infinite enums, see Prime for an example). Therefore it must call a function to get each successive element (typically this will be the each
method).
Arrays are the most common collection so it is reasonable to optimize their performance. Since Ruby knows a lot about how arrays work it doesn't have to call each
but can only use simple pointer manipulation which is significantly faster.
Similar optimizations exist for a number of Array methods like zip
or count
.
#collect
is actually an alias for #map
. That means the two methods can be used interchangeably, and effect the same behavior.