If I create an Enumertor like so:
enum = [1,2,3].each => #
enum
is an Enumerator. What
What happens if you do enum = [1,2,3].each; enum.next
?:
enum = [1,2,3].each
=> #<Enumerator: [1, 2, 3]:each>
enum.next
=> 1
enum.next
=> 2
enum.next
=> 3
enum.next
StopIteration: iteration reached an end
This can be useful when you have an Enumerator that does a calculation, such as a prime-number calculator, or a Fibonacci-sequence generator. It provides flexibility in how you write your code.
As answered so far, Enumerator comes in handy when you want to iterate through a sequence of data of potentially infinite length.
Take a prime number generator prime_generator
that extends Enumerator for example. If we want to get the first 5 primes, we can simply write prime_generator.take 5
instead of embedding the "limit" into the generating logic. Thus we can separate generating prime numbers and taking a certain amount out of generated prime numbers making the generator reusable.
I for one like method chaining using methods of Enumerable returning Enumerator like the following example (it may not be a "purpose" but I want to just point out an aesthetic aspect of it):
prime_generator.take_while{|p| p < n}.each_cons(2).find_all{|pair| pair[1] - pair[0] == 2}
Here the prime_generator
is an instance of Enumerator that returns primes one by one. We can take prime numbers below n
using take_while
method of Enumerable. The methods each_cons
and find_all
both return Enumerator so they can be chained. This example is meant to generate twin primes below n
. This may not be an efficient implementation but is easily written within a line and IMHO suitable for prototyping.
Here is a pretty straightforward implementation of prime_generator
based on Enumerator:
def prime?(n)
n == 2 or
(n >= 3 and n.odd? and (3...n).step(2).all?{|k| n%k != 0})
end
prime_generator = Enumerator.new do |yielder|
n = 1
while true
yielder << n if prime? n
n += 1
end
end
I think, the main purpose is to get elements by demand instead of getting them all in a single loop. I mean something like this:
e = [1, 2, 3].each
... do stuff ...
first = e.next
... do stuff with first ...
second = e.next
... do more stuff with second ...
Note that those do stuff
parts can be in different functions far far away from each other.
Lazily evaluated infinite sequences (e.g. primes, Fibonacci numbers, string keys like 'a'..'z','aa'..'az','ba'..'zz','aaa'..
etc.) are a good use case for enumerators.
To understand the major advantage of the enumerator class, you first need to distinguish internal and external iterators. With internal iterators, the iterator itself controls the iteration. With external iterators, the client (often times the programmer) controls the iteration. Clients that use an external iterator must advance the traversal and request the next element explicitly from the iterator. In contrast, the client hands an internal iterator an operation to perform, and the iterator applies that operation to every element in the collection.
In Ruby, the Enumerator class enables you to make use of external iterators. And once you understand external iterators you will begin to discover a lot of advantages. First, let's look how the Enumerator class facilitates external iteration:
class Fruit
def initialize
@kinds = %w(apple orange pear banana)
end
def kinds
yield @kinds.shift
yield @kinds.shift
yield @kinds.shift
yield @kinds.shift
end
end
f = Fruit.new
enum = f.to_enum(:kinds)
enum.next
=> "apple"
f.instance_variable_get :@kinds
=> ["orange", "pear", "banana"]
enum.next
=> "orange"
f.instance_variable_get :@kinds
=> ["pear", "banana"]
enum.next
=> "pear"
f.instance_variable_get :@kinds
=> ["banana"]
enum.next
=> "banana"
f.instance_variable_get :@kinds
=> []
enum.next
StopIteration: iteration reached an end
It's important to note that calling to_enum on an object and passing a symbol that corresponds to a method will instantiate Enumerator class and in our example, the enum local variable holds an Enumerator instance. And then we use external iteration to traverse through the enumeration method we created. Our enumeration method called "kinds" and notice we use the yield method, which we typically do with blocks. Here, the enumerator will yield one value at a time. It pauses after each yield. When asked for another value, it will resume immediately after the last yielded value, and execute up to the next yielded value. When nothing left to yield, and you call next, it will invoke StopIteration exception.
So what is the power of external iteration in Ruby? There are several benefits and I will highlight a few of them. First, the Enumerator class allows for chaining. For example, with_index is defined in the Enumerator class and it allows us to specify a start value for iteration when iterating over an Enumerator object:
f.instance_variable_set :@kinds, %w(apple orange pear banana)
enum.rewind
enum.with_index(1) do |name, i|
puts "#{name}: #{i}"
end
apple: 1
orange: 2
pear: 3
banana: 4
Second, it provides a TON of useful convenience methods from the Enumerable module. Remember Enumerator is a class and Enumerable is a module, but the Enumerable module is included in the Enumerator class and so Enumerators are Enumerable:
Enumerator.ancestors
=> [Enumerator, Enumerable, Object, Kernel, BasicObject]
f.instance_variable_set :@kinds, %w(apple orange pear banana)
enum.rewind
enum.detect {|kind| kind =~ /^a/}
=> "apple"
enum
=> #<Enumerator: #<Fruit:0x007fb86c09bdf8 @kinds=["orange", "pear", "banana"]>:kinds>
And there is one other major benefit of Enumerator that might not be immediately clear. Let me explain this through a demonstration. As you probably know, you can make any of your user-defined classes Enumerable by including the Enumerable module and defining an each instance method:
class Fruit
include Enumerable
attr_accessor :kinds
def initialize
@kinds = %w(apple orange pear banana)
end
def each
@kinds.each { |kind| yield kind }
end
end
This is cool. Now we have a ton of Enumerable instance method goodies available to us like chunk
, drop_while
, flat_map
, grep
, lazy
, partition
, reduce
, take_while
and more.
f.partition {|kind| kind =~ /^a/ }
=> [["apple"], ["orange", "pear", "banana"]]
It's interesting to note that each of the instance methods of Enumerable module actually call our each method behind the scenes in order to get the enumerable items. So if we were to implement the reduce method, it might look something like this:
module Enumerable
def reduce(acc)
each do |value|
acc = yield(acc, value)
end
acc
end
end
Notice how it passes a block to the each method and so our each method is expected to yield something back to the block.
But look what happens if client code calls the each method without specifying a block:
f.each
LocalJumpError: no block given (yield)
So now we can modify our each method to use enum_for, which will return an Enumerator object when a block is not given:
class Fruit
include Enumerable
attr_accessor :kinds
def initialize
@kinds = %w(apple orange pear banana)
end
def each
return enum_for(:each) unless block_given?
@kinds.each { |kind| yield kind }
end
end
f = Fruit.new
f.each
=> #<Enumerator: #<Fruit:0x007ff70aa3b548 @kinds=["apple", "orange", "pear", "banana"]>:each>
And now we have an Enumerator instance we could control with our client code for later use.
It is possible to combine enumerators:
array.each.with_index { |el, idx| ... }