I have the following Array = [\"Jason\", \"Jason\", \"Teresa\", \"Judah\", \"Michelle\", \"Judah\", \"Judah\", \"Allison\"]
How do I produce a count for
Enumberable#each_with_object saves you from returning the final hash.
names.each_with_object(Hash.new(0)) { |name, hash| hash[name] += 1 }
Returns:
=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
counts = Hash.new(0)
names.each { |name| counts[name] += 1 }
# => {"Jason" => 2, "Teresa" => 1, ....
arr = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
arr.uniq.inject({}) {|a, e| a.merge({e => arr.count(e)})}
Time elapsed 0.028 milliseconds
interestingly, stupidgeek's implementation benchmarked:
Time elapsed 0.041 milliseconds
and the winning answer:
Time elapsed 0.011 milliseconds
:)
Lots of great implementations here.
But as a beginner I would consider this the easiest to read and implement
names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
name_frequency_hash = {}
names.each do |name|
count = names.count(name)
name_frequency_hash[name] = count
end
#=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
The steps we took:
names
arraynames
arrayname
and a value using the count
It may be slightly more verbose (and performance wise you will be doing some unnecessary work with overriding keys), but in my opinion easier to read and understand for what you want to achieve
There's actually a data structure which does this: MultiSet
.
Unfortunately, there is no MultiSet
implementation in the Ruby core library or standard library, but there are a couple of implementations floating around the web.
This is a great example of how the choice of a data structure can simplify an algorithm. In fact, in this particular example, the algorithm even completely goes away. It's literally just:
Multiset.new(*names)
And that's it. Example, using https://GitHub.Com/Josh/Multimap/:
require 'multiset'
names = %w[Jason Jason Teresa Judah Michelle Judah Judah Allison]
histogram = Multiset.new(*names)
# => #<Multiset: {"Jason", "Jason", "Teresa", "Judah", "Judah", "Judah", "Michelle", "Allison"}>
histogram.multiplicity('Judah')
# => 3
Example, using http://maraigue.hhiro.net/multiset/index-en.php:
require 'multiset'
names = %w[Jason Jason Teresa Judah Michelle Judah Judah Allison]
histogram = Multiset[*names]
# => #<Multiset:#2 'Jason', #1 'Teresa', #3 'Judah', #1 'Michelle', #1 'Allison'>
The following is a slightly more functional programming style:
array_with_lower_case_a = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
hash_grouped_by_name = array_with_lower_case_a.group_by {|name| name}
hash_grouped_by_name.map{|name, names| [name, names.length]}
=> [["Jason", 2], ["Teresa", 1], ["Judah", 3], ["Michelle", 1], ["Allison", 1]]
One advantage of group_by
is that you can use it to group equivalent but not exactly identical items:
another_array_with_lower_case_a = ["Jason", "jason", "Teresa", "Judah", "Michelle", "Judah Ben-Hur", "JUDAH", "Allison"]
hash_grouped_by_first_name = another_array_with_lower_case_a.group_by {|name| name.split(" ").first.capitalize}
hash_grouped_by_first_name.map{|first_name, names| [first_name, names.length]}
=> [["Jason", 2], ["Teresa", 1], ["Judah", 3], ["Michelle", 1], ["Allison", 1]]