I am trying to sort a document based on the number of times the word appears then alphabetically by the words so when it is outputted it will look something like this.
When you use the sort
method on a hash, you receive two element arrays in your comparison block, with which you can do comparisons in one pass.
hsh = { 'the' => '6', 'we' => '6', 'those' => '5', 'have' => '3'}
ary = hsh.sort do |a,b|
# a and b are two element arrays in the format [key,value]
value_comparison = a.last <=> b.last
if value_comparison.zero?
# compare keys if values are equal
a.first <=> b.first
else
value_comparison
end
end
# => [['have',3],['those',5],['the',6],['we',6]]
Note that the result is an array of arrays because hashes do not have intrinsic order in ruby
Try this:
Assuming:
a = {
'the' => '6',
'we' => '7',
'those' => '5',
'have' => '3',
'hav' => '3',
'haven' => '3'
}
then after doing this:
b = a.sort_by { |x, y| [ -Integer(y), x ] }
b
will look like this:
[
["we", "7"],
["the", "6"],
["those", "5"],
["hav", "3"],
["have", "3"],
["haven", "3"]
]
Edited to sort by reverse frequencies.
histogram = { 'the' => 6, 'we' => 7, 'those' => 5, 'have' => 3, 'and' => 6 }
Hash[histogram.sort_by {|word, freq| [-freq, word] }]
# {
# 'we' => 7,
# 'and' => 6,
# 'the' => 6,
# 'those' => 5,
# 'have' => 3
# }
Note: this assumes that you use numbers to store the numbers. In your data model, you appear to use strings to store the numbers. I have no idea why you would want to do this, but if you do want to do this, you would obviously have to convert them to numbers before sorting and then back to strings.
Also, this assumes Ruby 1.9. In Ruby 1.8, hashes aren't ordered, so you cannot convert the sorted result back to a hash since that would lose the ordering information, you would have to keep it as an array.
1.9.1
>> words = {'the' => 6,'we' => 7, 'those' => 5, 'have' => 3}
=> {"the"=>6, "we"=>7, "those"=>5, "have"=>3}
>> words.sort_by{ |x| x.last }.reverse
=> [["we", 7], ["the", 6], ["those", 5], ["have", 3]]
word_counts = {
'the' => 6,
'we' => 7,
'those' => 5,
'have' => 3,
'and' => 6
};
word_counts_sorted = word_counts.sort do
|a,b|
# sort on last field descending, then first field ascending if necessary
b.last <=> a.last || a.first <=> b.first
end
puts "Unsorted\n"
word_counts.each do
|word,count|
puts word + " " + count.to_s
end
puts "\n"
puts "Sorted\n"
word_counts_sorted.each do
|word,count|
puts word + " " + count.to_s
end
words = {'the' => 6,'we' => 7,'those' => 5,'have' => 3}
sorted_words = words.sort { |a,b| b.last <=> a.last }
sorted_words.each { |k,v| puts "#{k} #{v}"}
produces:
we 7
the 6
those 5
have 3
You probably want the values to be integers rather than strings for comparison purposes.
EDIT
Oops, overlooked the requirement that it needs to be sorted by the key too. So:
words = {'the' => 6,'we' => 7,'those' => 5,'have' => 3,'zoo' => 3,'foo' => 3}
sorted_words = words.sort do |a,b|
a.last == b.last ? a.first <=> b.first : b.last <=> a.last
end
sorted_words.each { |k,v| puts "#{k} #{v}"}
produces:
we 7
the 6
those 5
foo 3
have 3
zoo 3