问题
I have a .csv file that, for simplicity, is two fields: ID and comments. The rows of id's are duplicated where each comment field had met max char from whatever table it was generated from and another row was necessary. I now need to merge associative comments together thus creating one row for each unique ID, using Ruby.
To illustrate, I'm trying in Ruby, to make this:
ID | COMMENT
1 | fragment 1
1 | fragment 2
2 | fragment 1
3 | fragment 1
3 | fragment 2
3 | fragment 3
into this:
ID | COMMENT
1 | fragment 1 fragment 2
2 | fragment 1
3 | fragment 1 fragment 2 fragment 3
I've come close to finding a way to do this using inject({}) and hashmap, but still working on getting all data merged correctly. Meanwhile seems my code is getting too complicated with multiple hashes and arrays just to do a merge on selective rows.
What's the best/simplest way to achieve this type of row merge? Could it be done with just arrays?
Would appreciate advice on how one would normally do this in Ruby.
回答1:
Keep the headers and use group by ID:
rows = CSV.read 'comment.csv', :headers => true
rows.group_by{|row| row['ID']}.values.each do |group|
puts [group.first['ID'], group.map{|r| r['COMMENT']} * ' '] * ' | '
end
You can use 0 and 1 but I think it's clearer to use the header field names.
回答2:
With the following csv file, tmp.csv
1,fragment 11
1,fragment 21
2,fragment 21
2,fragment 22
3,fragment 31
3,fragment 32
3,fragment 33
Try this (demonstrated using irb)
irb> require 'csv'
=> true
irb> h = Hash.new
=> {}
irb> CSV.foreach("tmp.csv") {|r| h[r[0]] = h.key?(r[0]) ? h[r[0]] + r[1] : r[1]}
=> nil
irb> h
=> {"1"=>"fragment 11fragment 21", "2"=>"fragment 21fragment 22", "3"=>"fragment 31fragment 32fragment 33"}
来源:https://stackoverflow.com/questions/10973182/merge-rows-csv-by-id-ruby