What is a clever way to remove rows from a CSV file in ruby where a particular value exists in a particular row?
Here\'s an example of a file:
350 lbs.,
Well, I don't think this example will get the answer you are looking for... but this would work...
tmp.txt =>
350 lbs., Outrigger Footprint, 61" x 53", Weight, 767 lbs., 300-2080
350 lbs., Outrigger Footprint, 61" x 53", Weight, 817 lbs., 300-2580
350 lbs., Outrigger Footprint, 61" x 53", Weight, 817 lbs., 300-2580
350 lbs., Outrigger Footprint, 69" x 61", Weight, 867 lbs., 300-3080
350 lbs., Outrigger Footprint, 69" x 61", Weight, 867 lbs., 300-3080
File.readlines('tmp.txt').uniq
will return this:
350 lbs., Outrigger Footprint, 61" x 53", Weight, 767 lbs., 300-2080
350 lbs., Outrigger Footprint, 61" x 53", Weight, 817 lbs., 300-2580
350 lbs., Outrigger Footprint, 69" x 61", Weight, 867 lbs., 300-3080
So, you could also easily sort with Array fxns. Google ruby arrays and I'm sure you can learn how to choose if you want an entry according to a comparison to a desired string.
You can use this to get the unique lines in an array in a csv file
File.readlines("file.csv").uniq
=> ["350 lbs., Outrigger Footprint, 61\" x 53\", Weight, 767 lbs., 300-2080\n", "350 lbs., Outrigger Footprint, 61\" x 53\", Weight, 817 lbs., 300-2580\n", "350 lbs., Outrigger Footprint, 69\" x 61\", Weight, 867 lbs., 300-3080\n"]
To write it to a new file, you can open a file in write mode, write this into the file:
File.open("new_csv", "w+") { |file| file.puts File.readlines("csv").uniq }
For comparing values, you can use split function on ",", to access each column like this:
rows = File.readlines("csv").map(&:chomp) # equivalent to File.readlines.map { |f| f.chomp }
mapped_columns = rows.map { |r| r.split(",").map(&:strip) }
=> [["350 lbs.", " Outrigger Footprint", " 61\" x 53\"", " Weight", " 767 lbs.", " 300-2080"], ["350 lbs.", " Outrigger Footprint", " 61\" x 53\"", " Weight", " 817 lbs.", " 300-2580"], .....]
mapped_columns[0][5]
=> "300-2080"
If you want more functionality, you are better off installing FasterCSV gem.
You can also create a Hash which would NOT allow duplicate records as its entries . For example, the following code should help:
require 'optparse'
require 'csv'
require 'pp'
options = Hash.new
OptionParser.new do |opts|
opts.banner = "Usage: remove_extras.rb [options] file1 ..."
options[:input_file] = ''
opts.on('-i', '--input_file FILENAME', 'File to have extra rows removed') do |file|
options[:input_file] = file
end
end.parse!
if File.exists?(options[:input_file])
p "Parsing: #{options[:input_file]}"
UniqFile=Hash.new
File.open(options[:input_file]).each do |row|
UniqFile.store(row,row.hash)
end
puts "please enter the output filename: \n"
aFile=File.open(gets.chomp, "a+")
UniqFile.each do|key,value|
aFile.syswrite("#{key}")
end
end