问题
I have the following code which is parsing a HTML table as simply as possible.
# Timestamp (Column 1 of the table)
page = agent.page.search("tbody td:nth-child(1)").each do |item|
Call.create!(:time => item.text.strip)
end
# Source (Column 2 of the table)
page = agent.page.search("tbody td:nth-child(2)").each do |item|
Call.create!(:source => item.text.strip)
end
# Destination (Column 3 of the table)
page = agent.page.search("tbody td:nth-child(3)").each do |item|
Call.create!(:destination => item.text.strip)
end
# Duration (Column 4 of the table)
page = agent.page.search("tbody td:nth-child(4)").each do |item|
Call.create!(:duration => item.text.strip)
end
Although the above code works well, it's treating each "item" as a new record. So it's adding a record for each of the time rows, another record for every one of the source column etc.
What is the simplest way of making it cycle the above code but adding the four items into one record and then moving on to the next record?
For additional info here is the migration file which shows my database structure:
class CreateCalls < ActiveRecord::Migration
def change
create_table :calls do |t|
t.datetime :time
t.string :source
t.string :destination
t.string :duration
t.timestamps
end
end
end
Any help is appreciated.
回答1:
Consider iterating over each row instead of each column.
page = agent.page.search("table tbody tr").each do |row|
time = row.at("td:nth-child(1)").text.strip
source = row.at("td:nth-child(2)").text.strip
destination = row.at("td:nth-child(3)").text.strip
duration = row.at("td:nth-child(4)").text.strip
Call.create!(:time => time, :source => source, :destination => destination, :duration => duration)
end
回答2:
Instead of calling call.create everytime just append all the source to a string and the at the end save the record.
来源:https://stackoverflow.com/questions/8792074/parse-html-into-rails-without-new-record-every-time