I want to convert .json file into .csv file using ruby. Pleases help me to do this.
Also propose any tool to achieve this.
Edit:
This functionality described below is now available as a gem. After installing with gem install json_converter
, the following snippet can be used to generate a CSV from a valid JSON string or object:
require 'json_converter'
json_converter= JsonConverter.new
# Assume json is a valid JSON string or object
csv = json_converter.generate_csv json
Original Answer:
If your JSON data is relatively simple (no nesting or arrays), Alex's answer is probably the cleanest way of handling this problem.
However, if you do need to take arrays and nested objects into account, I've attempted to port a web version of such a converter to ruby. It can be found here. The methods that handle the actual restructuring of data are array_from
and flatten
.
The array_from
method attempts to identify what a "row" of data looks like for a given dataset. It is not perfect, and you may want to tweak this part for different datasets.
# Attempt to identify what a "row" should look like
def array_from(json)
queue, next_item = [], json
while !next_item.nil?
return next_item if next_item.is_a? Array
if next_item.is_a? Hash
next_item.each do |k, v|
queue.push next_item[k]
end
end
next_item = queue.shift
end
return [json]
end
The flatten
method recursively iterates over the JSON object(s), and generates an object that represents headers and values. If an object is nested, the header for its column will be prefixed with its parent key(s), delimited by the /
character.
# The path argument is used to construct header columns for nested elements
def flatten(object, path='')
scalars = [String, Integer, Fixnum, FalseClass, TrueClass]
columns = {}
if [Hash, Array].include? object.class
object.each do |k, v|
new_columns = flatten(v, "#{path}#{k}/") if object.class == Hash
new_columns = flatten(k, "#{path}#{k}/") if object.class == Array
columns = columns.merge new_columns
end
return columns
elsif scalars.include? object.class
# Remove trailing slash from path
end_path = path[0, path.length - 1]
columns[end_path] = object
return columns
else
return {}
end
end
If there are any null
values in the original JSON, you'll need to convert these to something other than nil
before attempting the conversion - you'll generally end up with uneven rows if you don't. The nils_to_strings
method handles that:
# Recursively convert all nil values of a hash to empty strings
def nils_to_strings(hash)
hash.each_with_object({}) do |(k,v), object|
case v
when Hash
object[k] = nils_to_strings v
when nil
object[k] = ''
else
object[k] = v
end
end
end
Here's a brief example of how this would be used:
json = JSON.parse(File.open('in.json').read)
in_array = array_from json
in_array.map! { |x| nils_to_strings x }
out_array = []
in_array.each do |row|
out_array[out_array.length] = flatten row
end
headers_written = false
CSV.open('out.csv', 'w') do |csv|
out_array.each do |row|
csv << row.keys && headers_written = true if headers_written === false
csv << row.values
end
end
And finally, here's some example input/output:
Input:
{
"Forms": [
{
"Form": {
"id": "x",
"version_id": "x",
"name": "x",
"category": "",
"subcategory": null,
"is_template": null,
"moderation_status": "x",
"display_status": "x",
"use_ssl": "x",
"modified": "x",
"Aggregate_metadata": {
"id": "x",
"response_count": "x",
"submitted_count": "x",
"saved_count": "x",
"unread_count": "x",
"dropout_rate": "x",
"average_completion_time": null,
"is_uptodate": "x"
}
},
"User": {
"username": "somedude@example.com"
}
},
{
"Form": {
"id": "x",
"version_id": "x",
"name": "x",
"category": "",
"subcategory": null,
"is_template": null,
"moderation_status": "x",
"display_status": "x",
"use_ssl": "x",
"modified": "x",
"Aggregate_metadata": {
"id": "x",
"response_count": "x",
"submitted_count": "x",
"saved_count": "x",
"unread_count": "x",
"dropout_rate": "x",
"average_completion_time": null,
"is_uptodate": "x"
}
},
"User": {
"username": "somedude@example.com"
}
}
]
}
Output:
Form/id,Form/version_id,Form/name,Form/category,Form/subcategory,Form/is_template,Form/moderation_status,Form/display_status,Form/use_ssl,Form/modified,Form/Aggregate_metadata/id,Form/Aggregate_metadata/response_count,Form/Aggregate_metadata/submitted_count,Form/Aggregate_metadata/saved_count,Form/Aggregate_metadata/unread_count,Form/Aggregate_metadata/dropout_rate,Form/Aggregate_metadata/average_completion_time,Form/Aggregate_metadata/is_uptodate,User/username
x,x,x,"","","",x,x,x,x,x,x,x,x,x,x,"",x,somedude@example.com
x,x,x,"","","",x,x,x,x,x,x,x,x,x,x,"",x,somedude@example.com
This is handle headers & nested json.
require 'csv'
require 'json'
@headers = []
file = File.open('file.json')
JSON.parse(file.read).each do |h|
h.keys.each do |key|
@headers << key
end
end
uniq_headers = @headers.uniq
file = File.open('file.json')
finalrow = []
JSON.parse(file.read).each do |h|
final = {}
@headers.each do |key2|
final[key2] = h[key2]
end
finalrow << final
end
CSV.open('output.csv' , 'w') do |csv|
csv << uniq_headers
finalrow.each do |deal|
csv << deal.values
end
end
Try something like this:
require 'csv'
require 'json'
csv_string = CSV.generate do |csv|
JSON.parse(File.open("foo.json").read).each do |hash|
csv << hash.values
end
end
puts csv_string
Ruby code to convert from json to csv.
Handles arrays by converting them to double quoted strings.
#json-to-csv.rb
require 'json'
require 'csv'
file = File.read('./input.json')
hash = JSON.parse(file)
CSV.open('./output.csv', 'w') do |csv|
headers = hash.first.keys
csv << headers
hash.each do |item|
values = item.values
printable_values = Array.new
values.each do |value|
printable_values << value.to_s.gsub(/\[|\]/,'').gsub(/"/,'\'')
end
csv << printable_values
end
end
To actually write to file...
require 'csv'
require 'json'
CSV.open("your_csv.csv", "w") do |csv| #open new file for write
JSON.parse(File.open("your_json.json").read).each do |hash| #open json to parse
csv << hash.values #write value to file
end
end
Based on @Alex's answer but adding csv headers and example test.
# utils.rb
require "csv"
module Utils
def self.array_of_hashes_to_csv(array_of_hashes)
CSV.generate do |csv|
csv << array_of_hashes.first.keys
array_of_hashes.each { |hash| csv << hash.values }
end
end
end
# utils_test.rb
class UtilsTest < MiniTest::Unit::TestCase
def test_array_of_hashes_to_csv
array_of_hashes = [
{ :key1 => "value1", :key2 => "value2" },
{ :key1 => "value3", :key2 => "value4" }
]
expected_result = "key1,key2\nvalue1,value2\nvalue3,value4\n"
assert_equal(expected_result, Utils.array_of_hashes_to_csv(array_of_hashes))
end
end