Convert .json to .csv in ruby

前端 未结 7 1411
心在旅途
心在旅途 2020-12-24 09:42

I want to convert .json file into .csv file using ruby. Pleases help me to do this.

Also propose any tool to achieve this.

相关标签:
7条回答
  • 2020-12-24 09:49

    Edit:

    This functionality described below is now available as a gem. After installing with gem install json_converter, the following snippet can be used to generate a CSV from a valid JSON string or object:

    require 'json_converter'
    json_converter= JsonConverter.new
    
    # Assume json is a valid JSON string or object
    csv = json_converter.generate_csv json
    

    Original Answer:

    If your JSON data is relatively simple (no nesting or arrays), Alex's answer is probably the cleanest way of handling this problem.

    However, if you do need to take arrays and nested objects into account, I've attempted to port a web version of such a converter to ruby. It can be found here. The methods that handle the actual restructuring of data are array_from and flatten.

    The array_from method attempts to identify what a "row" of data looks like for a given dataset. It is not perfect, and you may want to tweak this part for different datasets.

    # Attempt to identify what a "row" should look like
    def array_from(json)
      queue, next_item = [], json
      while !next_item.nil?
    
        return next_item if next_item.is_a? Array
    
        if next_item.is_a? Hash
          next_item.each do |k, v|
            queue.push next_item[k]
          end
        end
    
        next_item = queue.shift
      end
    
      return [json]
    end
    

    The flatten method recursively iterates over the JSON object(s), and generates an object that represents headers and values. If an object is nested, the header for its column will be prefixed with its parent key(s), delimited by the / character.

    # The path argument is used to construct header columns for nested elements
    def flatten(object, path='')
      scalars = [String, Integer, Fixnum, FalseClass, TrueClass]
      columns = {}
    
      if [Hash, Array].include? object.class
        object.each do |k, v|
          new_columns = flatten(v, "#{path}#{k}/") if object.class == Hash
          new_columns = flatten(k, "#{path}#{k}/") if object.class == Array
          columns = columns.merge new_columns
        end
    
        return columns
      elsif scalars.include? object.class
          # Remove trailing slash from path
          end_path = path[0, path.length - 1]
          columns[end_path] = object
          return columns
      else
        return {}
      end
    end
    

    If there are any null values in the original JSON, you'll need to convert these to something other than nil before attempting the conversion - you'll generally end up with uneven rows if you don't. The nils_to_strings method handles that:

    # Recursively convert all nil values of a hash to empty strings
    def nils_to_strings(hash)
      hash.each_with_object({}) do |(k,v), object|
        case v
        when Hash
          object[k] = nils_to_strings v
        when nil
          object[k] = ''
        else
          object[k] = v
        end
      end
    end
    

    Here's a brief example of how this would be used:

    json = JSON.parse(File.open('in.json').read)
    in_array = array_from json
    in_array.map! { |x| nils_to_strings x }
    
    out_array = []
    in_array.each do |row|
      out_array[out_array.length] = flatten row
    end
    
    headers_written = false
    CSV.open('out.csv', 'w') do |csv|
      out_array.each do |row|
        csv << row.keys && headers_written = true if headers_written === false
        csv << row.values
      end
    end
    

    And finally, here's some example input/output:

    Input:

    {
      "Forms": [
        {
          "Form": {
            "id": "x",
            "version_id": "x",
            "name": "x",
            "category": "",
            "subcategory": null,
            "is_template": null,
            "moderation_status": "x",
            "display_status": "x",
            "use_ssl": "x",
            "modified": "x",
            "Aggregate_metadata": {
              "id": "x",
              "response_count": "x",
              "submitted_count": "x",
              "saved_count": "x",
              "unread_count": "x",
              "dropout_rate": "x",
              "average_completion_time": null,
              "is_uptodate": "x"
            }
          },
          "User": {
            "username": "somedude@example.com"
          }
        },
        {
          "Form": {
            "id": "x",
            "version_id": "x",
            "name": "x",
            "category": "",
            "subcategory": null,
            "is_template": null,
            "moderation_status": "x",
            "display_status": "x",
            "use_ssl": "x",
            "modified": "x",
            "Aggregate_metadata": {
              "id": "x",
              "response_count": "x",
              "submitted_count": "x",
              "saved_count": "x",
              "unread_count": "x",
              "dropout_rate": "x",
              "average_completion_time": null,
              "is_uptodate": "x"
            }
          },
          "User": {
            "username": "somedude@example.com"
          }
        }
      ]
    }
    

    Output:

    Form/id,Form/version_id,Form/name,Form/category,Form/subcategory,Form/is_template,Form/moderation_status,Form/display_status,Form/use_ssl,Form/modified,Form/Aggregate_metadata/id,Form/Aggregate_metadata/response_count,Form/Aggregate_metadata/submitted_count,Form/Aggregate_metadata/saved_count,Form/Aggregate_metadata/unread_count,Form/Aggregate_metadata/dropout_rate,Form/Aggregate_metadata/average_completion_time,Form/Aggregate_metadata/is_uptodate,User/username
    x,x,x,"","","",x,x,x,x,x,x,x,x,x,x,"",x,somedude@example.com
    x,x,x,"","","",x,x,x,x,x,x,x,x,x,x,"",x,somedude@example.com
    
    0 讨论(0)
  • 2020-12-24 09:49

    This is handle headers & nested json.

    require 'csv'
    require 'json'
    
    @headers = []
    file = File.open('file.json')
    JSON.parse(file.read).each do |h|
      h.keys.each do |key|
        @headers << key
      end
    end
    
    uniq_headers = @headers.uniq
    file = File.open('file.json')
    finalrow = []
    
    JSON.parse(file.read).each do |h|
      final = {}
      @headers.each do |key2|
        final[key2] = h[key2]
      end
    
      finalrow << final
    end
    
    CSV.open('output.csv' , 'w') do |csv|
      csv << uniq_headers
      finalrow.each do |deal|
        csv << deal.values
      end
    end
    
    0 讨论(0)
  • 2020-12-24 09:54

    Try something like this:

    require 'csv'
    require 'json'
    
    csv_string = CSV.generate do |csv|
      JSON.parse(File.open("foo.json").read).each do |hash|
        csv << hash.values
      end
    end
    
    puts csv_string
    
    0 讨论(0)
  • 2020-12-24 09:56

    Ruby code to convert from json to csv.

    Handles arrays by converting them to double quoted strings.

    #json-to-csv.rb
    require 'json'
    require 'csv'
    
    file = File.read('./input.json')
    hash = JSON.parse(file)
    
    CSV.open('./output.csv', 'w') do |csv|
      headers = hash.first.keys
      csv << headers
    
      hash.each do |item|
        values = item.values
        printable_values = Array.new
        values.each do |value|
    
          printable_values << value.to_s.gsub(/\[|\]/,'').gsub(/"/,'\'')
    
        end
    
        csv << printable_values
    
      end
    
    end
    
    0 讨论(0)
  • 2020-12-24 10:01

    To actually write to file...

    require 'csv'
    require 'json'
    
    CSV.open("your_csv.csv", "w") do |csv| #open new file for write
      JSON.parse(File.open("your_json.json").read).each do |hash| #open json to parse
        csv << hash.values #write value to file
      end
    end
    
    0 讨论(0)
  • 2020-12-24 10:01

    Based on @Alex's answer but adding csv headers and example test.

    # utils.rb
    require "csv"
    module Utils
      def self.array_of_hashes_to_csv(array_of_hashes)
        CSV.generate do |csv|
          csv << array_of_hashes.first.keys
          array_of_hashes.each { |hash| csv << hash.values }
        end
      end
    end
    
    # utils_test.rb
    class UtilsTest < MiniTest::Unit::TestCase
      def test_array_of_hashes_to_csv
        array_of_hashes = [
          { :key1 => "value1", :key2 => "value2" },
          { :key1 => "value3", :key2 => "value4" }
        ]
        expected_result = "key1,key2\nvalue1,value2\nvalue3,value4\n"
        assert_equal(expected_result, Utils.array_of_hashes_to_csv(array_of_hashes))
      end
    end
    
    0 讨论(0)
提交回复
热议问题