“OutOfMemoryError: GC overhead limit exceeded”: parse large json file with java

前端 未结 3 817
北荒
北荒 2021-01-24 13:36

I try to parse large json file (more 600Mo) with Java. My json file look like that:

{
    \"0\" : {\"link_id\": \"2381317\", \"overview\": \"mjklmk         


        
相关标签:
3条回答
  • 2021-01-24 13:46

    Increase the JVM heap space by setting the environment variables :

    SET _JAVA_OPTIONS = -Xms512m -Xmx1024m
    

    But it cant be a permanent solution as your file can be increased in future

    0 讨论(0)
  • If you have to read huge JSON Files you can't mantain in memory all informations. Extending memory can be a solution for a file of 1 Gb. If the files tomorrow is a 2 Gb Files?

    The right approach to this problem is to parse the json element by element using a streaming parser. Basically instead of loading the whole json in memory and creating a whole big object representing it you need to read single elements of the json and converting them to objects step by step.

    Here you find a nice article explaing how to do it with jackson library.

    0 讨论(0)
  • 2021-01-24 14:03

    You have two choices:

    1. Give more memory to the Java program by specifying the -Xmx argument, e.g. -Xmx1g to give it 1 Gb of memory.
    2. Use a "streaming" JSON parser. This will scale to infinitely large JSON files.

    json-simple has a streaming API. See https://code.google.com/p/json-simple/wiki/DecodingExamples#Example_5_-_Stoppable_SAX-like_content_handler

    There are other libraries with good streaming parser, e.g. Jackson.

    0 讨论(0)
提交回复
热议问题