Issues parsing a 1GB json file using JSON.NET

前端 未结 2 639
闹比i
闹比i 2021-01-28 02:29

I have gotten an application where the input has been scaled up from 50K location records to 1.1 Million location records. This has caused serious issues as the entire file was

相关标签:
2条回答
  • 2021-01-28 03:04

    Thanks for all the help, I've managed to get it doing what I want which is de-serializing individual location objects.

    If the item is converted to a JObject it will read in the full object and de-serialize it, this can be looped to get the solution.

    This is the code that was settled on

    while (reader.Read())
    {
        if (reader.TokenType == JsonToken.StartObject && reader.Depth == 2)
        {
            location = JObject.Load(reader).ToObject<Location>();
    
            var lv = new LocationValidator(location, FootprintInfo.OperatorId, FootprintInfo.RoamingGroups, true);
            var vr = lv.IsValid();
            if (vr.Successful)
            {
                yield return location;
            }
            else
            {
                errors.Add(new Error(elNumber, location.LocationId, vr.Error.Field, vr.Error.Detail));
                if (errors.Count >= maxErrors)
                {
                    yield break;
                }
            }
    
            ++elNumber;
        }
    }
    
    0 讨论(0)
  • 2021-01-28 03:28

    When the reader is positioned at the beginning of the object you want to deserialize (an entry in the Locations array in your case), you can just call ser.Deserialize<T>(reader) and it will work, advancing to the end of the object at that level, and no further. Thus the following should iterate through the Location objects in your file, loading each one separately:

        public static IEnumerable<T> DeserializeNestedItems<T>(TextReader textReader)
        {
            var ser = new JsonSerializer();
            using (var reader = new JsonTextReader(textReader))
            {
                reader.SupportMultipleContent = true;
    
                while (reader.Read())
                {
                    if (reader.TokenType == JsonToken.StartObject && reader.Depth == 2)
                    {
                        var item = ser.Deserialize<T>(reader);
                        yield return item;
                    }
                }
            }
        }
    

    And an example of use using your test string:

            Debug.Assert(DeserializeNestedItems<Location>(new StringReader(json)).Count() == 2); // No assert.
    
            var list = DeserializeNestedItems<Location>(new StringReader(json)).SelectMany(l => l.AccessPoints).Select(a => new { a.Latitude, a.Longitude }).ToList();
    
            Debug.WriteLine(JsonConvert.SerializeObject(list, Formatting.Indented));
    

    Which outputs:

    [
      {
        "Latitude": 40.59485,
        "Longitude": -73.96174
      },
      {
        "Latitude": 40.59485,
        "Longitude": -73.96174
      }
    ]
    

    Note - the Location class comes from posting your JSON to http://json2csharp.com/.

    0 讨论(0)
提交回复
热议问题