I have a large JSON file, about 5 million records and a file size of about 32GB, that I need to get loaded into our Snowflake Data Warehouse. I need to get this file broken up i
Snowflake has a very special treatment for JSON and if we understand them, it would be easy to draw the design.
While loading the JSON data into stage location, flag the strip_outer_array=true
Each row size can not exceed 16Mb compressed when loaded in snowflake. Use the utilities which can split the file based on per line and have the file size note more than 100Mb and that brings the power of parallelism as well as accuracy for your data. As per your data set size, you will get around 31K small files (of 100Mb size). Look at the warehouse configuration & throughput details and refer semi-structured data loading best practice.
copy into
from @~/