Stream a Massive Gzipped Json File in Python
I found myself with the problem of having to process massive JSON files that were gzipped. The files were 6.8GB gzipped and over 50GB un-gzipped (de-gzipped?).
I found myself here because a new regulatory requirement called “Transparency in Coverage” that took effect on July 1, 2022 for Insurance companies to post pricing information that is machine readable. There is a guide with examples out on GitHub. Unfortunately the examples in no way reflect the actual size of the data.