@aral I'm actually working on something similar in Python but I need to process tens to hundreds of millions of JSON lines records, which requires a different approach...