2 years ago
#57478
rfengineer
how to read large csvs in multiple zipfiles very quickly using pandas?
I'm using below code to read multiple multiple CSV's in multiple Zipfiles (one CSV per Zipfile) and data is huge (each csv is 1.5GB and I have more than 30 zipped CSV's) and that's why I prefer not unzipping them. however, it gets very slow when I want to concat many of them. is there a more effecient code to make the process quicker?
import os
import glob
import pandas as pd
os.chdir(r"path")
extension = 'zip'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames ])
python
pandas
csv
zip
glob
0 Answers
Your Answer