2 years ago
#61576

Abdullah Odibat
How to read LZO compressed files in Pyspark
I am using PySpark 3.1.2 with virtual environment.
I am trying to read files compressed with lzo
but i cant find proper documentation on how to do that, i understand that for licensing issue, the lzo
codec needs to be added manually to spark. But i dont find a step by step documentation to do that.
I already checked the question: Read Lzo file in PySpark but id did not really help me, as i would like to add the codec in spark so im able to read directory of lzo
files same as i do with parquet and json.
Any help will be appreciated :)
apache-spark
pyspark
compression
virtualenv
lzo
0 Answers
Your Answer