python (12.9k questions)
javascript (9.2k questions)
reactjs (4.7k questions)
java (4.2k questions)
java (4.2k questions)
c# (3.5k questions)
c# (3.5k questions)
html (3.3k questions)
Opening arrow files using vaex slower and using more memory than expected
I have multiple .arrow files, each about 1GB (total filesize is larger than my RAM). I tried to open all of them using vaex.open_many() to read them into a single dataframe, and saw that the memory us...
Rayne
Votes: 0
Answers: 0
Pyarrow basic auth: How to prevent `Stream is closed`?
I am new to Arrow Flight and pyarrow (v=6.0.1), and am trying to implement basic auth but I am always getting an error:
OSError: Stream is closed
I have created a minimal reproducing sample, by runni...
JBSnorro
Votes: 0
Answers: 2
Chunked tokenization in huggingface has an arrow error
I'm following the code from this video at 1m25s, which shows:
def tokenize_and_chunk(texts):
return tokenizer(
texts["text"], truncation=True, max_length=context_length,
return ove...
Mittenchops
Votes: 0
Answers: 1
Ray Dataset: ArrowInvalid: Unrecognized filesystem type in URI: gs://
I've recently began exploring Ray and am having trouble just reading data from my GCS bucket.
Here is the code:
ray.data.read_parquet("gs://path")
Here is the error:
ArrowInvalid: Unrecogni...
madst
Votes: 0
Answers: 1