python (12.9k questions)
javascript (9.2k questions)
reactjs (4.7k questions)
java (4.2k questions)
java (4.2k questions)
c# (3.5k questions)
c# (3.5k questions)
html (3.3k questions)
How to get vocabulary size of word2vec?
I have a pretrained word2vec model in pyspark and I would like to know how big is its vocabulary (and perhaps get a list of words in the vocabulary).
Is this possible? I would guess it has to be store...

Luca Clissa
Votes: 0
Answers: 1
How to troubleshoot this PySpark GLM error?
Trying to run a GLM using poisson family and log link function and getting the following errors:
2022-01-11 15:56:55,143 root ERROR An error occurred while calling o266.fit.
: java.lang.NullPointerExc...
Evan Zamir
Votes: 0
Answers: 1
Remove specific stopwords Pyspark
New to Pyspark, I'd like to remove some french stopwords from pyspark column.
Due to some constraint, I can't use NLTK/Spacy, StopWordsRemover is the only option that I got.
Below is what I have tried...

A2N15
Votes: 0
Answers: 2
How best to shuffle columns in pyspark to calculate permutation feature importance
It seems that pyspark ML doesn't have a built-in permutation feature importance method. So I want code this up, and to do so I have to individually shuffle each column in the dataframe. I found this r...
Gaurav Bansal
Votes: 0
Answers: 1