i'm using (for first time) scikit library , got error: valueerror: empty vocabulary; perhaps documents contain stop words file "c:\users\a605563\desktop\velibprojetpreso\traitementtwitterdico.py", line 33, in <module> x_train_counts = count_vect.fit_transform(filetweets) file "c:\python27\lib\site-packages\sklearn\feature_extraction\text.py", line 804, in fit_transform self.fixed_vocabulary_) file "c:\python27\lib\site-packages\sklearn\feature_extraction\text.py", line 751, in _count_vocab raise valueerror("empty vocabulary; perhaps documents contain stop words but don't understand why that's happening. import sklearn sklearn.feature_extraction.text import countvectorizer import pandas pd import numpy import unicodedata import nltk tweetsfile = open('tweets2015-08-13.csv', 'r+') f2 = open('analyzer.txt', 'a') print tweetsfile.readline() count_vect = countvectorizer(strip_accents='ascii...
Comments
Post a Comment