python - Error with training logistic regression model on Apache Spark. SPARK-5063 -


i trying build logistic regression model apache spark. here code.

parseddata = raw_data.map(mapper) # mapper function generates pair of label , feature vector labeledpoint object featurevectors = parseddata.map(lambda point: point.features) # feature vectors parsed data  scaler = standardscaler(true, true).fit(featurevectors) #this creates standardization model scale features scaleddata = parseddata.map(lambda lp: labeledpoint(lp.label, scaler.transform(lp.features))) #trasform features scale mean 0 , unit std deviation modelscaledsgd = logisticregressionwithsgd.train(scaleddata, iterations = 10) 

but error:

exception: appears attempting reference sparkcontext broadcast variable, action, or transforamtion. sparkcontext can used on driver, not in code run on workers. more information, see spark-5063.

i not sure how work around this. greately appreciated.

problem see pretty same 1 i've described in how use java/scala function action or transformation? transform have call scala function, , requires access sparkcontext hence error see.

standard way handle process required part of data , zip results.

labels = parseddata.map(lambda point: point.label) featurestransformed = scaler.transform(featurevectors)  scaleddata = (labels     .zip(featurestransformed)     .map(lambda p: labeledpoint(p[0], p[1])))  modelscaledsgd = logisticregressionwithsgd.train(...) 

if don't plan implement own methods based on mllib components easier use high level ml api.

edit:

there 2 possible problems here.

  1. at point logisticregressionwithsgd supports only binomial classification (thanks eliasah pointing out) . if need multi-label classification can replace logisticregressionwithlbfgs.
  2. standardscaler supports dense vectors has limited applications.

Comments

Popular posts from this blog

java - UnknownEntityTypeException: Unable to locate persister (Hibernate 5.0) -

python - ValueError: empty vocabulary; perhaps the documents only contain stop words -

ubuntu - collect2: fatal error: ld terminated with signal 9 [Killed] -