Local Spark talking to remote HDFS? -


i have file in hdfs inside hortonworks hdp 2.3_1 virtualbox vm.

if go guest spark-shell , refer file thus, works fine

val words=sc.textfile("hdfs:///tmp/people.txt") words.count

however if try access local spark app on windows host, doesn't work

 val conf = new sparkconf().setmaster("local").setappname("my app")    val sc = new sparkcontext(conf)     val words=sc.textfile("hdfs://localhost:8020/tmp/people.txt")    words.count  

emits

exception in thread "main" org.apache.spark.sparkexception: job aborted due stage failure: task 0 in stage 0.0 failed 1 times, recent failure: lost task 0.0 in stage 0.0 (tid 0, localhost): org.apache.hadoop.hdfs.blockmissingexception: not obtain block: bp-452094660-10.0.2.15-1437494483194:blk_1073742905_2098 file=/tmp/people.txt @ org.apache.hadoop.hdfs.dfsinputstream.choosedatanode(dfsinputstream.java:838) @ org.apache.hadoop.hdfs.dfsinputstream.blockseekto(dfsinputstream.java:526)

the port 8020 open, , if choose wrong file name, tell me

input path not exist: hdfs://localhost:8020/tmp/people.txt!! 

localhost:8020 should correct guest hdp vm hat nat port tunneling host windows box.

and it's telling if give wrong name appropriate exception

my pom has

<dependency>                 <groupid>org.apache.spark</groupid>                 <artifactid>spark-core_2.11</artifactid>                 <version>1.4.1</version>                 <scope>provided</scope>         </dependency> 

am doing wrong? , blockmissingexception trying tell me?


Comments

Popular posts from this blog

java - UnknownEntityTypeException: Unable to locate persister (Hibernate 5.0) -

python - ValueError: empty vocabulary; perhaps the documents only contain stop words -

ubuntu - collect2: fatal error: ld terminated with signal 9 [Killed] -