hadoop - Error processing complex json object of twitter with pig JsonLoader() of elephant-bird Jars -


i wanted process twitter json object pig using elephant-bird jars wrote pig script below.

register '/usr/lib/pig/lib/elephant-bird-hadoop-compat-4.1.jar';  register '/usr/lib/pig/lib/elephant-bird-pig-4.1.jar';    = load '/user/flume/tweets/data.json' using com.twitter.elephantbird.pig.load.jsonloader('-nestedload') mymap;  b = foreach generate mymap#'id' id,mymap#'created_at' createdat;  dump b;

which gave me error below

2015-08-25 11:06:34,295 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - hadoopjobid: job_1439883208520_0177  2015-08-25 11:06:34,295 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - processing aliases a,b  2015-08-25 11:06:34,295 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - detailed locations: m: a[3,4],b[4,4] c:  r:  2015-08-25 11:06:34,303 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - 0% complete  2015-08-25 11:06:34,303 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - running jobs [job_1439883208520_0177]  2015-08-25 11:07:06,449 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - 50% complete  2015-08-25 11:07:06,449 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - running jobs [job_1439883208520_0177]  2015-08-25 11:07:09,458 [main] warn  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - ooops! job has failed! specify -stop_on_failure if want pig stop on failure.  2015-08-25 11:07:09,458 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - job job_1439883208520_0177 has failed! stop running dependent jobs  2015-08-25 11:07:09,459 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - 100% complete  2015-08-25 11:07:09,667 [main] info  org.apache.hadoop.yarn.client.api.impl.timelineclientimpl - timeline service address: http://trinityhadoopmaster.com:8188/ws/v1/timeline/  2015-08-25 11:07:09,668 [main] info  org.apache.hadoop.yarn.client.rmproxy - connecting resourcemanager @ trinityhadoopmaster.com/192.168.1.135:8032  2015-08-25 11:07:09,678 [main] info  org.apache.hadoop.mapred.clientservicedelegate - application state completed. finalapplicationstatus=failed. redirecting job history server  2015-08-25 11:07:09,779 [main] error org.apache.pig.tools.pigstats.pigstats - error 0: java.lang.classnotfoundexception: org.json.simple.parser.parseexception  2015-08-25 11:07:09,779 [main] error org.apache.pig.tools.pigstats.mapreduce.mrpigstatsutil - 1 map reduce job(s) failed!  2015-08-25 11:07:09,780 [main] info  org.apache.pig.tools.pigstats.mapreduce.simplepigstats - script statistics:    hadoopversion   pigversion      userid  startedat       finishedat      features  2.6.0   0.14.0  hdfs    2015-08-25 11:06:33     2015-08-25 11:07:09     unknown    failed!    failed jobs:  jobid   alias   feature message outputs  job_1439883208520_0177  a,b     map_only        message: job failed!    hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559,    input(s):  failed read data "hdfs://trinityhadoopmaster.com:9000/user/flume/tweets/data.json"    output(s):  failed produce result in "hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559"    counters:  total records written : 0  total bytes written : 0  spillable memory manager spill count : 0  total bags proactively spilled: 0  total records proactively spilled: 0    job dag:  job_1439883208520_0177      2015-08-25 11:07:09,780 [main] info  org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - failed!  2015-08-25 11:07:09,787 [main] error org.apache.pig.tools.grunt.grunt - error 1066: unable open iterator alias b. backend error : java.lang.classnotfoundexception: org.json.simple.parser.parseexception  details @ logfile: /tmp/pig-err.log  grunt>

which have no clue on how approach, can 1 me on this.

register '/tmp/elephant-bird-core-4.1.jar';  register '/tmp/elephant-bird-pig-4.1.jar';  register '/tmp/elephant-bird-hadoop-compat-4.1.jar';  register '/tmp/google-collections-1.0.jar';  register '/tmp/json-simple-1.1.jar'; 

it works.


Comments

Popular posts from this blog

java - UnknownEntityTypeException: Unable to locate persister (Hibernate 5.0) -

python - ValueError: empty vocabulary; perhaps the documents only contain stop words -

ubuntu - collect2: fatal error: ld terminated with signal 9 [Killed] -