hadoop - Error processing complex json object of twitter with pig JsonLoader() of elephant-bird Jars -
i wanted process twitter json object pig using elephant-bird jars wrote pig script below.
register '/usr/lib/pig/lib/elephant-bird-hadoop-compat-4.1.jar'; register '/usr/lib/pig/lib/elephant-bird-pig-4.1.jar'; = load '/user/flume/tweets/data.json' using com.twitter.elephantbird.pig.load.jsonloader('-nestedload') mymap; b = foreach generate mymap#'id' id,mymap#'created_at' createdat; dump b;
which gave me error below
2015-08-25 11:06:34,295 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - hadoopjobid: job_1439883208520_0177 2015-08-25 11:06:34,295 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - processing aliases a,b 2015-08-25 11:06:34,295 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - detailed locations: m: a[3,4],b[4,4] c: r: 2015-08-25 11:06:34,303 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - 0% complete 2015-08-25 11:06:34,303 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - running jobs [job_1439883208520_0177] 2015-08-25 11:07:06,449 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - 50% complete 2015-08-25 11:07:06,449 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - running jobs [job_1439883208520_0177] 2015-08-25 11:07:09,458 [main] warn org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - ooops! job has failed! specify -stop_on_failure if want pig stop on failure. 2015-08-25 11:07:09,458 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - job job_1439883208520_0177 has failed! stop running dependent jobs 2015-08-25 11:07:09,459 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - 100% complete 2015-08-25 11:07:09,667 [main] info org.apache.hadoop.yarn.client.api.impl.timelineclientimpl - timeline service address: http://trinityhadoopmaster.com:8188/ws/v1/timeline/ 2015-08-25 11:07:09,668 [main] info org.apache.hadoop.yarn.client.rmproxy - connecting resourcemanager @ trinityhadoopmaster.com/192.168.1.135:8032 2015-08-25 11:07:09,678 [main] info org.apache.hadoop.mapred.clientservicedelegate - application state completed. finalapplicationstatus=failed. redirecting job history server 2015-08-25 11:07:09,779 [main] error org.apache.pig.tools.pigstats.pigstats - error 0: java.lang.classnotfoundexception: org.json.simple.parser.parseexception 2015-08-25 11:07:09,779 [main] error org.apache.pig.tools.pigstats.mapreduce.mrpigstatsutil - 1 map reduce job(s) failed! 2015-08-25 11:07:09,780 [main] info org.apache.pig.tools.pigstats.mapreduce.simplepigstats - script statistics: hadoopversion pigversion userid startedat finishedat features 2.6.0 0.14.0 hdfs 2015-08-25 11:06:33 2015-08-25 11:07:09 unknown failed! failed jobs: jobid alias feature message outputs job_1439883208520_0177 a,b map_only message: job failed! hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559, input(s): failed read data "hdfs://trinityhadoopmaster.com:9000/user/flume/tweets/data.json" output(s): failed produce result in "hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559" counters: total records written : 0 total bytes written : 0 spillable memory manager spill count : 0 total bags proactively spilled: 0 total records proactively spilled: 0 job dag: job_1439883208520_0177 2015-08-25 11:07:09,780 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - failed! 2015-08-25 11:07:09,787 [main] error org.apache.pig.tools.grunt.grunt - error 1066: unable open iterator alias b. backend error : java.lang.classnotfoundexception: org.json.simple.parser.parseexception details @ logfile: /tmp/pig-err.log grunt>
which have no clue on how approach, can 1 me on this.
register '/tmp/elephant-bird-core-4.1.jar'; register '/tmp/elephant-bird-pig-4.1.jar'; register '/tmp/elephant-bird-hadoop-compat-4.1.jar'; register '/tmp/google-collections-1.0.jar'; register '/tmp/json-simple-1.1.jar';
it works.
Comments
Post a Comment