java - How to For Each RDD Spark Streaming -
i have 1 csv file queries.txt , reading file this:
javardd<string> distfile = sc.textfile("queries.txt");
schema of queries.txt file is: uniq_id,,,...some numeric values in csv...
i need each line - create hashmap, key first column of queries.txt file(uniq_id) , value other columns in file hashmap.
example. (this not real , not working example, want convey essence)
hashmap totalmap = new hashmap<integer, numericvalues>(); for(int i=0;i<distfile.size();i++) { string line = distfile[i].getcolumns(); for(int y=0;y<line.size();y++) { totalmap.put(line.getfirstcolumn,line.getremainingcolumns); } }
here numericvalues custom class have variables mapping columns in file.
any other suggestions helpful.
i guess looking for, example doesn't parses csv line itself.
javardd<string> distfile = sc.textfile("queries.txt"); hashmap totalmap = new hashmap<integer, numericvalues>(); distfile.foreach(new voidfunction<string>(){ public void call(string line) { totalmap.put(yourcsvparser(line)); //this dummy function call }});
Comments
Post a Comment