java - How to improve speed of parsing in Jsoup -
public static void main(string[] args) throws malformedurlexception, ioexception{ document doc = jsoup.connect("my_url").useragent("mozilla/5.0 applewebkit/537.36 (khtml, gecko) chrome/45.0.2454.4 safari/537.36").get(); int = 1; for(element table : doc.select("tbody")){ (element row : table.select("tr")) { (element sale1 : row.select("td.sale_type.bottomline div.inner.pl4")){ system.out.print(i + " : " + sale1.text() + " / "); } (element sale2 : row.select("td.sale_type2.bottomline div.inner")){ system.out.print(sale2.text() + " / "); } (element date : row.select("td.bottomline div.inner.inner_mark span.mark4")){ system.out.print(date.text() + " / "); } (element add : row.select("td.align_l.name div.inner ")){ system.out.print(add.text() + " / "); } (element size : row.select("td.num div.inner ")){ system.out.print(size.text() + " / "); } (element floor : row.select("td.num2 div.inner ")){ system.out.print(floor.text()+ " / "); } (element price : row.select("td.num.align_r div.inner ")){ system.out.print(price.text()+ " / "); } (element cont : row.select("td.contact.bottomline div.inner ")){ system.out.println(cont.text()); i++; } } } }
i'm parsing sites using jsoup, additionally, need visit 20,000 sites, don't know code design optimization or not didn't test yet, i'm worried slow processing... if slow, want improve more design , have modify code better performance?
few fixes :
public static string gettabledata(document doc){ stringbuilder sb = new stringbuilder(); for(element table : doc.select("tbody")){ int = 0; (element row : table.select("tr")) { i++; (element sale1 : row.select("td.sale_type.bottomline div.inner.pl4")){ sb.append(i + " : " + sale1.text() + " / "); } (element sale2 : row.select("td.sale_type2.bottomline div.inner")){ sb.append(sale2.text() + " / "); } (element date : row.select("td.bottomline div.inner.inner_mark span.mark4")){ sb.append(date.text() + " / "); } (element add : row.select("td.align_l.name div.inner ")){ sb.append(add.text() + " / "); } (element size : row.select("td.num div.inner ")){ sb.append(size.text() + " / "); } (element floor : row.select("td.num2 div.inner ")){ sb.append(floor.text()+ " / "); } (element price : row.select("td.num.align_r div.inner ")){ sb.append(price.text()+ " / "); } (element cont : row.select("td.contact.bottomline div.inner ")){ sb.append(cont.text()); } } } return sb.tostring(); }
and
public static void main(string[] args) throws malformedurlexception, ioexception{ document doc = jsoup.connect("my_url").useragent("mozilla/5.0 applewebkit/537.36 (khtml, gecko) chrome/45.0.2454.4 safari/537.36").get(); string result = gettabledata(doc); system.out.println(result); }
Comments
Post a Comment