java - How to improve speed of parsing in Jsoup -


public static void main(string[] args) throws malformedurlexception, ioexception{          document doc = jsoup.connect("my_url").useragent("mozilla/5.0 applewebkit/537.36 (khtml, gecko) chrome/45.0.2454.4 safari/537.36").get();           int = 1;         for(element table : doc.select("tbody")){                      (element row : table.select("tr")) {                  (element sale1 : row.select("td.sale_type.bottomline div.inner.pl4")){                      system.out.print(i + " : " + sale1.text() + " / ");                 }                 (element sale2 : row.select("td.sale_type2.bottomline div.inner")){                      system.out.print(sale2.text() + " / ");                 }                 (element date : row.select("td.bottomline div.inner.inner_mark span.mark4")){                      system.out.print(date.text() + " / ");                 }                 (element add : row.select("td.align_l.name div.inner ")){                      system.out.print(add.text() + " / ");                 }                 (element size : row.select("td.num div.inner ")){                      system.out.print(size.text() + " / ");                                  }                 (element floor : row.select("td.num2 div.inner ")){                      system.out.print(floor.text()+ " / ");                                  }                 (element price : row.select("td.num.align_r div.inner ")){                      system.out.print(price.text()+ " / ");                                  }                 (element cont : row.select("td.contact.bottomline div.inner ")){                      system.out.println(cont.text());   i++;                                 }                }         }      } 

i'm parsing sites using jsoup, additionally, need visit 20,000 sites, don't know code design optimization or not didn't test yet, i'm worried slow processing... if slow, want improve more design , have modify code better performance?

few fixes :

public static string gettabledata(document doc){     stringbuilder sb = new stringbuilder();     for(element table : doc.select("tbody")){                 int = 0;          (element row : table.select("tr")) {             i++;             (element sale1 : row.select("td.sale_type.bottomline div.inner.pl4")){                  sb.append(i + " : " + sale1.text() + " / ");             }             (element sale2 : row.select("td.sale_type2.bottomline div.inner")){                  sb.append(sale2.text() + " / ");             }             (element date : row.select("td.bottomline div.inner.inner_mark span.mark4")){                  sb.append(date.text() + " / ");             }             (element add : row.select("td.align_l.name div.inner ")){                  sb.append(add.text() + " / ");             }             (element size : row.select("td.num div.inner ")){                  sb.append(size.text() + " / ");                              }             (element floor : row.select("td.num2 div.inner ")){                  sb.append(floor.text()+ " / ");                              }             (element price : row.select("td.num.align_r div.inner ")){                  sb.append(price.text()+ " / ");                              }             (element cont : row.select("td.contact.bottomline div.inner ")){                  sb.append(cont.text());                                }          }     }     return sb.tostring(); } 

and

public static void main(string[] args) throws malformedurlexception, ioexception{     document doc = jsoup.connect("my_url").useragent("mozilla/5.0 applewebkit/537.36 (khtml, gecko) chrome/45.0.2454.4 safari/537.36").get();      string result = gettabledata(doc);     system.out.println(result);  } 

Comments

Popular posts from this blog

java - UnknownEntityTypeException: Unable to locate persister (Hibernate 5.0) -

python - ValueError: empty vocabulary; perhaps the documents only contain stop words -

ubuntu - collect2: fatal error: ld terminated with signal 9 [Killed] -