python - What's the relationship between 'BeautifulSoup' and 'lxml'? -
in lxml
's doc, says:
lxml can interface parsing capabilities of beautifulsoup through lxml.html.soupparser module. provides 3 main functions: fromstring() , parse() parse string or file using beautifulsoup lxml.html document, , convert_tree() convert existing beautifulsoup tree list of top-level elements.
meanwhile, bs
' can use lxml
parser.[ref]
beautiful soup supports html parser included in python’s standard library, supports number of third-party python parsers. 1 lxml parser.
bs
suggests use lxml
parser speed.
so if lxml
uses bs
parsing when bs
's parser lxml
conversely?
i have been scratching head on understanding relationship. help.
nothing should confusing bs
parser , lxml.html
parser. bs
has html parser, , lxml
has own html parser.
bs
documentation quoted says can parse html bs
soup object using lxml
parser or other possible third-party parsers, alternative using default bs
parser :
beautifulsoup(markup, "lxml")
similarly, lxml
documentation says can parse html lxml
tree object using bs
parser, alternative using default lxml.html
parser :
root = lxml.html.soupparser.fromstring(tag_soup)
Comments
Post a Comment