python - What's the relationship between 'BeautifulSoup' and 'lxml'? -
in lxml's doc, says:
lxml can interface parsing capabilities of beautifulsoup through lxml.html.soupparser module. provides 3 main functions: fromstring() , parse() parse string or file using beautifulsoup lxml.html document, , convert_tree() convert existing beautifulsoup tree list of top-level elements.
meanwhile, bs' can use lxml parser.[ref]
beautiful soup supports html parser included in python’s standard library, supports number of third-party python parsers. 1 lxml parser.
bs suggests use lxml parser speed.
so if lxml uses bs parsing when bs's parser lxml conversely?
i have been scratching head on understanding relationship. help.
nothing should confusing bs parser , lxml.html parser. bs has html parser, , lxml has own html parser.
bs documentation quoted says can parse html bs soup object using lxml parser or other possible third-party parsers, alternative using default bs parser :
beautifulsoup(markup, "lxml") similarly, lxml documentation says can parse html lxml tree object using bs parser, alternative using default lxml.html parser :
root = lxml.html.soupparser.fromstring(tag_soup)
Comments
Post a Comment