parsing - Python Beautifulsoup. Parse <p></p> -
I'm learning how to parse beautifully, someone can tell me that & lt; P & gt; & Lt; / P & gt; How to pars in
element div class = "article-content"
I only want to see content information after the script launch. Show me what I want:
div class = "article-content"
, but & lt; P & gt; & Lt; / P & gt; Can not require information in
. My code looks like this: Import the import from the BeautifulSoup HTML = urllib2.urlopen urllib2 ('http://www.engadget.com/2014/10/17/local- Multiplayer- is-to-and-play / ') parsed_html = BeautifulSoup (html) print parsed_html.body.find (' div ', attrs = {' class': 'article-content'}.) Text
But I also get a lot of junk:
$ dragon engadget_parser.py Ever wish that you simply whip out your android device and play games Passer-one can bother with you? It's such a thing that for example, Nintendo DS users are using Thanks for the company's StreetPus feature, but so far, Google's smartphones are not available. Now, however, the company has added an update to its game infrastructure, which enables the "ambient, real-time" game with more than one user - so long that the game is a Google Playhouse multiplayer backend Depends on, however, it may be that do not increase the road and challenge people to double, because they can get wrong ideas. OnBreak ({0: function () {var a = {mobilePlacementID: "348-14-15-135b", width: "320", height: "115"}; madserver.requestAd (a); }) ();}, 768: function () {}}); Source: Android Developer (G +) Tags: Android, AndroidGames, Gaming, Google, googleplaygames, mobile, mobilepostcross Comments 0Comments Hide _when_.eng ("eng.livefyre.init", {articleId: 20,979,699, domain: "engadget.fyre. Co ", site id:" 296092 ", L:" liveifier_2097969 ", initial adaptability: 2}) _when_.eng (" eng.perm.init "); Lab.scriptBs ('gravity.js') onBreak ({0: function () {}, 320: function () {}, 768: function () {}});
Thanks!
I like the choice of beautiful soup method in this case. Change it to:
print parsed_html.body.find ('div', attrs = {'class': 'article-content'}). Text
with:
for parsed_html.select in P ('div.article-content p'): print p.text
Comments
Post a Comment