parsing - Python Beautifulsoup. Parse <p></p> -


I'm learning how to parse beautifully, someone can tell me that & lt; P & gt; & Lt; / P & gt; How to pars in element div class = "article-content" I only want to see content information after the script launch. Show me what I want:

Enter image details here

< P> I can parse div class = "article-content" , but & lt; P & gt; & Lt; / P & gt; Can not require information in . My code looks like this:

  Import the import from the BeautifulSoup HTML = urllib2.urlopen urllib2 ('http://www.engadget.com/2014/10/17/local- Multiplayer- is-to-and-play / ') parsed_html = BeautifulSoup (html) print parsed_html.body.find (' div ', attrs = {' class': 'article-content'}.) Text  

But I also get a lot of junk:

  $ dragon engadget_parser.py Ever wish that you simply whip out your android device and play games Passer-one can bother with you? It's such a thing that for example, Nintendo DS users are using Thanks for the company's StreetPus feature, but so far, Google's smartphones are not available. Now, however, the company has added an update to its game infrastructure, which enables the "ambient, real-time" game with more than one user - so long that the game is a Google Playhouse multiplayer backend Depends on, however, it may be that do not increase the road and challenge people to double, because they can get wrong ideas. OnBreak ({0: function () {var a = {mobilePlacementID: "348-14-15-135b", width: "320", height: "115"}; madserver.requestAd (a); }) ();}, 768: function () {}}); Source: Android Developer (G +) Tags: Android, AndroidGames, Gaming, Google, googleplaygames, mobile, mobilepostcross Comments 0Comments Hide _when_.eng ("eng.livefyre.init", {articleId: 20,979,699, domain: "engadget.fyre. Co ", site id:" 296092 ", L:" liveifier_2097969 ", initial adaptability: 2}) _when_.eng (" eng.perm.init "); Lab.scriptBs ('gravity.js') onBreak ({0: function () {}, 320: function () {}, 768: function () {}});  

Thanks!

I like the choice of beautiful soup method in this case. Change it to:

  print parsed_html.body.find ('div', attrs = {'class': 'article-content'}). Text  

with:

  for parsed_html.select in P ('div.article-content p'): print p.text  

Comments

Popular posts from this blog

winforms - C# Form - Property Change -

java - Messages from .properties file do not display UTF-8 characters -

javascript - amcharts makechart not working -