zoukankan html css js c++ java

beautifulsoup简单用法

原文地址

http://www.cnblogs.com/yupeng/p/3362031.html

这篇文章讲的也很全

http://www.cnblogs.com/twinsclover/archive/2012/04/26/2471704.html

稍微研究了下bs4这个库，运行了下都还好用，就是解析html的各种结构，和xml的elementTree解析库是类似的，使用起来差不多。

可以直接调试，用来熟悉其用法

 1 # coding=utf-8
 2 #
 3 from bs4 import BeautifulSoup
 4 
 5 html_doc = """
 6 <html><head><title>The Dormouse's story</title></head>
 7 <body>
 8 <p class="title"><b>The Dormouse's story</b></p>
 9 <p class="story">Once upon a time there were three little sisters; and their names were
10 <a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
11 <a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
12 <a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
13 and they lived at the bottom of a well.</p>
14 <p class="story">...</p>
15 """
16 
17 soup = BeautifulSoup(html_doc,'html.parser')
18 # print soup.title
19 # print soup.title.name
20 # print soup.title.string
21 # print soup.p
22 # print soup.a
23 # print soup.find_all('a')
24 # a=soup.find_all('a')
25 # print len(a)
26 # print soup.find_all('p')#返回类似数组的结构
27 # p=soup.find_all('p')
28 # print len(p)
29 # print soup.find(id='link3')
30 
31 # print soup.get_text()#返回整个的文本
32 # print soup.p.get_text()#根据解析的节点来
33 # for i in soup.find_all('p'):
34     # print i.get_text()
35     # print i.contents
36 # print soup.a['href'],soup.a['class'],soup.a['id'],soup.a.text#注意单节点的每个内容都获取到了
37 # print soup.html,soup.head,soup.body#s整体，头，身体，全部的结构
38 # print soup.p.contents,soup.head.contents#列表形式返回子内容
39 # for i in list(soup.head.children):#不需要知道子节点的名称，迭代遍历子内容
40 #     print i,
41 # print soup.a.parent#向上查找，parents是查找所有的
42 # for i in soup.html.parents:
43 #     print i,len(i)
44 # print soup.a.parent
45 # print soup.find_all(class_="sister")
46 print soup.find_all('a',limit=1)#限制个数

查看全文

相关阅读:
UVALive 7141 BombX
CodeForces 722D Generating Sets
CodeForces 722C Destroying Array
CodeForces 721D Maxim and Array
CodeForces 721C Journey
CodeForces 415D Mashmokh and ACM
CodeForces 718C Sasha and Array
CodeForces 635C XOR Equation
CodeForces 631D Messenger
田忌赛马问题

原文地址：https://www.cnblogs.com/dahu-daqing/p/6558812.html