(venv) D:pytest>pip install beautifulsoup
Collecting beautifulsoup
Using cached https://files.pythonhosted.org/packages/1e/ee/295988deca1a5a7accd783d0dfe14524867e31abb05b6c0eeceee49c759d/BeautifulSoup-3.2.1.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:Users1AppDataLocalTemppip-install-mav7d0boeautifulsoupsetup.py", line 22
print "Unit tests have failed!"
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("Unit tests have failed!")?
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in C:Users1AppDataLocalTemppip-install-mav7d0boeautifulsoup
哦,大概是beautifulsoup已经被炸了,需要pip install beautifulsoup4 或者直接bs4
BeautifulSoup类的基本元素
| 基本元素 | 说明 |
| Tag | 标签,最基本的信息组织单元,分别用<> 和</>标明开头和结尾 |
| Name | 标签的名字,<p>…</p>的名字是'p',格式:<tag>.name |
| Attributes | 标签的属性,字典形式组织,格式:<tag>.attrs |
| NavigableString | 标签内非属性字符串,<>…</>中字符串,格式:<tag>.string |
| Comment | 标签内字符串的注释部分,一种特殊的Comment类型 |
标签树的下行遍历
| 属性 | 说明 |
| .contents | 子节点的列表,将<tag>所有儿子节点存入列表 |
| .children | 子节点的迭代类型,与.contents类似,用于循环遍历儿子节点 |
| .descendants | 子孙节点的迭代类型,包含所有子孙节点,用于循环遍历 |