zoukankan      html  css  js  c++  java
  • python爬取昵称并保存为csv

    代码:

     1 import sys
     2 import io
     3 import re
     4 sys.stdout=io.TextIOWrapper(sys.stdout.buffer,encoding='gb18030')
     5 import requests
     6 from bs4 import BeautifulSoup
     7 
     8 def html_save(s):
     9     with open('Name.csv','a')as f:
    10         f.write(s+'
    ')
    11 # soup = BeautifulSoup(html,'index')
    12 def getName_link():
    13     lst=[]
    14     soup = BeautifulSoup(open('Girl.html'))
    15     for div in soup.find_all('div',{'class':'babynology_textevidence babynology_bg_grey babynology_shadow babynology_radius left overflow_scroll'}):
    16         for strong in div.find_all('strong'):
    17             print(strong.find_all('a')[0].text.replace('    ','').replace(' ','').replace('
    ',''))
    18             # print(strong.find_all('a')[0].get('href').replace('
    ',''))
    19             i=strong.find_all('a')[0].text.replace('    ','').replace(' ','').replace('
    ','')
    20             # j=strong.find_all('a')[0].get('href').replace('
    ','')
    21             # lst.append(j)
    22             html_save(i)
    23             # html_save(j)
    24     # print(lst)        
    25     # return lst
    26 getName_link()

    运行结果:

  • 相关阅读:
    NOIP 2016 回文日期
    USACO Ski Course Design
    USACO Combination Lock
    USACO 利润Profits
    POJ 3368 Frequent values
    USACO Balanced Lineup
    JDOJ 1065 打倒苏联修正主义
    JDOJ 2174 忠诚
    VIJOS-P1514 天才的记忆
    VIJOS-P1423 最佳路线
  • 原文地址:https://www.cnblogs.com/huanghuangwei/p/11997503.html
Copyright © 2011-2022 走看看