zoukankan      html  css  js  c++  java
  • Python爬虫学习-简单利用urllib.request和正则表达式抓取职位信息

    1: 利用urllib.request和正则表达式抓取职位信息,并写入本地文件

     1 # coding:utf-8
     2 
     3 import re
     4 import requests
     5 import urllib.request
     6 
     7 #利用urllib和re正则提取网页数据
     8 
     9 '''
    10 url = 'https://search.51job.com/list/020000,000000,0124,01,9,99,%2520,2,1.html?lang=c&stype=&postchannel=0000&workyear=99&cotype=99&degreefrom=99&jobterm=99&companysize=99&providesalary=99&lonlat=0%2C0&radius=-1&ord_field=0&confirmdate=9&fromType=&dibiaoid=0&address=&line=&specialarea=00&from=&w'
    11 # response = requests.get(url)
    12 # response.encoding='gbk'
    13 # wbdata =response.text
    14 
    15 wbdata=urllib.request.urlopen(url).read().decode('gbk')
    16 # print(len(wbdata))
    17 
    18 pat ='<a target="_blank" title="(.*?)"'
    19 data = re.compile(pat).findall(wbdata)
    20 # print(data)
    21 
    22 #输出到文件
    23 # with open('jobs.txt','w') as f:
    24 #     for k in range(len(data)):
    25 #         print(data[k])
    26 #         f.write(data[k]+'
    ')
    27 
    28 #输出至console
    29 for k in range(len(data)):
    30     print(data[k])
    31 '''
    32 print("--"*20)
    33 #超时设置
    34 # for i in range(0,20):
    35 #     try:
    36 #         file=urllib.request.urlopen("http://baidu.com",timeout=0.2).read().decode('gbk')
    37 #         print(len(file))
    38 #     except Exception as err:
    39 #         print("出现异常:可能网页超时!"+str(err))
    40 
    41 #get请求实战-获取51job职位信息
    42 keywd="Python"
    43 pat1='<div class="el">.*?title="(.*?)" href="(http.*?)".*?<span class="t4">(.*?)</span>.*?</div>'
    44 pat2='<span class="t4">(.*?)</span>'
    45 
    46 # keywd=urllib.request.quote(keywd)
    47 for i in range(1,11):
    48     url="https://search.51job.com/list/020000,000000,0000,00,9,99,"+keywd+",2,"+str(i)+".html"
    49     file=urllib.request.urlopen(url)
    50     # print(file.geturl())
    51     data=file.read().decode('gbk')
    52     print("----------------第"+str(i)+"页-----------------")
    53     rst1=re.compile(pat1,re.S).findall(data)
    54     # rst2 = re.compile(pat2).findall(data)
    55     # rst=list(zip(rst1,rst2))
    56     for j in range(0,len(rst1)):
    57         print(rst1[j])
    58         with open('jobs.txt','a') as f:
    59             f.write(str(rst1[j]) + '
    ')
    60 
    61     # rst2 = re.compile(pat2).findall(data)
    62     # for z in range(0, len(rst2)):
    63     #     print(rst2[z])
    View Code

    2: 抓取信息如下

      1 ----------------第1页-----------------
      2 ('自动化测试工程师Selenium', 'https://jobs.51job.com/shanghai-ypq/114603381.html?s=01&t=5', '1-1.5万/月')
      3 ('大数据研发工程师', 'https://jobs.51job.com/shanghai/67963188.html?s=01&t=6', '')
      4 ('Python爬虫工程师', 'https://jobs.51job.com/shanghai-pdxq/121129060.html?s=01&t=0', '1-1.5万/月')
      5 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-pdxq/114332244.html?s=01&t=0', '2-4万/月')
      6 ('Python爬虫工程师', 'https://jobs.51job.com/shanghai/120028078.html?s=01&t=0', '1-1.5万/月')
      7 ('Python开发工程师', 'https://jobs.51job.com/shanghai-xhq/119981428.html?s=01&t=0', '1-1.5万/月')
      8 ('python开发工程师/大数据建模', 'https://jobs.51job.com/shanghai-hpq/114718480.html?s=01&t=0', '6-8千/月')
      9 ('Python开发工程师', 'https://jobs.51job.com/shanghai/120395604.html?s=01&t=0', '1.2-1.5万/月')
     10 ('C/C++/Python开发工程师', 'https://jobs.51job.com/shanghai-xhq/120909208.html?s=01&t=0', '15-20万/年')
     11 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/89716807.html?s=01&t=0', '1-1.5万/月')
     12 ('erlang/python服务器开发工程师', 'https://jobs.51job.com/shanghai-xhq/98416948.html?s=01&t=0', '1.5-3.5万/月')
     13 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120657799.html?s=01&t=0', '1.5-2万/月')
     14 ('Python开发工程师 (MJ000231)', 'https://jobs.51job.com/shanghai-jaq/117808653.html?s=01&t=0', '0.6-1万/月')
     15 ('python开发工程师-A0122', 'https://jobs.51job.com/shanghai-jaq/119864919.html?s=01&t=0', '1-1.5万/月')
     16 ('高级python后端工程师(AI平台)', 'https://jobs.51job.com/shanghai-ypq/120959109.html?s=01&t=0', '1.5-2万/月')
     17 ('初级Python工程师', 'https://jobs.51job.com/shanghai-pdxq/116357032.html?s=01&t=0', '10-15万/年')
     18 ('Python开发工程师', 'https://jobs.51job.com/shanghai/115980776.html?s=01&t=0', '')
     19 ('python数据分析', 'https://jobs.51job.com/shanghai-bsq/120583326.html?s=01&t=0', '1.5-2.5万/月')
     20 ('Python开发工程师', 'https://jobs.51job.com/hefei/121179694.html?s=01&t=0', '0.8-1.2万/月')
     21 ('Python开发专家', 'https://jobs.51job.com/shanghai-mhq/121173046.html?s=01&t=0', '3-3.5万/月')
     22 ('Python开发工程师', 'https://jobs.51job.com/shanghai-jaq/120911076.html?s=01&t=0', '1-1.5万/月')
     23 ('python Web开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120283479.html?s=01&t=0', '1-1.3万/月')
     24 ('python web后台开发', 'https://jobs.51job.com/shanghai-cnq/119461330.html?s=01&t=0', '0.6-1.5万/月')
     25 ('Python开发工程师(金融科技)', 'https://jobs.51job.com/shanghai-xhq/117251350.html?s=01&t=0', '0.7-1.5万/月')
     26 ('Python开发工程师', 'https://jobs.51job.com/shanghai-ypq/107196999.html?s=01&t=0', '0.8-1.5万/月')
     27 ('Python高级软件工程师', 'https://jobs.51job.com/shanghai-mhq/120966514.html?s=01&t=0', '21-35万/年')
     28 ('P0076-python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/121163335.html?s=01&t=0', '1-2万/月')
     29 ('Python 应用开发工程师', 'https://jobs.51job.com/shanghai-cnq/121160736.html?s=01&t=0', '0.7-1.3万/月')
     30 ('软件开发工程师(GO/Lua/python)', 'https://jobs.51job.com/shanghai-jaq/119529417.html?s=01&t=0', '1.5-2.5万/月')
     31 ('运维开发工程师(Python开发)', 'https://jobs.51job.com/shanghai/119187608.html?s=01&t=0', '2.5-4万/月')
     32 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/98528369.html?s=01&t=0', '1-1.5万/月')
     33 ('Python开发工程师', 'https://jobs.51job.com/shanghai-jaq/120386104.html?s=01&t=0', '1-2.2万/月')
     34 ('Python开发工程师', 'https://jobs.51job.com/shanghai-mhq/118338654.html?s=01&t=0', '0.8-1万/月')
     35 ('PFS-Python Developer', 'http://durrgroup.51job.com/jobinfo1.html?id=120187451', '')
     36 ('python数据分析师', 'https://jobs.51job.com/shanghai-pdxq/112471902.html?s=01&t=0', '1-1.5万/月')
     37 ('25923-Python高级工程师(深圳)', 'https://jobs.51job.com/shanghai-hpq/118208611.html?s=01&t=0', '')
     38 ('硬件工程师(python)', 'https://jobs.51job.com/shanghai/120159337.html?s=01&t=0', '')
     39 ('Python开发工程师', 'https://jobs.51job.com/shanghai-ypq/121106633.html?s=01&t=0', '1.3-1.9万/月')
     40 ('Python后端开发工程师', 'https://jobs.51job.com/shanghai-xhq/118007556.html?s=01&t=0', '1.2-1.8万/月')
     41 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120944265.html?s=01&t=0', '1-1.5万/月')
     42 ('Python全栈工程师', 'https://jobs.51job.com/shanghai-mhq/109925368.html?s=01&t=0', '0.8-1.6万/月')
     43 ('python开发', 'https://jobs.51job.com/shanghai-hkq/117623816.html?s=01&t=0', '2-3万/月')
     44 ('高级Python/Django后端软件工程师', 'https://jobs.51job.com/shanghai-cnq/109764789.html?s=01&t=0', '1.5-2万/月')
     45 ('软件工程师(汽车行业优先)精通python', 'https://jobs.51job.com/shanghai-pdxq/120189871.html?s=01&t=0', '0.8-2万/月')
     46 ('Python 爬虫工程师(薪智)', 'https://jobs.51job.com/shanghai-mhq/119329837.html?s=01&t=0', '1.5-2万/月')
     47 ('高级Python开发工程师', 'https://jobs.51job.com/shanghai/105294644.html?s=01&t=0', '2.5-5万/月')
     48 ('Python开发工程师', 'https://jobs.51job.com/shanghai/101535573.html?s=01&t=0', '1.5-2万/月')
     49 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/115644674.html?s=01&t=0', '1.5-2.2万/月')
     50 ('Python开发经理', 'https://jobs.51job.com/shanghai-pdxq/114043106.html?s=01&t=0', '2-2.5万/月')
     51 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-pdxq/118673386.html?s=01&t=0', '1.8-3万/月')
     52 ('高级Python开发工程师', 'https://jobs.51job.com/shanghai-mhq/98639401.html?s=01&t=0', '1.5-2.5万/月')
     53 ('python开发', 'https://jobs.51job.com/shanghai-hkq/117622468.html?s=01&t=0', '2-3万/月')
     54 ----------------第2页-----------------
     55 ('python工程师', 'https://jobs.51job.com/shanghai-sjq/120620177.html?s=01&t=0', '0.8-2万/月')
     56 ('Python(Odoo)工程师', 'https://jobs.51job.com/shanghai-pdxq/119746269.html?s=01&t=0', '1-1.5万/月')
     57 ('Python/Odoo高级开发工程师', 'https://jobs.51job.com/shanghai/116881344.html?s=01&t=0', '2.5-3万/月')
     58 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/116927659.html?s=01&t=0', '0.8-1.5万/月')
     59 ('Python开发工程师', 'https://jobs.51job.com/shanghai-cnq/120895491.html?s=01&t=0', '1-1.5万/月')
     60 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120349450.html?s=01&t=0', '1.5-2万/月')
     61 ('中级后端工程师(python/odoo)', 'https://jobs.51job.com/shanghai-sjq/107147430.html?s=01&t=0', '1.5-2万/月')
     62 ('实习生(Python开发)', 'https://jobs.51job.com/shanghai/119951678.html?s=01&t=0', '1-5千/月')
     63 ('Python开发工程师', 'https://jobs.51job.com/shanghai-ypq/120474003.html?s=01&t=0', '0.8-1.2万/月')
     64 ('Python后端开发', 'https://jobs.51job.com/shanghai-xhq/121023917.html?s=01&t=0', '1.5-2.5万/月')
     65 ('Python软件工程师', 'https://jobs.51job.com/shanghai-xhq/117038800.html?s=01&t=0', '1-1.5万/月')
     66 ('初级python/R 工程师', 'https://jobs.51job.com/shanghai-xhq/116055667.html?s=01&t=0', '0.5-1万/月')
     67 ('python', 'https://jobs.51job.com/shanghai-pdxq/120778793.html?s=01&t=0', '6.5-9.5千/月')
     68 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120941553.html?s=01&t=0', '0.8-1.5万/月')
     69 ('python开发项目经理', 'https://jobs.51job.com/shanghai-ypq/117944790.html?s=01&t=0', '2.6-4万/月')
     70 ('Python/PHP后端程序员', 'https://jobs.51job.com/shanghai-xhq/114897348.html?s=01&t=0', '1-1.5万/月')
     71 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/119493296.html?s=01&t=0', '15-25万/年')
     72 ('Python开发工程师(外汇岗位)', 'https://jobs.51job.com/shanghai-pdxq/120797741.html?s=01&t=0', '1-1.6万/月')
     73 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/112330285.html?s=01&t=0', '1-1.7万/月')
     74 ('Python开发工程师(***)', 'https://jobs.51job.com/shanghai/120694247.html?s=01&t=0', '1000元/天')
     75 ('python开发', 'https://jobs.51job.com/shanghai-hkq/117614339.html?s=01&t=0', '2-3万/月')
     76 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/118308129.html?s=01&t=0', '1.1-2万/月')
     77 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/119981957.html?s=01&t=0', '1.5-2万/月')
     78 ('Python开发工程师', 'https://jobs.51job.com/shenzhen/118443903.html?s=01&t=0', '3-4万/月')
     79 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-sjq/120173104.html?s=01&t=0', '1.5-2万/月')
     80 ('python 工程师', 'https://jobs.51job.com/shanghai-xhq/120570442.html?s=01&t=0', '1.5-2万/月')
     81 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-mhq/120105386.html?s=01&t=0', '1.5-3万/月')
     82 ('软件开发工程师(Python)', 'https://jobs.51job.com/shenzhen-nsq/118492627.html?s=01&t=0', '0.6-1.2万/月')
     83 ('python开发', 'https://jobs.51job.com/shanghai-pdxq/118924590.html?s=01&t=0', '6-8千/月')
     84 ('Python开发工程师', 'https://jobs.51job.com/shanghai-xhq/115691023.html?s=01&t=0', '1.3-1.8万/月')
     85 ('Python工程师', 'https://jobs.51job.com/shanghai-cnq/119488897.html?s=01&t=0', '1.5-2万/月')
     86 ('python开发', 'https://jobs.51job.com/shanghai-xhq/115310459.html?s=01&t=0', '6-8千/月')
     87 ('大数据算法开发/Python开发', 'https://jobs.51job.com/shanghai-pdxq/120055740.html?s=01&t=0', '1.5-2万/月')
     88 ('Python开发(09)', 'https://jobs.51job.com/shanghai-pdxq/120786708.html?s=01&t=0', '0.8-1.1万/月')
     89 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120472220.html?s=01&t=0', '1-1.5万/月')
     90 ('Python工程师', 'https://jobs.51job.com/shanghai/119281603.html?s=01&t=0', '2-3.5万/月')
     91 ('Python高级开发工程师', 'https://jobs.51job.com/shanghai-hkq/119893009.html?s=01&t=0', '1.5-2万/月')
     92 ('Python运维开发工程师', 'https://jobs.51job.com/shanghai-mhq/119681490.html?s=01&t=0', '1.5-2万/月')
     93 ('Python工程师', 'https://jobs.51job.com/shanghai/102367533.html?s=01&t=0', '1.5-2万/月')
     94 ('Python开发工程师', 'https://jobs.51job.com/shanghai-ypq/120946896.html?s=01&t=0', '1.5-2万/月')
     95 ('Python开发工程师', 'https://jobs.51job.com/shanghai-hpq/109016702.html?s=01&t=0', '2.3-2.8万/月')
     96 ('Senior Python Software Engineer', 'https://jobs.51job.com/shanghai/119066163.html?s=01&t=0', '1.5-3万/月')
     97 ('高级软件工程师  Golang/Python', 'https://jobs.51job.com/shanghai-cnq/120220700.html?s=01&t=0', '2-6万/月')
     98 ('Python开发工程师', 'https://jobs.51job.com/shanghai/114088389.html?s=01&t=0', '1-1.5万/月')
     99 ('Python 开发工程师', 'https://jobs.51job.com/shanghai-pdxq/120139836.html?s=01&t=0', '0.8-1.5万/月')
    100 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/115129775.html?s=01&t=0', '1.5-2万/月')
    101 ('python开发', 'https://jobs.51job.com/shanghai-hkq/117611952.html?s=01&t=0', '2-3万/月')
    102 ('Python开发工程师', 'https://jobs.51job.com/shanghai-pdxq/110125924.html?s=01&t=0', '1-1.5万/月')
    103 ('Python开发工程师', 'https://jobs.51job.com/shanghai-xhq/118610557.html?s=01&t=0', '1-2万/月')
    104 ('Python 架构', 'https://jobs.51job.com/shanghai-pdxq/120317481.html?s=01&t=0', '2.5-3.5万/月')
    View Code
  • 相关阅读:
    影响上传、下载速度的原因
    JDK9环境变量配置
    CentOS配置Nginx及常见命令
    Docker基本命令
    selenium+java文件上传
    selenium java清空默认值时失效方法
    js常用 方法 封装
    Jvm的gc机制和算法
    Java正则总结
    枚举类
  • 原文地址:https://www.cnblogs.com/Jeffrey-xu/p/12657909.html
Copyright © 2011-2022 走看看