zoukankan      html  css  js  c++  java
  • Python 爬取拉钩网工作岗位

    如果拉钩网html页面做了调整,需要重新调整代码

    代码如下

    #/usr/bin/env python3
    #coding:utf-8
    import sys
    import json
    import requests
    
    """
    Usage:
            python3 lagou.py  <Number> <positionName>
    
    """
    
    def get_jobs(pn=1,kw="python"):
            url = 'https://www.lagou.com/jobs/positionAjax.json?needAddtionalResult=false'
            payload = {"first":"false","pn":pn,"kd":kw}
            #payload = {'needAddtionalResult':'false'}
            rr = requests.post(url,data=payload)
            #print(r.json())
            jobs_data = rr.json()
            #print(jobs_data["content"]["positionResult"]["result"][0])
            jobs = jobs_data["content"]["positionResult"]["result"]
    
            for i in jobs:
                    print("学历:" + i["education"])
                    print("城市:" + i["city"])
            #       print("福利:" + i["companyLabelList"])
                    print("薪资:" + i["salary"])
                    print("职位:" + i["positionName"])
                    print("要求:" + i["positionAdvantage"])
                    print("经验:" + i["workYear"])
                    print("链接:" + "https://www.lagou.com/jobs/" + str(i["positionId"]) + ".html")
                    print()
    
    if __name__ == '__main__':
            #pn表示页数,kw表示关键词搜索
            pn = sys.argv[1]
            kw = sys.argv[2]
            get_jobs(pn,kw)
  • 相关阅读:
    Java
    Java
    Java
    其他
    Java
    Java
    Java
    正则
    Win10
    【转】Flask 上下文机制和线程隔离
  • 原文地址:https://www.cnblogs.com/linyouyi/p/11409869.html
Copyright © 2011-2022 走看看