zoukankan      html  css  js  c++  java
  • requests 获取百度推广信息

    2019年的第一篇博客,恩,好久没写过博客了,恩,忘了,哈哈,实在是太懒了

    今天写一个爬取百度推广数据的爬虫,当然我写的肯定不是那么的完美,但是能用,大哭

    注意:有的时候,get或post方法获取数据是会报ssl等错误,网站要验证啥的,没搞懂,网上搜索,都是设置 verify=False,我也懒得去详细分析,你们要是有兴趣可以去了解一下,然后这么设置了,在GET或者post是会有warning提示,编写代码:

    # 禁用ssl发出的警告
    requests.packages.urllib3.disable_warnings()

    下面代码:
    def main():
      #开启一个session对话

    main_session = requests.session()
    return main_session

    def get_cost_info(main_session, endtime, flag1, flag2):
      #获取省或市在日期或者月份下的消费信息

    if flag1 == 'province':
    splitDimension = "provinceName"
    else:
    splitDimension = "provinceCityName"
    if flag2 == 'month':
    unitOfTime = 3
    else:
    unitOfTime = 5
    headers = {
    'Accept':'application/json',
    'Accept-Encoding':'gzip, deflate',
    'Accept-Language':'zh-CN,zh;q=0.8',
    'Connection':'keep-alive',
    'Content-Length':'763',
    'Content-Type':'application/x-www-form-urlencoded',
    'Cookie':setting.Cookie,
    'DNT':'1',
    'Host':'fengchao.baidu.com',
    'Origin':'https://fengchao.baidu.com',
    'Referer':'https://fengchao.baidu.com/fc/report/dashboard/user/%s/account'%setting.userid,
    'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 BIDUBrowser/8.7 Safari/537.36',
    'X-DevTools-Emulate-Network-Conditions-Client-Id':'87F3C66D-3166-46F6-8B46-141057354EBC'
    }
    params = {
    "userId":setting.userid,
    "ids":[setting.userid],
    "idType":2,
    "splitDimension":splitDimension,
    "limit":[0,1000],
    "sortRules":[],
    "levelOfDetails":211,
    "startDate":"2019-01-01",
    "endDate":endtime,
    "predicateConditions":[],
    "unitOfTime":unitOfTime,
    "columns":["userId","date","accountName","impression","click","cost","cpc","ctr","conversion","phoneConversion","bridgeConversion"]
    }
    formdata = {
    'reqid': setting.reqid,
    'eventId': setting.eventId,
    'userid': setting.userid,
    'token': setting.token,
    'path': 'mars/GET/AccountReportDataService/getAccountDataCenterReportData',
    'params': json.dumps(params)
    }
      #url是去F12---network获取的

    url_1 = 'https://fengchao.baidu.com/hairuo/request.ajax?path=mars/GET/AccountReportDataService/getAccountDataCenterReportData&reqid=%s' % formdata['reqid']
    cont_1 = main_session.post(url_1, headers=headers, data=formdata, verify=False)
    datas = cont_1.json()
    cont_list_1 = datas['data']['ACCOUNT']['rows']
    for i in cont_list_1:
    cont_list_2 = i['subRows']
    cont_list = []
    for j in cont_list_2:
    if flag2 == 'month':
    time_1 = j['date'][0:7]
    time_list1 = time_1.split('-')
    date = '%s年%s月份' % (time_list1[0], str(int(time_list1[1])))
    else:
    date = j['date']
    zhanghu = j['accountName']
    province, city = '', ''
    if flag1 == 'province':
    province = j['provinceName']
    else:
    city_list = j['provinceCityName'].split('-')
    province, city = city_list[0], city_list[1]
    zhanxian = str(j['impression'])
    dianji = str(j['click'])
    xiaofei = str(j['cost'])
    pinjunjiage = str('%.2f' % (j['cpc']))
    dianjilv = str('{:.2%}'.format(j['ctr']))
    wangyezhuanhua = str(j['conversion'])
    dianhuazhuanhua = str(j['phoneConversion'])
    shangqiaozhuanhua = str(j['bridgeConversion'])
    cont_list.append([date, zhanghu, province, zhanxian, dianji, xiaofei, pinjunjiage, dianjilv, wangyezhuanhua, dianhuazhuanhua, shangqiaozhuanhua])

     经过测试,cookie一般能保持7天不过期,七天之后就要重新登录浏览器F12获取

  • 相关阅读:
    项目配置64位Release版,编译提示:TRACKER : 错误 TRK0005: 未能找到: “CL.exe”。系统找不到指定的文件。
    解决word表格中换行到最后一行不分页以及分页后在最后一行回车后增加一个新页页不是与其它内容共用一页
    Qt error: undefined reference to `vtable for XXX'
    C++中string转int
    U盘在linux下出现加锁解决办法
    Qt执行没崩溃,但是也不往下走,一调式出现Signal Received错误
    工具里调整视图切换新结构后调用表格的importfile会崩溃
    treectrl关联了一个右键弹出菜单,但是一执行到GetSubMenu(0)就崩溃
    java中lock和synchronized区别
    http 状态码
  • 原文地址:https://www.cnblogs.com/wozuilang-mdzz/p/10736115.html
Copyright © 2011-2022 走看看