zoukankan      html  css  js  c++  java
  • web scraper——爬取知乎|微博用户数据模板【三】

    前言

    在这里呢,我就只给模板,不写具体的教程啦,具体的可以参考我之前写的博文。
    https://www.cnblogs.com/wangyang0210/p/10338574.html

    模板

    1. 进入微博选择粉丝较多的博主

    2. 复制下面的模板导入站点即可

    3. 修改地址,编辑好名称,点击Import Sitemap即可

    微博

    {"_id":"weibo_chenglong","startUrl":["https://weibo.com/p/1006051234552257/follow?relate=fans&page=[1-5]"],"selectors":[{"id":"userinfo","type":"SelectorElement","parentSelectors":["_root"],"selector":"li.follow_item","multiple":true,"delay":6},{"id":"username","type":"SelectorText","parentSelectors":["userinfo"],"selector":"a.S_txt1","multiple":false,"regex":"","delay":0},{"id":"avatar","type":"SelectorImage","parentSelectors":["userinfo"],"selector":"img","multiple":false,"delay":0},{"id":"city","type":"SelectorText","parentSelectors":["userinfo"],"selector":"div.info_add span","multiple":false,"regex":"","delay":0}]}
    

    知乎

    {"_id":"zhihuranqiqigongzuoshi","startUrl":["https://www.zhihu.com/people/xie-ling-520/followers?page=[1-45]"],"selectors":[{"id":"list","type":"SelectorElement","parentSelectors":["_root"],"selector":"div.List-item","multiple":true,"delay":0},{"id":"username","type":"SelectorText","parentSelectors":["list"],"selector":"div.UserItem-title","multiple":false,"regex":"","delay":0},{"id":"avatar","type":"SelectorImage","parentSelectors":["list"],"selector":"img","multiple":false,"delay":0}]}
    
  • 相关阅读:
    JS基础_自增和自减
    计算机组成原理
    SyntaxHighlighter
    10个经典的C语言面试基础算法及代码
    知名互联网公司面试题
    计算机网络基础知识(笔试题)
    面试准备之常见上机题目搜罗
    小米2013年校园招聘笔试题-简单并查集
    2014华为上机试题
    C++学习笔记
  • 原文地址:https://www.cnblogs.com/wangyang0210/p/11115690.html
Copyright © 2011-2022 走看看