zoukankan      html  css  js  c++  java
  • java实现豆瓣回帖机器人

    最近一直帮老板写爬虫,写累了就寻思着找点乐子,碰巧平时喜欢逛豆瓣,就打算写一个自动回帖机器人,废话不多说我们进入正题:

    主要用到2个开源工具:Jsouphttpclient

    Step 1:模拟登陆

    public static boolean login() throws IOException{
            String captcha_id=downloadPic(login_url,"D:\yz.png");//下载验证码图片到本地
            BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
            System.out.println("请输入验证码:");
            String yan = br.readLine();
            HttpPost httppost = new HttpPost(login_url);
            List<NameValuePair> params = new ArrayList<NameValuePair>();
            params.add(new BasicNameValuePair("captcha-id", captcha_id));//用firebug自己找
            params.add(new BasicNameValuePair("captcha-solution", yan));//验证码
            params.add(new BasicNameValuePair("form_email", form_email));//用户名
            params.add(new BasicNameValuePair("form_password", form_password));//密码
            params.add(new BasicNameValuePair("redir", redir));//跳转地址,一般是自己主页
            params.add(new BasicNameValuePair("source", "main"));
            params.add(new BasicNameValuePair("login", "登录"));
            httppost.setEntity(new UrlEncodedFormEntity(params));
            CloseableHttpResponse response = httpclient.execute(httppost);//执行这个post
            
            int statuts_code=response.getStatusLine().getStatusCode();//获得服务器返回状态码
            
            if(statuts_code!=302){
                System.err.println("登录失败~");
                return false;
            }
            
            else{
                System.err.println("登录成功~");
            }
            httppost.releaseConnection();
            return true;
        }

    Step 2:利用火狐浏览器的firebug插件查看发帖时post哪些参数给服务器

    一般是这4个参数:ck、rv_comment、start、submit_btn

    发帖代码如下:

      public static boolean startPost(String url) {//参数url即为帖子地址
    try{ String html=getPageHtml(url); Pattern p=Pattern.compile("呃...你想要的东西不在这儿"); Matcher m=p.matcher(html); if(m.find()){ return false; } Pattern p3=Pattern.compile("该话题已被小组管理员设为不允许回应"); Matcher m3=p3.matcher(html); if(m3.find()){ return false; } Pattern p2=Pattern.compile("请输入上图中的单词"); Matcher m2=p2.matcher(html); if(m2.find()){ System.out.println("要输验证码了~暂停10分钟"); Thread.sleep(600000); return false; } HttpPost httppost = new HttpPost(url+"add_comment#last"); httppost.addHeader("Connection", "keep-alive"); List<NameValuePair> params2 = new ArrayList<NameValuePair>(); params2.add(new BasicNameValuePair("ck", "xNxg"));//这个参数很重要一定要自己用firebug查看,否则发不了贴 params2.add(new BasicNameValuePair("rv_comment","你的评论内容"));// params2.add(new BasicNameValuePair("start", "0")); params2.add(new BasicNameValuePair("submit_btn", "加上去")); httppost.setEntity(new UrlEncodedFormEntity(params2,"utf-8")); CloseableHttpResponse response = httpclient.execute(httppost); int status_code=response.getStatusLine().getStatusCode(); if(status_code==302){ System.out.println("评论成功~ "+url);//评论成功 } else{ System.out.println("评论失败~ "+url);//评论失败 } httppost.releaseConnection(); Thread.sleep(1500); }catch(Exception e){ return false; } return true; }

    完整代码请查看我的GitHub:   https://github.com/wqpod2g/Douban

    感谢这篇帖子的作者:http://www.cnblogs.com/lzzgym/p/3322685.html

  • 相关阅读:
    使用 suspend 和 resume 暂停和恢复线程
    在 Go 语言中使用 Session(一)
    理解Cookie和Session
    Go Iris 中间件
    Go 通道(channel)与协程间通信
    Java终止线程的三种方式
    Go http包执行流程
    Java项目服务器跨域设置
    mysql5.7初始化密码报错 ERROR 1820 (HY000): You must reset your password using ALTER USER statement before
    记录下在阿里云linux上安装与配置Mysql
  • 原文地址:https://www.cnblogs.com/mrpod2g/p/4176307.html
Copyright © 2011-2022 走看看