zoukankan      html  css  js  c++  java
  • use socket fetch page from baidu

    #!/usr/bin/env python
    #encoding=utf-8
    import socket,codecs
    s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
    hostname="www.baidu.com"
    addr=socket.gethostbyname(hostname)
    print addr
    s.connect((addr,80))
    html="""GET / HTTP/1.0\r\n"""
    html="""HEAD / HTTP/1.0\r\n"""
    html+="""Host: www.baidu.com\r\n"""
    html+="""User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0\r\n"""
    html+="""Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n"""
    html+="""Accept-Language: zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3\r\n"""
    html+="""Cookie: BAIDUID=4782C3288E4A1689E0F8CBC0DF82BB1D:FG=1; BDUT=sc2x4782C3288E4A1689E0F8CBC0DF82BB1D13bda69e4000; H_PS_PSSID=1428_1667_1662\r\n"""
    html+="""Cache-Control: max-age=0\r\n"""
    html+="""\r\n"""
    f=None
    s.sendall(html)
    first=True
    count=0
    while True:
        count+=1
        print "aaaa"
        msg = s.recv(40960)
        #print msg
        if not len(msg):
                if f!=None:
                    f.flush()
                    f.close()

                break
        if first:
                first=False
                headpos=msg.index("\r\n\r\n")
                print msg[:headpos]
        print type(msg)
        ff=codecs.open("./%s.txt"%count,"w","utf-8")
        ff.write(msg.decode("gbk","ignore"))
        ff.close()


    exit(0)
    import urllib2
    print urllib2.urlopen("http://www.baidu.com").read()

  • 相关阅读:
    CSS3中三种清除浮动(float)影响的方式
    HTML中关于动态创建的标签无法绑定js事件的解决方法:.on()方法的 [.selector]
    Android 5.0以上heads up通知
    CoordinatorLayout
    ViewDragHelper
    Transition FrameWork
    Android启动过程
    不要滥用SharedPreference
    不要在Application中缓存数据
    SparseArray替代HashMap来提高性能
  • 原文地址:https://www.cnblogs.com/lexus/p/2846897.html
Copyright © 2011-2022 走看看