zoukankan      html  css  js  c++  java
  • Python Web-第三周-Networks and Sockets(Using Python to Access Web Data)

    1.Networked Programs


    1.Internet

    我们现在学习Internet部分,即平时我们浏览器做的事情,之后再学习客服端这部分


    2.TCP 传输控制协议


    3.Socket


     


    HTTP80端口用来与浏览器沟通


    4.Sockets in Python

    1 mysock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)#like file open
    2 #AF_INET refer i'm make an internet socket
    3 #STREAM refer i'm make an stream socket
    4 mysock.connect(('www.py4inf.com',80))
    5 #在我们这个程序和www.py4inf.com的80端口间建立一个Sockets
    Python天然支持TCP Sockets
    docs.python.org/library/socket.html  

    2.From Sockets to Applications


    1.HTTP 超文本传输协议


    http://www.dr-chuck.com/page1.htm

    protocol        host                  document

    2.Sockets


    Click the Second Page is just a socket

    3.Hacking HTTP


    用telnet 加 GET去获取网页内容(Win7 默认不带telnet)

    每次访问网页都是十几二十个GET,GET html、GET CSS、GET image....

    3.Let's Write a Web Browser


    1.An HTTP Request in Python

     1 import socket
     2 mysock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)#like file open
     3 #AF_INET refer i'm make an internet socket
     4 #STREAM refer i'm make an stream socket
     5 mysock.connect(('www.py4inf.com',80))
     6 #在我们这个程序和www.py4inf.com的80端口间建立一个Sockets
     7 toSend='GET http://www.py4inf.com/code/romeo.txt HTTP/1.0
    
    '
     8 mysock.send(toSend.encode('ascii'))
     9 whileTrue:
    10 data = mysock.recv(65)#65是buf长度,此处用来设置显示数据时的长度
    11 if(len(data)<1):
    12 break
    13 print(data)
    14 mysock.close()

    2.编码错误,及其解决方法

    使用encode 进行以下类型转换即可

    1 toSend='GET http://www.py4inf.com/code/romeo.txt HTTP/1.0
    
    '
    2 mysock.send(toSend.encode('ascii'))

    3.Making HTTP Easier With urllib

    socket比url更加接近底层,也就是说url更加简单。

    socket是 Transport Layer , url是 Application Layer

     

    注:2.x版本python使用import urllib,但3.x版本python使用的是import urllib.request

    1 import urllib.request
    2 fhand=urllib.request.urlopen('http://www.py4inf.com/code/romeo.txt')
    3 for line in fhand:
    4 print(line.strip())

    4.Like a file

    urllib turn URLs into files,所以我们可以像操作文件一样操作它
    1 import urllib.request
    2 fhand=urllib.request.urlopen('http://www.py4inf.com/code/romeo.txt')
    3 counts=dict()
    4 for line in fhand:
    5 words=line.split()
    6 for word in words:
    7 counts[word]=counts.get(word,0)+1
    8 print(counts)

    Words:

    subtlety 微妙

  • 相关阅读:
    转 linux shell自定义函数(定义、返回值、变量作用域)介绍
    转 Shell调试篇
    WIN2012的桌面和开始菜单跑到什么地方去了
    转 awr自动收集脚本
    Troubleshooting Guide for ORA-12541 TNS: No Listener
    test
    向Linus学习,让代码具有good taste
    php 页面展示
    c++ list sort
    c++ word类型
  • 原文地址:https://www.cnblogs.com/moonache/p/5112060.html
Copyright © 2011-2022 走看看