zoukankan      html  css  js  c++  java
  • 《Python网络编程》学习笔记--使用谷歌地理编码API获取一个JSON文档

    Foundations of Python Network Programing,Third Edition 《python网络编程》,本书中的代码可在Github上搜索fopnp下载

    本书的第一章中使用到了google地图的api来获取一个地址的经度和纬度,因为众所周知的原因会出现无法访问,我们需要使用代理访问

    因此书上的代码需要根据实际情况来修改,我的电脑的代理地址为127.0.0.1:1080,下面放我的代码吧,可根据自己电脑的代理设置进行修改。

    运行环境:Windows 10,Anaconda3,python3.6.3,Pycharm Edu 2017.3

    调用库:

    #search1.py
    
    from pygeocoder import Geocoder
    
    if __name__ == '__main__':
        a = Geocoder()
        a.proxy = "127.0.0.1:1080"
        address = '207 N. Definace St,Archbold,OH'
        print(a.geocode(address)[0].coordinates)

    这里使用的是Geocoder中的proxy参数设置代理(需要先使用pip安装pygeocoder),因此必须先实例化,不能像书中一样直接print

    应用层:

    #search2.py

    import requests proxies = {"http": "http://127.0.0.1:1080", "https": "http://127.0.0.1:1080", } def geocode(address): parameters = {'address': address, 'sensor': 'falise'} base = 'http://maps.googleapis.com/maps/api/geocode/json' response = requests.get(base, params=parameters, proxies=proxies) answer = response.json() print(answer['results'][0]['geometry']['location']) if __name__ == '__main__': geocode('207 N. Defiance St,Archbold, OH')

    这里使用了requests中的proxies参数设置代理

    使用HTTP协议:

    # search3.py
    
    import http.client
    import json
    from urllib.parse import quote_plus
    
    base = '/maps/api/geocode/json'
    
    
    def geocode(address):
        path = '{}?address={}&sensor=false'.format(base, quote_plus(address))
        connection = http.client.HTTPSConnection('127.0.0.1', 1080)
        connection.set_tunnel('map.google.com')
        connection.request('GET', path)
        rawreply = connection.getresponse().read()
        reply = json.loads(rawreply.decode('utf-8'))
        print(reply['results'][0]['geometry']['location'])
    
    
    if __name__ == '__main__':
        geocode('207 N. Defiance St,Archbold, OH')

    这里会提示

    Traceback (most recent call last):
      File "E:/Learn Python/Python网络编程/search3.py", line 21, in <module>
        geocode('207 N. Defiance St,Archbold, OH')
      File "E:/Learn Python/Python网络编程/search3.py", line 16, in geocode
        reply = json.loads(rawreply.decode('utf-8'))
      File "D:Anaconda3libjson\__init__.py", line 354, in loads
        return _default_decoder.decode(s)
      File "D:Anaconda3libjsondecoder.py", line 339, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "D:Anaconda3libjsondecoder.py", line 357, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    Process finished with exit code 1
    
    

    很明显报了一个json.decoder.JSONDecodeError的错误 说明没有能够正确访问,json decode失败

    print(rawreply)发现rawreply返回的是这样的html文件

    b'<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
    
    <TITLE>301 Moved</TITLE></HEAD><BODY>
    <H1>301 Moved</H1>
    
    The document has moved
    
    <A HREF="https://maps.google.com/maps/api/geocode/json?address=207+N.+Defiance+St%2CArchbold%2C+OH&sensor=false">here</A>.
    
    </BODY>
    </HTML>
    
    '

    返回了一个301错误,说明需要重定向这里我们使用的是HTTPS协议因此不会像浏览器一样直接重定向,感觉应该是google反爬虫的一种行为

    因此我们使用正则表达式提取字符串(方法来自https://www.cnblogs.com/rj81/p/5933838.html),更改后代码如下

    # search3.py
    
    import http.client
    import json
    from urllib.parse import quote_plus
    import re
    
    base = '/maps/api/geocode/json'
    
    
    def geocode(address):
        path = '{}?address={}&sensor=false'.format(base, quote_plus(address))
        connection = http.client.HTTPSConnection('127.0.0.1', 1080)
        connection.set_tunnel('map.google.com')
        connection.request('GET', path)
        rawreply = connection.getresponse().read().decode()
        newweb = re.findall(r"HREF="(.+?)"", string=rawreply)
        # print(newweb)
        connection.request('GET', newweb[0])
        rawreply = connection.getresponse().read()
        # print(path)
        # print(rawreply)
        reply = json.loads(rawreply.decode('utf-8'))
        print(reply['results'][0]['geometry']['location'])
    
    
    if __name__ == '__main__':
        geocode('207 N. Defiance St, Archbold, OH')




    即可正确输出结果

    {'lat': 41.5219645, 'lng': -84.3066496}
    
    Process finished with exit code 0

    这里需要注意的是 我一开始以为newweb是一个str,直接使用了connection.request('GET', newweb)

    结果发现AttributeError: 'list' object has no attribute 'startswith'的错误,更改之后即可正常输出

    直接使用Socket与谷歌地图通信:

    设置代理的方法(转自http://www.jb51.net/article/50510.htm)

             urllib2:

    proxy_handler = urllib2.ProxyHandler({'http' : 'http://地址:端口'})
    opener = urllib2.build_opener(proxy_handler, urllib2.HTTPHandler)
    urllib2.install_opener(opener)

               socket:

    import socks, socket
    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "地址", 端口)
    socket.socket = socks.socksocket

    代码如下:

    #search4.py
    #!/usr/bin/env python3
    
    import socket
    import socks
    from urllib.parse import quote_plus
    
    request_text = """
    GET /maps/api/geocode/json?address={}&sensor=false HTTP/1.1
    
    Host: maps.google.com:80
    
    User-Agent: search4.py (Foundations of Python Network Programming)
    
    Connection: close
    
    
    
    """
    
    
    def geocode(address):
        socks.set_default_proxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 1080)
        socket.socket = socks.socksocket
        sock = socket.socket()
        sock.connect(('maps.google.com', 80))
        request = request_text.format(quote_plus(address))
        sock.sendall(request.encode('ascii'))
        raw_reply = b''
        while True:
            more = sock.recv(4096)
            if not more:
                break
            raw_reply += more
        print(raw_reply.decode('utf-8'))
    
    
    if __name__ == '__main__':
        geocode('207 N. Defiance St, Archbold, OH')

    运行输出:

    HTTP/1.1 200 OK
    Content-Type: application/json; charset=UTF-8
    Date: Fri, 12 Jan 2018 07:21:20 GMT
    Expires: Sat, 13 Jan 2018 07:21:20 GMT
    Cache-Control: public, max-age=86400
    Access-Control-Allow-Origin: *
    Server: mafe
    X-XSS-Protection: 1; mode=block
    X-Frame-Options: SAMEORIGIN
    Accept-Ranges: none
    Vary: Accept-Language,Accept-Encoding
    Connection: close
    
    {
       "results" : [
          {
             "address_components" : [
                {
                   "long_name" : "207",
                   "short_name" : "207",
                   "types" : [ "street_number" ]
                },
                {
                   "long_name" : "North Defiance Street",
                   "short_name" : "N Defiance St",
                   "types" : [ "route" ]
                },
                {
                   "long_name" : "Archbold",
                   "short_name" : "Archbold",
                   "types" : [ "locality", "political" ]
                },
                {
                   "long_name" : "German Township",
                   "short_name" : "German Township",
                   "types" : [ "administrative_area_level_3", "political" ]
                },
                {
                   "long_name" : "Fulton County",
                   "short_name" : "Fulton County",
                   "types" : [ "administrative_area_level_2", "political" ]
                },
                {
                   "long_name" : "Ohio",
                   "short_name" : "OH",
                   "types" : [ "administrative_area_level_1", "political" ]
                },
                {
                   "long_name" : "United States",
                   "short_name" : "US",
                   "types" : [ "country", "political" ]
                },
                {
                   "long_name" : "43502",
                   "short_name" : "43502",
                   "types" : [ "postal_code" ]
                },
                {
                   "long_name" : "1160",
                   "short_name" : "1160",
                   "types" : [ "postal_code_suffix" ]
                }
             ],
             "formatted_address" : "207 N Defiance St, Archbold, OH 43502, USA",
             "geometry" : {
                "bounds" : {
                   "northeast" : {
                      "lat" : 41.521994,
                      "lng" : -84.30646179999999
                   },
                   "southwest" : {
                      "lat" : 41.521935,
                      "lng" : -84.30683739999999
                   }
                },
                "location" : {
                   "lat" : 41.5219645,
                   "lng" : -84.3066496
                },
                "location_type" : "ROOFTOP",
                "viewport" : {
                   "northeast" : {
                      "lat" : 41.5233134802915,
                      "lng" : -84.30530061970849
                   },
                   "southwest" : {
                      "lat" : 41.5206155197085,
                      "lng" : -84.3079985802915
                   }
                }
             },
             "place_id" : "ChIJk4BHnIy0PYgRXbKj5GjFe_U",
             "types" : [ "premise" ]
          }
       ],
       "status" : "OK"
    }
    
    
    Process finished with exit code 0
  • 相关阅读:
    【技术贴】解决Eclipse编译java源文件之后没有生成class文件|找不到class文件
    小米1s 正式为我服役,纪念一下。
    解决servlet的out输出流html中文乱码
    小米1s充电时屏幕一直亮着不关闭的解决办法
    今天遇到的问题分析
    java操作Excel(org.apache.poi.hssf.usermodel)
    HTML滚动文字代码 marquee标签
    MIUI小米 卸载金山安全服务
    Oracle DBA 逻辑备份试题
    Oracle DBA结构试题1
  • 原文地址:https://www.cnblogs.com/take-fetter/p/8278864.html
Copyright © 2011-2022 走看看