zoukankan      html  css  js  c++  java
  • python中级---->pymongo存储json数据

      这里面我们介绍一下python中操作mangodb的第三方库pymongo的使用,以及简单的使用requests库作爬虫。人情冷暖正如花开花谢,不如将这种现象,想成一种必然的季节。

    pymongo的安装及前期准备

    一、mangodb的安装以及启动

    测试机器:win10, mangodb版本v3.4.0,python版本3.6.3。

    mangodb的安装目录:D:DatabaseDataBaseMongo。数据的存放目录:E:datadatabasemangodata。首先我们启动mangodb服务器的:可以看到在本地27017端口成功启动server。

    D:DatabaseDataBaseMongoServer3.4in>mongod --dbpath E:datadatabasemangodata
    2017-11-21T20:48:38.458+0800 I CONTROL  [initandlisten] MongoDB starting : pid=20484 port=27017 dbpath=E:datadatabasemangodata 64-bit host=Linux
    2017-11-21T20:48:38.461+0800 I CONTROL  [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
    2017-11-21T20:48:38.462+0800 I CONTROL  [initandlisten] db version v3.4.0
    2017-11-21T20:48:38.463+0800 I CONTROL  [initandlisten] git version: f4240c60f005be757399042dc12f6addbc3170c1
    2017-11-21T20:48:38.464+0800 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1t-fips  3 May 2016
    2017-11-21T20:48:38.465+0800 I CONTROL  [initandlisten] allocator: tcmalloc
    2017-11-21T20:48:38.466+0800 I CONTROL  [initandlisten] modules: none
    2017-11-21T20:48:38.466+0800 I CONTROL  [initandlisten] build environment:
    2017-11-21T20:48:38.467+0800 I CONTROL  [initandlisten]     distmod: 2008plus-ssl
    2017-11-21T20:48:38.468+0800 I CONTROL  [initandlisten]     distarch: x86_64
    2017-11-21T20:48:38.469+0800 I CONTROL  [initandlisten]     target_arch: x86_64
    2017-11-21T20:48:38.469+0800 I CONTROL  [initandlisten] options: { storage: { dbPath: "E:datadatabasemangodata" } }
    2017-11-21T20:48:38.491+0800 I -        [initandlisten] Detected data files in E:datadatabasemangodata created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
    2017-11-21T20:48:38.493+0800 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=5573M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
    2017-11-21T20:48:39.931+0800 I CONTROL  [initandlisten]
    2017-11-21T20:48:39.933+0800 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
    2017-11-21T20:48:39.936+0800 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
    2017-11-21T20:48:39.940+0800 I CONTROL  [initandlisten]
    2017-11-21T20:48:41.253+0800 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory 'E:/data/database/mango/data/diagnostic.data'
    2017-11-21T20:48:41.259+0800 I NETWORK  [thread1] waiting for connections on port 27017

    mangodb客户端的启动:D:DatabaseDataBaseMongoServer3.4inmongo.exe。双击即可运行

    MongoDB shell version v3.4.0
    connecting to: mongodb://127.0.0.1:27017
    MongoDB server version: 3.4.0
    Server has startup warnings:
    2017-11-21T20:48:39.931+0800 I CONTROL  [initandlisten]
    2017-11-21T20:48:39.933+0800 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
    2017-11-21T20:48:39.936+0800 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
    2017-11-21T20:48:39.940+0800 I CONTROL  [initandlisten]
    >

    二、python中pymongo的安装

    pip install pymongo

    这里简单的介绍一下pymongo的使用,这里面的代码是选自github的入门例子。

    >>> import pymongo
    >>> client = pymongo.MongoClient("localhost", 27017)
    >>> db = client.test
    >>> db.name
    u'test'
    >>> db.my_collection
    Collection(Database(MongoClient('localhost', 27017), u'test'), u'my_collection')
    >>> db.my_collection.insert_one({"x": 10}).inserted_id
    ObjectId('4aba15ebe23f6b53b0000000')
    >>> db.my_collection.insert_one({"x": 8}).inserted_id
    ObjectId('4aba160ee23f6b543e000000')
    >>> db.my_collection.insert_one({"x": 11}).inserted_id
    ObjectId('4aba160ee23f6b543e000002')
    >>> db.my_collection.find_one()
    {u'x': 10, u'_id': ObjectId('4aba15ebe23f6b53b0000000')}
    >>> for item in db.my_collection.find():
    ...     print(item["x"])
    ...
    10
    8
    11
    >>> db.my_collection.create_index("x")
    u'x_1'
    >>> for item in db.my_collection.find().sort("x", pymongo.ASCENDING):
    ...     print(item["x"])
    ...
    8
    10
    11
    >>> [item["x"] for item in db.my_collection.find().limit(2).skip(1)]
    [8, 11]

    pymongo的使用例子

    一、python爬虫以及pymongo存储数据

    import requests
    import pymongo
    import json
    
    def requestData():
        url = 'http://****.com/*.do'
        data = {
            'projectId': 90,
            'myTaskFlag': 1,
            'userId': 40
        }
        json_data = requests.post(url, data=json.dumps(data)).json()
        return json_data
    
    def output_data(json_data):
        client = pymongo.MongoClient(host='localhost', port=27017)
        db = client.test
        collection = db.tasks
        tasks_data = json_data.get('taskList')
        collection.insert(tasks_data)
        client.close()
    
    if __name__ == '__main__':
        json_data = requestData()
        output_data(json_data)

    我们把得到的数据存放在tasks集合中,这里使用的是mangodb默认的test数据库。运行完程序,我们可以通过mangodb的客户端查看数据,运行:db.tasks.find().pretty()可以查询tasks集合的所有数据。

    {
            "_id" : ObjectId("5a1427a2edc9f04be40bc02d"),
            "taskId" : 1,
            "summary" : "PC版“个人信息”页面优化",
            "status" : 8,
            "categoryId" : 3,
            "creatorId" : 7,
            "projectId" : 1,
            "dateSubmit" : NumberLong("1481105108000"),
            "level" : 1,
            "handlerId" : 2,
            "ViewState" : 2,
            "priority" : 2
    } {
            "_id" : ObjectId("5a1427a2edc9f04be40bc02e"),
            "taskId" : 2,
            "summary" : "PC版“添加新任务”界面字体太大",
            "status" : 8,
            "categoryId" : 3,
            "creatorId" : 7,
            "projectId" : 1,
            "dateSubmit" : NumberLong("1481105195000"),
            "level" : 1,
            "handlerId" : 2,
            "ViewState" : 2,
            "priority" : 1
    }

    友情链接

  • 相关阅读:
    201521123036 《Java程序设计》第4周学习总结
    201521123036 《Java程序设计》第3周学习总结
    201521123075 《Java程序设计》第12周学习总结
    201521123075 《Java程序设计》第11周学习总结
    201521123075 《Java程序设计》第10周学习总结
    201521123075 《Java程序设计》第9周学习总结
    201521123075 《Java程序设计》第8周学习总结
    201521123075 《Java程序设计》第7周学习总结
    201521123075 《Java程序设计》第6周学习总结
    201521123075 《Java程序设计》第5周学习总结
  • 原文地址:https://www.cnblogs.com/huhx/p/baseusepythonpymongo.html
Copyright © 2011-2022 走看看