zoukankan      html  css  js  c++  java
  • The Guardian’s Migration from MongoDB to PostgreSQL on Amazon RDS

    转载一片mongodb 迁移pg 数据库的文章

    The Guardian migrated their CMS's datastore in 2018 from a self-managed MongoDBcluster to PostgreSQL on Amazon RDS for a fully managed solution. The team did an API-based migration without any downtime.

    Guardian’s in-house CMS - called Composer - which stores articles, blog content, photo galleries and video was originally built on top of MongoDB as a datastore. This was preceded by a vendor software backed by an Oracle database. This setup had downtimes whenever the schema had to be migrated. As an alternative, the team looked at various NoSQL dbs, and one of the key reasons for choosing MongoDB seems to have been flexibility. Originally hosted on their own datacenter, they moved their MongoDB to their AWS servers after an outage. The installation and management scripts had to be handwritten by Guardian’s team. They opted for a support contract and bought the OpsManager tool, which is a frontend application for managing MongoDB. However, the team did not go for MongoDB's Atlas offering, which is a "fully managed database", for reasons which are unclear. OpsManager does not manage deployments.

    After moving to AWS, the team faced two MongoDB outages. Some of them the reasons were basic system administration issues, like not allowing NTP to access time servers to keep clocks in sync. Others pertained to the difficulty of managing OpsManager itself and obtaining timely support from the vendor, according to the article. The team felt that moving to a solution which had minimal database management would suit them best.

    The team chose PostgreSQL due to its maturity and support for the jsonb data type, as a hosted database on Amazon’s RDS. The jsonb type allows for indexing of fields inside the JSON object. The migration plan was to write a new API over Postgres and use a proxy that would send traffic to both APIs to keep them in sync for new, incoming data. The existing data would be migrated using the APIs, and then the proxy would switch to the new API. Their previous migration from Oracle was also done using a similar approach. The migration script logs were pushed to Elasticsearch so that the migration could be tracked. In the process, they also improved their structured logging.

    The proxy directed all traffic to the MongoDB API in real time, and asynchronously to the Postgres API . Any difference in the responses was logged and analyzed. To ensure that the new API and backend can hold up to production traffic, GoReplay processes were run to generate traffic. GoReplay can capture traffic and replay it against a different environment - in this case, the pre-production one. A complete migration was done on the pre-production environment. The final step in the production migration was to switch the DNS name from the proxy's endpoint (an Amazon ELB) to the Postgres API (another ELB). This allowed their clients to function without any change. Post-migration, their integration tests failed as they had not been migrated to the new API.

    Other organizations have moved from MongoDB to PostgreSQL for a variety of reasons.

  • 相关阅读:
    软件编写和设计中的18大原则
    Ubuntu CTRL+ALT+F1~F6 进入命令模式后不支持中文显示的解决办法
    BM串匹配算法
    KMP串匹配算法解析与优化
    mongodb随机查询一条记录的正确方法!
    这真的该用try-catch吗?
    计算机的本质与数值、文字、声音、图像
    编程语言的概念
    linux服务方式启动程序脚本(init.d脚本)
    linux的7种运行级别
  • 原文地址:https://www.cnblogs.com/rongfengliang/p/10259045.html
Copyright © 2011-2022 走看看