zoukankan      html  css  js  c++  java
  • Singer 学习二 使用Singer进行gitlab 2 postgres 数据转换

    Singer 可以方便的进行数据的etl 处理,我们可以处理的数据可以是api 接口,也可以是数据库数据,或者
    是文件
    备注: 测试使用docker-compose 运行&&提供数据库内容,使用virtualenv && python 3.5 以及以上

    环境准备

    • docker-compose 文件
     
    version: "3"
    services:
      gogs-service:
        image: gogs/gogs
        ports:
          - "10022:22"
          - "10080:3000"
      mysql:
        image: mysql:5.7.16
        ports:
          - 3306:3306
        command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
        environment:
          MYSQL_ROOT_PASSWORD: dalongrong
          MYSQL_DATABASE: gogs
          MYSQL_USER: gogs
          MYSQL_PASSWORD: dalongrong
          TZ: Asia/Shanghai
      postgres:
        image: postgres:9.6.11
        ports:
        - "5432:5432"
        environment:
        - "POSTGRES_PASSWORD:dalong"
     
     
    • postgres target 配置
      target.json
     
    {
        "host": "localhost",
        "port": 5432,
        "dbname": "postgres",
        "user": "postgres",
        "password": "postgres",
        "schema": "public"
    }
     
     
    • 创建gitlab virtualenv
    virtualenv gitlab  
    source ./gitlab/bin/activate
    pip install tap-gitlab
     
    • 创建access_token
      从gitlab 官方网站创建即可
    • gitlab tap 配置文件
      格式如下,因为隐私没有暴露:
     
    {
     "api_url": "https://gitlab.com/api/v4",
     "private_token": "xxxxxxx",
     "groups": "",
     "projects": "", 
     "start_date":"2010-01-01T00:00:00Z"
    }
     
     

    运行&&效果

    • 运行
    ./gitlab/bin/tap-gitlab -c gitlab.json | ./postgres/bin/target-postgres -c target.json
     
    • 效果
    INFO Starting sync
    INFO GET https://gitlab.com/api/v4/projects/dalongrong%2Fdemoapp?private_token=XXXXXXXXX
    INFO Table 'projects' does not exist. Creating... CREATE TABLE public.projects ("approvals_before_merge" bigint, "archived" boolean, "avatar
    _url" character varying, "builds_enabled" boolean, "container_registry_enabled" boolean, "created_at" timestamp without time zone, "creator_
    id" bigint, "default_branch" character varying, "description" character varying, "forks_count" bigint, "http_url_to_repo" character varying,
     "id" bigint, "issues_enabled" boolean, "last_activity_at" timestamp without time zone, "lfs_enabled" boolean, "merge_requests_enabled" bool
    ean, "name" character varying, "name_with_namespace" character varying, "namespace__id" bigint, "namespace__kind" character varying, "namespace__name" character varying, "namespace__path" character varying, "only_allow_merge_if_all_discussions_are_resolved" boolean, "only_allow_merge_if_build_succeeds" boolean, "open_issues_count" bigint, "owner_id" bigint, "path" character varying, "path_with_namespace" character varying, "public" boolean, "public_builds" boolean, "request_access_enabled" boolean, "shared_runners_enabled" boolean, "shared_with_groups" jsonb, "snippets_enabled" boolean, "ssh_url_to_repo" character varying, "star_count" bigint, "tag_list" jsonb, "visibility_level" bigint, "web_url" character varying, "wiki_enabled" boolean, PRIMARY KEY ("id"))
    INFO Table 'branches' does not exist. Creating... CREATE TABLE public.branches ("commit_id" character varying, "developers_can_merge" boolean, "developers_can_push" boolean, "merged" boolean, "name" character varying, "project_id" bigint, "protected" boolean, PRIMARY KEY ("project_id", "name"))
    INFO Table 'commits' does not exist. Creating... CREATE TABLE public.commits ("allow_failure" boolean, "author_email" character varying, "author_name" character varying, "committer_email" character varying, "committer_name" character varying, "created_at" timestamp without time zone, "id" character varying, "message" character varying, "project_id" bigint, "short_id" character varying, "title" character varying, PRIMARY KEY ("id"))
    INFO Table 'issues' does not exist. Creating... CREATE TABLE public.issues ("assignee_id" bigint, "author_id" bigint, "confidential" boolean, "created_at" timestamp without time zone, "description" character varying, "due_date" character varying, "id" bigint, "iid" bigint, "labels" jsonb, "milestone_id" bigint, "project_id" bigint, "state" character varying, "subscribed" boolean, "title" character varying, "updated_at" timestamp wi
     
     

    说明

    使用类似的方法,我们也可以转换github 的以及jira 等基于api 开发的模型

    参考资料

    https://github.com/singer-io/tap-gitlab
    https://github.com/rongfengliang/singer-mysql2postges-demo

  • 相关阅读:
    微信運動步數
    JS逐页转pdf文件为图片格式
    js学习笔记]PDF.js专题
    PDF轉圖片流並jquery顯示到頁面
    使用 pdf.js 在网页中加载 pdf 文件
    使用pdfobject.js实现在线浏览PDF
    Echarts的使用
    C# ffmpeg 视频处理
    C#文件/文件夾壓縮,解壓縮
    epplus插入圖片/鏈接
  • 原文地址:https://www.cnblogs.com/rongfengliang/p/10239449.html
Copyright © 2011-2022 走看看