zoukankan      html  css  js  c++  java
  • dbt 0.13.0 新添加特性sources 试用

    dbt 0.13 添加了一个新的功能sources 我呢可以用来做以下事情

    • 从基础模型的源表中进行数据选择
    • 测试对于源数据的假设
    • 计算源数据的freshness

    source 操作

    • 定义source 模版格式

      注意对于pg 等类型的,如果包含了schema 的可能需要配置额外参数,或者通过schema 约定

    # This example defines a source called `source_1` containing one table
    # called `table_1`. This is a minimal example of a source definition.
    version: 2
    sources:
      - name: source_1
        tables:
          - name: table_1
          - name: table_2
      - name: source_2
        tables:
          - name: table_1
     
     
    • schema 配置数据源格式
    # This source entry describes the table:
    # "raw"."public"."Orders_"
    #
    # It can be referenced with:
    # {{ source('ecommerce', 'orders') }}
    version: 2
    sources:
      - name: ecommerce
        database: raw # Tell dbt to look for the source in the "raw" database
        schema: public # You wouldn't put your source data in public, would you?
        tables:
          - name: orders
            identifier: Orders_ # To alias table names to account for strange casing or naming of tables
     
     

    一个简单例子

    我配置的source 直接在model 文件夹中 可以参考https://github.com/rongfengliang/dbt-source-demo,关于表数据结构
    也可以参考此项目

    • 环境准备(使用python venv 管理)
    python3 -m venv venv 
    source venv/bin/activate
    pip install dbt
    • 测试数据库准备(使用docker-compose)
    version: '3.6'
    services:
      postgres:
        image: postgres:9.6.11
        ports: 
        - "5432:5432"
        environment:
        - "POSTGRES_PASSWORD:dalong"
      graphql-engine:
        image: hasura/graphql-engine:v1.0.0-beta.2
        ports:
        - "8080:8080"
        depends_on:
        - "postgres"
        environment:
        - "HASURA_GRAPHQL_DATABASE_URL=postgres://postgres:dalong@postgres:5432/postgres"
        - "HASURA_GRAPHQL_ENABLE_CONSOLE=true"
        - "HASURA_GRAPHQL_ENABLE_ALLOWLIST=true"
    • model source 配置
    models
    ├── apps
    ├── app_summary.sql
    └── sources.yml
    └── users
        ├── sources.yml
        ├── user_summary.sql
        └── user_summary2.sql
    • source 内容

      内容很简单,就是配置table

    version: 2
    sources:
      - name: apps
        schema: public
        tables:
          - name: apps
    • 运行效果
    dbt run

    效果

    Running with dbt=0.13.1
    Found 3 models, 0 tests, 0 archives, 0 analyses, 94 macros, 0 operations, 0 seed files, 2 sources
    17:43:42 | Concurrency: 3 threads (target='dev')
    17:43:42 | 
    17:43:42 | 1 of 3 START view model public.app_summary........................... [RUN]
    17:43:42 | 2 of 3 START view model public.user_summary.......................... [RUN]
    17:43:42 | 3 of 3 START table model public.user_summary2........................ [RUN]
    17:43:44 | 2 of 3 OK created view model public.user_summary..................... [CREATE VIEW in 0.26s]
    17:43:45 | 1 of 3 OK created view model public.app_summary...................... [CREATE VIEW in 0.27s]
    17:43:46 | 3 of 3 OK created table model public.user_summary2................... [SELECT 2 in 0.27s]
    17:43:46 | 
    17:43:46 | Finished running 2 view models, 1 table models in 4.46s.
    Completed successfully
    Done. PASS=3 ERROR=0 SKIP=0 TOTAL=3

    参考资料

    https://github.com/rongfengliang/dbt-source-demo

  • 相关阅读:
    V4L2学习(三)框架分析
    Linux 内核源码外编译 linux模块--编译驱动模块的基本方法
    V4L2学习(二)结构介绍
    V4L2学习(一)整体说明
    Linux内存管理之mmap详解
    C语言指针分析
    V4L2使用V4L2_MEMORY_USERPTR和V4L2_MEMORY_MMAP的区别
    Ubuntu添加环境变量
    list_add_tail()双向链表实现分析
    Linux下查看USB设备信息
  • 原文地址:https://www.cnblogs.com/rongfengliang/p/10988774.html
Copyright © 2011-2022 走看看