zoukankan      html  css  js  c++  java
  • DataX案例:从stream流读取数据并打印到控制台

    stream流读取数据并打印到控制台

    1)查看配置模板

    [jason@hadoop102 bin]$ python datax.py -r streamreader -w streamwriter
    
     
    
    DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
    
    Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
    
     
    
     
    
    Please refer to the streamreader document:
    
         https://github.com/alibaba/DataX/blob/master/streamreader/doc/streamreader.md
    
     
    
    Please refer to the streamwriter document:
    
         https://github.com/alibaba/DataX/blob/master/streamwriter/doc/streamwriter.md
    
     
    
    Please save the following configuration as a json file and  use
    
         python {DATAX_HOME}/bin/datax.py {JSON_FILE_NAME}.json
    
    to run the job.
    
     
    
    {
    
        "job": {
    
            "content": [
    
                {
    
                    "reader": {
    
                        "name": "streamreader",
    
                        "parameter": {
    
                            "column": [],
    
                            "sliceRecordCount": ""
    
                        }
    
                    },
    
                    "writer": {
    
                        "name": "streamwriter",
    
                        "parameter": {
    
                            "encoding": "",
    
                            "print": true
    
                        }
    
                    }
    
                }
    
            ],
    
            "setting": {
    
                "speed": {
    
                    "channel": ""
    
                }
    
            }
    
        }
    
    }

    2)根据模板编写配置文件

    [jason@hadoop102 job]$ vim stream2stream.json

    填写以下内容: 

    {
    
      "job": {
    
        "content": [
    
          {
    
            "reader": {
    
              "name": "streamreader",
    
              "parameter": {
    
                "sliceRecordCount": 10,
    
                "column": [
    
                  {
    
                    "type": "long",
    
                    "value": "10"
    
                  },
    
                  {
    
                    "type": "string",
    
                    "value": "hello,DataX"
    
                  }
    
                ]
    
              }
    
            },
    
            "writer": {
    
              "name": "streamwriter",
    
              "parameter": {
    
                "encoding": "UTF-8",
    
                "print": true
    
              }
    
            }
    
          }
    
        ],
    
        "setting": {
    
          "speed": {
    
            "channel": 1
    
           }
    
        }
    
      }
    
    }

    3)运行

     

    [jason@hadoop102 job]$  /opt/module/datax/bin/datax.py /opt/module/datax/job/stream2stream.json
  • 相关阅读:
    个性化推荐系统中的BadCase分析
    Hadoop优先级调度
    【剑指offer】斐波那契数列
    【剑指offer】旋转数组的最小数字
    【剑指offer】用两个栈实现队列
    【剑指offer】重建二叉树
    【剑指offer】从尾到头打印链表
    【剑指offer】替换空格
    【剑指offer】二维数组中的查找
    聚类算法项目整理
  • 原文地址:https://www.cnblogs.com/LIAOBO/p/13665320.html
Copyright © 2011-2022 走看看