zoukankan      html  css  js  c++  java
  • centos7 PDI(Kettle)安装

    kettle介绍

    PDI(Kettle)是一种开源的 ETL 解决方案,书中介绍了如何使用PDI来实现数据的剖析、清洗、校验、抽取、转换、加载等各类常见的ETL类工作。
    除了ODS/DW类比较大型的应用外,Kettle实际还可以为中小企业提供灵活的数据抽取和数据处理的功能。
    Kettle除了支持各种关系型数据库、HBase、MongoDB这样的NoSQL数据源外,它还支持Excel、Access这类小型的数据源。
    并且通过插件扩展,Kettle 可以支持各类数据源。本书详细介绍了Kettle可以处理的数据源,
    而且详细介绍了如何使用Kettle抽取增量数据。Kettle的数据处理功能也很强大,除了选择、过滤、分组、连接、排序这些常用的功能外,
    Kettle 里的Java表达式、正则表达式、Java脚本、Java类等功能都非常灵活而强大,都非常适合于各种数据处理功能
    

    kettle下载

    kettle安装

    • kettle依赖java,所以需要安装java
    • 如果安装环境是centos7,需要安装webkitgtk。同时需要安装桌面(自行安装)
      • yum install epel-release
      • yum install webkitgtk
    • kettle不需要安装,直接解压就能使用

    以下是官网建议安装依赖

    How to get PDI up and running
     
    Linux
     
    Ubuntu 12.04 and later:
    The libwebkitgtk package needs to be installed. This can be done by running
    apt-get install libwebkitgtk-1.0.0
    Unzip the downloaded file. Run spoon.sh file, it should be under /data-integration.
    On some installations of Ubuntu 14.04, Unity doesn't display the menu bar. In order to fix that, spoon.sh has a setting to disable this integration, export
    UBUNTU_MENUPROXY=0
    You can try to remove that setting if you wish to see if it works propery on your machine
     
    CentOS 6 Desktop:
    The libwebkitgtk package needs to be installed. This can be done by running
    yum install libwebkitgtk
    Unzip the downloaded file and run spoon.sh, it should be under /data-integration.
    

    kettle启动

    • winodws启动脚本
      • Spoon.bat
    • cenos7启动脚本(需要桌面环境启动,否则将报错)
      • Spoon.sh

    kettle报错处理(centos系统需要在桌面环境启动)

    • centos7 需要安装webkitgtk

      • WARNING: no libwebkitgtk-1.0 detected, some features will be unavailable
    • java8 不支持MaxPermSize参数,启动脚本中删除即可

      • Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
    • 具体报错如下:

    #######################################################################
    WARNING:  no libwebkitgtk-1.0 detected, some features will be unavailable
        Consider installing the package with apt-get or yum.
        e.g. 'sudo apt-get install libwebkitgtk-1.0-0'
    #######################################################################
    Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
    org.eclipse.swt.SWTError: No more handles [gtk_init_check() failed]
    	at org.eclipse.swt.SWT.error(Unknown Source)
    	at org.eclipse.swt.widgets.Display.createDisplay(Unknown Source)
    	at org.eclipse.swt.widgets.Display.create(Unknown Source)
    	at org.eclipse.swt.graphics.Device.<init>(Unknown Source)
    	at org.eclipse.swt.widgets.Display.<init>(Unknown Source)
    	at org.eclipse.swt.widgets.Display.<init>(Unknown Source)
    	at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:649)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at org.pentaho.commons.launcher.Launcher.main(Launcher.java:92
    
    • 解决方法

      • yum install webkitgtk

    kettle桌面双击运行

    • 桌面创建启动文件kettle.desktop
    [Desktop Entry]
    Version=7.1
    Name=kettle
    Exec=path to start script xxx/spoon.sh
    Icon=path to ico /spoon.ico
    Terminal=false
    Type=Application
    Categories=Application;
    

    其他报错

    • 安装KDE桌面后启动kettle报错(安装gnome桌面没出现此类问题)
    # A fatal error has been detected by the Java Runtime Environment:
    #
    #  SIGSEGV (0xb) at pc=0x00007f4ab4f35164, pid=4011, tid=0x00007f4b09bd7700
    #
    # JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-b12)
    # Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
    # Problematic frame:
    # C  [libglib-2.0.so.0+0x5e164]  g_match_info_unref+0x4
    #
    # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
    
    • 解决方法
      • 修改系统主题,改成非GTK
    As I already mentioned on #1245468 I could not verify that changing GTK_MODULES, UBUNTU_MENUPROXY, or GTK_IM_MODULE helps in any way.
    
    However, I could verify that the problem GOES AWAYS IN KUBUNTU/KDE when doing:
    
    System Settings -> Application Themes -> GTK -> Choose GTK2 Theme
    
    Choose 'Radiance' instead of 'oxygen-gtk'
    

    报错:ERROR (version 7.1.0.0-12, build 1 from 2017-05-16 17.18.02 by buildguy) : java.io.IOException: Cannot run program "lsb_release": error=2, No such file or directory

    • 解决方法
      • yum -y install redhat-lsb

    插入数据乱码问题

    在kettle的启动文件spoon.sh中jvm的启动参数中,添加参数
    -Dfile.encoding=utf8(指定自己需要的字符集)
    
  • 相关阅读:
    有一个实体类,只想返还一部分字段给前端
    Dozer-对象属性映射工具类
    java冒泡排序
    总结Java中的reference类型与四种引用类型
    关于jar包的两种导包方式
    Java Web项目的创建——IDEA+Maven+Tomcat
    关于maven的配置过程
    MYSQL数据库的增删改以及查
    关于linux系统下,出现ERROR 1366 (HY000): Incorrect string value: 'xE6xB4xBBxE5x8AxA8...' for column 'deptN的问题解决方法
    Java Script
  • 原文地址:https://www.cnblogs.com/zhanmeiliang/p/7844362.html
Copyright © 2011-2022 走看看