zoukankan      html  css  js  c++  java
  • Programming Impala Applications

    Programming Impala Applications

    The core development language with Impala is SQL. You can also use Java or other languages to interact with Impala through the standard JDBC and ODBC interfaces used by many business intelligence tools. For specialized kinds of analysis, you can supplement the SQL built-in functions by writing user-defined functions (UDFs) in C++ or Java.

    Continue reading:

    Overview of the Impala SQL Dialect

    The Impala SQL dialect is descended from the SQL syntax used in the Apache Hive component (HiveQL). As such, it is familiar to users who are already familiar with running SQL queries on the Hadoop infrastructure. Currently, Impala SQL supports a subset of HiveQL statements, data types, and built-in functions.

    For users coming to Impala from traditional database backgrounds, the following aspects of the SQL dialect might seem familiar or unusual:

    • Impala SQL is focused on queries and includes relatively little DML. There is no UPDATE or DELETE statement. Stale data is typically discarded (by DROP TABLE orALTER TABLE ... DROP PARTITION statements) or replaced (by INSERT OVERWRITE statements).
    • All data loading is done by INSERT statements, which typically insert data in bulk by querying from other tables. There are two variations, INSERT INTO which appends to the existing data, and INSERT OVERWRITE which replaces the entire contents of a table or partition (similar to TRUNCATE TABLE followed by a new INSERT). There is no INSERT ... VALUES syntax to insert a single row.
    • You often construct Impala table definitions and data files in some other environment, and then attach Impala so that it can run real-time queries. The same data files and table metadata are shared with other components of the Hadoop ecosystem.
    • Because Hadoop and Impala are focused on data warehouse-style operations on large data sets, Impala SQL includes some idioms that you might find in the import utilities for traditional database systems. For example, you can create a table that reads comma-separated or tab-separated text files, specifying the separator in theCREATE TABLE statement. You can create external tables that read existing data files but do not move or transform them.
    • Because Impala reads large quantities of data that might not be perfectly tidy and predictable, it does not impose length constraints on string data types. For example, you can define a database column as STRING with unlimited length, rather than CHAR(1) or VARCHAR(64)Although in Impala 2.0 and later, you can also use length-constrained CHAR and VARCHAR types.)
    • For query-intensive applications, you will find familiar notions such as joinsbuilt-in functions for processing strings, numbers, and dates, aggregate functions, subqueries, and comparison operators such as IN() and BETWEEN.
    • From the data warehousing world, you will recognize the notion of partitioned tables.
    • In Impala 1.2 and higher, UDFs let you perform custom comparisons and transformation logic during SELECT and INSERT...SELECT statements.

    Related information: Impala SQL Language Reference, especially SQL Statements and Built-in Functions

    Overview of Impala Programming Interfaces

    You can connect and submit requests to the Impala daemons through:

    • The impala-shell interactive command interpreter.
    • The Apache Hue web-based user interface.
    • JDBC.
    • ODBC.

    With these options, you can use Impala in heterogeneous environments, with JDBC or ODBC applications running on non-Linux platforms. You can also use Impala on combination with various Business Intelligence tools that use the JDBC and ODBC interfaces.

    Each impalad daemon process, running on separate nodes in a cluster, listens to several ports for incoming requests. Requests from impala-shell and Hue are routed to the impalad daemons through the same port. The impalad daemons listen on separate ports for JDBC and ODBC requests.

    谨言慎行,专注思考 , 工作与生活同乐
  • 相关阅读:
    为什么不能获取PHP表单变量的?
    DOM4j读写XML(实例)
    MyEclipse环境的使用中的一些常识
    Springcloud学习笔记38springboot整合日志框架log4j2
    Springcloud学习笔记37任务调度框架Quartz 使用(Cron表达式)与@scheduled注解定时任务
    Linux学习笔记05linux 常用操作命令02(touch命令、cp命令、rm命令、mv命令)
    Linux学习笔记07Vim文本编辑器
    Java基础知识13Java反射原理以及基本使用和重写与重载的区别
    Linux学习笔记07常用操作命令(tar命令)
    Java基础知识14commonsio第3方开源库的具体使用(IOUtils类、FileUtils类、FilenameUtils类)
  • 原文地址:https://www.cnblogs.com/tmeily/p/4271589.html
Copyright © 2011-2022 走看看