目录:
- 连接hadoop配置语法
- 配置hadoop连接
- Pushdown配置
- Create external tables for Azure blob storage
连接hadoop配置语法:
global configuration settings for PolyBase Hadoop and Azure blob storage connectivity, Syntax:
- sp_configure: List all of the configuration options
- 配置与hadoop连接语法:
-
sp_configure [ @configname = ] 'hadoop connectivity', [ @configvalue = ] { 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 } [;] RECONFIGURE [;]
- Option 0: Disable Hadoop connectivity
- Option 1: Hortonworks HDP 1.3 on Windows Server
- Option 1: Azure blob storage (WASB[S])
- Option 2: Hortonworks HDP 1.3 on Linux
- Option 3: Cloudera CDH 4.3 on Linux
- Option 4: Hortonworks HDP 2.0 on Windows Server
- Option 4: Azure blob storage (WASB[S])
- Option 5: Hortonworks HDP 2.0 on Linux
- Option 6: Cloudera 5.1, 5.2, 5.3, 5.4, and 5.5 on Linux
- Option 7: Hortonworks 2.1, 2.2, and 2.3 on Linux
- Option 7: Hortonworks 2.1, 2.2, and 2.3 on Windows Server
- Option 7: Azure blob storage (WASB[S])
- 示例代码:
- 显示所有可用的配置列表: EXEC sp_configure;
- 显示某个配置项的配置信息: EXEC sp_configure @configname='hadoop connectivity';
- 设置配置项示例语法如下:
-
sp_configure @configname = 'hadoop connectivity', @configvalue = 7; GO RECONFIGURE GO
配置hadoop连接
- 在本节示例中,连接HDInsight3.2集,对应关系如下:
- 配置polybase连接配置项信息信息如下:
-
sp_configure @configname = 'hadoop connectivity', @configvalue = 7; GO RECONFIGURE GO
- 命令行: services.msc, 重启下面的服务
- SQL Server PolyBase Data Movement Service
- SQL Server PolyBase Engine
Pushdown配置:
- To improve query performance, enable pushdown computation to a Hadoop cluster
- Find the file yarn-site.xml in the installation path of SQL Server. Typically, the path is: C:Program FilesMicrosoft SQL ServerMSSQL13.MSSQLSERVERMSSQLBinnPolybaseHadoopconf
- On the Hadoop machine, find the analogous(类似) file in the Hadoop configuration directory. In the file, find and copy the value of the configuration key yarn.application.classpath
- On the SQL Server machine, in the yarn-site.xml file, find the yarn.application.classpath property. Paste the value from the Hadoop machine into the value element.
-
$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*
Create external tables for Azure blob storage
- The Elastic(弹性) Database query feature relies on(依靠) the these four DDL statements. Typically, these DDL statements are used once or rarely when the schema of your application changes
- [CREATE MASTER KEY] (https://msdn.microsoft.com/library/ms174382.aspx)
- [CREATE CREDENTIAL] (https://msdn.microsoft.com/library/ms189522.aspx)
- [CREATE DATABASE SCOPED CREDENTIAL] (https://msdn.microsoft.com/library/mt270260.aspx)
- [CREATE/DROP EXTERNAL DATA SOURCE] (https://msdn.microsoft.com/library/dn935022.aspx)
- [CREATE/DROP EXTERNAL TABLE] (https://msdn.microsoft.com/library/dn935021.aspx)
- You can use the following syntax to drop the master key and credentials
- DROP CREDENTIAL <credential_name> ON DATABASE;
- DROP DATABASE SCOPED CREDENTIAL <credential_name>;
- DROP MASTER KEY;
- 模拟简单文本数据,如下:
- 将文本文件上传到 Azure blob storage, 如下图:
- 远程连接ploybase1, 打开ssms(注:此版本需单独安装),执行脚本如下:
-
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Password01!'; CREATE DATABASE SCOPED CREDENTIAL JNAzureCredit WITH IDENTITY = 'zhushy', Secret = XXXX'; /*存储访问Key*/ CREATE EXTERNAL DATA SOURCE JNAzureStorage with ( TYPE = HADOOP, LOCATION ='wasb://<blob_container_name>@<azure_storage_account_name>.blob.core.chinacloudapi.cn', CREDENTIAL = JNAzureCredit ); CREATE EXTERNAL FILE FORMAT TextFileFormat WITH ( FORMAT_TYPE = DELIMITEDTEXT, FORMAT_OPTIONS (FIELD_TERMINATOR =' ', USE_TYPE_DEFAULT = TRUE)) CREATE EXTERNAL TABLE [dbo].[hospital] ( [ID] varchar(128) NULL, [name] varchar(128) null ) WITH (LOCATION='/zhu/data1.txt', DATA_SOURCE = JNAzureStorage, FILE_FORMAT = TextFileFormat );
- 验证数据,输入如下SQL: select * from [dbo].[hospital] ,如下图,则OK