zoukankan      html  css  js  c++  java
  • SPARK-SQL内置函数之字符串函数

     转载请注明转自:http://www.cnblogs.com/feiyumo/p/8763186.html

    1.concat对于字符串进行拼接

    concat(str1, str2, ..., strN) - Returns the concatenation of str1, str2, ..., strN.

    Examples:> SELECT concat('Spark', 'SQL');  SparkSQL

    2.concat_ws在拼接的字符串中间添加某种格式

    concat_ws(sep, [str | array(str)]+) - Returns the concatenation of the strings separated by sep.

    Examples:> SELECT concat_ws(' ', 'Spark', 'SQL');  Spark SQL

    3.decode转码

    decode(bin, charset) - Decodes the first argument using the second argument character set.

    Examples: > SELECT decode(encode('abc', 'utf-8'), 'utf-8');   abc

    4.encode设置编码格式

    encode(str, charset) - Encodes the first argument using the second argument character set.

    Examples: > SELECT encode('abc', 'utf-8');abc

    5.format_string/printf 格式化字符串

    format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.

    Examples:> SELECT format_string("Hello World %d %s", 100, "days");  Hello World 100 days

    6.initcap将每个单词的首字母变为大写,其他字母小写; lower全部转为小写,upper大写

    initcap(str) - Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space.

    Examples:> SELECT initcap('sPark sql');  Spark Sql

    7.length返回字符串的长度

    Examples:> SELECT length('Spark SQL ');  10

    8.levenshtein编辑距离(将一个字符串变为另一个字符串的距离)

    levenshtein(str1, str2) - Returns the Levenshtein distance between the two given strings.

    Examples:> SELECT levenshtein('kitten', 'sitting');   3

    9.lpad返回固定长度的字符串,如果长度不够,用某种字符补全,rpad右补全

    lpad(str, len, pad) - Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters.

    Examples:> SELECT lpad('hi', 5, '??');   ???hi

    10.ltrim去除空格或去除开头的某些字符,rtrim右去除,trim两边同时去除

    ltrim(str) - Removes the leading space characters from str.

    ltrim(trimStr, str) - Removes the leading string contains the characters from the trim string

    Examples:

    > SELECT ltrim('    SparkSQL   ');   SparkSQL
    > SELECT ltrim('Sp', 'SSparkSQLS');   arkSQLS

    11.regexp_extract 正则提取某些字符串,regexp_replace正则替换

    Examples:> SELECT regexp_extract('100-200', '(d+)-(d+)', 1);   100

    Examples: > SELECT regexp_replace('100-200', '(d+)', 'num');   num-num

    12.repeat复制给的字符串n次

    Examples: > SELECT repeat('123', 2);  123123

    13.instr返回截取字符串的位置/locate

    instr(str, substr) - Returns the (1-based) index of the first occurrence of substr in str.

    Examples:> SELECT instr('SparkSQL', 'SQL');  6

    Examples:SELECT locate('bar', 'foobarbar');   4

    14.space 在字符串前面加n个空格

    space(n) - Returns a string consisting of n spaces.

    Examples:> SELECT concat(space(2), '1');  1

    15.split以某些字符拆分字符串

    split(str, regex) - Splits str around occurrences that match regex.

    Examples:> SELECT split('oneAtwoBthreeC', '[ABC]');      ["one","two","three",""]

    16.substr截取字符串,substring_index

    Examples:

    > SELECT substr('Spark SQL', 5);  k SQL
    > SELECT substr('Spark SQL', -3);  SQL
    > SELECT substr('Spark SQL', 5, 1);   k
    > SELECT substring_index('www.apache.org', '.', 2);   www.apache

    17.translate 替换某些字符串为

    Examples: > SELECT translate('AaBbCc', 'abc', '123');   A1B2C3

    18.get_json_object

    get_json_object(json_txt, path) - Extracts a json object from path.

    Examples:> SELECT get_json_object('{"a":"b"}', '$.a');  b

    19.unhex

    unhex(expr) - Converts hexadecimal expr to binary.

    Examples:> SELECT decode(unhex('537061726B2053514C'), 'UTF-8');   Spark SQL

    20.to_json

    to_json(expr[, options]) - Returns a json string with a given struct value

    Examples:

    > SELECT to_json(named_struct('a', 1, 'b', 2));   {"a":1,"b":2}
    > SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'));   {"time":"26/08/2015"}
    > SELECT to_json(array(named_struct('a', 1, 'b', 2));   [{"a":1,"b":2}]
    > SELECT to_json(map('a', named_struct('b', 1)));  {"a":{"b":1}}
    > SELECT to_json(map(named_struct('a', 1),named_struct('b', 2)));   {"[1]":{"b":2}}
    > SELECT to_json(map('a', 1));  {"a":1}
    > SELECT to_json(array((map('a', 1))));  [{"a":1}]

    Since: 2.2.0

  • 相关阅读:
    Commando War (贪心)
    Codehorses T-shirts (map+遍历)
    HDU
    HDU—2021-发工资咯(水题,有点贪心的思想)
    HDU
    CSDN自定义栏目代码
    xgqfrms™, xgqfrms® : xgqfrms's offical website of GitHub!
    xgqfrms™, xgqfrms® : xgqfrms's offical website of GitHub!
    xgqfrms™, xgqfrms® : xgqfrms's offical website of GitHub!
    xgqfrms™, xgqfrms® : xgqfrms's offical website of GitHub!
  • 原文地址:https://www.cnblogs.com/feiyumo/p/8763186.html
Copyright © 2011-2022 走看看