zoukankan      html  css  js  c++  java
  • spark教程(18)-sparkSQL 自定义函数

    sparkSQL 也允许用户自定义函数,包括 UDF、UDAF,但没有 UDTF

    官方 API

    class pyspark.sql.UDFRegistration(sparkSession)[source]

        register(namefreturnType=None)[source]

          Register a Python function (including lambda function) or a user-defined function as a SQL function.

          Parameters

          name – name of the user-defined function in SQL statements.

          f – a Python function, or a user-defined function. The user-defined function can be either row-at-a-time or vectorized. See pyspark.sql.functions.udf() andpyspark.sql.functions.pandas_udf().

          returnType – the return type of the registered user-defined function. The value can be either a pyspark.sql.types.DataType object or a DDL-formatted type string.

          Returns

          a user-defined function.

      registerJavaFunction(namejavaClassNamereturnType=None)[source]

      registerJavaUDAF(namejavaClassName)

     示例代码

    strlen = spark.udf.register("stringLengthString", lambda x: len(x))
    spark.sql("SELECT stringLengthString('test')").collect()        # test 只是个字符
    # [Row(stringLengthString(test)=u'4')]
    spark.sql("SELECT stringLengthString(name) from hive1101.person limit 3").collect()     # read  hive table
    # [Row(stringLengthString(name)=u'4'), Row(stringLengthString(name)=u'4'), Row(stringLengthString(name)=u'4')]
    
    
    
    from pyspark.sql.types import IntegerType
    from pyspark.sql.functions import udf
    slen = udf(lambda s: len(s), IntegerType())
    _ = spark.udf.register("slen", slen)
    spark.sql("SELECT slen('test')").collect()
    # [Row(slen(test)=4)]

    参考资料:

    https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.UDFRegistration  官网,也更多例子

  • 相关阅读:
    JavaScript异步编程1——Promise的初步使用
    Pailler
    ElGamal
    RSA
    密码基础
    博客园中:为文章添加版权保护
    DCT实现水印嵌入与提取(带攻击)
    量子:基于EPR块对的两步量子直接通信
    量子:拜占庭协议和测谎问题的量子协议的实验证明
    liunx:网络命令
  • 原文地址:https://www.cnblogs.com/yanshw/p/11982652.html
Copyright © 2011-2022 走看看