zoukankan      html  css  js  c++  java
  • SQL Server对比两字符串的相似度(函数算法)

      

    一、概述

       最近有人问到关于两个字符串求相似度的函数,所以就写了本篇文章,分别是“简单的模糊匹配”,“顺序匹配”,“一对一位置匹配”。在平时的这种函数可能会需要用到,业务需求不一样,这里只给出参照,实际情况可以相应修改。本文所有的两个字段比较都是除以比较字段本身,例如A与B比较,找出的长度除以A的长度,因为考虑如果A的长度大于B的长度,相似度会超100%,例如‘abbc’,'ab'.

    如果大家想除以B的长度,只需要在语句末尾将‘SET @num=@num*1.0/LEN(@Cloumna)’修改成‘SET @num=@num*1.0/LEN(@Cloumnb)’

     1.两个字符串简单相似

    ---两个字段简单相似
    CREATE FUNCTION DBO.FN_Resemble
        (@Cloumna NVARCHAR(MAX),
         @Cloumnb NVARCHAR(MAX)
        )
    RETURNS FLOAT
    AS
    BEGIN
        DECLARE @num FLOAT,@len int
        SET @Cloumna=ISNULL(@Cloumna,0)
        SET @Cloumnb=ISNULL(@Cloumnb,0)
        SET @len=1
        SET @num=0
        WHILE(LEN(@Cloumna)<>0 AND LEN(@CloumnB)<>0)
        BEGIN
            WHILE(@len<=LEN(@Cloumna))
            BEGIN
                DECLARE @a NVARCHAR(4)
                SET @a=''
                SET @a=SUBSTRING(@Cloumna,@len,1)
                IF(CHARINDEX(@a,@CloumnB)>0)
                BEGIN
                SET @num=@num+1
                END
            SET  @len=@len+1   
            END
        SET @num=@num*1.0/LEN(@Cloumna)
        BREAK
        END
        
        RETURN @num
    END
    
    
    ----测试代码
    SELECT DBO.FN_Resemble('ABDC321G','ABDC123G')

     2.两个字符串顺序相似

    ---两个字段顺序相似
    CREATE FUNCTION DBO.FN_Resemble_order
        (@Cloumna NVARCHAR(MAX),
         @Cloumnb NVARCHAR(MAX)
        )
    RETURNS FLOAT
    AS
    BEGIN
        DECLARE @num FLOAT,@len int
        SET @Cloumna=ISNULL(@Cloumna,0)
        SET @Cloumnb=ISNULL(@Cloumnb,0)
        SET @len=1
        SET @num=0
        WHILE(LEN(@Cloumna)<>0 AND LEN(@CloumnB)<>0)
        BEGIN
        DECLARE @a NVARCHAR(4)
        DECLARE @b NVARCHAR(4)
            IF(LEN(@Cloumna)>=LEN(@CloumnB))
            BEGIN
                WHILE(@len<=LEN(@CloumnB))
                BEGIN
                    
                    SET @a=''
                    SET @a=SUBSTRING(@Cloumna,@len,1)
                    SET @b=''
                    SET @b=SUBSTRING(@CloumnB,@len,1)
                    IF(@a=@b)
                    BEGIN
                    SET @num=@num+1
                    END
                        ELSE
                        BEGIN
                            break
                        END
                SET  @len=@len+1   
                END
            END
            ELSE IF    (LEN(@Cloumna)<LEN(@CloumnB))
            BEGIN
                WHILE(@len<=LEN(@Cloumna))
                    BEGIN
                        SET @a=''
                        SET @a=SUBSTRING(@Cloumna,@len,1)
                        SET @b=''
                        SET @b=SUBSTRING(@CloumnB,@len,1)
                        IF(@a=@b)
                        BEGIN
                            SET @num=@num+1
                        END
                        ELSE
                            BEGIN
                                break
                            END
                    SET  @len=@len+1   
                END
            
            END
        SET @num=@num*1.0/LEN(@Cloumna)
        BREAK
        END
        RETURN @num
    END
    go
    
    ----测试代码
    SELECT DBO.FN_Resemble_order('ABDC456G','ABDC123G')

    3.两个字符串一对一相似

     

    ---两个字段一对一相似
    CREATE FUNCTION DBO.FN_Resemble_onebyone
        (@Cloumna NVARCHAR(MAX),
         @Cloumnb NVARCHAR(MAX)
        )
    RETURNS FLOAT
    AS
    BEGIN
        DECLARE @num FLOAT,@len int
        SET @Cloumna=ISNULL(@Cloumna,0)
        SET @Cloumnb=ISNULL(@Cloumnb,0)
        SET @len=1
        SET @num=0
        WHILE(LEN(@Cloumna)<>0 AND LEN(@CloumnB)<>0)
        BEGIN
        DECLARE @a NVARCHAR(4)
        DECLARE @b NVARCHAR(4)
            IF(LEN(@Cloumna)>=LEN(@CloumnB))
            BEGIN
                WHILE(@len<=LEN(@CloumnB))
                BEGIN
                    
                    SET @a=''
                    SET @a=SUBSTRING(@Cloumna,@len,1)
                    SET @b=''
                    SET @b=SUBSTRING(@CloumnB,@len,1)
                    IF(@a=@b)
                    BEGIN
                    SET @num=@num+1
                    END
                SET  @len=@len+1   
                END
            END
            ELSE IF    (LEN(@Cloumna)<LEN(@CloumnB))
            BEGIN
                WHILE(@len<=LEN(@Cloumna))
                    BEGIN
                        SET @a=''
                        SET @a=SUBSTRING(@Cloumna,@len,1)
                        SET @b=''
                        SET @b=SUBSTRING(@CloumnB,@len,1)
                        IF(@a=@b)
                        BEGIN
                            SET @num=@num+1
                        END
                    SET  @len=@len+1   
                END
            
            END
        SET @num=@num*1.0/LEN(@Cloumna)
        BREAK
        END
        RETURN @num
    END
    
    ----测试代码
    SELECT DBO.FN_Resemble_onebyone('ABDC456G','ABDC123G')

    4.对比两个版本号的大小

    如果前面比后面的大返回1,小返回-1,相等返回0

    ALTER FUNCTION FNStrCompare
    (@Val1  VARCHAR(50),---比较字符串1
     @Val2  VARCHAR(50),---比较字符串2
     @Break VARCHAR(10) ---分隔符
    )
    RETURNS INT
    AS
    BEGIN
    DECLARE @Num1 INT
    DECLARE @Num2 INT
    DECLARE @Val1Num INT
    DECLARE @Val2Num INT
    DECLARE @a INT
    IF CHARINDEX(@Break,@Val1)>0 AND CHARINDEX(@Break,@Val2)>0
        BEGIN
            WHILE LEN(@Val1)>0 AND LEN(@Val2)>0
            BEGIN
                IF CHARINDEX(@Break,@Val1)>0 AND CHARINDEX(@Break,@Val2)>0
                BEGIN
                        SET @Num1=CHARINDEX(@Break,@Val1)-1
                        SET @Val1Num=LEFT(@Val1,@Num1)
                        SET @Val1=SUBSTRING(@Val1,@Num1+2,LEN(@Val1))
    
                        SET @Num2=CHARINDEX(@Break,@Val2)-1
                        SET @Val2Num=LEFT(@Val2,@Num2)
                        SET @Val2=SUBSTRING(@Val2,@Num1+2,LEN(@Val2))
    
                END
                ELSE
                BEGIN
                        SET @Val1Num=CONVERT(INT,@Val1)
                        SET @Val2Num=CONVERT(INT,@Val2)
    
                        IF @Val1Num=@Val2Num
                        BEGIN
                            SET @a=0
                            BREAK
                        END
    
                END
    
                IF @Val1Num>@Val2Num
                BEGIN
                     SET @a=1
                     BREAK
                END
                IF @Val1Num<@Val2Num
                BEGIN
                     SET @a=-1
                     BREAK
                END
    
            END
        END
    ELSE
        BEGIN
            SET @Val1Num=CONVERT(INT,@Val1)
            SET @Val2Num=CONVERT(INT,@Val2)
            IF @Val1Num>@Val2Num
            BEGIN
                    SET @a=1
            END
            IF @Val1Num<@Val2Num
            BEGIN
                    SET @a=-1
            END
            IF @Val1Num=@Val2Num
            BEGIN
                    SET @a=0
            END
    
        END
    
    RETURN @a
    
    END

     执行

    SELECT chenmh.dbo.FNStrCompare('1.15.1','1.15.1','.')
    
    SELECT chenmh.dbo.FNStrCompare('1.15.2','1.15.1','.')
    
    SELECT chenmh.dbo.FNStrCompare('1.15.2','2.3.1','.')
    
    SELECT chenmh.dbo.FNStrCompare('1.08.2','1.15.1','.')
    
    SELECT dbo.FNStrCompare('1','2','.')
  • 相关阅读:
    redux的使用流程
    react类型检查
    将逻辑运算字符串转化为逻辑运算进行运算
    SQL 行列互换 天高地厚
    【转载】linux的IO调度算法和回写机制 天高地厚
    查询昨天的数据 天高地厚
    摘:DBA案例CPU占用100%的问题 天高地厚
    ASP.net HTTP/HTTPS自动切换 天高地厚
    网络连接和初始HTTP请求 天高地厚
    C++内存对象大会战 . 天高地厚
  • 原文地址:https://www.cnblogs.com/xuxiaoshuan/p/15656350.html
Copyright © 2011-2022 走看看