zoukankan      html  css  js  c++  java
  • SQL Server对比两字符串的相似度(函数算法)

    一、概述

       最近有人问到关于两个字符串求相似度的函数,所以就写了本篇文章,分别是“简单的模糊匹配”,“顺序匹配”,“一对一位置匹配”。在平时的这种函数可能会需要用到,业务需求不一样,这里只给出参照,实际情况可以相应修改。本文所有的两个字段比较都是除以比较字段本身,例如A与B比较,找出的长度除以A的长度,因为考虑如果A的长度大于B的长度,相似度会超100%,例如‘abbc’,'ab'.

    如果大家想除以B的长度,只需要在语句末尾将‘SET @num=@num*1.0/LEN(@Cloumna)’修改成‘SET @num=@num*1.0/LEN(@Cloumnb)

    1.两个字符串简单相似 

    ---两个字段简单相似
    CREATE FUNCTION DBO.FN_Resemble
        (@Cloumna NVARCHAR(MAX),
         @Cloumnb NVARCHAR(MAX)
        )
    RETURNS FLOAT
    AS
    BEGIN
        DECLARE @num FLOAT,@len int
        SET @Cloumna=ISNULL(@Cloumna,0)
        SET @Cloumnb=ISNULL(@Cloumnb,0)
        SET @len=1
        SET @num=0
        WHILE(LEN(@Cloumna)<>0 AND LEN(@CloumnB)<>0)
        BEGIN
            WHILE(@len<=LEN(@Cloumna))
            BEGIN
                DECLARE @a NVARCHAR(4)
                SET @a=''
                SET @a=SUBSTRING(@Cloumna,@len,1)
                IF(CHARINDEX(@a,@CloumnB)>0)
                BEGIN
                SET @num=@num+1
                END
            SET  @len=@len+1   
            END
        SET @num=@num*1.0/LEN(@Cloumna)
        BREAK
        END
        
        RETURN @num
    END
    
    
    ----测试代码
    SELECT DBO.FN_Resemble('ABDC321G','ABDC123G'

    2.两个字符串顺序相似

    ---两个字段顺序相似
    CREATE FUNCTION DBO.FN_Resemble_order
        (@Cloumna NVARCHAR(MAX),
         @Cloumnb NVARCHAR(MAX)
        )
    RETURNS FLOAT
    AS
    BEGIN
        DECLARE @num FLOAT,@len int
        SET @Cloumna=ISNULL(@Cloumna,0)
        SET @Cloumnb=ISNULL(@Cloumnb,0)
        SET @len=1
        SET @num=0
        WHILE(LEN(@Cloumna)<>0 AND LEN(@CloumnB)<>0)
        BEGIN
        DECLARE @a NVARCHAR(4)
        DECLARE @b NVARCHAR(4)
            IF(LEN(@Cloumna)>=LEN(@CloumnB))
            BEGIN
                WHILE(@len<=LEN(@CloumnB))
                BEGIN
                    
                    SET @a=''
                    SET @a=SUBSTRING(@Cloumna,@len,1)
                    SET @b=''
                    SET @b=SUBSTRING(@CloumnB,@len,1)
                    IF(@a=@b)
                    BEGIN
                    SET @num=@num+1
                    END
                        ELSE
                        BEGIN
                            break
                        END
                SET  @len=@len+1   
                END
            END
            ELSE IF    (LEN(@Cloumna)<LEN(@CloumnB))
            BEGIN
                WHILE(@len<=LEN(@Cloumna))
                    BEGIN
                        SET @a=''
                        SET @a=SUBSTRING(@Cloumna,@len,1)
                        SET @b=''
                        SET @b=SUBSTRING(@CloumnB,@len,1)
                        IF(@a=@b)
                        BEGIN
                            SET @num=@num+1
                        END
                        ELSE
                            BEGIN
                                break
                            END
                    SET  @len=@len+1   
                END
            
            END
        SET @num=@num*1.0/LEN(@Cloumna)
        BREAK
        END
        RETURN @num
    END
    go
    
    ----测试代码
    SELECT DBO.FN_Resemble_order('ABDC456G','ABDC123G')

    3.两个字符串一对一相似

    ---两个字段一对一相似
    CREATE FUNCTION DBO.FN_Resemble_onebyone
        (@Cloumna NVARCHAR(MAX),
         @Cloumnb NVARCHAR(MAX)
        )
    RETURNS FLOAT
    AS
    BEGIN
        DECLARE @num FLOAT,@len int
        SET @Cloumna=ISNULL(@Cloumna,0)
        SET @Cloumnb=ISNULL(@Cloumnb,0)
        SET @len=1
        SET @num=0
        WHILE(LEN(@Cloumna)<>0 AND LEN(@CloumnB)<>0)
        BEGIN
        DECLARE @a NVARCHAR(4)
        DECLARE @b NVARCHAR(4)
            IF(LEN(@Cloumna)>=LEN(@CloumnB))
            BEGIN
                WHILE(@len<=LEN(@CloumnB))
                BEGIN
                    
                    SET @a=''
                    SET @a=SUBSTRING(@Cloumna,@len,1)
                    SET @b=''
                    SET @b=SUBSTRING(@CloumnB,@len,1)
                    IF(@a=@b)
                    BEGIN
                    SET @num=@num+1
                    END
                SET  @len=@len+1   
                END
            END
            ELSE IF    (LEN(@Cloumna)<LEN(@CloumnB))
            BEGIN
                WHILE(@len<=LEN(@Cloumna))
                    BEGIN
                        SET @a=''
                        SET @a=SUBSTRING(@Cloumna,@len,1)
                        SET @b=''
                        SET @b=SUBSTRING(@CloumnB,@len,1)
                        IF(@a=@b)
                        BEGIN
                            SET @num=@num+1
                        END
                    SET  @len=@len+1   
                END
            
            END
        SET @num=@num*1.0/LEN(@Cloumna)
        BREAK
        END
        RETURN @num
    END
    
    ----测试代码
    SELECT DBO.FN_Resemble_onebyone('ABDC456G','ABDC123G'

    4.对比两个版本号的大小

    如果前面比后面的大返回1,小返回-1,相等返回0

    ALTER FUNCTION FNStrCompare
    (@Val1  VARCHAR(50),---比较字符串1
     @Val2  VARCHAR(50),---比较字符串2
     @Break VARCHAR(10) ---分隔符
    )
    RETURNS INT
    AS
    BEGIN
    DECLARE @Num1 INT
    DECLARE @Num2 INT
    DECLARE @Val1Num INT
    DECLARE @Val2Num INT
    DECLARE @a INT
    IF CHARINDEX(@Break,@Val1)>0 AND CHARINDEX(@Break,@Val2)>0
        BEGIN
            WHILE LEN(@Val1)>0 AND LEN(@Val2)>0
            BEGIN
                IF CHARINDEX(@Break,@Val1)>0 AND CHARINDEX(@Break,@Val2)>0
                BEGIN
                        SET @Num1=CHARINDEX(@Break,@Val1)-1
                        SET @Val1Num=LEFT(@Val1,@Num1)
                        SET @Val1=SUBSTRING(@Val1,@Num1+2,LEN(@Val1))
    
                        SET @Num2=CHARINDEX(@Break,@Val2)-1
                        SET @Val2Num=LEFT(@Val2,@Num2)
                        SET @Val2=SUBSTRING(@Val2,@Num1+2,LEN(@Val2))
    
                END
                ELSE
                BEGIN
                        SET @Val1Num=CONVERT(INT,@Val1)
                        SET @Val2Num=CONVERT(INT,@Val2)
    
                        IF @Val1Num=@Val2Num
                        BEGIN
                            SET @a=0
                            BREAK
                        END
    
                END
    
                IF @Val1Num>@Val2Num
                BEGIN
                     SET @a=1
                     BREAK
                END
                IF @Val1Num<@Val2Num
                BEGIN
                     SET @a=-1
                     BREAK
                END
    
            END
        END
    ELSE
        BEGIN
            SET @Val1Num=CONVERT(INT,@Val1)
            SET @Val2Num=CONVERT(INT,@Val2)
            IF @Val1Num>@Val2Num
            BEGIN
                    SET @a=1
            END
            IF @Val1Num<@Val2Num
            BEGIN
                    SET @a=-1
            END
            IF @Val1Num=@Val2Num
            BEGIN
                    SET @a=0
            END
    
        END
    
    RETURN @a
    
    END

     执行

    SELECT chenmh.dbo.FNStrCompare('1.15.1','1.15.1','.')
    
    SELECT chenmh.dbo.FNStrCompare('1.15.2','1.15.1','.')
    
    SELECT chenmh.dbo.FNStrCompare('1.15.2','2.3.1','.')
    
    SELECT chenmh.dbo.FNStrCompare('1.08.2','1.15.1','.')
    
    SELECT dbo.FNStrCompare('1','2','.')

      

    备注:

        作者:pursuer.chen

        博客:http://www.cnblogs.com/chenmh

    本站点所有随笔都是原创,欢迎大家转载;但转载时必须注明文章来源,且在文章开头明显处给明链接,否则保留追究责任的权利。

    《欢迎交流讨论》

  • 相关阅读:
    模板 无源汇上下界可行流 loj115
    ICPC2018JiaozuoE Resistors in Parallel 高精度 数论
    hdu 2255 奔小康赚大钱 最佳匹配 KM算法
    ICPC2018Beijing 现场赛D Frog and Portal 构造
    codeforce 1175E Minimal Segment Cover ST表 倍增思想
    ICPC2018Jiaozuo 现场赛H Can You Solve the Harder Problem? 后缀数组 树上差分 ST表 口胡题解
    luogu P1966 火柴排队 树状数组 逆序对 离散化
    luogu P1970 花匠 贪心
    luogu P1967 货车运输 最大生成树 倍增LCA
    luogu P1315 观光公交 贪心
  • 原文地址:https://www.cnblogs.com/chenmh/p/3967913.html
Copyright © 2011-2022 走看看