zoukankan      html  css  js  c++  java
  • 【原】 POJ 1035 Spell checker 编辑距离 解题报告

    http://poj.org/problem?id=1035

    方法一:动态规划  O(len^2)

    EditDist

    c[i,j]表示s1[0...i]和s2[0...j]的编辑距离,由于矩阵c的第一行和第一列为基准情况,所以实际中c[len1,len2]为s1和s2的编辑距离
    当s1[i]==s2[j]时,自然而然的c[i,j] = c[i-1][j-1]
    当s1[i]!=s2[j]时,先按照变动s1来讨论:
           删除s1[i],再观察s1[0...i-1]和s2[0...j]的编辑距离,为c[i-1,j]
           将s1[i]替换成s2[j],此时s1[i]==s2[j],再观察s1[0...i-1]和s2[0...j-1]的编辑距离,为c[i-1,j-1]
           在s1[i]后添加s[j],再观察s1[0...i]和s2[0...j-1]的编辑距离,为c[i,j-1]
    此时c[i,j]即为c[i-1,j]、c[i,j-1]、c[i-1,j-1]三者中的最小值再加上一次操作
    对于s2[j]的这三种操作所获得的需要计算的矩阵c中的项是相同的
    因此可得如下递归式

    c[i,j] = c[i-1][j-1] , if s1[i]==s2[j]
    c[i,j] = min{ c[i][j-1] , c[i-1][j] ,  c[i-1][j-1] }+1 , otherwise

    方法二:顺序扫描 O(len)

    run1035()

    由于只求编辑距离为1,因此可以得到线性算法

    Description

    You, as a member of a development team for a new spell checking program, are to write a module that will check the correctness of given words using a known dictionary of all correct words in all their forms.
    If the word is absent in the dictionary then it can be replaced by correct words (from the dictionary) that can be obtained by one of the following operations:
    ?deleting of one letter from the word;
    ?replacing of one letter in the word with an arbitrary letter;
    ?inserting of one arbitrary letter into the word.
    Your task is to write the program that will find all possible replacements from the dictionary for every given word.

    Input

    The first part of the input file contains all words from the dictionary. Each word occupies its own line. This part is finished by the single character '#' on a separate line. All words are different. There will be at most 10000 words in the dictionary.
    The next part of the file contains all words that are to be checked. Each word occupies its own line. This part is also finished by the single character '#' on a separate line. There will be at most 50 words that are to be checked.
    All words in the input file (words from the dictionary and words to be checked) consist only of small alphabetic characters and each one contains 15 characters at most.

    Output

    Write to the output file exactly one line for every checked word in the order of their appearance in the second part of the input file. If the word is correct (i.e. it exists in the dictionary) write the message: " is correct". If the word is not correct then write this word first, then write the character ':' (colon), and after a single space write all its possible replacements, separated by spaces. The replacements should be written in the order of their appearance in the dictionary (in the first part of the input file). If there are no replacements for this word then the line feed should immediately follow the colon.

    Sample Input

    i

    is

    has

    have

    be

    my

    more

    contest

    me

    too

    if

    award

    #

    me

    aware

    m

    contest

    hav

    oo

    or

    i

    fi

    mre

    #

    Sample Output

    me is correct

    aware: award

    m: i my me

    contest is correct

    hav: has have

    oo: too

    or:

    i is correct

    fi: i

    mre: more me

       1: int EditDist(const string& s1,const string& s2,int c[15+1][15+1])
       2: {
       3:     int len1 = s1.size() ;
       4:     int len2 = s2.size() ;    
       5:     int i,j ;
       6:     int strIndex1,strIndex2 ;
       7:     int d1,d2,d3 ;
       8:     int min ;
       9:  
      10:     for( i=1 ; i<len1+1 ; ++i )
      11:     {
      12:         for( j=1 ; j<len2+1 ; ++j )
      13:         {
      14:             strIndex1 = i-1 ;
      15:             strIndex2 = j-1 ;
      16:  
      17:             if( s1[strIndex1] == s2[strIndex2] )
      18:                 c[i][j] = c[i-1][j-1] ;
      19:             else 
      20:             {
      21:                 d1 = c[i-1][j-1] ;
      22:                 d2 = c[i-1][j] ;
      23:                 d3 = c[i][j-1] ;
      24:                 min = d1<d2 ? d1 : d2 ;
      25:                 min = min<d3 ? min : d3 ;
      26:                 c[i][j] = min+1 ;                
      27:             }
      28:             //cout<<c[i][j]<<" ";
      29:         }
      30:         //cout<<endl;
      31:     }
      32:     return c[len1][len2] ;
      33: }
       1:  
       2: bool check( const string& s1 , const string& s2 )
       3: {
       4:     int len1 = s1.size() ;
       5:     int len2 = s2.size() ;
       6:     int i,j ;
       7:     if( len1 == len2 )  //replace
       8:     {
       9:         i = 0 ;
      10:         while( i<len1 && s1[i]==s2[i] )
      11:             ++i ;
      12:         //now , s1[i]!=s2[i] , the edit distance is 1 only when just i-th char is replaced
      13:         //other positions are the same , so skip this position
      14:         //hello hollo
      15:         while( ++i<len1 )
      16:             if( s1[i]!=s2[i] )
      17:                 return false ;
      18:     }
      19:     else if( len1 == len2+1 )
      20:     {
      21:         i = 0 ;
      22:         while( i<len2 && s1[i]==s2[i] )
      23:             ++i ;
      24:         //now , s1[i]!=s2[i] , the edit distance is 1 only when s1[i] is been insertd
      25:         //other positions are the same , so skip s1[i]
      26:         //more mre
      27:         while( ++i<len1 )
      28:             if( s1[i]!=s2[i-1] )
      29:                 return false ;
      30:     }
      31:     else if( len1+1 == len2 )
      32:     {
      33:         i = 0 ;
      34:         while( i<len1 && s1[i]==s2[i] )
      35:             ++i ;
      36:         //now , s1[i]!=s2[i] , the edit distance is 1 only when s2[i] is been insertd
      37:         //other positions are the same , so skip s2[i]
      38:         //mre more 
      39:         while( ++i<len2 )
      40:             if( s1[i-1]!=s2[i] )
      41:                 return false ;
      42:     }
      43:     else
      44:         return false ;
      45:  
      46:     return true ;
      47: }
      48:  
      49: //use check
      50: void run1035()
      51: {
      52:     vector<string> dictVec ;
      53:     dictVec.reserve(10000) ;
      54:     vector<string>::iterator dictVecIter ;
      55:  
      56:     stdext::hash_set<string> dictSet;    
      57:     string str,dictStr;
      58:     //ifstream in("in.txt");
      59:  
      60:     while( cin>>str && str!="#" )
      61:     {
      62:         dictVec.push_back(str);
      63:         dictSet.insert(str);
      64:     }
      65:     cin.clear() ;
      66:     while( cin>>str && str!="#" )
      67:     {
      68:         if( dictSet.find(str)!=dictSet.end() )
      69:         {
      70:             cout<<str<<" is correct"<<endl;
      71:             continue ;
      72:         }
      73:  
      74:         cout<<str<<": " ;    
      75:         for( dictVecIter=dictVec.begin() ; dictVecIter!=dictVec.end() ; ++dictVecIter )
      76:         {
      77:             dictStr = *dictVecIter ;
      78:             if( check(dictStr,str)==true )
      79:                 cout<<dictStr<<" " ;
      80:         }
      81:         cout<<endl ;
      82:     }
      83:  
      84:     dictVec.clear();
      85:     dictSet.clear();
      86: }
  • 相关阅读:
    CodeDeploy 应用程序规范文件
    Lambda 函数的最佳实践
    路由策略
    AWS CodeStar
    使用 Lambda@Edge 在边缘站点自定义内容
    Step Functions
    将应用程序部署到 AWS Elastic Beanstalk 环境
    DynamoDB 静态加密
    web数据安全——防篡改
    【Spring】spring全局异常处理即全局model数据写入
  • 原文地址:https://www.cnblogs.com/allensun/p/1869398.html
Copyright © 2011-2022 走看看