zoukankan      html  css  js  c++  java
  • HDU1298 T9 字典树 DFS

    T9

                                                                                Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others)
                                                                                                                                       Total Submission(s): 629    Accepted Submission(s): 250


    Problem Description
    A while ago it was quite cumbersome to create a message for the Short Message Service (SMS) on a mobile phone. This was because you only have nine keys and the alphabet has more than nine letters, so most characters could only be entered by pressing one key several times. For example, if you wanted to type "hello" you had to press key 4 twice, key 3 twice, key 5 three times, again key 5 three times, and finally key 6 three times. This procedure is very tedious and keeps many people from using the Short Message Service.

    This led manufacturers of mobile phones to try and find an easier way to enter text on a mobile phone. The solution they developed is called T9 text input. The "9" in the name means that you can enter almost arbitrary words with just nine keys and without pressing them more than once per character. The idea of the solution is that you simply start typing the keys without repetition, and the software uses a built-in dictionary to look for the "most probable" word matching the input. For example, to enter "hello" you simply press keys 4, 3, 5, 5, and 6 once. Of course, this could also be the input for the word "gdjjm", but since this is no sensible English word, it can safely be ignored. By ruling out all other "improbable" solutions and only taking proper English words into account, this method can speed up writing of short messages considerably. Of course, if the word is not in the dictionary (like a name) then it has to be typed in manually using key repetition again.


    Figure 8: The Number-keys of a mobile phone.


    More precisely, with every character typed, the phone will show the most probable combination of characters it has found up to that point. Let us assume that the phone knows about the words "idea" and "hello", with "idea" occurring more often. Pressing the keys 4, 3, 5, 5, and 6, one after the other, the phone offers you "i", "id", then switches to "hel", "hell", and finally shows "hello".

    Write an implementation of the T9 text input which offers the most probable character combination after every keystroke. The probability of a character combination is defined to be the sum of the probabilities of all words in the dictionary that begin with this character combination. For example, if the dictionary contains three words "hell", "hello", and "hellfire", the probability of the character combination "hell" is the sum of the probabilities of these words. If some combinations have the same probability, your program is to select the first one in alphabetic order. The user should also be able to type the beginning of words. For example, if the word "hello" is in the dictionary, the user can also enter the word "he" by pressing the keys 4 and 3 even if this word is not listed in the dictionary.
     

    Input
    The first line contains the number of scenarios.

    Each scenario begins with a line containing the number w of distinct words in the dictionary (0<=w<=1000). These words are given in the next w lines. (They are not guaranteed in ascending alphabetic order, although it's a dictionary.) Every line starts with the word which is a sequence of lowercase letters from the alphabet without whitespace, followed by a space and an integer p, 1<=p<=100, representing the probability of that word. No word will contain more than 100 letters.

    Following the dictionary, there is a line containing a single integer m. Next follow m lines, each consisting of a sequence of at most 100 decimal digits 2-9, followed by a single 1 meaning "next word".
     

    Output
    The output for each scenario begins with a line containing "Scenario #i:", where i is the number of the scenario starting at 1.

    For every number sequence s of the scenario, print one line for every keystroke stored in s, except for the 1 at the end. In this line, print the most probable word prefix defined by the probabilities in the dictionary and the T9 selection rules explained above. Whenever none of the words in the dictionary match the given number sequence, print "MANUALLY" instead of a prefix.

    Terminate the output for every number sequence with a blank line, and print an additional blank line at the end of every scenario.
     

    Sample Input
    2
    5
    hell 3
    hello 4
    idea 8
    next 8
    super 3
    2
    435561
    43321
    7
    another 5
    contest 6
    follow 3
    give 13
    integer 6
    new 14
    program 4
    5
    77647261
    6391
    4681
    26684371
    77771
     

    Sample Output
    Scenario #1:
    i
    id
    hel
    hell
    hello

    i
    id
    ide
    idea


    Scenario #2:
    p
    pr
    pro
    prog
    progr
    progra
    program

    n
    ne
    new

    g
    in
    int

    c
    co
    con
    cont
    anoth
    anothe
    another

    p
    pr
    MANUALLY
    MANUALLY
      超级恶心的题目,写了足足加上有一天的时间。。。。  远处传来一个声音,“水平不行啊!”
    题目要求完成一个智能选择的系统,即通过按动数字键盘自动匹配到概率最高的那一单词组合上去。
       该题的求解思路基本是这样的,首先将所有单词全部录入进字典树,这样能够使得查找大大加快,然后就是搜索的过程了,无论如何,每次不同的按键长度都将从第一次按动开搜,由于单词量较少,可以预见其搜索次数其实并不会特别的多。这里是根据权值来进行选择的,在搜索过程中还有一个很重要的步骤就是记录你的搜索路径,即保留你的最佳路径,这可以通过一个栈来实现。
      前面的基本相同,但是这题我并没有在搜索时来记录搜索路径,为了降低代码的难度,选择将每个点属于哪个单词保留在节点的信息中,这样最后调出最优节点中的信息即可。
      代码如下:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    typedef struct Node
    {
    	int best[10], val, i;  // best在后面并没有应用到
    	struct Node *child[26];
    }Node;
    
    int search[10][2]= { {}, {}, { 0, 2 }, { 3, 5 }, { 6, 8 }, { 9, 11 }, { 12, 14 }, { 15, 18 }, { 19, 21 }, { 22, 25 } };
    
    char word[1005][105]; 
    
    Node *cmpa;
    
    Node *init(  )
    {
    	Node *n= ( Node * )malloc( sizeof( Node ) );
    	n-> val=0;
        n-> i= -1;
    	for( int i= 0; i< 10; ++i ) // 对应按键初始化
    	{
    	    n-> best[i]= -1;  // 因为‘0‘代表 a那条支路了
    	}
    	memset( n-> child, NULL, sizeof( n-> child ) ); // 将所有孩子置空
    	return n;
    }
    
    void insert( Node *p, char *in, int val, int i )  // 插入操作
    {
    	if( *in== '\0' )
    	{
    		return; //先暂时不计录是该点否有单词结束
    	}
    	else
    	{
    		if( p-> child[ *in- 'a' ]== NULL ) 
    		{  // 如果没有该条支路的话,就新增一条,也即前面没有出现相同前缀的单词
    			p-> child[ *in- 'a' ]= init(  ); // 每次申请用初始化函数
    		}
    		p-> child[ *in- 'a' ]-> val+= val; // 更新其子链上的权值,题中所述为 累加概率
    		p-> child[ *in- 'a' ]-> i= i;	 
    		// 覆盖过程,标识每一段所属哪个单词,不用担心会覆盖掉以前的记录,因为一旦覆盖,说明其前缀是相同的,输出结果也就一样了
    		insert( p-> child[ *in- 'a' ], in+ 1, val, i ); // 递归插入,后两个参数始终没有发生变化
    	}
    }
    
    /*void update( Node *p )  
    {
    	for( int i= 2; i<= 9; ++i ) // 对 2-9 号按键进行区域求解
    	{
    		int max= 0, rec= -1; // 最大值初始化为申请节点时的 ‘0‘, 最优解为申请节点时的 ‘-1’
    		for( int j= search[i][0]; j<= search[i][1]; ++j ) //搜索每个按键对应的字符区域
    		{
    			if( p-> child[j] )
    			{ // 如果该支路存在
    				if( max< p-> child[j]-> val )  // 这里将其值与最大值比较,是否有可能成为其最优解
    				{
    					max= p-> child[j]-> val;
    					rec= j;
    				}
    				update( p-> child[j] );	 // 深度优先搜索
    			}
    		}
    		p-> best[i]= rec;
    	}
    } */
    
    Node *get( Node *p, char *opr, int optimes )
    {
        int op= *( opr )- '0';  // 将 opr 转化为整型
        if( optimes )
        {
            for( int i= search[op][0]; i<= search[op][1]; ++i )
            {
                if( optimes )
                {
                    if( p-> child[i] )
                    {
                        get( p-> child[i], opr+ 1, optimes- 1 );
                    }
                }
            }
            return cmpa;
        }
        else
        {
            return cmpa= ( p-> val> cmpa-> val? p: cmpa ); 
        }
    }
    
    void trans( Node * p, char *op )
    { 
    	int len= strlen( op ), optimes;
    	Node *fcmpa;
    	for( int i= 0; i< len- 1; ++i )
    	{ 
    	    fcmpa= cmpa= init(  );
    	    optimes= i+ 1;
    	    get( p, op, optimes );
    		if( cmpa-> i== -1 )
    		{
    		    puts( "MANUALLY" );
    		}
    		else
    		{
    		    for( int j= 0; j<= i; ++j )
    		    {
    			    printf( "%c", word[ cmpa-> i ][j] );
    		    }
    		    puts( "" );
    		}
    		free( fcmpa );
    	}
    } 
    
    void _free( Node * p )
    {  // 对树的节点进行释放,注意,这里的根节点也被一并释放了,需程序重新申请
    	for( int i= 0; i< 26; ++i )
    	{
    		if( p-> child[i] )
    		{
    			_free( p-> child[i] );
    		}
    	}
    	free( p );
    }
    
    int main(  )
    {
    	int T, w, m;	
    	scanf( "%d", &T );
    	for( int t= 1; t<= T; ++t )
    	{
    	    Node *n= init(  );
    		char op[105];	int val; // op[] 用来存储数字序列
    		scanf( "%d", &w );
    		for( int i= 0; i< w; ++i ) 
    		{
    			scanf( "%s %d", word[i], &val );
    			insert( n, word[i], val, i );
    		}
    		//update( n );
    		printf( "Scenario #%d:\n", t );
    		scanf( "%d", &m );
    		while( m-- )
    		{
    			scanf( "%s", op );
    			if( w== 0 )
    			{
    			    puts( "MANUALLY" );
    			}
    			else
    			{
    			    trans( n, op );
    			}
    			if( m> 0 )
    			{
    			    puts( "" );
    			}
    		}
    		_free( n );  // 释放该树
    		printf( "\n\n" );
    	}
    }
    
    /*
    1
    3
    abyb 3
    abye 13
    aaza 4
    1
    22921
    
    输出结果为
    
    a
    ab
    aby
    aaza
    
    */
    
  • 相关阅读:
    IE故障修复之点击无反应
    第三十四天 我为集成平台狂(七)-步履轻盈的JQuery(五)
    《世界如此险恶,你要内心强大》读书笔记(二)
    hbase phoenix char may not be null
    堆(优先级队列) 的应用
    JVM 调优总结
    Reactor模式和NIO
    JVM调优总结 -Xms -Xmx -Xmn -Xss
    Hadoop源码分析37 RPC的线程协作
    Hadoop源码分析37 RPC的线程协作
  • 原文地址:https://www.cnblogs.com/Lyush/p/2105346.html
Copyright © 2011-2022 走看看