zoukankan      html  css  js  c++  java
  • hdu 1277 全文检索 (字典树应用)

    全文检索

    Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others)
    Total Submission(s): 2553    Accepted Submission(s): 853


    Problem Description
    我们大家经常用google检索信息,但是检索信息的程序是很困难编写的;现在请你编写一个简单的全文检索程序。
    问题的描述是这样的:给定一个信息流文件,信息完全有数字组成,数字个数不超过60000个,但也不少于60个;再给定一个关键字集合,其中关键字个数不超过10000个,每个关键字的信息数字不超过60个,但也不少于5个;两个不同的关键字的前4个数字是不相同的;由于流文件太长,已经把它分成多行;请你编写一个程序检索出有那些关键字在文件中出现过。
     
    Input
    第一行是两个整数M,N;M表示数字信息的行数,N表示关键字的个数;接着是M行信息数字,然后是一个空行;再接着是N行关键字;每个关键字的形式是:[Key No. 1] 84336606737854833158。
     
    Output
    输出只有一行,如果检索到有关键字出现,则依次输出,但不能重复,中间有空格,形式如:Found key: [Key No. 9] [Key No. 5];如果没找到,则输出形如:No key can be found !。
     
    Sample Input
    20 10
    646371829920732613433350295911348731863560763634906583816269
    637943246892596447991938395877747771811648872332524287543417
    420073458038799863383943942530626367011418831418830378814827
    679789991249141417051280978492595526784382732523080941390128
    848936060512743730770176538411912533308591624872304820548423
    057714962038959390276719431970894771269272915078424294911604
    285668850536322870175463184619212279227080486085232196545993
    274120348544992476883699966392847818898765000210113407285843
    826588950728649155284642040381621412034311030525211673826615
    398392584951483398200573382259746978916038978673319211750951
    759887080899375947416778162964542298155439321112519055818097
    642777682095251801728347934613082147096788006630252328830397
    651057159088107635467760822355648170303701893489665828841446
    069075452303785944262412169703756833446978261465128188378490
    310770144518810438159567647733036073099159346768788307780542
    503526691711872185060586699672220882332373316019934540754940
    773329948050821544112511169610221737386427076709247489217919
    035158663949436676762790541915664544880091332011868983231199
    331629190771638894322709719381139120258155869538381417179544
    000361739177065479939154438487026200359760114591903421347697
     
    [Key No. 1] 934134543994403697353070375063
    [Key No. 2] 261985859328131064098820791211
    [Key No. 3] 306654944587896551585198958148
    [Key No. 4] 338705582224622197932744664740
    [Key No. 5] 619212279227080486085232196545
    [Key No. 6] 333721611669515948347341113196
    [Key No. 7] 558413268297940936497001402385
    [Key No. 8] 212078302886403292548019629313
    [Key No. 9] 877747771811648872332524287543
    [Key No. 10] 488616113330539801137218227609
     
    Sample Output
    Found key: [Key No. 9] [Key No. 5]
     
     
    题意:给一大串文本串,在给若干关键串,问关键串是否在文本串出现过
     
    题解:将关键串建树,用文本串去匹配关键串。在匹配的时候,要不断将开始匹配的地址后推,
    举个例子:假设关键串是123,文本串是456123;
    先用456123匹配
    再用56123匹配
    6123
    123
     
    注意:文本串是连续的一大段,只是输入的时候分开输入,用strlen()求长度会超时,注意输出格式
     
    #include<iostream>
    #include<string>
    #include<string.h>
    #include<vector>
    using namespace std;
    int tree[400005][26], vis[400005];
    int id, root, len, n, m, num = 0, flag = 0, k = 0;
    string s;
    char str[400005];
    
    void insert(int cnt)
    {
        root = 0;
        for (int i = 0; s[i]; i++)//如果用strlen()求长度会超时
        {
            id = s[i] - '0';
            if (!tree[root][id])
                tree[root][id] = ++num;
            root = tree[root][id];
        }
        vis[root] = cnt;
    }
    void search(char ss[])
    {
        root = 0;
        for (int i = 0; i < ss[i]; i++)
        {
            id = ss[i] - '0';
            if (root&&vis[root])
            {
                if (flag == 0)
                {
                    cout << "Found key:";
                    flag = 1;
                }
                cout << " [Key No. " << vis[root] << ']';
                vis[root] = 0;//避免重复检索
            }
            if (!tree[root][id])
                return;
            root = tree[root][id];
        }
    }
    int main()
    {
        ios::sync_with_stdio(false);
        cin >> n >> m;
        while (n--)
        {
            string temp;
            cin >> temp;
            for (int i = 0; i < temp.length(); i++)//文本合并
                str[k++] = temp[i];
        }
        
        for (int i = 1; i <= m; i++)
        {
            string temp;
            cin >> temp >> temp >> temp >> s;//cin遇到空格停止
            insert(i);
        }
        for(int i=0;str[i];i++)//依次变化起始匹配位置
            search(str+i);
        if (flag == 0)
            cout << "No key can be found !" << endl;
        else
            cout<<endl;
        return 0;
    }
  • 相关阅读:
    centos8 docker podman冲突问题技术就是要不断折腾 时刻踩坑
    systemd upstart sysvinit
    warden创建容器的过程
    Linux彩色输出
    cloudfoundry warden安装和配置
    The Architecture of Open Source Applications: Audacity
    The Architecture of Open Source Applications: Asterisk
    [转]查看dd的拷贝进度
    The Architecture of Open Source Applications Berkeley DB
    libcurl的段错误
  • 原文地址:https://www.cnblogs.com/-citywall123/p/11177271.html
Copyright © 2011-2022 走看看