zoukankan      html  css  js  c++  java
  • SWERC13 Trending Topic


    map暴力。

    。。


    Imagine you are in the hiring process for a company whose principal activity is the analysis
    of information in the Web. One of the tests consists in writing a program for maintaining up to
    date a set of trending topics. You will be hired depending on the efficiency of your solution.
    They provide you with text from the most active blogs. The text is organised daily and you
    have to provide the sorted list of the N most frequent words during the last 7 days, when asked.
    INPUT
    Each input file contains one test case. The text corresponding to a day is delimited by tag
    <text>. Queries of top N words can appear between texts corresponding to two different days.
    A top N query appears as a tag like <top 10 />. In order to facilitate you the process of reading
    from input, the number always will be delimited by white spaces, as in the sample.
    Notes:
    • All words are composed only of lowercase letters of size at most 20.
    • The maximum number of different words that can appear is 20000.
    • The maximum number of words per day is 20000.
    • Words of length less than four characters are considered of no interest.
    • The number of days will be at most 1000.
    • 1 ≤ N ≤ 20
    OUTPUT
    The list of N most frequent words during the last 7 days must be shown given a query. Words
    must appear in decreasing order of frequency and in alphabetical order when equal frequency.
    There must be shown all words whose counter of appearances is equal to the word
    at position N. Even if the amount of words to be shown exceeds N.


    SAMPLE INPUT
    <text>
    imagine you are in the hiring process of a company whose
    main business is analyzing the information that appears
    in the web
    </text>
    <text>
    a simple test consists in writing a program for
    maintaining up to date a set of trending topics
    </text>
    <text>
    you will be hired depending on the efficiency of your solution
    </text>
    <top 5 />
    <text>
    they provide you with a file containing the text
    corresponding to a highly active blog
    </text>
    <text>
    the text is organized daily and you have to provide the
    sorted list of the n most frequent words during last week
    when asked
    </text>
    <text>
    each input file contains one test case the text corresponding
    to a day is delimited by tag text
    </text>
    <text>
    the query of top n words can appear between texts corresponding
    to two different days
    </text>
    <top 3 />
    <text>
    blah blah blah blah blah blah blah blah blah
    please please please
    </text>
    <top 3 />
    2
    Problem IProblem I
    Trending Topic
    SAMPLE OUTPUT
    <top 5>
    analyzing 1
    appears 1
    business 1
    company 1
    consists 1
    date 1
    depending 1
    efficiency 1
    hired 1
    hiring 1
    imagine 1
    information 1
    main 1
    maintaining 1
    process 1
    program 1
    simple 1
    solution 1
    test 1
    that 1
    topics 1
    trending 1
    whose 1
    will 1
    writing 1
    your 1
    </top>
    <top 3>
    text 4
    corresponding 3
    file 2
    provide 2
    test 2
    words 2
    </top>
    <top 3>
    blah 9
    text 4
    corresponding 3
    please 3
    </top>



    #include <iostream>
    #include <cstdio>
    #include <cstring>
    #include <algorithm>
    #include <string>
    #include <map>
    #include <vector>
    
    using namespace std;
    
    typedef pair<int,int> pII;
    
    map<string,int> Hash;
    vector<int> dy[11];
    string rHash[20200];
    int day_sum[11][20200];
    char cache[30];
    int now=9,pre=0,id=1;
    int arr[20020],na;
    string rss[20020];
    bool vis[20020];
    
    void DEBUG(int x)
    {
        int sz=dy[x].size();
        for(int i=0;i<sz;i++)
        {
            cout<<"ID: "<<dy[x][i]<<" : "<<rHash[dy[x][i]]<<endl;
            cout<<"sum: "<<day_sum[x][dy[x][i]]<<endl;
        }
    }
    
    struct RSP
    {
        int times;
        string word;
    }rsp[20020];
    
    bool cmpRSP(RSP a,RSP b)
    {
        if(a.times!=b.times)
            return a.times>b.times;
        else
            return a.word<b.word;
    }
    
    void get_top(int now,int k)
    {
        int sz=dy[now].size();
        na=0;
        int _7dayago=(now+3)%10;
        memset(vis,false,sizeof(vis));
        for(int i=0;i<sz;i++)
        {
            if(vis[dy[now][i]]==false)
            {
                arr[na++]=day_sum[now][dy[now][i]]-day_sum[_7dayago][dy[now][i]];
                vis[dy[now][i]]=true;
            }
        }
        sort(arr,arr+na);
        int sig=arr[max(0,na-k)];
        int rn=0;
        memset(vis,false,sizeof(vis));
        for(int i=0;i<sz;i++)
        {
            int times=day_sum[now][dy[now][i]]-day_sum[_7dayago][dy[now][i]];
            if(times >= sig &&vis[dy[now][i]]==false)
            {
                rsp[rn++]=(RSP){times,rHash[dy[now][i]]};
                vis[dy[now][i]]=true;
            }
        }
        sort(rsp,rsp+rn,cmpRSP);
        printf("<top %d>
    ",k);
        for(int i=0;i<rn;i++)
        {
            cout<<rsp[i].word<<" "<<rsp[i].times<<endl;
        }
        printf("</top>
    ");
    }
    
    int main()
    {
        while(scanf("%s",cache)!=EOF)
        {
            if(strcmp(cache,"<text>")==0)
            {
                ///read cache
                pre=now;
                now=(now+1)%10;
                dy[now]=dy[pre];
                memcpy(day_sum[now],day_sum[pre],sizeof(day_sum[0]));
                ///7 day ago    ....
                while(scanf("%s",cache))
                {
                    if(cache[0]=='<') break;
                    if(strlen(cache)<4) continue;
                    string word=cache;
                    if(Hash[word]==0)
                    {
                        rHash[id]=word;
                        Hash[word]=id++;
                    }
                    int ID=Hash[word];
                    if(day_sum[pre][ID]==0)
                        dy[now].push_back(ID);
                    day_sum[now][ID]++;
                }
            }
            else if(strcmp(cache,"<top")==0)
            {
                int top;
                scanf("%d",&top); scanf("%s",cache);
                get_top(now,top);
            }
        }
        return 0;
    }
    


  • 相关阅读:
    Delphi 农历算法
    Installing Custom Maps for L4D
    双网卡,上网走外网网卡,内网走内网网卡设置
    L4D的指令合集
    两个RGB的颜色半透明混色算法
    中国省级行政区划变动情况
    Win7编程:在按钮中加入管理员权限运行
    教你快速识别手机质量的好坏
    如何利用预编译指令来判断Delphi 的版本?
    在.NET中读写INI文件 ——兼谈正则表达式的应用
  • 原文地址:https://www.cnblogs.com/liguangsunls/p/6897280.html
Copyright © 2011-2022 走看看