zoukankan      html  css  js  c++  java
  • 统计一个英文文本中单词出现的频率

    任务:统计一个文本中单词出现的频率,并且输出频率最高的前十个单词及其出现次数

    思考:在编写程序前,我先确定了用C语言来编写代码

    程序源代码:

    #include <stdio.h>
    #include <string.h>
    //能统计的最大单词个数,可以自己改
    #define MAX_WORD_COUNT 500
    //结构体,保存每个单词及对应的个数
    typedef struct WordCount
     {
     char cWord[20];
     int  iCount;
    }T_WordCount;
     
    int CalcEachWord(const char *pText);//计算单词个数及输出信息等
    void LowerText(char *pText);//把单词变成小写形式
    void SwapItem(T_WordCount *ItemA, T_WordCount * ItemB);//交换两个元素
    void SortWord(T_WordCount *pWordSet);//排序
     
    int main(int argc, char *argv[])
    {
     //测试文本
     FILE *fp=NULL;
    fp=fopen("D:\text.txt","r");
    if(fp == NULL)
    {
        return -1;
    }
    char cBuf[1001]={0};
    fread(cBuf, 1, 1000,fp);
    printf("----------------------------------\n");
     printf("The top 10 words is :\n");
     
     CalcEachWord(cBuf); return 0;
    }
     
    int CalcEachWord(const char *cBuf)
    {
     char cTmp[20] = {0};
     int  i   = 0;
     char *pTmp   = cTmp;
     int  iFlag   = 0;
     
     T_WordCount tWordSet[MAX_WORD_COUNT];
     memset(tWordSet, 0, sizeof(tWordSet));
     
     while (*cBuf != '\0')
     {
      if ((*cBuf >= 'A' && *cBuf <= 'Z') || (*cBuf >= 'a' && *cBuf <= 'z'))
      {  
     
       *pTmp = *cBuf;
       pTmp++;
     
      }
      else if (*cBuf == '-')
      {
       ++cBuf;
       continue;
      }
      else
      {
     
       if (strlen(cTmp) > 0)
       {
        LowerText(cTmp);
        iFlag = 0;
        for (i = 0; i < MAX_WORD_COUNT; ++i)
        {
         if (strlen(tWordSet[i].cWord) > 0)
         {
          if (strcmp(tWordSet[i].cWord, cTmp) == 0)
          {
           iFlag = 1;
           tWordSet[i].iCount++;
           break;
          }    
         }
         else
         {
          strcpy(tWordSet[i].cWord, cTmp);
          tWordSet[i].iCount = 1;
          iFlag = 1;
          break;
         }
     
        }
        if (!iFlag)
        {
         printf("No more space to save word.\n");
        }
     
       }
       memset(cTmp, 0, 20);
       pTmp = cTmp;
      }
     
      ++cBuf;
     }
     
    //排序 SortWord(tWordSet);
     for (i = 0; i < 10; ++i)
     {
      if (strlen(tWordSet[i].cWord) > 0)
      {
       printf("%s:%d\n",tWordSet[i].cWord,tWordSet[i].iCount);
      }
     }
     
     return 0;
    }
     
    void LowerText(char *cBuf)
    {
     char *pTmp = cBuf;
     while (*pTmp != '\0')
     {
      if ((*pTmp >= 'A' && *pTmp <= 'Z'))
      {
       *pTmp += 32 ;
      }
     
      pTmp++; }
    }
     
    void SwapItem(T_WordCount *ItemA, T_WordCount * ItemB)
    {
     T_WordCount Tmp;
     memset(&Tmp, 0, sizeof(T_WordCount));
     strcpy(Tmp.cWord, ItemA->cWord);
     Tmp.iCount = ItemA->iCount;
     
     strcpy(ItemA->cWord, ItemB->cWord); ItemA->iCount = ItemB->iCount;
     strcpy(ItemB->cWord, Tmp.cWord); ItemB->iCount = Tmp.iCount;
    }
    //冒泡排序算法
    void SortWord(T_WordCount *pWordSet){
     int i,j;
     for (j = 0; j < MAX_WORD_COUNT - 1; j++)
     {  
      for (i = 0; i < MAX_WORD_COUNT - 1 - j; i++)
      {    
       if (pWordSet[i].iCount < pWordSet[i+1].iCount)    
       {                            
        SwapItem(&pWordSet[i], &pWordSet[i+1]);
       }    
      }
     }
    }

    文本内容:

    With the Games officially announced the beginning of each class followed by admission to the ranks of marching parade performances. At this point march suddenly sounded, one after another a class rank of neat formation, dance props set great strides in coming to the podium. Everyone in their bright clothes, smiling, hold our heads up, demonstrating the unique vitality of youth and vitality. The performances of these classes strengths and weaknesses, and some formation varied, and some clothing brightest, and some neat moves, and some arrangement was new, their graceful dance to attract everyone's attention. Props in the hands of these students can be described as great variety, variety. Some, and the dynamic music, holding a pair of chopsticks beating out the rhythm of sonorous; some flapping the ball like a cheerful bright wizard; some dress swayed, holding a dance fan, to draw a beautiful arc Road, and some Qiyuxuanang , hand-held gun salute in the air emitted by colorful fireworks. Their performance to the entire stint of the many splendours of color games, like spring flowers, summer sunshine, with a cool autumn wind blow against our faces, so that the presence of teachers and students are all touched, and are all delighted. It is understood that the road parade costumes and props  in many of them are the students themselves to select and purchase, and they show the formation and movement are also explored and arrangement of their own. This is totally reflects the student's enthusiasm and longing for the Games, but also fully demonstrated their ability to act independently and strong organizational skills

    测试结果:

  • 相关阅读:
    浅析C#中的套接字编程
    在 C# 中通过 P/Invoke 调用Win32 DLL
    读书笔记c#高级编程 委托和事件
    如何将 .net framework 打包进 msi安装包,使得安装时自动安装
    自实现input上传指定文件到服务器
    Thrift初探:简单实现C#通讯服务程序
    C# 使用NLog记录日志
    C# winform程序怎么打包成安装项目(图解)
    VUE3.0+Vant VS Code入门教程
    WCF入门教程2——创建第一个WCF程序
  • 原文地址:https://www.cnblogs.com/hfxdaj/p/3575790.html
Copyright © 2011-2022 走看看