关键词提取新方法-YAKE! Collection-independent Automatic Keyword Extractor

zoukankan html css js c++ java

关键词提取新方法-YAKE! Collection-independent Automatic Keyword Extractor

Extracting keywords from texts has become a challenge for individuals and organizations as the information grows in complexity and size. The need to automate this task so that texts can be processed in a timely and adequate manner has led to the emergence of automatic keyword extraction tools. Despite the advances, there is a clear lack of multilingual online tools to automatically extract keywords from single documents. In this paper, we present Yake!, a novel feature-based system for multi-lingual keyword extraction, which supports texts of different sizes, domain or languages. Unlike most of the systems, Yake! does not rely on dictionaries nor thesauri, neither is trained against any corpora. Instead, we follow an unsupervised approach which builds upon features extracted from the text, making it thus applicable to documents written in different languages without the need for further knowledge. This can be beneficial for a large number of tasks and a plethora of situations where the access to training corpora is either limited or restricted. In this demo, we offer an easy to use, interactive session, where users from both academia and industry can try our system, either by using a sample document or by introducing their own text. As an add-on, we compare our extracted keywords against the output produced by the IBM Natural Language Understanding and Rake system. This will enable users to understand the distinctions between the three approaches.

开源地址：https://boiling-castle-88317.herokuapp.com/

paper:A Text Feature Based Automatic Keyword Extraction Method for Single Documents

查看全文

相关阅读:
freertos学习
 开源好用的一些库
 一些链接
 电子书链接
 C#：文件的输入与输出（转载20）
C# 特性（Attribute 转载19）
C#：异常处理（转载18）
C#：正则表达式（转载17）
C#：预处理器指令（转载16）
C#：接口和命名空间（Interface和NameSpace 转载15）

原文地址：https://www.cnblogs.com/demo-deng/p/13215615.html