Python自然语言处理学习笔记(56)：建模语言模式

zoukankan html css js c++ java

Python自然语言处理学习笔记(56)：建模语言模式

6.7 Modeling Linguistic Patterns 建模语言模式

Classifiers can help us to understand the linguistic patterns that occur in natural language, by allowing us to create explicit models that capture those patterns. Typically, these models are using supervised classification techniques, but it is also possible to build analytically motivated models（以分析为动机的模型）. Either way, these explicit models serve two important purposes: they help us to understand linguistic patterns, and they can be used to make predictions about new language data.

The extent to（到..的程度） which explicit models can give us insights into（深刻理解） linguistic patterns depends largely on what kind of model is used. Some models, such as decision trees, are relatively transparent, and give us direct information about which factors are important in making decisions and about which factors are related to one another. Other models, such as multi-level neural networks（多级神经网络）, are much more opaque. Although it can be possible to gain insight by studying them, it typically takes a lot more work.

But all explicit models can make predictions about new "unseen" language data that was not included in the corpus used to build the model. These predictions can be evaluated to assess the accuracy of the model. Once a model is deemed（被认为） sufficiently accurate, it can then be used to automatically predict information about new language data. These predictive models can be combined into systems that perform many useful language processing tasks, such as document classification, automatic translation, and question answering.

What do models tell us? 模型告诉我们什么？

It's important to understand what we can learn about language from an automatically constructed model（重要的是要明白从自动构建的模型中能学到哪些关于语言的知识）. One important consideration when dealing with models of language is the distinction between descriptive models（描述性模型） and explanatory models（解释性模型）. Descriptive models capture patterns in the data but they don't provide any information about why the data contains those patterns. For example, as we saw in Table 3.1, the synonyms absolutely and definitely are not interchangeable（可互换的）: we say absolutely adore not definitely adore, and definitely prefer not absolutely prefer. In contrast, explanatory models attempt to capture properties and relationships that cause the linguistic patterns. For example, we might introduce the abstract concept of "polar adjective", as one that has an extreme meaning, and categorize some adjectives like adore and detest as polar. Our explanatory model would contain the constraint that absolutely can only combine with polar adjectives（极性形容词）, and definitely can only combine with non-polar adjectives. In summary, descriptive models provide information about correlations in the data, while explanatory models go further to postulate（假设） causal relationships（因果关系）.

Most models that are automatically constructed from a corpus are descriptive models; in other words, they can tell us what features are relevant to a given patterns or construction, but they can't necessarily tell us how those features and patterns relate to one another. If our goal is to understand the linguistic patterns, then we can use this information about which features are related as a starting point for further experiments designed to tease apart（弄清） the relationships between features and patterns. On the other hand, if we're just interested in using the model to make predictions (e.g., as part of a language processing system), then we can use the model to make predictions about new data without worrying about the details of underlying causal relationships.

None

查看全文

相关阅读:
有理想的程序员必须知道的15件事
 C#是唯一能挑战Java的编程语言？
ADO.NET Entity Framework 如何：定义具有修改存储过程的模型（实体框架）
Android sqlite3工具的使用
 SQLite管理工具
 在 Android 应用程序中使用 Internet 数据解析 XML、JSON 和 protocol buffers 数据
 2011年1月编程排行榜：PHP表现不佳,Python夺冠
 NoSQL开篇——为什么要使用NoSQL
Windows 7任务栏与站点的集成
 Apache中 RewriteCond 规则参数介绍

原文地址：https://www.cnblogs.com/yuxc/p/2165555.html