zoukankan      html  css  js  c++  java
  • Data Mining: Practical Machine Learning Tools and Techniques

    Data Mining: Practical Machine Learning Tools and Techniques

    Data Mining: Practical Machine Learning Tools and Techniques (Third Edition)

    Ian H. Witten, Eibe Frank, Mark A. Hall
    The 3r edition of the data mining book.

    Morgan Kaufmann
    January 2011
    629 pages
    Paper
    ISBN 978-0-12-374856-0

    Click here for a bigger image of Eibe Frank and Ian Witten

    Eibe Frank and Ian Witten Click here to order from Amazon.com
    Comments

    "If you have data that you want to analyze and understand, this book and the associated Weka toolkit are an excellent way to start."

    -Jim Gray, Microsoft Research

    "The authors provide enough theory to enable practical application, and it is this practical focus that separates this book from most, if not all, other books on this subject."

    -Dorian Pyle, Director of Modeling at Numetrics

    "This book would be a strong contender for a technical data mining course. It is one of the best of its kind."

    -Herb Edelstein, Principal, Data Mining Consultant, Two Crows Consulting

    It is certainly one of my favourite data mining books in my library"

    -Tom Breur, Principal, XLNT Consulting, Tiburg, The Netherlands

    Features
    • Explains how data mining algorithms work.
    • Helps you select appropriate approaches to particular problems and to compare and evaluate the results of different techniques.
    • Covers performance improvement techniques, including input preprocessing and combining output from different methods.
    • Shows you how to use the Weka machine learning workbench.
    Translations

    The book has been translated into German (first edition) and Chinese (second edition).

    Errata

    Click here to get to a list of errata.

    Teaching material

    Slides for Chapters 1-5 of the 3rd edition can be found here.

    Slides for Chapters 6-8 of the 3rd edition can be found here

    These archives contain .pdf files as well as .odp files in Open Document Format that were generated using OpenOffice 2.0. Note that there are several free office programs now that can read .odp files. There is also a plug-in for Word made by Sun for reading this format. Corresponding information is on this Wikipedia page.

    Reviews of the first edition

    Review by J. Geller (SIGMOD Record, Vol. 32:2, March 2002).
    Review by E. Davis (AI Journal, Vol. 131:1-2, September 2001).
    Review by P.A. Flach (AI Journal, Vol. 131:1-2, September 2001).


    Table of Contents for the 3rd Edition:
    Sections and chapters with new material are marked in red.

    Preface

    Part I: Practical Machine Learning Tools and Techniques

    1. What’s it all about?
    1.1 Data Mining and Machine Learning
    1.2 Simple Examples: The Weather Problem and Others
    1.3 Fielded Applications
    1.4 Machine Learning and Statistics
    1.5 Generalization as Search
    1.6 Data Mining and Ethics
    1.7 Further Reading

    2. Input: Concepts, instances, attributes
    2.1 What’s a Concept?
    2.2 What’s in an Example?
    2.3 What’s in an Attribute?
    2.4 Preparing the Input
    2.5 Further Reading

    3. Output: Knowledge representation
    3.1 Tables
    3.2 Linear Models
    3.3 Trees
    3.4 Rules
    3.5 Instance-Based Representation
    3.6 Clusters
    3.7 Further Reading

    4. Algorithms: The basic methods
    4.1 Inferring Rudimentary Rules
    4.2 Statistical Modeling
    4.3 Divide-and-Conquer: Constructing Decision Trees
    4.4 Covering Algorithms: Constructing Rules
    4.5 Mining Association Rules
    4.6 Linear Models
    4.7 Instance-Based Learning
    4.8 Clustering
    4.9 Multi-Instance Learning
    4.10 Further Reading
    4.11 Weka Implementations

    5. Credibility: Evaluating what’s been learned
    5.1 Training and Testing
    5.2 Predicting Performance
    5.3 Cross-Validation
    5.4 Other Estimates
    5.5 Comparing Data Mining Schemes
    5.6 Predicting Probabilities
    5.7 Counting the Cost
    5.8 Evaluating Numeric Prediction
    5.9 The Minimum Description Length Principle
    5.10 Applying MDL to Clustering
    5.11 Further Reading

    Part II: Advanced Data Mining

    6. Implementations: Real machine learning schemes
    6.1 Decision Trees
    6.2 Classification Rules
    6.3 Association Rules
    6.4 Extending Linear Models
    6.5 Instance-Based Learning
    6.6 Numeric Prediction with Local Linear Models
    6.7 Bayesian Networks
    6.8 Clustering
    6.9 Semisupervised Learning
    6.10 Multi-Instance Learnging
    6.11 Weka Implementations

    7. Data Transformations
    7.1 Attribute Selection
    7.2 Discretizing Numeric Attributes
    7.3 Projections
    7.4 Sampling
    7.5 Cleansing
    7.6 Transforming Multiple Classes to Binary Ones
    7.7 Calibrating Class Probabilities
    7.8 Further Reading
    7.9 Weka Implementations

    8. Ensemble Learning
    8.1 Combining Multiple Models
    8.2 Bagging
    8.3 Randomization
    8.4 Boosting
    8.5 Additive Regression
    8.6 Interpretable Ensembles
    8.7 Stacking
    8.8 Further Reading
    8.9 Weka Implementations

    9. Moving on: Applications and Beyond
    9.1 Applying Data Mining
    9.2 Learning from Massive Datasets
    9.3 Data Stream Learning
    9.4 Incorporating Domain Knowledge
    9.5 Text Mining
    9.6 Web Mining
    9.7 Adversarial Situations
    9.8 Ubiquitous Data Mining
    9.9 Further Reading

    Part III: The Weka Data Mining Workbench

    10. Introduction to Weka
    10.1 What’s in Weka?
    10.2 How Do You Use It?
    10.3 What Else Can You Do?

    11. The Explorer
    11.1 Getting Started
    11.2 Exploring the Explorer
    11.3 Filtering Algorithms
    11.4 Learning Algorithms
    11.5 Meta-Learning Algorithms
    11.6 Clustering Algorithms
    11.7 Association-Rule Learners
    11.8 Attribute Selection

    12. The Knowledge Flow Interface
    12.1 Getting Started
    12.2 Knowledge Flow Components
    12.3 Configuring and Connecting the Components
    12.4 Incremental Learning

    13. The Experimenter
    13.1 Getting Started
    13.2 Simple Setup
    13.3 Advanced Setup
    13.4 The Analyze Panel
    13.5 Distributing Processing over Several Machines

    14. The Command-Line Interface
    14.1 Getting Started
    14.2 The Structure of Weka
    14.3 Command-Line Options

    15. Embedded Machine Learning
    15.1 A Simple Data Mining Application

    16. Writing New Learning Schemes
    16.1 An Example Classifier
    16.2 Conventions for Implementing Classifiers

    17. Tutorial Excercises for the Weka Explorer
    17.1 Introduction to the Explorer Interface
    17.2 Nearest-Neighbor Learning and Decision Trees
    17.3 Classification Boundaries
    17.4 Preprocessing and Parameter Tuning
    17.5 Document Classification
    17.6 Mining Association Rules

  • 相关阅读:
    [考试反思]0904NOIP模拟测试37:守望
    游戏:最短路,拆点
    [考试反思]0903NOIP模拟测试36:复始
    [考试反思]0902NOIP模拟测试35:摆动
    长寿花:dp
    [考试反思]0901NOIP模拟测试34:游离
    赤壁情:dp
    [考试反思]0829NOIP模拟测试33:仰望
    [考试反思]0828NOIP模拟测试32:沉底
    宅男计划:单峰函数三分
  • 原文地址:https://www.cnblogs.com/lexus/p/2810007.html
Copyright © 2011-2022 走看看