zoukankan html css js c++ java

Three ways to detect outliers

Z-score

import numpy as np

def outliers_z_score(ys):
    threshold = 3

    mean_y = np.mean(ys)
    stdev_y = np.std(ys)
    z_scores = [(y - mean_y) / stdev_y for y in ys]
    return np.where(np.abs(z_scores) > threshold)

Modified Z-score

import numpy as np

def outliers_modified_z_score(ys):
    threshold = 3.5

    median_y = np.median(ys)
    median_absolute_deviation_y = np.median([np.abs(y - median_y) for y in ys])
    modified_z_scores = [0.6745 * (y - median_y) / median_absolute_deviation_y
                         for y in ys]
    return np.where(np.abs(modified_z_scores) > threshold)

IQR(interquartile range)

import numpy as np

def outliers_iqr(ys):
    quartile_1, quartile_3 = np.percentile(ys, [25, 75])
    iqr = quartile_3 - quartile_1
    lower_bound = quartile_1 - (iqr * 1.5)
    upper_bound = quartile_3 + (iqr * 1.5)
    return np.where((ys > upper_bound) | (ys < lower_bound))

Conclusion

It is important to reiterate that these methods should not be used mechanically. 
They should be used to explore the data – they let you know which points might be worth a closer look. 
What to do with this information depends heavily on the situation. 
Sometimes it is appropriate to exclude outliers from a dataset to make a model trained on that dataset more predictive. 
Sometimes, however, 
the presence of outliers is a warning sign that the real-world process generating the data is more complicated than expected.

As an astute commenter on CrossValidated put it: 
“Sometimes outliers are bad data, and should be excluded, such as typos.
Sometimes they are Wayne Gretzky or Michael Jordan, and should be kept.” 

Domain knowledge and practical wisdom are the only ways to tell the difference.

摘自：http://colingorrie.github.io/outlier-detection.html

查看全文

相关阅读:
day15—jQuery UI之widgets插件
 day14—jQuery UI 之dialog部件
 day13—CSS之导航栏
 day12—jQuery ui引入及初体验
 day11—前端学习之我不想看书
 struts2的action方法匹配以及通配符的使用
 Java中的static
ActiveMQ的简单使用
 MS DOS 常用命令整理
 IntelliJ IDEA 中 Ctrl+Alt+Left/Right 失效

原文地址：https://www.cnblogs.com/standby/p/9403999.html