zoukankan      html  css  js  c++  java
  • statistics of Python

    statistics

    统计模块支持普通的int float类型,还支持封装的 Decimal 和 Fraction的统计计算。

    且输入数据的类型要保持一致。

    统计功能分为两个部分:

    (1)均值和中心位置度量。-- 均值和中位数。

    (2)延展度度量。-- 偏差和标准差。

    https://docs.python.org/3.7/library/statistics.html

    This module provides functions for calculating mathematical statistics of numeric (Real-valued) data.

    Note

    Unless explicitly noted otherwise, these functions support int, float, decimal.Decimal and fractions.Fraction. Behaviour with other types (whether in the numeric tower or not) is currently unsupported. Mixed types are also undefined and implementation-dependent. If your input data consists of mixed types, you may be able to use map() to ensure a consistent result, e.g. map(float, input_data).

    Averages and measures of central location

    These functions calculate an average or typical value from a population or sample.

    mean()

    Arithmetic mean (“average”) of data.

    harmonic_mean()

    Harmonic mean of data.

    median()

    Median (middle value) of data.

    median_low()

    Low median of data.

    median_high()

    High median of data.

    median_grouped()

    Median, or 50th percentile, of grouped data.

    mode()

    Mode (most common value) of discrete data.

    Measures of spread

    These functions calculate a measure of how much the population or sample tends to deviate from the typical or average values.

    pstdev()

    Population standard deviation of data.

    pvariance()

    Population variance of data.

    stdev()

    Sample standard deviation of data.

    variance()

    Sample variance of data.

    DEMO

    https://pymotw.com/3/statistics/index.html

    使用方差或者标准差来度量数据的分散度。

    其值越小,表示数据更加向均值聚拢。

    Statistics uses two values to express how disperse a set of values is relative to the mean. The variance is the average of the square of the difference of each value and the mean, and the standard deviation is the square root of the variance (which is useful because taking the square root allows the standard deviation to be expressed in the same units as the input data). Large values for variance or standard deviation indicate that a set of data is disperse, while small values indicate that the data is clustered closer to the mean.

    from statistics import *
    import subprocess
    
    
    def get_line_lengths():
        cmd = 'wc -l ../[a-z]*/*.py'
        out = subprocess.check_output(
            cmd, shell=True).decode('utf-8')
        for line in out.splitlines():
            parts = line.split()
            if parts[1].strip().lower() == 'total':
                break
            nlines = int(parts[0].strip())
            if not nlines:
                continue  # skip empty files
            yield (nlines, parts[1].strip())
    
    
    data = list(get_line_lengths())
    
    lengths = [d[0] for d in data]
    sample = lengths[::2]
    
    print('Basic statistics:')
    print('  count     : {:3d}'.format(len(lengths)))
    print('  min       : {:6.2f}'.format(min(lengths)))
    print('  max       : {:6.2f}'.format(max(lengths)))
    print('  mean      : {:6.2f}'.format(mean(lengths)))
    
    print('
    Population variance:')
    print('  pstdev    : {:6.2f}'.format(pstdev(lengths)))
    print('  pvariance : {:6.2f}'.format(pvariance(lengths)))
    
    print('
    Estimated variance for sample:')
    print('  count     : {:3d}'.format(len(sample)))
    print('  stdev     : {:6.2f}'.format(stdev(sample)))
    print('  variance  : {:6.2f}'.format(variance(sample)))

    计算方差和标准差,有抽样 和 全集模式。

    stdev -- 表示抽样

    pstdev -- 表示统计所有数据。

    Python includes two sets of functions for computing variance and standard deviation, depending on whether the data set represents the entire population or a sample of the population. This example uses wc to count the number of lines in the input files for all of the example programs and then uses pvariance() and pstdev() to compute the variance and standard deviation for the entire population before using variance() and stddev() to compute the sample variance and standard deviation for a subset created by using the length of every second file found.

    $ python3 statistics_variance.py
    
    Basic statistics:
      count     : 1282
      min       :   4.00
      max       : 228.00
      mean      :  27.79
    
    Population variance:
      pstdev    :  17.86
      pvariance : 319.04
    
    Estimated variance for sample:
      count     : 641
      stdev     :  16.94
      variance  : 286.99
    
  • 相关阅读:
    JTAG的SWD接线方式
    Qt のEXecl
    人脸识别
    Qt实现基本QMainWindow主窗口程序
    Qt学习之路MainWindow学习过程中的知识点
    QT_FORWARD_DECLARE_CLASS
    标准的并发控制实现
    C++ DFS
    C# 互操作(一) 编写一个C++ COM组件
    Socket使用SOAP调用WCF
  • 原文地址:https://www.cnblogs.com/lightsong/p/13994422.html
Copyright © 2011-2022 走看看