python机器学习-乳腺癌细胞挖掘(博主亲自录制视频)https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share
根据power,effect size,a,决定样本量
# -*- coding: utf-8 -*- """ sample size VS effect size VS power Created on Fri Apr 28 11:00:22 2017 @author: toby """ from statsmodels.stats import power nobs = power.tt_ind_solve_power(effect_size = 0.5, alpha =0.05, power=0.8 ) print (nobs) ''' 63.76561177540974 ''' effect_size = power.tt_ind_solve_power(alpha =0.05, power=0.8, nobs1=25 ) print(effect_size) ''' 0.8087077886680407 '''
t独立检验中,敏感性(power功效)越高,要求的样本量越大,effect size效应量0.5表示中等效应,如果效应太低,即使显著性<0.05,实验无意义
更好的样本计算脚本来自GitHub
https://github.com/thomas-haslwanter/statsintro_python/tree/master/ISP/Code_Quantlets/07_CheckNormality_CalcSamplesize/sampleSize
# -*- coding: utf-8 -*- """ Created on Fri Apr 28 11:12:01 2017 @author: toby """ '''Calculate the sample size for experiments, for normally distributed groups, for: - Experiments with one single group - Comparing two groups ''' # Copyright(c) 2015, Thomas Haslwanter. All rights reserved, under the CC BY-SA 4.0 International License # Import standard packages import numpy as np # additional packages from scipy.stats import norm def sampleSize_oneGroup(d, alpha=0.05, beta=0.2, sigma=1): '''Sample size for a single group. The formula corresponds to Eq 6.2 in the book.''' n = np.round((norm.ppf(1-alpha/2.) + norm.ppf(1-beta))**2 * sigma**2 / d**2) print(('In order to detect a change of {0} in a group with an SD of {1},'.format(d, sigma))) print(('with significance {0} and test-power {1}, you need at least {2:d} subjects.'.format(alpha, 100*(1-beta), int(n)))) return n def sampleSize_twoGroups(D, alpha=0.05, beta=0.2, sigma1=1, sigma2=1): '''Sample size for two groups. The formula corresponds to Eq 6.4 in the book.''' n = np.round((norm.ppf(1-alpha/2.) + norm.ppf(1-beta))**2 * (sigma1**2 + sigma2**2) / D**2) print(('In order to detect a change of {0} between groups with an SD of {1} and {2},'.format(D, sigma1, sigma2))) print(('with significance {0} and test-power {1}, you need in each group at least {2:d} subjects.'.format(alpha, 100*(1-beta), int(n)))) return n if __name__ == '__main__': sampleSize_oneGroup(0.5) print(' ') sampleSize_twoGroups(0.4, sigma1=0.6, sigma2=0.6)