zoukankan      html  css  js  c++  java
  • 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Final

    Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授。

    PDF笔记下载(Academia.edu)

    ADDITIONAL PRACTICE FOR THE FINAL

    PROBLEM 1

    A box contains 8 dark chocolates, 8 milk chocolates, and 8 white chocolates. (It’s amazing how this box keeps replenishing itself and reappearing. It’s like the Magic Pudding. Australians will know what I mean, and the rest of you might enjoy finding out. It’s one of the classics of children’s literature.) A simple random sample of 6 chocolates is drawn. Find:

    a) the expected number of dark chocolates

    b) the SE of the number of dark chocolates

    c) the chance that there are fewer than 2 dark chocolates

    d) the chance that the second and third chocolates drawn are dark, given that the first and fourth chocolates drawn are not dark

    e) the expected number of dark chocolates among the last four draws

    Solution

    This is hypergeometric distribution (Zeros and Ones: Sum of a sample without replacement), $n=6, N=24, G=8$.

    1a) $$E( ext{dark chocolates})=ncdotfrac{G}{N}=6 imesfrac{8}{24}=2$$

    1b) $$SE( ext{dark chocolates})=sqrt{ncdotfrac{G}{N}cdotfrac{N-G}{N}}cdotsqrt{frac{N-n}{N-1}}$$ $$=sqrt{6 imesfrac{8}{24} imesfrac{16}{24}} imessqrt{frac{24-6}{24-1}}doteq1.021508$$

    1c) $$P( ext{fewer than 2 dark chocolates})=sum_{x=0}^{1}frac{C_{G}^{x}cdot C_{N-G}^{n-x}}{C_{N}^{n}}$$ $$=sum_{x=0}^{1}frac{C_{8}^{x} imes C_{16}^{6-x}}{C_{24}^{6}}doteq0.319118$$ R code:

    sum(dhyper(0:1, 8, 16, 6))
    [1] 0.319118

    1d) $$P( ext{2nd and 3rd are dark | 1st and 4th are not dark})$$ $$=frac{8}{22} imesfrac{7}{21}doteq0.1212121$$

    1e) Given no information about any other draw, the last four draws are probabilistically the same as any other four, say the first four. $$E( ext{dark chocolates among the last four draws})=4 imesfrac{8}{24}doteq1.333333$$

    PROBLEM 2

    The casino is offering a “house special” at roulette: there are 8 chances in 38 to win, and the bet pays 3 to 1. Suppose you bet $$1$ on the house special, 200 times, independently. Find:

    a) your expected average net gain per bet (and then pledge that you will never play this game)

    b) the chance that you come out ahead

    c) the chance that you lose more than $$20$

    Solution

    2a) Sample mean with replacement: $$E=3 imesfrac{8}{38}+(-1) imesfrac{30}{38}doteq-0.1578947$$

    2b) Let $x$ be the number of winning times. $$3x+(-1)cdot(200-x) > 0Rightarrow x > 50Rightarrow xgeq51$$ Binomial distribution $n=200, k=51:200, p=frac{8}{38}$: $$P( ext{come out ahead})=sum_{k=51}^{200}C_{200}^{k} imes(frac{8}{38})^k imes(frac{30}{38})^{200-k}doteq0.0750046$$ R code:

    sum(dbinom(51:200, 200, 8/38))
    [1] 0.0750046

    2c) $$3x+(-1)cdot(200-x) < -20Rightarrow x < 45Rightarrow xleq44$$ $$P( ext{lose more than 20})=sum_{k=0}^{44}C_{200}^{k} imes(frac{8}{38})^k imes(frac{30}{38})^{200-k}doteq0.6660572$$ R code:

    sum(dbinom(0:44, 200, 8/38))
    [1] 0.6660572

    PROBLEM 3

    Households in a large city contain an average of 2.2 people, with an $SD$ of 1.2 people. A simple random sample of 625 households is taken.

    a) Approximately what is the chance that there are more than 1400 people in the sampled households?

    b) How would your answer to a) have been different had the sample been drawn with replacement?

    Solution

    3a) Sample sum without replacement but the correction factor is very close to 1 since the city is very large. $mu=2.2, sigma=1.2, n=625$: $$SE=sqrt{n}cdotsigma=sqrt{625} imes1.2=30$$ $$Z=frac{1400.5-ncdotmu}{SE}$$ Calculating by R:

    n = 625; mu = 2.2
    z = (1400.5 - n * mu) / 30
    1 - pnorm(z)
    [1] 0.1976625

    Thus the chance is around $19.77\%$.

    3b) It wouldn't. Because the city is large so the correction factor is very close to 1, that is, the chance will be the same whether draw with replacement or without replacement.

    PROBLEM 4

    There are three boxes. Box I contains one gold coin and one silver coin. Box II contains two silver coins. Box III contains two gold coins. A box is selected at random, and then one coin is selected at random from that box. Given that the coin is gold, what is the chance that the other coin in the box is gold? [No, the answer is not 1/2.]

    Solution

    Bayes Rules: $$P( ext{box 3 | the first coin is gold})=frac{ ext{the first coin is gold and it is from box 3}}{ ext{the first coin is gold}}$$ $$=frac{frac{1}{3} imes1}{frac{1}{3} imesfrac{1}{2}+frac{1}{3} imes0+frac{1}{3} imes1}=frac{2}{3}$$

    PROBLEM 5

    A coin is tossed $n$ times. There is about $95\%$ chance that the proportion of heads is in the range $.49$ to $.51$. The number of tosses $n$ is closest to:

    a) 1,000

    b) 5,000

    c) 10,000

    d) 50,000

    Solution

    Sample proportion of ones. $p=0.5$ and the interval $.49$ to $.51$ has to be $0.5pm2SE$, thus $$2SE=0.01Rightarrow SE=0.005$$ On the other hand $$SE=sqrt{frac{pcdot(1-p)}{n}}=sqrt{frac{frac{1}{4}}{n}}=0.005Rightarrow n=10000$$

    FINAL EXAM

    PROBLEM 1

    Suppose you are trying to estimate the percent of women in a city. Other things being equal, a simple random sample of 0.1% of the population of a city that has 2,000,000 people is ________ as a simple random sample of 0.1% of the population of a city that has 500,000 people. Fill in the blank with the best of the following choices.

    a) about 1/4 times as accurate

    b) about 1/2 times as accurate

    c) about as accurate

    d) about 2 times as accurate

    e) about 4 times as accurate

    Solution

    Square Root Law. $$2 imes10^6 imes0.1\%=2000, 5 imes10^5 imes0.1\%=500$$ $$Rightarrowsqrt{frac{2000}{500}}=2$$ Thus the former is about 2 times as accurate as the latter. d) is correct.

    PROBLEM 2

    A group of 30 people consists of 15 children, 10 men, and 5 women. Tom and Jerry are two of the men in the group. Five people are picked at random without replacement.

    2A Find the chance the first person picked is a man, given that the fourth and fifth people picked are children.

    2B Find the chance that more than two women are picked.

    2C Find the chance that Tom and Jerry both get picked.

    Solution

    2A) $$P( ext{1st person is a man | 4th and 5th are children})=frac{10}{28}doteq0.3571429$$

    2B) Hypergeometric distribution $$P( ext{more than 2 women})=sum_{x=3}^{5}frac{C_{5}^{x}cdot C_{25}^{5-x}}{C_{30}^{5}}doteq0.02193592$$ R code:

    sum(dhyper(3:5, 5, 25, 5))
    [1] 0.02193592

    2C) Both of Tom and Jerry get picked means we only have to select 3 persons among other 28 remaining people: $$P( ext{both of Tom and Jerry get selected})=frac{C_{28}^{3}}{C_{30}^{5}}doteq0.02298851$$ R code:

    choose(28, 3) / choose(30, 5)
    [1] 0.02298851

    PROBLEM 3

    A gambling game pays 4 to 1 and the chance of winning is 1 in 6. Suppose you bet $$1$ on this game 600 times independently.

    3A Find the expected number of times you win.

    3B Find the $SE$ of the number of times you win.

    3C Find the chance that you lose more than $$50$ (that is, your net gain in the 600 bets is less than $-$50$).

    Solution

    Zeros and Ones: Sum of a sample with replacement, $n=600, p=frac{1}{6}$.

    3A) $$E( ext{winning times})=ncdot p=600 imesfrac{1}{6}=100$$

    3B) $$SE( ext{winning times})=sqrt{ncdot pcdot(1-p)}doteq9.128709$$

    3C) Let $x$ be the number of winning times, $$4x+(-1)cdot(600-x) < -50Rightarrow x < 110Rightarrow xleq109$$ Binomial distribution $n=600, k=0:109, p=frac{1}{6}$: $$P( ext{lose more than 50})=sum_{k=0}^{109}C_{600}^{k} imes(frac{1}{6})^k imes(frac{5}{6})^{600-k}doteq0.8508149$$ R code:

    sum(dbinom(0:109, 600, 1/6))
    [1] 0.8508149

    PROBLEM 4

    In a grocery store, butter is sold in “sticks” that are shaped like little bricks. The weights of these sticks are like draws at random with replacement from a population with average 4 ounces and SD 0.2 ounces. The grocery store receives the butter in boxes; each box consists of 100 sticks.

    4A Find the chance that the average weight of the sticks in one box is less than 3.999 ounces.

    4B The grocery store has received 6 boxes of butter. There is about ___________ chance that in at least one of the boxes, the average weight of sticks is less than 3.999 ounces.

    Solution

    4A) Sample mean with replacement, $$mu=4, sigma=0.2, n=100Rightarrow SE=frac{sigma}{sqrt{n}}=0.02$$ $$Z=frac{3.999-mu}{SE}$$ Calculating by R:

    z = (3.999 - 4) / 0.02
    pnorm(z)
    [1] 0.4800612

    4B) Following 4A), this is binomial distribution $n=6, k=1:6, p=0.4800612$: $$P( ext{at least 1 box is less than 3.999 ounces})$$ $$=sum_{k=1}^{6}C_{6}^{k}cdot p^kcdot(1-p)^{6-k}=0.9802433$$ R code:

    p = pnorm(z)
    sum(dbinom(1:6, 6, p))
    [1] 0.9802433

    PROBLEM 5

    In surveys about sensitive topics, respondents are sometimes given ways to “hide” their answers from the surveyor. In a survey of taxpayers, one of the questions is, “Did you cheat on your taxes?” To answer, the respondent is asked to toss a fair coin. If it lands heads, the respondent must answer “yes.” If it lands tails, the respondent must answer the question truthfully, either “yes” or “no” (the answer has to be the one that is true). Assume that all respondents follow this procedure, and that for 10% of the respondents the truthful answer is “yes.” Also assume that the result of a respondent’s coin toss is independent of whether or not the respondent cheated on his / her taxes. Oneof the respondents is picked at random.

    5A Given that the respondent cheated on his / her taxes, what is the chance that he / she answered “yes”?

    5B Given that the respondent did not cheat on his / her taxes, what is the chance that he / she answered “yes”?

    5C Given that the respondent answered “yes,” what is the chance that the respondent cheated on his / her taxes?

    Solution

    According to the information, we have $$P( ext{did not cheat on taxes})=0.1, P( ext{not cheated on taxes})=0.9$$

    5A) $$P( ext{answered Yes | cheated on taxes})$$ $$=frac{P( ext{cheated on taxes and answered Yes)}}{P( ext{cheated on taxes})}$$ $$=frac{P( ext{cheated and tossed head})+P( ext{cheated and tossed tail})}{P( ext{cheated on taxes})}$$ $$=frac{0.1 imes0.5+0.1 imes0.5}{0.1}=1$$ This result indicates that if someone cheated on taxes then he / she must answered "Yes"!

    5B) $$P( ext{answered Yes | did not cheat on taxes})$$ $$=frac{P( ext{answered Yes but did not cheat on taxes})}{P( ext{did not cheat on taxes})}$$ $$=frac{0.9 imes0.5}{0.9}=0.5$$

    5C) $$P( ext{cheated on taxes | answered Yes})=frac{P( ext{cheated on taxes and answered Yes})}{P( ext{answered Yes})}$$ $$=frac{P( ext{cheated on taxes and answered Yes})}{P( ext{cheated on taxes and answered Yes})+P( ext{did not cheat on taxes and answered Yes})}$$ $$=frac{0.1 imes0.5+0.1 imes0.5}{(0.1 imes0.5+0.1 imes0.5)+0.9 imes0.5}doteq0.1818182$$

    PROBLEM 6

    In a population of 10,000 adults, $20\%$ are smokers. A simple random sample of 600 of the adults is drawn.

    6A Find the expected number of smokers in the sample.

    6B The $SE$ of the number of smokers in the sample is closest to

    6C Find the chance that there are fewer than 115 smokers in the sample.

    Solution

    6A) $$E=ncdot p=600 imes0.2=120$$

    6B) $$SE=sqrt{ncdot pcdot(1-p)}cdotsqrt{frac{N-n}{N-1}}$$ $$=sqrt{600 imes0.2 imes0.8} imessqrt{frac{10000-600}{10000-1}}doteq9.499949$$

    6C) $$Z=frac{115-120}{SE}$$ Calculating by R:

    n = 600; N = 10000; p = 0.2
    se = sqrt(n * p * (1 - p)) * sqrt((N - n) / (N - 1))
    z = (115 - n * p) / se
    pnorm(z)
    [1] 0.2993334

    PROBLEM 7

    When a die is rolled, the face with six spots appears with chance $frac{1}{6}$, independently of all other rolls. Rank the three events below in increasing order of probability. For example, if you choose “A B C”, you are saying that A has the smallest chance, B has more chance than A but less chance than C, and C has the biggest chance. [If you think that some of the events have the same chance, please think again.]

    A: The face with six spots shows up on fewer than $16.7\%$ of the rolls when a die is rolled 60,000 times.

    B: The face with six spots shows up on more than $16.7\%$ of the rolls when a die is rolled 30,000 times.

    C: The face with six spots shows up on fewer than $16.7\%$ of the rolls when a die is rolled 30,000 times.

    Solution

    This is binomial distribution. Let $m=ncdot p$, where $n$ is the number of rolls and $p=frac{1}{6}$: $$P(A)=sum_{0}^{m-1}C_{n}^{k}cdot p^kcdot(1-p)^{n-k}$$ where $n=60000$. $$P(B)=sum_{m+1}^{n}C_{n}^{k}cdot p^kcdot(1-p)^{n-k}$$ where $n=30000$. $$P(C)=sum_{0}^{m-1}C_{n}^{k}cdot p^kcdot(1-p)^{n-k}$$ where $n=30000$. R code:

    dieroll = function(n, p, id){ # id=0 means fewer than a fixed proportion 
      m = n * p
      if(id == 0){
        print(sum(dbinom(0:(m - 1), n, p)))
      } else{
        print(sum(dbinom((m + 1):n, n, p)))
      }
    }
    > dieroll(60000, 1/6, 0)
    [1] 0.4983005
    > dieroll(30000, 1/6, 1)
    [1] 0.4962232
    > dieroll(30000, 1/6, 0)
    [1] 0.4975965
    

    Thus $$P(B) < P(C) < P(A)$$

    PROBLEM 8

    A die has 2 red faces, 2 blue faces, and 2 green faces. It is rolled 240 times. Let $R$ be the number of times red faces appear, and $B$ the number of times blue faces appear.

    8A The random variable $R$ is the sum of 240 draws at random with replacement from

    8B Consider the random variable $D = R - B$. That’s $D$ for “difference.” If all 240 rolls show blue faces, then $D = -240$; if they all show red faces, then $D = 240$; otherwise $D$ is somewhere in between. The random variable $D$ is the sum of 240 draws at random with replacement from

    8C Find $E(D)$

    8D Find $SE(D)$

    Solution

    8A) Note that $R$ is from 0 to 240, that is, if red was picked then $R=R+1$. Thus the similar pool should include 1 and 0, such as $$1,1,0,0,0,0$$ or $$1,0,0$$

    8B) Similar to 8A. The equivalent pool should contain 1(red), -1(blue), and 0(green), such as $$1, 0, -1$$ or $$1, 1, -1, -1, 0, 0$$

    8C) & 8D) Sample sum with replacement: $$mu=0, n=240$$ and $$sigma=sqrt{(1-1)^2 imesfrac{1}{3}+(-1-0)^2 imesfrac{1}{3}+(0-0)^2 imesfrac{1}{3}}=sqrt{frac{2}{3}}$$ Thus $$E(D)=ncdotmu=0$$ $$SE(D)=sqrt{n}cdotsigma=sqrt{240} imessqrt{frac{2}{3}}=sqrt{160}doteq12.64911$$


    作者:赵胤
    出处:http://www.cnblogs.com/zhaoyin/
    本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

  • 相关阅读:
    MOSS2007图片库的幻灯片视图在IE8标准渲染模式下的bug及其修正
    分享一个WM上绘制饼图、柱形图、折线图的控件类
    C# 中启动进程的三种方法
    SSCLI 包含了微软的CLI ,C#,JScript....的源码,学习.Net的不看怎么行
    (2)继承关系中的多态性编译时与运行时
    .NET.性能:装箱与拆箱、string stringBuilder、struct class、Add AddRangle等影响性能分析
    .NET.GC学习总结
    .NET.GC 浅谈.net托管程序中的资源释放问题 (转帖)
    (1)通过IL来看构造函数
    conda的使用
  • 原文地址:https://www.cnblogs.com/zhaoyin/p/4197020.html
Copyright © 2011-2022 走看看