zoukankan      html  css  js  c++  java
  • 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Section 2 Random sampling with and without replacement

    Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授。

    PDF笔记下载(Academia.edu)

    Summary

    • Independent $$P(Acap B)=P(A)cdot P(B)$$
    • Binomial Distribution $$C_{n}^{k}cdot p^kcdot(1-p)^{n-k}$$ R function:
      dbinom(k, n, p)

    UNGRADED EXERCISE SET A

    PROBLEM 1

    I toss a coin 4 times. Find the chance of getting:

    1A the sequence $HTHT$

    1B 2 heads

    1C more heads than tails

    Solution

    1A) $$P( ext{HTHT})=frac{1}{2^4}=0.0625$$

    1B) Binomial distribution $n=4, k=2, p=0.5$: $$P( ext{two heads of four tosses})=C_{n}^{k}cdot p^kcdot (1-p)^{n-k}=C_{4}^{2} imes0.5^4=0.375$$ R code:

    > dbinom(x = 2, size = 4, prob = 0.5)
    [1] 0.375

    1C) Binomial distribution $n=4, k=3,4, p=0.5$: $$P( ext{more heads than tails})=P( ext{3 heads of 4 tosses})+P( ext{4 heads of 4 tosses})$$ $$=sum_{k=3}^{4}C_{4}^{k}cdot 0.5^kcdot (1-0.5)^{4-k}=0.25+0.0625=0.3125$$ R code:

    > sum(dbinom(x = 3:4, size = 4, prob = 0.5))
    [1] 0.3125

    PROBLEM 2

    A random number generator draws at random with replacement from the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Find the chance that the digit 5 appears on more than 11% of the draws, if:

    2A 100 draws are made

    2B 1000 draws are made

    Solution

    2A) Binomial distribution $n=100, k=12:100, p=0.1$: $$P( ext{digit 5 appears on more than 11% of 100 draws})$$ $$=sum_{k=12}^{100}C_{100}^{k}cdot 0.1^kcdot (1-0.1)^{100-k}=0.2969669$$ R code:

    > sum(dbinom(x = 12:100, size = 100, prob = 0.1))
    [1] 0.2969669
    > # alternativel using "pbinom" function
    > pbinom(q = 100, size = 100, p = 0.1) - pbinom(q = 11, size = 100, p = 0.1)
    [1] 0.2969669

    2B) Binomial distribution $n=1000, k=111:1000, p=0.1$: $$P( ext{digit 5 appears on more than 11% of 1000 draws})$$ $$=sum_{k=111}^{1000}C_{100}^{k}cdot 0.1^kcdot (1-0.1)^{1000-k}=0.1347765$$ R code:

    > sum(dbinom(x = 111:1000, size = 1000, prob = 0.1))
    [1] 0.1347765
    > # Alternatively
    > pbinom(q = 1000, size = 1000, p = 0.1) - pbinom(q = 110, size = 1000, p = 0.1)
    [1] 0.1347765

    PROBLEM 3

    A die is rolled 12 times. Find the chance that the face with six spots appears once among the first 6 rolls, and once among the next 6 rolls.

    Solution

    The first six rolls and the second six rolls are independent, and each of them is binomial distribution $n=6, k=1, p=frac{1}{6}$: $$P( ext{once among first 6 rolls & once among second 6 rolls})$$ $$=P( ext{once among first 6 rolls}) imes P( ext{once among second 6 rolls})$$ $$=C_{6}^{1} imesfrac{1}{6} imes(1-frac{1}{6})^5 imes C_{6}^{1} imesfrac{1}{6} imes(1-frac{1}{6})^5=0.1615056$$ R code:

    > dbinom(x = 1, size = 6, prob = 1/6) ^ 2
    [1] 0.1615056

    PROBLEM 4

    A quiz consists of 20 true-false questions. The score for each question is 1 point if it is answered correctly, and 0 otherwise.

    4A Suppose a student guesses the answer to Question 1 on the test by tossing a coin: if the coin lands Heads, she answers True, and if it lands Tails, she answers False. What is the chance that she gets the right answer?

    4B Suppose a student guesses the answers to both Questions 1 and 2 as described in 4A, using a different toss for each question. Are the events “gets the right answer to Question 1” and “gets the right answer to Question 2” independent?

    4C To get an A grade on the test, you need a total score of more than 16 points. One of the students knows the correct answer to 6 of the 20 questions. The rest she guesses at random by tossing a coin (one toss per question, as in 4B). What is the chance that she gets an A grade on the test?

    Solution

    4A) No matter what the right answer is, the chance that the coin picks that answer is $frac{1}{2}$.

    4B) Yes, they are independent. No matter what the pair of correct answers is $(TT, TF, FT, TT)$, the chance that the students gets both right is $$P( ext{Q1 & Q2 are right})=frac{1}{4}=frac{1}{2} imesfrac{1}{2}=P( ext{Q1 is right})cdot P( ext{Q2 is right})$$

    4C) From the remaining 14 questions she needs to get at least 11 points. Binomial distribution $n=14, k=11:14, p=0.5$: $$P( ext{at least 11 are right among 14 questions})$$ $$=sum_{k=11}^{14}C_{14}^{k}cdot0.5^kcdot(1-0.5)^{14-k}=0.02868652$$ R code:

    > sum(dbinom(x = 11:14, size = 14, prob = 0.5))
    [1] 0.02868652

    PROBLEM 5

    A die has one red face, two blue faces, and three green faces. It is rolled 5 times. Find the chance that the red face appears on one of the rolls and the remaining rolls are green. [Careful what you multiply. The most straightforward method is to follow the derivation of the binomial formula.]

    Solution

    This can be seen as a derivation of binomial distribution: $C_{n}^{k}cdot {p_1}^kcdot {p_2}^{n-k}$, where $n=5, k=1, p_1=frac{1}{6}, p_2=frac{3}{6}$: $$P( ext{1 red and 4 green among 5 rolls})=C_{5}^{1} imesfrac{1}{6} imes(frac{3}{6})^4=0.05208333$$ R code:

    > choose(5, 1) * (1/6) * (3/6)^4
    [1] 0.05208333

    Summary

    • Hypergeometric Distribution $$frac{C_{m}^{x}cdot C_{n}^{k-x}}{C_{m+n}^{k}}$$ R function:
      dhyper(x, m, n, k)
    • Geometric Distribution $$pcdot(1-p)^x$$ R function:
      dgeom(x, p)

    UNGRADED EXERCISE SET B

    PROBLEM 1

    A poker hand consists of 5 cards dealt at random without replacement from a standard deck of 52 cards of which 26 are red and the rest black. A poker hand is dealt. Find the chance that the hand contains three red cards and two black cards.

    Solution

    Hypergeometric distribution $x=3, m=26, n=26, k=5$: $$P( ext{3 red and 2 black})=frac{C_{m}^{x}cdot C_{n}^{k-x}}{C_{m+n}^{k}}=frac{C_{26}^{3}cdot C_{26}^{2}}{C_{52}^{5}}=0.3251301$$ R code:

    > dhyper(x = 3, m = 26, n = 26, k = 5)
    [1] 0.3251301

    PROBLEM 2

    In a population of 500 voters, 40% belong to Party X. A simple random sample of 60 voters is taken. What is the chance that a majority (more than 50%) of the sampled voters belong to Party X?

    Solution

    Hypergeometric distribution $x=31:60, m=200, n=300, k=60$: $$P( ext{majority voters belong to Party X})$$ $$=frac{C_{m}^{x}cdot C_{n}^{k-x}}{C_{m+n}^{k}}=frac{sum_{x=31}^{60}C_{200}^{x}cdot C_{300}^{60-x}}{C_{500}^{60}}=0.0348151$$ R code:

    > sum(dhyper(x = 31:60, m = 200, n = 300, k = 60))
    [1] 0.0348151

    PROBLEM 3

    In an egg carton there are 12 eggs, of which 9 are hard-boiled and 3 are raw. Six of the eggs are chosen at random to take to a picnic (yes, the draws are made without replacement). Find the chance that at least one of the chosen eggs is raw.

    Solution

    Hypergeometric distribution $x=1:3, m=3, n=9, k=6$: $$P( ext{at least one is raw})=frac{C_{m}^{x}cdot C_{n}^{k-x}}{C_{m+n}^{k}}=frac{sum_{x=1}^{3}C_{3}^{x}cdot C_{9}^{6-x}}{C_{12}^{6}}=0.9090909$$ R code:

    > sum(dhyper(x = 1:3, m = 3, n = 9, k = 6))
    [1] 0.9090909

    PROBLEM 4

    A box contains 8 dark chocolates, 8 white chocolates, and 8 milk chocolates. I choose chocolates at random (yes, without replacement; I’m eating them). What is the chance that I have chosen 20 chocolates and still haven’t got all the dark ones?

    Solution

    Hypergeometric distribution $x=0:7, m=8, n=16, k=20$: $$P( ext{less than 8 dark chocolates})=frac{C_{m}^{x}cdot C_{n}^{k-x}}{C_{m+n}^{k}}=frac{sum_{x=0}^{7}C_{8}^{x}cdot C_{16}^{20-x}}{C_{24}^{20}}=0.828722$$ R code:

    > sum(dhyper(x = 0:7, m = 8, n = 16, k = 20))
    [1] 0.828722
    > 1-dhyper(x = 8, m = 8, n = 16, k = 20)
    [1] 0.828722

    PROBLEM 5

    I throw darts repeatedly. Assume that on each throw I have a 1% chance of hitting the bullseye, independently of all other throws. (Note that this implies for example that repetition doesn’t help my aim get any better; in my case that might not be such a bad assumption.) Find the chance that it takes me more than 100 throws to hit the bullseye.

    Solution

    At least 101 throws including 100 fails and 1 success. so $$P( ext{more than 100 throws to hit the bullseye})=(1-0.01)^{100}=0.3660323$$ Alternatively, we can consider that "doesn't hit the bullseye within 100 throws" (geometric distribution $x=0:99, p=0.01$): $$P( ext{more than 100 throws to hit the bullseye})$$ $$=1-P( ext{at most 100 throws to hit the bullseye})$$ $$=1-sum_{x=0}^{99}(1-0.01)^xcdot0.01=0.3660323$$ R code:

    > 1 - sum(dgeom(x = 0:99, prob = 0.01))
    [1] 0.3660323

    PROBLEM 6

    If you bet on “red” at roulette, you have chance 18/38 of winning. (There will be more on roulette later in the course; for now, just treat it as a generic gambling game.) Suppose you make a sequence of independent bets on “red” at roulette, with the decision that you will stop playing once you’ve won 5 times. What is the chance that after 15 bets you are still playing?

    Solution

    After 15 bets you are still playing means "there are at most winning 4 times within 15 bets", hence it is binomial distribution that $n=15, k=0:4, p=frac{18}{38}$: $$P( ext{at most winning 4 times within 15 bets})$$ $$=sum_{k=0}^{4}C_{15}^{k}cdot(frac{18}{38})^kcdot(1-frac{18}{38})^{15-k}=0.08739941$$ R code:

    > sum(dbinom(x = 0:4, size = 15, prob = 18/38))
    [1] 0.08739941

    PROBLEM 7

    A school is running a raffle. There are 100 tickets, of which 3 are winners. You can assume that tickets are sold by drawing at random without replacement from the available tickets. Teacher X buys 10 raffle tickets, and so does Teacher Y. Find the chance that one of those two teachers gets all three winning tickets.

    Solution

    Hypergeometric distribution $x=3, m=3, n=97, k=10$: $$P( ext{teacher X or teacher Y gets all three winning tickets})$$ $$=P( ext{teacher X gets three winning tickets})+P( ext{teacher Y gets three winning tickets})$$ $$=2 imesfrac{C_{m}^{x}cdot C_{n}^{k-x}}{C_{m+n}^{k}}=2 imesfrac{C_{3}^{3}cdot C_{97}^{7}}{C_{100}^{10}}=0.00148423$$ R code:

    > 2 * dhyper(x = 3, m = 3, n = 97, k = 10)
    [1] 0.00148423


    作者:赵胤
    出处:http://www.cnblogs.com/zhaoyin/
    本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

  • 相关阅读:
    mysql最后一个内容orm
    mysql第五天:
    mysql第二天 数据的增删改查补充及外键
    MYsql 初识
    第二天openc的内容:图片的缩放、旋转、格式转换
    第二个内容第一天 opencv的基本内容:
    第五十七天 bom 的新知识
    第五十六天jQurey的内容新增:
    第五十五天jQery内容的进阶
    windows11 upgrade
  • 原文地址:https://www.cnblogs.com/zhaoyin/p/4187955.html
Copyright © 2011-2022 走看看