Part 1
The following random experiment is described. There are 5 identical bags, 4 of which contain 4 read beads and 96 black ones, the 5th instead has 7 and 93 resp. Select one bag according to the uniform distribution and sample three beads, 1 red and 2 black. What is the probability that the selected bag was the 5th?Part 2
Let's go back to the initial condition, pick one bag and then pick one bead at a time from that bag and stop when the probability of having picked the 5th bag is greater than 1/2. In the best case, how many beads do you need to pick?Solution
Let \(N\) be the number of beads in each bag, \(n\) the size of the sample, \(m\) and \(m'\) the number of red beads in each bag with \(m' > m\) and \(k\) the number of read beads in the sample. \(N=100\), \(n = 3\), \(m' = 7\), \(m = 4\) and \(k = 1\) in the first part of the interview challenge, with \(n\) and \(k\) becoming variable in the second part. Let \(X\) be the random variable corresponding to the number of red beads present in a sample. Conditional to the knowledge of the bag from which the extraction occurred, this variable has an hypergeometric distribution. Let M be the random variable corresponding to the number of red beads in the chosen bag.
\[
P(X = k|M=m) = \frac{\binom{m}{m}\binom{N-m}{n-k}}{\binom{N}{n}}
\]
that is \(X\) follows the hypergeometric distribution with parameters
\(N,n,m\) conditional to having selected a type of bag and
\[
P(M=m) = 4/5\]
\[
P(M=m') = 1/5
\]
assuming the uniform distribution in bag selection.
We are interested in:
\[
P(M = m'|X=k)
\]
that is distribution over bag types conditional to the outcome of a random draw.
Using the definition of conditional probability we have
\[
P(M = m'|X = k) = \frac{P(M = m' \wedge X = k)}{P(X = k)}
\]
and applying the same definition again we have
\[
P(M = m'|X = k) = \frac{P(X = k | M = m')P(M = m')}{P(X = k)}
\]
The numerator is the product of a hypergeometric distribution with
parameters \(N,n,m'\) and a constant. At the numerator, we apply the
law of alternatives to get
\[
P(X = k) = P(X = k| M= m') P(M = m')+P(X = k| M= m) P(M = m)
\]
Combining the last two we have
\[P(M = m'|X = k) = \]
\[
\frac{P(X = k | M = m')P(M = m')}{P(X = k| M= m') P(M = m')+P(X = k| M= m) P(M = m)} = \]
\[
\frac{1}{1+\frac{P(X = k| M= m) P(M = m)}{P(X = k| M= m') P(M = m')}} = \]
\[
\frac{1} {1+ \frac{\frac{\binom{m}{m}\binom{N-m}{n-k}}{\binom{N}{n}} \frac{4}{5}} {\frac{\binom{m}{m}\binom{N-m'}{n-k}}{\binom{N}{n}} \frac{1}{5} }}
\]
which can be simplified to
\[\frac{1}{1+\frac{4\binom{m}{k}\binom{N-m}{n-k}}{\binom{m'}{k}\binom{N-m'}{n-k}}}\qquad(1)
\]
We need only to substitute in the values to obtain the desired probability.
Now to the second part of the challenge, whereby one needs to find the smallest \(n\) such that for some \(k\) the above expression is greater than 1/2 (with the other parameters as before and within the allowed range for \(n\) and \(k\)). We want to show that the solution is \(n=k=3\) and we will do it in two parts. First we will show that Eq. (1) is maximized, for any given \(n\), when \(k=n\), which supports the intuition that the red bead rich bag will be more promptly identified when all the sampled beads are red. Then we will show that the smallest \(n\) such that Eq. (1) with \(k=n\) is greater than 1/2 is 3. To establish the first part we will show that
\[\frac{1}{1+\frac{4\binom{m}{k'}\binom{N-m}{n-k'}}{\binom{m'}{k'}\binom{N-m'}{n-k'}}} \ge
\frac{1}{1+\frac{4\binom{m}{k}\binom{N-m}{n-k}}{\binom{m'}{k}\binom{N-m'}{n-k}}} \qquad (2)
\]
if and only if \(k' \ge k\). We will prove the special case \(k' = k + 1\) from which the general case follows by induction.
By simple algebraic manipulation and substituting \(k'\) Eq. (2) is equivalent to:
\[\frac{\binom{m}{k+1}\binom{N-m}{n-k-1}}{\binom{m'}{k+1}\binom{N-m'}{n-k-1}} < \frac{\binom{m}{k}\binom{N-m}{n-k}}{\binom{m'}{k}\binom{N-m'}{n-k}} \]
Expanding the binomial coefficients we get:
\[\frac{\frac{m!}{(k+1)!(m-k-1)!}\frac{(N-m)!}{(n-k-1)!(N-m-n+k+1)!}} {\frac{m'!}{(k+1)!(m'-k-1)!}\frac{(N-m')!}{(n-k-1)!(N-m'-n+k+1)!}} < \frac{\frac{m!}{k!(m-k)!}\frac{(N-m)!}{(n-k)!(N-m-n+k)!}} {\frac{m'!}{k!(m'-k)!}\frac{(N-m')!}{(n-k)!(N-m'-n+k)!}} \]
By simple algebraic manipulations we have:
\[\frac{N-m'-n+k+1}{N-m-n+k+1} < \frac{m'-k}{m-k} \]
Since \(m' > m\) we can upper bound the left side with 1 and lower bound
the left side with 1, which completes this part of the proof.
Now we have established Eq. (2), we know that Eq. (1) is
maximized, for every \(n\), by setting \(k=n\). With this substitution
our goal becomes:
\[\frac{1}{1+\frac{4\binom{m}{n}}{\binom{m'}{n}}} \ge \frac{1}{2}
\]
which is equivalent to
\[
4\binom{m}{n}\le \binom{m'}{n}
\]
Substituting in the values of \(m\) and \(m'\) and trying
\(n \in \{1,2,3\}\) we find that \(n=3\) is the solution.
0 comments:
Post a Comment