Introduction to Probability & Statistics

Assignment 4, 2025/26

Instructions

Submit your answers to the four questions marked Hand-in. You should upload your solutions to the VLE as a single pdf file. Marks will be awarded for clear, logical explanations, as well as for correctness of solutions.
Solutions to questions marked have been released at the same time as this assignment, in case you want to check your answers or need a hint.
You should also look at the other questions in preparation for your Week 9 seminar.

Starters

These questions should help you to gain confidence with the basics.

S1. Let \(X\) be a random variable with \(\mathbb{E}\left[X\right]=5\). What is the expectation of \(3X+5\)? If furthermore \(\mathbb{E}\left[X^2\right]=30\), what is the variance of \(X\)?

Answer

We can use the linearity of expectation to find that \(\mathbb{E}\left[3X+5\right] =3\mathbb{E}\left[X\right]+5=20\). The variance is \(\textup{Var}\left(X\right)=\mathbb{E}\left[X^2\right]-\mathbb{E}\left[X\right]^2=30-5^2=5\).

S2. I arrive at the train station at 12.00 exactly. My train departs at a time which follows a (continuous) uniform distribution on the interval [11.55, 12.15]. What is the probability that I miss my train?

Answer

Let \(X\) denote the random time after 11.55 at which the train leaves. The question tells us that \(X\sim \mbox{\textup{Uniform}}[0,20]\). I miss the train if \(X<5\), which has probability \[ \mathbb{P}\left(X<5\right) = \int_0^5 \frac{1}{20} dx = \frac{1}{4} \,. \]

S3. Hand-in
Buses leave campus for the train station every 20 minutes, at 5, 25, and 45 minutes past the hour. If a student arrives at the bus stop at a time that follows a (continuous) uniform distribution on the interval between 09.00 and 09.35, find the probability that they wait

  1. less than 5 minutes for a bus;
  2. at least 10 minutes for a bus.
Answer

Let \(Y\) denote the number of minutes past 09.00 that the student arrives at the bus stop: \(Y\sim\mbox{\textup{Uniform}}[0,35]\). [1 mark]

  1. They will wait less than 5 minutes if and only \(0\leq Y\leq 5\) or \(20\leq Y\leq 25\). This occurs with probability

\[ \mathbb{P}\left(0\leq Y\leq 5\right) + \mathbb{P}\left(20\leq Y\leq 25\right) = \int_{0}^{5}\frac{1}{35}dy + \int_{20}^{25}\frac{1}{35}dy = \frac{2}{7}\,. \]

[2 marks]

  1. Similarly, they will wait at least 10 minutes if they arrive between 09.05 and 09.15, or between 09.25 and 09.35. This has probability \(20/35 = 4/7\). [2 marks]

S4. Suppose that you have a lecture at 14.00, and that the time taken to travel from your room to the lecture theatre is normally distributed with mean 30 minutes and standard deviation 4 minutes. What is the latest time you should leave your room if you want to be 99% certain that you will not miss the start of the lecture? (Hint: if \(Z\sim\mbox{\textup{N}}(0,1)\) then the R function qnorm(p) returns the value \(z\in\mathbb{R}\) such that \(\mathbb{P}\left(Z\le z\right) = p\).)

Answer

Let \(X\) denote the travel time to the lecture: \(X\sim\mbox{\textup{N}}(30,16)\). We wish to find \(x\) such that \(\mathbb{P}\left(X\leq x\right) = 0.99\). Now, \[ \mathbb{P}\left(X\leq x\right) = \mathbb{P}\left(\frac{X-30}{4} \leq \frac{x-30}{4}\right) = \mathbb{P}\left(Z\leq \frac{x-30}{4}\right) \] where \(Z\sim\mbox{\textup{N}}(0,1)\).

We can get hold of this value of \(x\) by using R (or by consulting statistical tables): qnorm(0.99) gives the value 2.326, meaning that \(\mathbb{P}\left(Z\leq 2.326\right) = 0.99\). Thus we require \((x-30)/4 = 2.326 \iff x=39.3\). Thus the latest you should leave your room is 39.3 minutes before the start of the lecture: i.e. at 13:20.

S5. Give an example of a joint probability table for two discrete random variables \(X\) and \(Y\), each having only two possible values, so that \(F_{X,Y}(5,6)=0.4, F_X(5)=0.5, F_Y(6)=0.6\) and \(\mathbb{E}\left[X\right]=10, \mathbb{E}\left[Y\right]=4\).

Answer

One possible example would be

\(y\backslash x\) 0 20 \(p_Y(y)\)
0 0.4 0.2 0.6
10 0.1 0.3 0.4
\(p_X(x)\) 0.5 0.5 1

S6. Let \(X:\Omega\to\{1,2\}\) and \(Y:\Omega\to\{0,1\}\) be two discrete random variables. The following is a partial table of their joint and their marginal mass functions:

\(y\backslash x\) 1 2 \(p_Y(y)\)
0 \(1/6\) \(1/2\)
1
\(p_X(x)\) \(5/12\) \(1\)
  1. Fill in the missing values.
  2. Determine the joint distribution function of \(X\) and \(Y\).
  3. Calculate \(\mathbb{E}\left[X\right]\) and \(\mathbb{E}\left[Y\right]\).
  4. Let \(Z = XY\). Calculate \(\mathbb{E}\left[Z\right]\).
Answer
  1. The missing entries in the probability table are determined by the requirement that summing the joint probabilities across a row or across a column in the table gives the corresponding marginal probability and by the requirement that the marginal probabilities for \(X\) as well as those for \(Y\) have to add up to \(1\). So first we determine \(p_Y(0)=1/6+1/2=2/3\). Then we can determine \(p_Y(1)=1-p_Y(0)=1-2/3=1/3\) and \(p_X(2)=1-p_X(1)=1-5/12=7/12\). Finally we determine \(p_{X,Y}(1,1)=p_X(1)-p_{X,Y}(1,0)=5/12-1/6=1/4\) and \(p_{X,Y}(2,1)=p_X(2)-p_{X,Y}(2,0)=7/12-1/2=1/12\).
\(y\backslash x\) 1 2 \(p_Y(y)\)
0 \(1/6\) \(1/2\) \(2/3\)
1 \(1/4\) \(1/12\) \(1/3\)
\(p_X(x)\) \(5/12\) \(7/12\) \(1\)
  1. The joint distribution function \(F_{X,Y}(x,y)\) is by definition given by \(\mathbb{P}\left(X\leq x,Y\leq y\right)\). So for example \[ F_{X,Y}(1.5,1.5)=p_{X,Y}(1,0)+p_{X,Y}(1,1)=\frac{1}{6}+\frac{1}{4}=\frac{5}{12}. \] By doing more such calculations we find that \[ F_{X,Y}=\begin{cases} 0 &\text{ if }x<1 \text{ or }y<0\\ 1/6&\text{ if }x\in[1,2) \text{ and }y\in[0,1)\\ 5/12&\text{ if }x\in[1,2) \text{ and }y\geq 1\\ 2/3&\text{ if }x\geq 2\text{ and }y\in[0,1)\\ 1&\text{ if }x\geq 2 \text{ and }y\geq 1. \end{cases} \]

  2. For calculating the expectations of \(X\) and \(Y\) we can use their marginal mass functions: \[ \mathbb{E}\left[X\right]=1\cdot p_X(1)+2\cdot p_X(2)=1\cdot \frac{5}{12}+2\cdot \frac{7}{12}=\frac{19}{12} \] and \[ \mathbb{E}\left[Y\right]=0\cdot p_Y(0)+1\cdot p_Y(1)=p_Y(1)=\frac{1}{3}\,. \]

  3. The random variable \(Z=XY\) can take the possible values \(0\), \(1\) and \(2\) with probabilities \[ \begin{split} p_Z(0)&=p_{X,Y}(1,0) + p_{X,Y}(2,0) = p_Y(0)=\frac23\\ p_Z(1)&=p_{X,Y}(1,1)=\frac14, ~~~p_Z(2)=p_{X,Y}(2,1)=\frac{1}{12}. \end{split} \] Thus \[ \mathbb{E}\left[Z\right]=1\cdot p_{Z}(1)+2\cdot p_{Z}(2)=\frac14+2\frac{1}{12}=\frac{5}{12}\,. \]

S7. Hand-in
Let \(X:\Omega\to\{0,1\}\) and \(Y:\Omega\to\{0,1\}\) be two discrete random variables. The following is a partial table of their joint and their marginal mass functions:

\(y\backslash x\) 0 1 \(p_Y(y)\)
0 \(1/8\) \(1/4\)
1 \(1/2\)
\(p_X(x)\)
  1. Fill in the missing values.
  2. Calculate \(\mathbb{E}\left[XY\right]\).
Answer
  1. The completed table is as follows [2 marks]
\(y\backslash x\) 0 1 \(p_Y(y)\)
0 \(1/8\) \(1/8\) \(1/4\)
1 \(1/4\) \(1/2\) \(3/4\)
\(p_X(x)\) \(3/8\) \(5/8\) \(1\)
  1. Note that the random variable \(XY\) equals 0 unless \((X,Y) = (1,1)\), in which case \(XY=1\). (This means that \(XY\) has a Bernoulli distribution.) Thus \(\mathbb{E}\left[XY\right] = \mathbb{P}\left((X,Y)=(1,1)\right) = 1/2\). [3 marks]

Mains

These are important, and cover some of the most substantial parts of the course.

M1. A random variable \(W\) has probability density function \[ f_W(x) = \begin{cases} \frac{6}{5675}(5x^2+3x+11)& \qquad \text{for } 3\leq x\leq 8\\ 0& \qquad \text{otherwise.} \end{cases} \] Would you expect \(\mathbb{E}\left[W\right]\) to lie closer to 3 or to 8? Calculate \(\mathbb{E}\left[W\right]\) and check whether your intuition was correct.

Answer

Since \(f_W\) is increasing on the interval \([3, 8]\) we know from the interpretation of expectation as centre of mass that the expectation should lie closer to 8 than to 3. The computation: \[\mathbb{E}\left[W\right] = \int_3^8 x f_W(x) dx = \frac{6}{5675}\int_3^8 \left(5x^3+3x^2+11x\right) dx = \frac{2787}{454} = 6.14.\]

M2. Let \(X\sim\mbox{\textup{Geom}}(p)\). Calculate \(\mathbb{E}\left[h(X)\right]\), where \(h(x) = e^{tx}\) for some \(t>0\). For what values of \(t\) is \(\mathbb{E}\left[h(X)\right]<\infty\)?

Answer

We use the formula for the expectation of a function of a discrete random variable: \[ \begin{split} \mathbb{E}\left[h(X)\right] &= \sum_{k=1}^\infty h(k)p(1-p)^{k-1} = \sum_{k=1}^\infty e^{tk}p(1-p)^{k-1} \\ &= pe^t \sum_{k=1}^\infty \left[e^t(1-p)\right]^{k-1} = pe^t \sum_{k=0}^\infty \left[e^t(1-p)\right]^k \\ &= \frac{pe^t}{1-e^t(1-p)}\,. \end{split} \]

This final step requires \(e^t(1-p)<1\). (Otherwise the geometric sum does not converge to a finite limit.)

M3. Hand-in
Show that if \(Z\) is a standard normal random variable then, for \(x>0\),

  1. \(\mathbb{P}\left(Z>x\right)=\mathbb{P}\left(Z<-x\right)\);
  2. \(\mathbb{P}\left(|Z|>x\right)=2\mathbb{P}\left(Z>x\right)\);
  3. \(\mathbb{P}\left(|Z|<x\right)=2\mathbb{P}\left(Z<x\right)-1\).

Hint: for part (a), express the probabilities in terms of integrals over the density function \(\phi\), and use the fact that \(\phi\) is an even function (i.e. \(\phi(z) = \phi(-z)\)).

Answer

There are many ways to show these identities. We use the hint about the symmetry of the density function of a standard normal random variable: \[\phi(-z)=\frac{1}{\sqrt{2\pi}}\exp\left(-\frac{(-z)^2}{2}\right) =\frac{1}{\sqrt{2\pi}}\exp\left(-\frac{z^2}{2}\right)=\phi(z).\]

  1. \[\mathbb{P}\left(Z>x\right)=\int_{x}^\infty \phi(z)dz=\int_{-\infty}^{-x} \phi(-u)du=\int_{-\infty}^{-x} \phi(u)du=\mathbb{P}\left(Z<-x\right);\] [1 mark]

  2. \[\mathbb{P}\left(|Z|>x\right)=\mathbb{P}\left(Z>x\right)+\mathbb{P}\left(Z<-x\right)=2\mathbb{P}\left(Z>x\right)\,, \] where the last equality follows from part (a). [2 marks]

  3. \[ \begin{split} \mathbb{P}\left(|Z|<x\right)&=1-\mathbb{P}\left(|Z|>x\right)=1-2\mathbb{P}\left(Z>x\right)\\ &=1-2\left(1-\mathbb{P}\left(Z<x\right)\right)=2\mathbb{P}\left(Z<x\right)-1\,, \end{split} \] where the second equality follows from part (b). [2 marks]

M4. Hand-in
Let \(X\sim\mbox{\textup{Exp}}(\lambda)\). Use proof by induction to show that \[ \mathbb{E}\left[X^m\right]=\frac{m!}{\lambda^m} \qquad \text{for all $m\in\mathbb{N}\cup\{0\}$.} \]

Answer

The statement is true for \(m=0\): \[ \mathbb{E}\left[X^0\right]=\mathbb{E}\left[1\right]=1=\frac{0!}{\lambda^0}. \] [1 mark]

Now suppose that the statement holds for some \(k\in\mathbb{N}\cup\{0\}\), and consider the case \(m=k+1\): \[ \begin{split} \mathbb{E}\left[X^{k+1}\right] &=\int_{-\infty}^\infty x^{k+1}f_X(x)dx=\int_{0}^\infty x^{k+1}\lambda e^{-\lambda x}dx\\ &=\int_0^\infty x^{k+1}\frac{d}{dx}\left(-e^{-\lambda x}\right)dx\\ &=-\left[x^{k+1}e^{-\lambda x}\right]^\infty_0+\int_0^\infty(k+1)x^ke^{-\lambda x}dx\\ &=0+\frac{k+1}{\lambda}\int_0^\infty x^k\lambda e^{-\lambda x}dx\\ &=\frac{k+1}{\lambda}\mathbb{E}\left[X^k\right]=\frac{k+1}{\lambda}\frac{k!}{\lambda^k}~~\text{ by our induction hypothesis}\\ &=\frac{(k+1)!}{\lambda^{k+1}}. \end{split} \] Thus the statement holds for all \(m\in\mathbb{N}\cup\{0\}\) by induction. [4 marks]

M5. Let \(X\) and \(Y\) be random variables. Show that \(\textup{Cov}\left(X,Y\right)=\mathbb{E}\left[XY\right]-\mathbb{E}\left[X\right]\mathbb{E}\left[Y\right]\).

Answer

We start from the definition of covariance, and use linearity of expectation: \[ \begin{split} \textup{Cov}\left(X,Y\right)&=\mathbb{E}\left[\left(X-\mathbb{E}\left[X\right]\right)\left(Y-\mathbb{E}\left[Y\right]\right)\right] \\ &=\mathbb{E}\left[XY-X\mathbb{E}\left[Y\right]-\mathbb{E}\left[X\right]Y+\mathbb{E}\left[X\right]\mathbb{E}\left[Y\right]\right] \\ &=\mathbb{E}\left[XY\right]-\mathbb{E}\left[X\right]\mathbb{E}\left[Y\right]-\mathbb{E}\left[X\right]\mathbb{E}\left[Y\right]+\mathbb{E}\left[X\right]\mathbb{E}\left[Y\right] \\ &=\mathbb{E}\left[XY\right]-\mathbb{E}\left[X\right]\mathbb{E}\left[Y\right]. \end{split} \]

M6. The joint probability mass function \(p_{X,Y}(x,y)\) of two random variables \(X\) and \(Y\) is summarised by the following table, where \(\eta\) is some real number:

\(x\backslash y\) -1 0 1
4 \(\eta-1/16\) \(1/4-\eta\) \(0\)
5 \(1/8\) \(3/16\) \(1/8\)
6 \(\eta+1/16\) \(1/16\) \(1/4-\eta\)
  1. Extend the table by including also the marginal probabilities, i.e., the values of the probability mass functions \(p_X\) and \(p_Y\).

  2. Which are the valid choices for \(\eta\)?

  3. Is there a value of \(\eta\) for which \(X\) and \(Y\) are independent?

Answer
  1. We extend the probability table to also include the marginal probability mass functions \(p_X\) and \(p_Y\):
\(x\backslash y\) -1 0 1 \(p_X(x)\)
4 \(\eta-1/16\) \(1/4-\eta\) \(0\) \(3/16\)
5 \(1/8\) \(3/16\) \(1/8\) \(7/16\)
6 \(\eta+1/16\) \(1/16\) \(1/4-\eta\) \(3/8\)
\(p_Y(y)\) \(2\eta+1/8\) \(1/2-\eta\) \(3/8-\eta\) \(1\)
  1. All entries of the probability table must be non-negative and they must sum up to \(1\). In order for \(p_{X,Y}(4,-1)\) to be non-negative we need \(\eta\geq 1/16\). In order for \(p_{X,Y}(4,0)\) and \(p_{X,Y}(6,1)\) to be non-negative we need \(\eta\leq 1/4\). The sum over all entries is not affected by the value of \(\eta\), so does not give any additional constraints. Therefore any \(\eta\in[1/16,1/4]\) is a valid choice.

  2. It is easy to find counterexamples to the factorisation of the joint probability mass function that would have to hold if \(X\) and \(Y\) were independent. For example \[ p_X(4)p_Y(1)=\frac{3}{16}\left(\frac{3}{8}-\eta\right)\neq 0 =p_{X,Y}(4,1) \] unless \(\eta=3/8\). However the value \(\eta=3/8\) is not allowed, and hence \(X\) and \(Y\) can never be independent.

M7. A married couple decide to have children until they have at least one child of each sex: let \(X\) denote the total number of children that they have. The probability of any one child being a boy is \(1/2\) (with the sex of each child being independent of all the others).

  1. What is the mass function of \(X\)? (I.e. write down \(\mathbb{P}\left(X=n\right)\) for all \(n\in \mathbb{N}\).)

  2. Show that \[ \mathbb{E}\left[X\right] = 3 \,. \] Hint: you may find it useful to refer to the result from lectures that if \(Y\sim\mbox{\textup{Geom}}(p)\) then \(\mathbb{E}\left[Y\right] = 1/p\).

Answer
  1. Clearly the couple need to have at least two children, so \(\mathbb{P}\left(X=1\right) = 0\). For \(n\geq 2\), there are two ways in which the couple can have exactly \(n\) children: either they have \(n-1\) boys in a row, and then a girl; or they have \(n-1\) girls and then a boy. Each of these possibilities has probability \((1/2)^n\). Thus \[ \mathbb{P}\left(X=n\right) =(1/2)^n + (1/2)^n = (1/2)^{n-1}\,, \qquad n\geq 2 \,. \]

  2. Here are two possible ways of calculating \(\mathbb{E}\left[X\right]\).

Method 1: We use the usual formula for the expectation of a discrete random variable: \[ \begin{split} \mathbb{E}\left[X\right] &= \sum_{n=2}^\infty n \mathbb{P}\left(X=n\right) = \sum_{n=2}^\infty n (1/2)^{n-1} \end{split} \]

Using the hint, we know that if \(Y \sim\mbox{\textup{Geom}}(1/2)\) then \(\mathbb{E}\left[Y\right] = 2\). That is, \[ \sum_{n=1}^\infty n (1/2)(1/2)^{n-1} = 2. \] We now manipulate our expression for the expectation, until it looks like something involving this result: \[ \begin{split} \mathbb{E}\left[X\right] &= \sum_{n=2}^\infty n (1/2)^{n-1} = 2 \sum_{n=2}^\infty n (1/2)(1/2)^{n-1} \\ &= 2 \left[\sum_{n=1}^\infty n (1/2)(1/2)^{n-1} - 1/2 \right] \\ &= 2[ 2-1/2] = 3 \,. \end{split} \]

Method 2: If we let \(Y=X-1\) then this random variable takes values in the set \(\mathbb{N}\) and has mass function \[ \mathbb{P}\left(Y=n\right) = \mathbb{P}\left(X-1=n\right) = \mathbb{P}\left(X=n+1\right) = (1/2)^n \] for \(n\in\mathbb{N}\). Thus \(Y\sim\mbox{\textup{Geom}}(1/2)\). It follows that \(\mathbb{E}\left[X\right] = \mathbb{E}\left[Y+1\right] =\mathbb{E}\left[Y\right]+1 = 2+1 = 3\). (Here we’re effectively observing that the couple start by having one child, who could be of either sex; they then need to have an additional random number of children until they have one of the opposite sex to the first – this is like repeating independent Bernoulli trials, with “success” meaning that they have a child of the opposite sex.)

M8. Let \(X\) be a discrete random variable. Show that for all functions \(h_{1},h_{2}:\mathbb{R}\rightarrow \mathbb{R}\), \[ \mathbb{E}\left[h_1(X) + h_2(X)\right] =\mathbb{E}\left[h_1(X)\right] + \mathbb{E}\left[h_2(X)\right]\,. \]

Answer

Let \(h(x)=h_1(x)+ h_2(x)\). From the formula for the expectation of a function of a discrete random variable it follows that \[ \begin{split} \mathbb{E}\left[h(X)\right]&=\sum_{k\in X(\Omega)}h(k)p_X(k)\\ &=\sum_{k\in X(\Omega)}(h_1(k)+ h_2(k))p_X(k)\\ &=\sum_{k\in X(\Omega)}h_1(k)p_X(k)+ \sum_{k\in X(\Omega)}h_2(k)p_X(k)\\ &=\mathbb{E}\left[h_1(X)\right]+ \mathbb{E}\left[h_2(X)\right]\,. \end{split} \]

M9. Let \(X\) and \(Y\) be random variables and let \(r,s,t,u\in\mathbb{R}\). Show that \[ \rho(rX+s,tY+u)=\begin{cases}\rho(X,Y)&\text{ if }rt>0\\ 0&\text{ if }rt= 0\\-\rho(X,Y)&\text{ if }rt<0\\\end{cases} \] where \(\rho(X,Y)\) denotes the correlation coefficient of \(X\) and \(Y\).

Answer

Let us first assume that \(\textup{Var}\left(X\right)\textup{Var}\left(Y\right)>0\) and \(rt>0\). Then the definition of the correlation coefficient gives \[ \rho(rX+s,tY+u)=\frac{\textup{Cov}\left(rX+s,tY+u\right)}{\sqrt{\textup{Var}\left(rX+s\right)\textup{Var}\left(tY+u\right)}}. \tag{1}\]

We already know that \[ \textup{Var}\left(rX+s\right)=r^2\textup{Var}\left(X\right),~~~\textup{Var}\left(tY+u\right)=t^2\textup{Var}\left(Y\right). \tag{2}\]

We need to derive a similar transformation rule for the covariance. \[ \begin{split} \textup{Cov}\left(rX+s,tY+u\right)&=\mathbb{E}\left[\left(rX+s-\mathbb{E}\left[rX+s\right]\right)\left(tY+u-\mathbb{E}\left[tY+u\right]\right)\right]\\ &=\mathbb{E}\left[\left(rX+s-(r\mathbb{E}\left[X\right]+s)\right)\left(tY+u-(t\mathbb{E}\left[Y\right]+u)\right)\right]\\ &=\mathbb{E}\left[r\left(X-\mathbb{E}\left[X\right]\right)t\left(Y-\mathbb{E}\left[Y\right]\right)\right]\\ &=rt\mathbb{E}\left[\left(X-\mathbb{E}\left[X\right]\right)\left(Y-\mathbb{E}\left[Y\right]\right)\right]\\ &=rt\textup{Cov}\left(X,Y\right)\,, \end{split} \tag{3}\] where we repeatedly used the linearity of expectation. Using the transformation rules Equation 2 and Equation 3 in Equation 1 gives \[ \rho(rX+s,tY+u)=\frac{rt}{\sqrt{r^2t^2}}\frac{\textup{Cov}\left(X,Y\right)}{\sqrt{\textup{Var}\left(X\right)\textup{Var}\left(Y\right)}}. \] The statement now follows from the observation that \[ \frac{rt}{\sqrt{r^2t^2}}=\begin{cases}1&\text{ if }rt>0\\-1&\text{ if }rt<0\,.\end{cases} \] In case \(\textup{Var}\left(X\right)\textup{Var}\left(Y\right)=0\) or \(rt=0\) also \(\textup{Var}\left(rX+s\right)\textup{Var}\left(tY+u\right)=rt\textup{Var}\left(X\right)\textup{Var}\left(Y\right)=0\), and thus \(\rho(rX+s,tY+u)=0\) by definition. This agrees with the statement because when \(\textup{Var}\left(X\right)\textup{Var}\left(Y\right)=0\) also \(\rho(X,Y)=0\).

Desserts

Still hungry for more? Try these if you want to push yourself further. (These are mostly harder than I’d expect you to answer in an exam, or involve non-examinable material.)

D1. Prove that binomial coefficients satisfy the identity \[n{{n-1}\choose{r-1}} = r{n\choose r}.\] Use this to find \(\mathbb{E}\left[X\right]\) and \(\textup{Var}\left(X\right)\), where \(X\sim\mbox{\textup{Bin}}(n,p)\).

Answer

First we prove the identity: \[ n{{n-1}\choose{r-1}} = n \, \frac{(n-1)!}{(r-1)!(n-r)!} = r\, \frac{n!}{r!(n-r)!} = r{n\choose r} \,. \]

For the mean and variance, remember that, since \(p_X(\cdot)\) is a mass function, it must sum to one. That is,

\[ \sum_{k=0}^n \binom{n}{k}p^k(1-p)^{n-k} = 1\,. \tag{4}\]

Now, \[ \begin{split} \mathbb{E}\left[X\right] &= \sum_{k=0}^n k {n\choose k}p^k(1-p)^{n-k} \\ &= \sum_{k=1}^n n {{n-1}\choose{k-1}} p^k(1-p)^{n-k} \quad\text{(by our identity)} \\ &= np \sum_{k=1}^n {{n-1}\choose{k-1}} p^{k-1}(1-p)^{n-k} \\ &= np \sum_{j=0}^{n-1} {{n-1}\choose{j}} p^{j}(1-p)^{(n-1)-j} \quad\text{(putting $j=k-1$)} \\ &= np \,, \end{split} \] thanks to Equation 4.

Furthermore, \[ \begin{split} \mathbb{E}\left[X(X-1)\right]&=\sum_{k=0}^nk(k-1){n\choose k}p^k(1-p)^{n-k}\\ &=\sum_{k=0}^nk(k-1)\frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}\\ &=\sum_{k=0}^n\frac{n!}{(n-k)!(k-2)!}p^k(1-p)^{n-k}\\ &=n(n-1)p^2\sum_{k=2}^n\frac{(n-2)!}{((n-2)-(k-2))!(k-2)!}p^{k-2}(1-p)^{(n-2)-(k-2)}\\ &=n(n-1)p^2\sum_{j=0}^{n-2}{{n-2}\choose{j}}p^j(1-p)^{(n-2)-j}\quad\text{(putting $j=k-2$)} \\ &=n(n-1)p^2, \end{split} \] again thanks to Equation 4. It follows that

\[ \mathbb{E}\left[X^2\right]=n(n-1)p^2+np \,, \]

and so \[ \textup{Var}\left(X\right)=\mathbb{E}\left[X^2\right]-\mathbb{E}\left[X\right]^2=np(1-p) \,. \]

D2. Let \(X\sim \mbox{\textup{Uniform}}(0,a)\) for some \(a>0\). Show that for any \(n\in\mathbb{N}\), \[ \mathbb{E}\left[X^n\right] =\frac{a^n}{n+1}. \] Use this to determine \(\rho(X,X^2)\), and show that this does not depend upon the value of \(a\).

Answer

For \(n\in\mathbb{N}\) we calculate \[ \mathbb{E}\left[X^n\right]=\int_{-\infty}^\infty x^nf_X(x)dx=\int_0^a\frac{x^n}{a}dx=\frac{1}{a}\left[\frac{x^{n+1}}{n+1}\right]^a_0=\frac{a^n}{n+1}.\] Now we calculate the covariance of \(X\) and \(X^2\): \[ \textup{Cov}\left(X,X^2\right)=\mathbb{E}\left[X^3\right]-\mathbb{E}\left[X\right]\mathbb{E}\left[X^2\right] = \frac{a^3}{4}-\frac{a^2}{3}\frac{a}{2}=\frac{a^3}{12}\,. \] We also have \[ \textup{Var}\left(X\right)=\mathbb{E}\left[X^2\right]-\mathbb{E}\left[X\right]^2=\frac{a^2}{3}-\left(\frac{a}{2}\right)^2=\frac{a^2}{12} \] and \[ \textup{Var}\left(X^2\right)=\mathbb{E}\left[X^4\right]-\mathbb{E}\left[X^2\right]^2=\frac{a^4}{5}-\left(\frac{a^2}{3}\right)^2=\frac{4a^4}{45}. \] Finally, we calculate \[ \rho(X,X^2)=\frac{\textup{Cov}\left(X,X^2\right)}{\sqrt{\textup{Var}\left(X\right)\textup{Var}\left(X^2\right)}}=\frac{a^3/12}{\sqrt{a^6/135}}=\frac{\sqrt{135}}{12}=\frac{\sqrt{15}}{4}, \] which doesn’t depend upon \(a\).

D3. A bag contains 3 cubes, 4 pyramids and 7 spheres. An object is drawn randomly from the bag and its type is recorded. Then the object is replaced. This is repeated 20 times.

  1. Let \(C_i\) be the indicator random variable for the event that the \(i\)-th draw gives a cube, for \(i=1,\dots,20\). Calculate \(\mathbb{E}\left[C_i\right], \mathbb{E}\left[C_i^2\right]\) and \(\mathbb{E}\left[C_iC_j\right]\) for \(i\neq j\).

  2. Let \(C\) be the number of times a cube was drawn, Use that \(C=\sum_{i=1}^{20}C_i\) to calculate \(\mathbb{E}\left[C\right]\) and \(\textup{Var}\left(C\right)\).

  3. Let \(S_i\) be the indicator random variable for the event that the \(i\)-th draw gives a sphere. Calculate \(\mathbb{E}\left[C_iS_i\right]\) and \(\mathbb{E}\left[C_iS_j\right]\) for \(i\neq j\).

  4. Let \(S\) be the number of times a sphere was drawn. Use the above results to calculate \(\mathbb{E}\left[CS\right]\), \(\textup{Cov}\left(C,S\right)\), \(\rho(C,S)\).

Answer
  1. As three of the 14 shapes are cubes, the probability to draw a cube is \(3/14\). Hence \(C_i\sim\mbox{\textup{Bern}}(3/14)\). This immediately gives \[ \mathbb{E}\left[C_i\right] = \mathbb{E}\left[C_i^2\right] = \frac{3}{14}. \] For \(i\neq j\) the event that the \(i\)-th draw gives a cube and the event that the \(j\)-th cube gives a draw are independent (because we put the shape back after each draw). Thus the indicator random variables \(C_i\) and \(C_j\) for these events are also independent and thus \[ \mathbb{E}\left[C_iC_j\right]=\mathbb{E}\left[C_i\right]\mathbb{E}\left[C_j\right]=\left(\frac{3}{14}\right)^2=\frac{9}{196}. \]

  2. The linearity of expectation gives \[ \mathbb{E}\left[C\right]=\mathbb{E}\left[\sum_{i=1}^{20} C_i\right]=\sum_{i=1}^{20}\mathbb{E}\left[C_i\right] = 20\frac{3}{14}=\frac{30}{7}. \] Because the \(C_i\) are independent of each other, the variance of their sum equals the sum of their variances: \[ \textup{Var}\left(C\right)=\textup{Var}\left(\sum_{i=1}^{20} C_i\right)=\sum_{i=1}^{20} \textup{Var}\left(C_i\right)=20 \frac{3}{14}\frac{11}{14}=\frac{165}{49}. \]

  3. We observe that \(C_iS_i=0\) because on the same draw one can not simultaneously have a cube and a sphere. Thus also \(\mathbb{E}\left[C_iS_i\right]=0\). If \(i\neq j\) we can use independence to factorise the expectation: \[ \mathbb{E}\left[C_iS_j\right]=\mathbb{E}\left[C_i\right]\mathbb{E}\left[S_j\right]=\frac{3}{14}\frac{1}{2}=\frac{3}{28}\,, \] where we used that the probability of drawing a sphere is \(1/2\).

  4. We have \[ \mathbb{E}\left[CS\right]=\mathbb{E}\left[\sum_{i=1}^{20} C_i \sum_{j=1}^{20} S_j\right]=\sum_{i=1}^{20} \sum_{j=1}^{20} \mathbb{E}\left[C_i S_j\right]. \] We split the sum over all pairs \((i,j)\) into the pairs where \(i\neq j\) and the pairs \((i,i)\), so \[ \mathbb{E}\left[CS\right]=\sum_{i=1}^{20} \sum_{\stackrel{j=1}{j\neq i}}^{20} \mathbb{E}\left[C_i S_j\right] + \sum_{i=1}^{20} \mathbb{E}\left[C_i S_i\right]\,. \] Using our above results for \(\mathbb{E}\left[C_iS_i\right]\) and \(\mathbb{E}\left[C_i S_j\right]\) and recognising that there are \(20\cdot 19=380\) pairs where \(i\neq j\) this gives us \[ \mathbb{E}\left[CS\right]=\sum_{i=1}^{20} \sum_{\stackrel{j=1}{j\neq i}}^{20}\frac{3}{28}+ \sum_{i=1}^{20} 0=380\frac{3}{28}=\frac{285}{7}\,. \] We also calculate \[ \mathbb{E}\left[S\right]=\mathbb{E}\left[\sum_{i=1}^{20} S_i\right]=\sum_{i=1}^{20}\mathbb{E}\left[S_i\right] = 20\frac{1}{2}=10. \] The covariance can then be calculated as \[ \textup{Cov}\left(C,S\right)=\mathbb{E}\left[CS\right]-\mathbb{E}\left[C\right]\mathbb{E}\left[S\right]=\frac{285}{7}-\frac{30}{7}10=-\frac{15}{7}. \] To calculate the correlation coefficient we also need \[ \textup{Var}\left(S\right)=\textup{Var}\left(\sum_{i=1}^{20} S_i\right)=\sum_{i=1}^{20} \textup{Var}\left(S_i\right)=20 \frac12\frac12=5. \] The correlation coefficient is \[ \rho(C,S)=\frac{\textup{Cov}\left(C,S\right)}{\sqrt{\textup{Var}\left(C\right)\textup{Var}\left(S\right)}}=-\sqrt{\frac{3}{11}}\approx -0.5222. \]

D4. Consider a random variable \(X\sim\mbox{\textup{Uniform}}[a,b]\), where \(a\) and \(b\) are unknown. You are told that \[ \mathbb{P}\left(X<2\right) = 1/3 \quad \text{and} \quad \mathbb{P}\left(1<X\leq 3\right) = 1/2 \,. \] Given this information, find \(a\) and \(b\).

Answer

From the first equation we immediately know that \(a<2<b\). Now, for a continuous random variable, we obtain the probability that it lies in an interval \((c,d)\) by integrating the density function over that interval, i.e.  \[ \mathbb{P}\left(c\leq X\leq d\right) = \int_c^d f_X(x) dx \,. \] Since \(X\sim\mbox{\textup{Uniform}}[a,b]\), we know that \[ f_X(x) = \begin{cases} 1/(b-a) &\quad x\in[a,b] \\ 0 &\quad\text{otherwise.} \end{cases} \] Thus we obtain \[ 1/3 = \mathbb{P}\left(X<2\right) = \mathbb{P}\left(a\leq X<2\right) = \int_a^2 1/(b-a) dx = (2-a)/(b-a) \,. \tag{5}\]

In order to use the second equation (\(\mathbb{P}\left(1<X\leq 3\right) = 1/2\)) in the same way, we have two possibilities to consider:

  1. \(a<1\)
  2. \(1\leq a<2\)

Suppose first that \(a<1\). Then \[ 1/2 = \mathbb{P}\left(1<X\leq 3\right) = \int_1^3 1/(b-a) dx = 2/(b-a) \,, \tag{6}\] since the density function \(f_X\) is equal to \(1/(b-a)\) for all \(x\in[1,3]\) if \(a<1\).

If \(1\leq a\) however, then instead we obtain \[ 1/2 = \mathbb{P}\left(1<X\leq 3\right) = \int_1^a 0\, dx + \int_a^3 1/(b-a)dx = (3-a)/(b-a) \,. \tag{7}\]

We now have to solve these simultaneous equations in order to find \(a\) and \(b\). If we assume that \(1\leq a<2\), then we must try to solve Equation 5 and Equation 7 together; but this gives \[ 2(3-a) = 3(2-a)\,, \] resulting in \(a=0\). But this contradicts our assumption that \(1\leq a\)!

So it must be the case that \(a<1\): now we must solve Equation 5 and Equation 6, and this is possible, with \(a=2/3\) and \(b=14/3\).

D5. Let \(X\) and \(Y\) be two independent geometrically distributed random variables with parameter \(p\), i.e., \(X\sim\mbox{\textup{Geom}}(p)\) and \(Y\sim\mbox{\textup{Geom}}(p)\). For any natural numbers \(i\) and \(n\) with \(i<n\) calculate the conditional probability \(\mathbb{P}\left(X=i\,|\,X+Y=n\right)\). Describe in words the meaning in terms of Bernoulli trials of what you just calculated.

Answer

According to the definition of conditional probability, \[ \mathbb{P}\left(X=i\,|\,X+Y=n\right)=\frac{\mathbb{P}\left(X=i,\,X+Y=n\right)}{\mathbb{P}\left(X+Y=n\right)}. \]

For the numerator we can use that the event \(\{X=i,X+Y=n\}\) is the event \(\{X=i,Y=n-i\}\). We then know that the independence of \(X\) and \(Y\) implies the factorisation of that probability: \[ \mathbb{P}\left(X=i,\,X+Y=n\right)=\mathbb{P}\left(X=i,Y=n-i\right)=\mathbb{P}\left(X=i\right)\mathbb{P}\left(Y=n-i\right). \] We can now substitute in the probability mass function for the geometric distribution with parameter \(p\): \[ \mathbb{P}\left(X=i\right)=(1-p)^{i-1}p\] and thus \[ \mathbb{P}\left(Y=n-i\right)=(1-p)^{n-i-1}p.\]

This gives \[\mathbb{P}\left(X=i,\,X+Y=n\right)=(1-p)^{i-1}p(1-p)^{n-i-1}p=(1-p)^{n-2}p^2. \] Note that this is independent of \(i\).

For the denominator we use the partition theorem to write \[ \mathbb{P}\left(X+Y=n\right)=\sum_{i=1}^{n-1}\mathbb{P}\left(X=i,X+Y=n\right). \] From our calculation above we see that every term in the sum is the same, so \[ \mathbb{P}\left(X+Y=n\right)=(n-1)\mathbb{P}\left(X=i,X+Y=n\right). \] Putting this all together we finally find that \[ \mathbb{P}\left(X=i\,|\,X+Y=n\right)=\frac{\mathbb{P}\left(X=i,X+Y=n\right)}{\mathbb{P}\left(X+Y=n\right)}=\frac{1}{n-1}. \]

A geometric random variable counts the number of turns until the first success in repeated Bernoulli trials. Therefore the sum \(X+Y\) of two identical and independent geometric random variables counts the number of turns until the second success. So the conditional probability we calculated is the probability that the first success happens on a particular trial \(i\) given that the second success happens on the \(n\)-th trial. The result shows that the first success is then equally likely to occur on any of the \(n-1\) trials before the \(n\)-th trial.