03. Discrete Probability Distributions

이 글은 컴퓨터학부 확률과통계 수업에서 배운 자료들을 정리한 내용입니다.

A binomial distribution(이항 분포) with parameter $n$ and $p$
- The random variable is defined by the parameter $p$, 0 ≤ p ≤ 1 , which is the probability that the outcome is 1
- The number of successes within a fixed number of trials $n$
- n independent Bernoulli trials $X_{1}, … , X_{n}$
  - $X = X_{1} + . . . + X_{n}$
- $X \sim B(n, p)$
  - $P( X = x) = {n \choose x} p^{x} (1 - p)^{n-x}$
  - ${n \choose x} == \frac{n!}{x!(n - x)!}$
Consider an experiment consisting of $n$ Bernoulli trials with a constant probability $p$ of success
A random variable $X$: the total number of successes
- A binomial distribution with parameter $n$ and $p$
- $X \sim B(n, p)$ → 성공 확률이 p인 실험을 n번 실행하였을 때, 성공 횟수의 분포 ⇒ 이항 분포
The probability mass function of a $B( n, p )$
- $P( X = x) = {n \choose x} p^{x} (1 - p)^{n-x}$, for $x = 0, 1, 2, …, n$
- the expectation and the variance
  - $E(X) = np$ and $Var(X) = np(1-p)$
The cumulative distribution function of a $B(n, p)$
- $P( X ≤ x ) = \sum_{j = 0}^{i} P( X ≤ x_{j} )$

The Geometric Distribution(기하 분포) with Parameter p
- A probability mass function
  - $P( X = x) = (1 - p)^{x-1}p$, for $x = 1, 2, 3, 4, …$
  - $(1 - p)^{x-1}$ → The probability value of x - 1 failures
  - $p$ → The probability value of $x{th}$ success
  - 베르누이 시행이 처음 성공할 때까지의 시행횟수를 확률변수 X라고 했을 때, 이 확률변수 X의 분포를 말한다.
  - 최초로 발생한 것 (처음 성공할 확률)
- The cumulative distribution function
  - $P( X ≤ x) = \sum_{i = 1}^{x} P( X = i ) = 1 - ( 1 - p)^{x}$
  - 의미: $x$ 번을 시도했을 때 다 실패하지 않은 값 ( ⇒ $x$번째까지 성공한 것을 다 더한 값)

A random variable that counts the number of “events” that occur within certain specified boundaries
A probability mass function for a random variable Poisson distribution
- $X \sim P(\lambda)$
- $P( X = x ) = \frac{e^{-\lambda}\lambda^{x}}{x!}$ , for $x = 0, 1, 2, 3, …$ 해석) $x$ 개 이벤트가 발생할 확률
The Poisson distribution can be used to approximate the $B(n, p)$ distribution

when 1) $n$ is very large (larger than 150, say) and 2) the success probability $p$ is very small (smaller than 0.01, say).
A parameter value of $\lambda = np$ should be used for the Poisson distribution, so that it has the same expected value as the binomial distribution.

The Multinomial Distribution

Consider a sequence of $n$ independent trials where each individual trial can have $k$ outcomes that occur with constant probability values $p_{1}, …, p_{k}$ with $\sum_{i = 1}^{k} p_{i} = 1$
The random variables $X_{1}, …, X_{k}$ that count the number of occurrences of each outcome are said to have a multinomial distribution, and their joint pmf, $P( X_{1} = x_{1}, …, X_{k} = x_{k} ) = \frac{n!}{x_{1}! … x_{k}!} \times p_{1}^{x_{1}} \times … \times p_{k}^{x_{k}}$, for non-negative integer values of the $x_{i}$ satisfying $x_{1} + … + x_{k} = n$
The expectation of the variance of each random variable $X_{i}$
- $E(X_{i}) = np_{i} , Var(X_{i}) = np_{i}(1 - p_{i})$
- The random variables $X_{1}, …, X_{k}$ are not independent

Reference