written 8.5 years ago by |
A random variable is a variable that is subject to variations due to random chance. One can think of a random variable as the result of a random experiment, such as rolling a die, flipping a coin, picking a number from a given interval.
The idea is that, each time you perform the experiment, you obtain a sample of the random variable. Since the variable is random, you expect to get different values as you obtain multiple samples.
A probability distribution is a function that describes how likely you will obtain the different possible values of the random variable.
It turns out that probability distributions have quite different forms depending on whether the random variable takes on discrete values (such as numbers from the set {1,2,3,4,5,6}) or takes on any value from a continuum (such as any real number in the interval [0,1]).
Despite their different forms, one can do the same manipulations and calculations with either discrete or continuous random variables. The main difference is usually just whether one uses a sum or an integral.
Discrete probability distribution
A discrete random variable is a random variable that can take on any value from a discrete set of values. The set of possible values could be finite, such as in the case of rolling a six-sided die, where the values lie in the set {1,2,3,4,5,6}
However, the set of possible values could also be countably infinite, such as the set of integers {0,1,−1,2,−2,3,−3,…}
The requirement for a discrete random variable is that we can enumerate all the values in the set of its possible values, as we will need to sum over all these possibilities.
If we rolled two six-sided dice, and let X be the sum, then X could take on any value in the set {2,3,4,5,6,7,8,9,10,11,12}. The probability mass function for this X is plotted as a bar graph in the following figure.
Continuous probability distribution
A continuous random variable is a random variable that can take on any value from a continuum, such as the set of all real numbers or an interval. We cannot form a sum over such a set of numbers. (There are too many, since such a continuum is uncountable.)
Instead, we replace the sum used for discrete random variables with an integral over the set of possible values.
For a continuous random variable X, we cannot form its probability distribution function by assigning a probability that X is exactly equal to each value. The probability distribution function we must use in the case is called a probability density function, which essentially assigns the probability that X is near each value. For intuition behind why we must use such a density rather than assigning individual probabilities, see the page that describes the idea behind the probability density function.