What the heck is a moment?
This was (and still is) one of the biggest questions left over in my brain from my first semester. In Theory I we were introduced to moment generating functions and told about their uses, but the way it was taught and the fact we weren't tested on it means it's still an underdeveloped and fuzzy concept in my head. After reading around online, I think this is a pretty useful description:
The function f(X)=(X-z)n is a particularly common and useful one for various values of z and n, and in order to specify what the values of z and n take, we have to use phrases like "the nth moment of X about z", or "the nth central moment of X" which is shorthand for "the nth moment of X about E[X]", or "the nth non-central moment of X" which is shorthand for "the nth moment of X about 0".
Another really good explanation of the topic is one I got from some Penn State online course material. Basically, moment is a synonym for expected value of a random variable. You can find a lot of expected values (E(Y), E(Y2), E(Y3),E(Y4)...), and some expected values are more useful than others. For example E(Y) is the mean of a random variable and E(Y2) is useful for finding variance of a random variable:
The mean and variance are functions of moments, and sometimes they can be hard to calculate, so instead moment generating functions (mgf) are used.
Let Y be a discrete random variable with probability function f(y), the the mgf of Y is:
m(t) = E(etY)
If a moment-generating function exists for a random variable X, then:
The mean of Y can be found by evaluating the first derivative of the moment-generating function at t = 0. That is:
μ=E(X)=M′(0)
The variance of Y can be found by evaluating the first and second derivatives of the moment-generating function at t = 0. That is:
σ2=E(Y2)−[E(X)]2 =M″(0)−[M′(0)]2
So to find the mean, find the mgf, etYp(y), take the first derivative and evaluate at t=0. For the variance, again find the mgf, and evaluate the 2nd derivative at t=0 and subtract the square of the first derivative evaluated at t=0.
MGFs are useful because they allow you to prove that a random variable possesses a particular probability distribution, or that 2 random variables have the same distribution. In many cases where you need to prove that two distributions are equal, it is much easier to prove equality of the moment generating functions than to prove equality of the distribution functions. From my textbook:
"If the moment-generating functions for two random variables Y and Z are equal (for all |t| < b for some b > 0), then Y and Z must have the same probability distribution. It follows that, if we can recognize the moment-generating function of a random variable Y to be one associated with a specific distribution, then Y must have that distribution."
After this little review, it now makes sense why we would care about moments and moment generating functions. Without really knowing it, we were dealing with moments every time we calculated the mean or variance (which was a lot). As for why we care if 2 random variables have the same distribution, I turned to wikipedia.
A sequence or other collection of random variables is independent and identically distributed (i.i.d.) if each random variable has the same probability distribution as the others and all are mutually independent. The abbreviation i.i.d. is particularly common in statistics (often as iid, sometimes written IID), where observations in a sample are often assumed to be effectively i.i.d. for the purposes of statistical inference
also from the same wikipedia page this caught my eye:
One of the simplest statistical tests, the z-test, is used to test hypotheses about means of random variables. When using the z-test, one assumes (requires) that all observations are i.i.d. in order to satisfy the conditions of the central limit theorem.
So maybe moment generating functions in practice are used to explicitly show/prove this. Hopefully we'll go over this more in Theory