What Is Statistics?

Perhaps the best definition of statistics I have found is statistics is the science of drawing conclusions from data with the aid of the mathematics of probability – S. Garfunkel.  We usually use a small portion of something (sample) to make inferences about the whole thing (population).  The concept of sample and population is extremely important and we will cover it early on in our lessons. Almost every statistical problem can be viewed through the lens of using our sample to make inferences about the population.

What Is Machine Learning?

Machine learning is not much different, although the problems tackled by machine learning tend to be more broad and complicated.  For now, you can view machine learning as more advanced statistics where conclusions tend to be determined numerically as opposed to analytically.  Numerically means that a computer algorithm uses some method to approximate the correct answer. Analytically means we actually have a formula or a mathematical relationship that gives us the answer – usually exact and may be based on some assumptions but other times approximate as well.  The difference between solving a problem analytically and numerically is an important concept that will be discussed later and gives you a better understanding as to why we use certain methods.  

Before we can go into the fun stuff of drawing conclusions using our data (this is called statistical inference), we must build up our foundation in statistical theory. Our first major topic will be probabilities.