Linear Regression

The earliest form of regression analysis, Linear Regression using Least Squares, was first published by Adrien-Marie Legendre in 1805 and then again by Carl Friedrich Gauss in 1809. Both used them to predict astronomical orbitals, specifically bodies that orbit the sun. Later in 1821, Gauss published his continued work on the Least Square Theory; however, the term regression wasn’t coined until the late nineteenth century by Francis Galton. Galton discovered a linear relationship between the weights of mother and daughter seeds across generations. To Galton, regression was merely a term to describe a biological phenomenon. It was not until Udny Yule and Karl Pearson expanded this method to a more general statistics view.

To this day, Linear Regression has evolved with many variations and solving methods, but the usefulness of the algorithm has remained the same. Linear Regression is not only used in the field of machine learning and data science but also in epidemiology, finance, economics, and so on. Practically any situation where a relationship between continuous variables is presented, Linear Regression can model such relation in one way or another. To generalize, the application of Linear Regression falls under the use of predicting or forecasting certain response variables based on related explanatory features; Linear Regression can also be used to quantify or measure the linear relation between two variables and to identify possible redundant or misleading features in a dataset.

Linear Regression models the relationship between one response, or target variable, and one or more features, or explanatory variables that may be correlated to the target variable in a linear fashion. Linear Regression with two variables, with one feature describing the target, is called Simple Linear Regression. On the other hand, Linear Regression that models multiple feature variables against one target variable is referred to as Multiple Linear Regression. The goal of Linear Regression is to find a function that produces a “line of best fit” that best models the relationship between the features and the target variables. There are many ways to determine the “line of best fit”; however, the most common and successful method is through gradient descent.

Linear Regression serves as the beginner algorithm for hundreds and thousands of data scientists, for a reason. Not only gradient descent is used widely across many machine learning algorithms but modeling the trend of the data based on features is one of the crucial uses of machine learning to this day with applications in forecasting, business, medics, and many more.