Logistic Regression in R Tutorial
Regression is a statistical relationship between two or more variables in which a change in the independent variable is associated with a change in the dependent variable. Logistic regression is used to estimate discrete values (usually binary values like 0 and 1) from a set of independent variables. It helps to predict the probability of an event by fitting data to a logistic function. This is called logistic regression.
Logistic regression is kind of a misnomer in that when most people think of regression, they think of linear regression, which is a machine learning algorithm for continuous variables. However, logistic regression is a classification algorithm, not a continuous variable prediction algorithm.
This Logistic Regression in R article deals with the following topics:
Why logistic regression is used?
What is logistic regression and how it works?
Use-case implementation using logistic regression to predict college admission.
Let us begin by understanding: Why do we use regression?
Let’s say you have a website, and your revenue is based on the website traffic, and you want to predict the revenue based on site traffic. The more traffic driven to your website, the higher your revenue would be, or at least that’s what you would intuitively assume.
In a plot of revenue versus website traffic, traffic would be considered the independent variable and revenue would be the dependent variable. The independent variable is often called the explanatory variable, and the dependent variable is called the response variable. However, they are typically referred to as independent and dependent variables. Our intuition tells us that the independent variable drives the dependent variable, and if there is some relationship between the two variables, then you would be able to use the independent variable to make predictions on the dependent variable.