Logistic Regression and First derivative test: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
 
imported>Auntof6Bot
m (Missing/miscoded ref display (WP ck error 3) and/or general cleanup using AWB)
 
Line 1: Line 1:
[[File:Logistic-curve.png|thumb|Figure 1: Example of a Logistic Curve. The values of y cannot be less than 0 or greater than 1.]]
{{simplify|date=May 2015}}
'''Logistic Regression''', also known as ''Logit Regression'' or ''Logit Model'', is a mathematical model used in [[statistics]] to estimate (guess) the probability of an event occurring having been given some previous data. Logistic Regression works with [[binary]] data, where either the event happens (1) or the event does not happen (0). So given some feature '''x''' it tries to find out whether some event '''y''' happens or not. So '''y''' can either be 0 or 1. In the case where the event happens, '''y''' is given the value 1. If the event does not happen, then '''y''' is given the value of 0. For example, if '''y''' represents whether a sports team wins a match, then '''y''' will be 1 if they win the match or '''y''' will be 0 if they do not. This is known as ''Binomial Logistic Regression''. There is also another form of Logistic Regression which uses multiple values for the variable '''y'''. This form of Logistic Regression is known as ''Multinomial Logistic Regression''.
In [[calculus]], the '''first derivative test''' is used to determine local [[:en:Maxima_and_minima|maxima and minima]] of a [[function]]. Furthermore, the first derivative test can be used to determine [[:en:Interval_(mathematics)|intervals]] of increase and intervals of decrease.<ref>https://www.math.hmc.edu/calculus/tutorials/extrema/</ref>


Logistic Regression uses the logistic function to find a model that fits with the data points. The function gives an 'S' shaped curve to model the data. The curve is restricted between 0 and 1, so it is easy to apply when '''y''' is binary. Logistic Regression can then model events better than linear regression, as it shows the [[probability]] for '''y''' being 1 for a given '''x''' value. Logistic Regression is used in [[statistics]] and [[machine learning]] to predict values of an [[input]] from previous test data.
==Critical Points (and how to find them)==
[[Critical point]]s are values in the [[:en:Domain_of_a_function|domain]] of a function where its [[:en:Derivative|derivative]] is 0 or [[:en:Undefined_(mathematics)|undefined]] (<math>f'(x)=0</math>  or <math>f'(x)</math> doesn't exist).<ref>http://tutorial.math.lamar.edu/Classes/CalcI/CriticalPoints.aspx</ref>


== Basics ==
To find the critical points of a function, compute the first derivative of the function and set it equal to zero. Solve for the [[:en:Zero_of_a_function|zeroes]] of this [[:en:Equation|equation]]. Similarly, if you have a [[:en:Rational_function|rational function]] (<math>f(x)=\dfrac{a}{b}</math>, where <math>b \neq 0</math>) find what values will cause the [[:en:Fraction_(mathematics)|denominator]] to be equal to 0 because this will make the function undefined, and therefore not [[:en:Differentiable_function|differentiable]] at this point. These are all of your critical numbers.
Logistic regression is an alternative method to use other than the simpler [[Linear regression|Linear Regression]]. Linear regression tries to predict the data by finding a linear – straight line – equation to model or predict future data points. Logistic regression does not look at the relationship between the two variables as a straight line. Instead, Logistic regression uses the natural logarithm function to find the relationship between the variables and uses test data to find the coefficients. The function can then predict the future results using these coefficients in the logistic equation.


Logistic regression uses the concept of odds ratios to calculate the probability. This is defined as the ratio of the odds of an event happening to its not happening. For example, the probability of a sports team to win a certain match might be 0.75. The probability for that team to lose would be 1 – 0.75 = 0.25. The odds for that team winning would be 0.75/0.25 = 3. This can be said as the odds of the team winning are 3 to 1.<ref>https://www.strath.ac.uk/aer/materials/5furtherquantitativeresearchdesignandanalysis/unit6/whatislogisticregression/</ref>
Plug these numbers into your original function to find the exact [[:en:Coordinate_system|coordinate]] of these critical points.


The odds can be defined as:
=== Example: <math>f(x)=\dfrac{1}{3}x^3-\dfrac{5}{2}x^2+6x</math> ===
<math>f'(x)=x^2-5x+6</math>


<math>Odds = {P(y=1|x) \over 1-P(y=1|x)}</math>
<math>x^2-5x+6=0</math>


The natural logarithm of the odds ratio is then taken in order to create the logistic equation. The new equation is know as the logit:
<math>(x-3)(x-2)=0</math>


<math>Logit(P(x)) = \ln \left ( {P(y = 1|x) \over (1 - P(y = 1|x) } \right) </math>
<math>x=3</math> and <math>x=2</math>


In Logistic regression the Logit of the probability is said to be linear with respect to x, so the logit becomes:
<math>f(3)=\dfrac{1}{3}3^3-\dfrac{5}{2}3^2+3(3)=\dfrac{27}{2}</math>


<math>Logit(P(x)) = a + bx</math>
<math>f(2)=\dfrac{1}{3}2^3-\dfrac{5}{2}2^2+3(2)=-\dfrac{4}{3}</math>


Using the two equations together then gives the following:
So the two critical points are <math>(3,\dfrac{27}{2})</math> and <math>(2,-\dfrac{4}{3})</math>.


<math>{P(y = 1|x) \over 1 - P(y = 1|x)} = e^{a+bx}</math>
==First derivative test==
To determine whether these points are local maximas, minimas, or neither, we need to consider the intervals between each of the points.  Plug in a value from each interval into the first derivative of the function. If the value is positive, then that is an interval of increase. If the value is negative, then that is an interval of decrease. Finally, if two neighboring intervals change sign then that point is either a local maxima or minima. If the first interval is positive and the second interval negative, then the critical point is a local maxima. If the first interval is negative and the second interval positive, then the critical point is a local minima. If both intervals are positive or both intervals are negative, then the critical point is neither a local maxima or minima.


This then leads to the probability:
The first derivative test is used in calculus [[mathematical optimization|optimization]] problems.


<math>P(y=1|x) = {e^{a+bx} \over 1 + e^{a+bx}} = {1 \over 1 + e^{-(a + bx)}}
==References==
</math> <ref>http://faculty.cas.usf.edu/mbrannick/regression/Logistic.html</ref>
{{reflist}}


This final equation is the logistic curve for Logistic regression. It models the non-linear relationship between x and y with an ‘S’-like curve for the probabilities that y =1 - that event the y occurs. In this example '''a''' and '''b''' represent the gradients for the logistic function just like in linear regression. The logit equation can then be expanded to handle multiple gradients. This gives more freedom with how the logistic curve matches the data. The multiplication of two [[vector]]s can then be used to model more gradient values and give the following equation:
[[Category:Calculus]]
 
<math>Logit(P(x)) = w_0x^0 + w_1x^1 + w_2x^2 + ... + w_nx^n = w^Tx</math>
 
In this equation '''w''' = [ w<sub>0</sub> , w<sub>1</sub> , w<sub>2</sub> , ... , w<sub>n</sub> ] and represents the n gradients for the equation. The powers of '''x''' are given by the vector  '''x''' = [ 1 , x <sup></sup>, x<sup>2</sup> , .. , x<sup>n</sup> ] . These two vectors give the new logit equation with multiple gradients. The logistic equation then can then be changed to show this:
 
<math>P(y=1|x) = {1 \over 1 + e^{-(w^Tx)}}
</math>
 
This is then a more general logistic equation allowing for more gradient values.
 
== References ==
 
{{Reflist}}
 
== Other websites ==
* http://faculty.cas.usf.edu/mbrannick/regression/Logistic.html
* https://www.strath.ac.uk/aer/materials/5furtherquantitativeresearchdesignandanalysis/unit6/whatislogisticregression/
 
<!--- Categories --->
[[Category:Statistics]]

Latest revision as of 06:03, 4 July 2015

Template:Simplify In calculus, the first derivative test is used to determine local maxima and minima of a function. Furthermore, the first derivative test can be used to determine intervals of increase and intervals of decrease.[1]

Critical Points (and how to find them)

Critical points are values in the domain of a function where its derivative is 0 or undefined ([math]\displaystyle{ f'(x)=0 }[/math] or [math]\displaystyle{ f'(x) }[/math] doesn't exist).[2]

To find the critical points of a function, compute the first derivative of the function and set it equal to zero. Solve for the zeroes of this equation. Similarly, if you have a rational function ([math]\displaystyle{ f(x)=\dfrac{a}{b} }[/math], where [math]\displaystyle{ b \neq 0 }[/math]) find what values will cause the denominator to be equal to 0 because this will make the function undefined, and therefore not differentiable at this point. These are all of your critical numbers.

Plug these numbers into your original function to find the exact coordinate of these critical points.

Example: [math]\displaystyle{ f(x)=\dfrac{1}{3}x^3-\dfrac{5}{2}x^2+6x }[/math]

[math]\displaystyle{ f'(x)=x^2-5x+6 }[/math]

[math]\displaystyle{ x^2-5x+6=0 }[/math]

[math]\displaystyle{ (x-3)(x-2)=0 }[/math]

[math]\displaystyle{ x=3 }[/math] and [math]\displaystyle{ x=2 }[/math]

[math]\displaystyle{ f(3)=\dfrac{1}{3}3^3-\dfrac{5}{2}3^2+3(3)=\dfrac{27}{2} }[/math]

[math]\displaystyle{ f(2)=\dfrac{1}{3}2^3-\dfrac{5}{2}2^2+3(2)=-\dfrac{4}{3} }[/math]

So the two critical points are [math]\displaystyle{ (3,\dfrac{27}{2}) }[/math] and [math]\displaystyle{ (2,-\dfrac{4}{3}) }[/math].

First derivative test

To determine whether these points are local maximas, minimas, or neither, we need to consider the intervals between each of the points. Plug in a value from each interval into the first derivative of the function. If the value is positive, then that is an interval of increase. If the value is negative, then that is an interval of decrease. Finally, if two neighboring intervals change sign then that point is either a local maxima or minima. If the first interval is positive and the second interval negative, then the critical point is a local maxima. If the first interval is negative and the second interval positive, then the critical point is a local minima. If both intervals are positive or both intervals are negative, then the critical point is neither a local maxima or minima.

The first derivative test is used in calculus optimization problems.

References

Template:Reflist