Cost Function - An Intuitive Explanation

When the parameters of a model are varied, the curve changes shape. This leads to change in the total error. This error plotting for multiple set of parameters results in the cost function.

L1 and L2 regularization are used depending on dataset. L1 will ignore outliers while curve fitting because the error is the absolute difference whereas this error is magnified in L2 where the error is squared.

Gradient descent works on the differentiation method. Start with negative gradient and keep moving downhill until you can no longer go down hill. Intuitively, you can say that a convex function has a global minima where the derivative/gradient (tangent to the curve) is 0.

Some cost functions are convex whereas some are non convex. Convex functions have only global minima.There is no local minima. Partial derivatives are also used.

In case there are no local optimas, initialization doesnt matter. In case there are local optimas, initialization will matter as the solution can get stuck in a local minima nearest to the initialization point.

Smooth cost functions are differentiable everywhere whereas non smooth functions are generally not differentiable everywhere. Hence, differentiation is generally not used in case of non smooth cost functions.

Comments

Popular posts from this blog

On Quora