Regularization

L1 vs L2

From the perspective of constraint

We should the regularization terms in the loss function as a lagrange multiplier term in the lagrange technique where regularization term is basically the constraint from the perspective of lagrange method. And there is difference between L1 and L2 constraint curves as shown in the article below, L2 constraint curve is basically a circle while L1 has different curve equation.

From the perspective of gradient penalty

Last updated