Strong convexity is one of the most important concepts in optimization, especially for guaranteeing a linear convergence rate of many gradient decent based algorithms. In this post, which is mainly based on my course project report, I would like to present some useful results on strong convexity.
Let’s us first begin with the definition.
A differentiable function is strongly convex if
for some and all .
Note: Strong convexity doesn’t necessarily require the function to be differentiable, and the gradient is replaced by the sub-gradient when the function is non-smooth.
Intuitively speaking, strong convexity means that there exists a quartic lower bound on the growth of the function. This directly implies that a strong convex function is strictly convex since the quartic lower bound growth is of course strictly grater than the linear growth.
Although the definition in is commonly used, it would be quite useful for us to note that there are several equivalent conditions for strong convexity.
Equivalent Conditions of Strong Convexity
The following proposition gives equivalent conditions for strong convexity. The key insight behind this result and its proof is that we can relate a strongly-convex function () to another convex function (), which enables us to apply the equivalent conditions for a convex function to obtain the result.
Proposition The following conditions are all equivalent to the condition that a differentiable function is strongly-convex with constant .
It follows from the first-order condition for convexity of , i.e., is convex if and only if
It follows from the monotone gradient condition for convexity of , i.e., is convex if and only if
It simply follows from the definition of convexity, i.e., is convex if
Note: The equivalence still holds for non-smooth function with gradient replaced by sub-gradient. For a proof, please refer to my notes here.
Implications of Strong Convexity
There are also some other conditions that are implied by strong convexity while are not equivalent to strong convexity, which means that these conditions are more general than strong convexity.
Proposition For a continuously differentiable function , the following conditions are all implied by strong convexity (SC) condition.
Note: The condition (a) is often called Polyak-Lojasiewicz (PL) inequality which is quite useful for establishing linear convergence for a lot of algorithms.
SC (a): Taking minimization respect to on both sides of yields
Re-arranging it we have the PL-inequality.
SC (b): Using Cauchy-Schwartz on the equivalent condition (iii) in the last Proposition, we have
and dividing both sides by (assuming ) gives (a) (while (a) clearly holds if ).
SC (c): Let us consider the function . It is easy to see that is strongly-convex with the same since
where we again use the equivalent condition (iii) in the last Proposition for strong convexity.
Then, the condition (c) can be directly obtained by applying the PL-inequality to the strongly-convex function with
SC (d): Interchanging and in condition (c) yields,
Condition (d) is obtained by adding and condition (c).
We have already seen the application of condition (iii) in proving useful results in the last section. In this section, let us look at one simple application of condition (iv) for strong convexity.
Proposition Let where is strongly convex function and is a convex function, then is strongly convex function
where the first inequality follows from the fact that is strongly convex; the second inequality holds since is convex.
Note: Although the proof is simple, it has its own importance. For example, any L2-regularized problem of the form , where is convex and , is strongly convex.
Now, it’s time to take a break by appreciating the masterpiece of Monet.
courtesy of www.Claude-Monet.com