Gradient methods for minimizing composite functions.

*(English)*Zbl 1287.90067Let \(E\) be a finite-dimensional linear space with dual \(E^*\). The author provides new extended gradient methods for solving optimization problems of the form
\[
\phi(x)= f(x)+ \psi(x)\to\min,\quad\text{s.t. } x\in Q,
\]
i.e. optimization problems where the objective function \(\phi: E\to\mathbb{R}\) is the sum of a smooth (not necessary convex) function \(f\) and a convex (not necessary smooth) function \(\psi\). The set \(Q\subset E\) is assumed to be a convex set.

First it is pointed out that for general nonsmooth, nonconvex functions even resolving the question of whether a descent direction from a point exists is NP-hard. However, for the above-mentioned special form of the objective function, this problem is better solvable using the so-called composite gradient mapping \(g_L: E\to E^*\) introduced by \[ \begin{gathered} T_L(y):= \text{argmin}\Biggl\{f(y)+ \langle\nabla f(y),x-y\rangle+{L\over 2}\| x-y\|^2+ \psi(x)\mid x\in Q\Biggr\},\\ g_L(y):= L\cdot B(y- T_L(y)),\end{gathered} \] where \(B: E\to E^*\) is a fixed positive definite self-adjoint operator which defines the norm \(\| h\|=\langle Bh,h\rangle^{1/2}\) and \(L\) is a positive constant. That means that the objective of the composite gradient mapping is the sum of the objective of the known gradient mapping for smooth problems and the nonsmooth convex term.

After the discussion of this mapping, some generalized gradient methods for the optimization problem are presented. It is shown that in convex and nonconvex cases there are exactly the same complexity results as in the usual smooth situation \((\psi=0)\). At the end of the paper, the author gives some examples of applications and some computational results of the proposed methods.

First it is pointed out that for general nonsmooth, nonconvex functions even resolving the question of whether a descent direction from a point exists is NP-hard. However, for the above-mentioned special form of the objective function, this problem is better solvable using the so-called composite gradient mapping \(g_L: E\to E^*\) introduced by \[ \begin{gathered} T_L(y):= \text{argmin}\Biggl\{f(y)+ \langle\nabla f(y),x-y\rangle+{L\over 2}\| x-y\|^2+ \psi(x)\mid x\in Q\Biggr\},\\ g_L(y):= L\cdot B(y- T_L(y)),\end{gathered} \] where \(B: E\to E^*\) is a fixed positive definite self-adjoint operator which defines the norm \(\| h\|=\langle Bh,h\rangle^{1/2}\) and \(L\) is a positive constant. That means that the objective of the composite gradient mapping is the sum of the objective of the known gradient mapping for smooth problems and the nonsmooth convex term.

After the discussion of this mapping, some generalized gradient methods for the optimization problem are presented. It is shown that in convex and nonconvex cases there are exactly the same complexity results as in the usual smooth situation \((\psi=0)\). At the end of the paper, the author gives some examples of applications and some computational results of the proposed methods.

Reviewer: Jörg Thierfelder (Ilmenau)

##### MSC:

90C30 | Nonlinear programming |

90C25 | Convex programming |

90C32 | Fractional programming |

65K05 | Numerical mathematical programming methods |

##### Software:

PDCO
PDF
BibTeX
XML
Cite

\textit{Yu. Nesterov}, Math. Program. 140, No. 1 (B), 125--161 (2013; Zbl 1287.90067)

Full Text:
DOI

**OpenURL**

##### References:

[1] | Chen, S; Donoho, D; Saunders, M, Atomic decomposition by basis pursuit, SIAM J. Sci. Comput., 20, 33-61, (1998) · Zbl 0919.94002 |

[2] | Claerbout, J; Muir, F, Robust modelling of eratic data, Geophysics, 38, 826-844, (1973) |

[3] | Figueiredo, M., Novak, R., Wright, S.J.: Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. Submitted for publication |

[4] | Fukushima, M; Mine, H, A generalized proximal point algorithm for certain nonconvex problems, Int. J. Sys. Sci., 12, 989-1000, (1981) · Zbl 0467.65028 |

[5] | Kim, S.-J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: A method for large-scale \(l_1\)-regularized least-squares problems with applications in signal processing and statistics. Stanford University, March 20, Research report (2007) |

[6] | Levy, S; Fullagar, P, Reconstruction of a sparse spike train from a portion of its spectrum and application to high-resolution deconvolution, Geophysics, 46, 1235-1243, (1981) |

[7] | Miller, A.: Subset Selection in Regression. Chapman and Hall, London (2002) · Zbl 1051.62060 |

[8] | Nemirovsky, A., Yudin, D.: Informational Complexity and Efficient Methods for Solution of Convex Extremal Problems. Wiley, New-York (1983) |

[9] | Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer, Boston (2004) · Zbl 1086.90045 |

[10] | Nesterov, Y, Smooth minimization of non-smooth functions, Math. Program. (A), 103, 127-152, (2005) · Zbl 1079.90102 |

[11] | Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE Discussion Paper \(\#\) 2007/76, CORE (2007) |

[12] | Nesterov, Y, Rounding of convex sets and efficient gradient methods for linear programming problems, Optim. Methods Softw., 23, 109-135, (2008) · Zbl 1192.90119 |

[13] | Nesterov, Y, Accelerating the cubic regularization of newton’s method on convex problems, Math. Program., 112, 159-181, (2008) · Zbl 1167.90013 |

[14] | Nesterov, Y., Nemirovskii, A.: Interior Point Polynomial Methods in Convex Programming: Theory and Applications. SIAM, Philadelphia (1994) · Zbl 0824.90112 |

[15] | Ortega, J., Rheinboldt, W.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York (1970) · Zbl 0241.65046 |

[16] | Santosa, F; Symes, W, Linear inversion of band-limited reflection histograms, SIAM J. Sci. Stat. Comput., 7, 1307-1330, (1986) · Zbl 0602.73113 |

[17] | Taylor, H; Bank, S; McCoy, J, Deconvolution with the \(l_1\) norm, Geophysics, 44, 39-52, (1979) |

[18] | Tibshirani, R, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. B, 58, 267-288, (1996) · Zbl 0850.62538 |

[19] | Tropp, J, Just relax: convex programming methods for identifying sparse signals, IEEE Trans. Inf. Theory, 51, 1030-1051, (2006) · Zbl 1288.94025 |

[20] | Wright, S.J.: Solving \(l_{1}\)-Regularized Regression Problems. Talk at International Conference “Combinatorics and Optimization”, Waterloo (June 2007) · Zbl 1192.90119 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.