8.4 Exploiting Structure

By default, the functions cp() and cpl() do not exploit problem structure. Two mechanisms are provided for implementing customized solvers that take advantage of problem structure.

Providing a function for solving KKT equations.
The most expensive step of each iteration of cp() is the solution of a set of linear equations (‘KKT equations’) of the form
⌊                 ⌋⌊     ⌋  ⌊    ⌋
  H  AT     G˜T        ux       bx
⌈ A   0      0    ⌉⌈  uy ⌉ = ⌈ by ⌉ ,
  ˜G   0   - W TW      uz       bz
(8.2)

where

    ∑m    2          ˜  [                       T ]T
H =    zk∇ fk(x),   G =   ∇f1 (x)  ⋅⋅⋅ ∇fm (x)  G     .
    k=0

The matrix W depends on the current iterates and is defined as follows. Suppose

u = (unl, ul, uq,0, ..., uq,M-1, vec(us,0), ..., vec(us,N- 1)), unl ∈ Rm,   ul ∈ Rl,   uq,k ∈ Rrk, k = 0,...,M - 1,  us,k ∈ Stk, k = 0,...,N - 1.

Then W is a block-diagonal matrix,

W u = (Wnlunl, Wlul, Wq,0uq,0, ..., Wq,M- 1uq,M- 1, Ws,0vec(us,0), ..., Ws,N-1vec (us,N-1))

with the following diagonal blocks.

It is often possible to exploit problem structure to solve (8.2) faster than by standard methods. The last argument kktsolver of cp() allows the user to supply a Python function for solving the KKT equations. This function will be called as ”f = kktsolver(x, z, W)”. The argument x is the point at which the derivatives in the KKT matrix are evaluated. z is a positive vector of length it m + 1, containing the coefficients in the 1,1 block H. W is a dictionary that contains the parameters of the scaling:

The function call ”f = kktsolver(x, z, W)” should return a routine for solving the KKT system (8.2) defined by x, z, W. It will be called as ”f(bx, by, bz)”. On entry, bx, by, bz contain the righthand side. On exit, they should contain the solution of the KKT system, with the last component scaled, i.e., on exit,

bx := ux,    by := uy,  bz := W uz.

The role of the argument kktsolver in the function cpl() is similar, except that in (8.2),

     m∑-1                  [                           ]
H  =    zk∇2fk(x),    ˜G =  ∇f0 (x) ⋅⋅⋅ ∇fm - 1(x)  GT  T .
     k=0

Specifying constraints via Python functions.
In the default use of cp(), the arguments G and A are the coefficient matrices in the constraints of (8.2). It is also possible to specify these matrices by providing Python functions that evaluate the corresponding matrix-vector products and their adjoints.

If the argument G of conelp() is a Python function, it should be defined as follows:

G(u, v [, alpha[, beta[, trans]]])

This evaluates the matrix-vector products

v := αGu  + βv  (trans = ′N′),   v := αGT u +βv   (trans = ′T′).

The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.

Similarly, if the argument A is a Python function, then it must be defined as follows.

A(u, v [, alpha[, beta[, trans]]])

This evaluates the matrix-vector products

v :=  αAu + βv  (trans = ′N′),   v := αAT u +βv  (trans = ′T′).

The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.

In a similar way, when the first argument F() of cp() returns matrices of first derivatives or second derivatives Df, H, these matrices can be specified as Python functions. If Df is a Python function, it should be defined as follows:

Df(u, v [, alpha[, beta[, trans]]])

This evaluates the matrix-vector products

v := αDf (x)u + βv (trans = ′N′),   v := αDf (x)Tu+ βv  (trans = ′T ′).

The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.

If H is a Python function, it should be defined as follows:

H(u, v [, alpha[, beta]])

This evaluates the matrix-vector product

v := αHu  +βv.

The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.

If G, A, Df, or H are Python functions, then the argument kktsolver must also be provided.

As an example, we consider the unconstrained problem

                      2  ∑n          2
minimize  (1∕2)∥Ax - b∥2 -  i=1 log(1- xi)

where A is an m by n matrix with m less than n. The Hessian of the objective is diagonal plus a low-rank term:

                           2(1 + x2)
H = AT A + diag(d),   di = -----2i2-.
                           (1 - xi)

We can exploit this property when solving (8.2) by applying the matrix inversion lemma. We first solve

(Adiag (d)-1AT + I)v = (1∕z )Adiag(d)-1b ,
                         0            x

and then obtain

ux = diag (d)-1(bx∕z0 - ATv).

The following code follows this method. It also uses BLAS functions for matrix-matrix and matrix-vector products.

from cvxopt import matrix, spdiag, mul, div, log, blas, lapack, solvers, base  
 
def l2ac(A, b):  
    """  
    Solves  
 
        minimize  (1/2) * ||A*x-b||_2^2 - sum log (1-xi^2)  
 
    assuming A is m x n with m << n.  
    """  
 
    m, n = A.size  
    def F(x = None, z = None):  
        if x is None:  
            return 0, matrix(0.0, (n,1))  
        if max(abs(x)) >= 1.0:  
            return None  
        # r = A*x - b  
        r = -b  
        blas.gemv(A, x, r, beta = -1.0)  
        w = x**2  
        f = 0.5 * blas.nrm2(r)**2  - sum(log(1-w))  
        # gradf = A’*r + 2.0 * x ./ (1-w)  
        gradf = div(x, 1.0 - w)  
        blas.gemv(A, r, gradf, trans = ’T’, beta = 2.0)  
        if z is None:  
            return f, gradf.T  
        else:  
            def Hf(u, v, alpha = 1.0, beta = 0.0):  
               # v := alpha * (A’*A*u + 2*((1+w)./(1-w)).*u + beta *v  
               v *= beta  
               v += 2.0 * alpha * mul(div(1.0+w, (1.0-w)**2), u)  
               blas.gemv(A, u, r)  
               blas.gemv(A, r, v, alpha = alpha, beta = 1.0, trans = ’T’)  
            return f, gradf.T, Hf  
 
 
    # Custom solver for the Newton system  
    #  
    #     z[0]*(A’*A + D)*x = bx  
    #  
    # where D = 2 * (1+x.^2) ./ (1-x.^2).^2.  We apply the matrix inversion  
    # lemma and solve this as  
    #  
    #     (A * D^-1 *A’ + I) * v = A * D^-1 * bx / z[0]  
    #     D * x = bx / z[0] - A’*v.  
 
    S = matrix(0.0, (m,m))  
    v = matrix(0.0, (m,1))  
    def Fkkt(x, z, W):  
        ds = (2.0 * div(1 + x**2, (1 - x**2)**2))**-0.5  
        Asc = spdiag(ds) * A  
        blas.syrk(Asc, S)  
        S[::m+1] += 1.0  
        lapack.potrf(S)  
        a = z[0]  
        def g(x, y, z):  
            x[:] = mul(x, ds) / a  
            blas.gemv(Asc, x, v)  
            lapack.potrs(S, v)  
            blas.gemv(Asc, v, x, alpha = -1.0, beta = 1.0, trans = ’T’)  
            x[:] = mul(x, ds)  
        return g  
 
    return solvers.cp(F, kktsolver = Fkkt)[’x’]