{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Working with Callback Functions\n",
"\n",
"MIP solvers are complicated combinations of many techniques: cutting planes, heuristics, branching rules, etc.\n",
"\n",
"Some solvers allow you to customize aspects of the solve process in a deeper way than just setting options for these parameters. You can provide code to be run when certain events happen, and the solver **calls back** to these functions to ask what action(s) should be taken. Why might you want to do this? Some situations:\n",
"\n",
"* you might want to extract the intermediate solutions found in the branch&bound tree, or other useful information during the solution process, and not work only with the optimal solution.\n",
"\n",
"* you might not want to enumerate all of your constraints at the onset, and enforce certain constraints only when they're needed.\n",
"\n",
"We'll explore these cases in this notebook."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"## Warm-Up: Linear Regression and its variants"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"srand(123)\n",
"\n",
"n = 20\n",
"p = 5\n",
"\n",
"real_x = 10*rand(p)\n",
"A = rand(n,p)\n",
"b = A*real_x + rand(n);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We start with linear regression: suppose we have a data matrix $A$, and vector $b$, and we want to estimate a vector $x$ such that:\n",
"\n",
"$$\n",
"\\underset{x}{\\min}\\ || Ax-b ||_2^2\n",
"$$\n",
"\n",
"One way of expressing it in JuMP is through the following formulation:\n",
"\n",
"$$\n",
"\\underset{x}{\\min}\\ \\sum_i z_i^2 \\\\\n",
"\\text{s.t.}\\quad z_i = a_i^\\top x - b_i \\quad\\forall i\n",
"$$\n",
"\n",
"where $a_i^\\top$ is the $i$-th row of the data matrix $A$."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Optimize a model with 20 rows, 25 columns and 120 nonzeros\n",
"Model has 20 quadratic objective terms\n",
"Coefficient statistics:\n",
" Matrix range [9e-03, 1e+00]\n",
" Objective range [0e+00, 0e+00]\n",
" QObjective range [2e+00, 2e+00]\n",
" Bounds range [0e+00, 0e+00]\n",
" RHS range [9e+00, 2e+01]\n",
"Presolve time: 0.01s\n",
"Presolved: 20 rows, 25 columns, 120 nonzeros\n",
"Presolved model has 20 quadratic objective terms\n",
"Ordering time: 0.00s\n",
"\n",
"Barrier statistics:\n",
" Free vars : 25\n",
" AA' NZ : 1.900e+02\n",
" Factor NZ : 2.100e+02\n",
" Factor Ops : 2.870e+03 (less than 1 second per iteration)\n",
" Threads : 1\n",
"\n",
" Objective Residual\n",
"Iter Primal Dual Primal Dual Compl Time\n",
" 0 4.54321690e+00 -4.54321690e+00 7.11e-15 1.09e+00 0.00e+00 0s\n",
" 1 2.19649750e+00 2.19650455e+00 1.05e-09 1.09e-06 0.00e+00 0s\n",
" 2 2.19649750e+00 2.19649750e+00 3.55e-15 1.97e-12 0.00e+00 0s\n",
"\n",
"Barrier solved model in 2 iterations and 0.01 seconds\n",
"Optimal objective 2.19649750e+00\n",
"\n"
]
},
{
"data": {
"text/plain": [
":Optimal"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using JuMP, Gurobi\n",
"\n",
"m = Model(solver=GurobiSolver())\n",
"@variable(m, x[1:p])\n",
"@variable(m, z[1:n])\n",
"\n",
"@objective(m, Min, sum(z[i]^2 for i in 1:n))\n",
"\n",
"@constraint(m, [i=1:n], z[i] == sum(A[i,j]*x[j] for j in 1:p) - b[i])\n",
" \n",
"solve(m)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"5-element Array{Float64,1}:\n",
" 7.85979\n",
" 9.8411 \n",
" 6.95982\n",
" 3.84803\n",
" 3.20848"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"getvalue(x)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"5-element Array{Float64,1}:\n",
" 7.68448\n",
" 9.40515\n",
" 6.73959\n",
" 3.95453\n",
" 3.13244"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"real_x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Discussion**: When might you want to do this yourself (in JuMP/etc), versus using a opensource/commercial package?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
">**\\[Exercise\\]**: Non-negative Least Squares\n",
"\n",
"> How might we modify the formulation above to solve for non-negative least squares, i.e. $\\underset{x\\geq 0}{\\min}\\ || Ax-b ||_2^2$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
">**\\[Exercise\\]**: Sparse Linear Regression\n",
"\n",
"> How might we modify the formulation above if we know at most $k$ of the coefficients are non-zero? i.e. $\\underset{x: ||x||_0\\leq k}{\\min}\\ || Ax-b ||_2^2$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Discussion**: Do you know of other ways of getting sparse (as in low number of non-zero coefficients) solutions in a linear regression setting?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extracting Intermediate Solutions\n",
"Rather than having to parse the solver log, sometimes querying for information (amongst other things) might be useful for you to do convergence plots (or perform other diagnostics). Informational callbacks are added to a JuMP model with the `addinfocallback(m::Model, f::Function; when::Symbol)` function, where the `when` argument should be one of `:MIPNode`, `:MIPSol` or `:Intermediate` (listed under `cbgetstate()` in the [MathProgBase documentation](https://mathprogbasejl.readthedocs.io/en/latest/callbacks.html)).\n",
"\n",
"Suppose we wish to extract all the incumbent solutions generated during the branch&bound process. Here's how we can modify the code above to do so:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Optimize a model with 31 rows, 30 columns and 145 nonzeros\n",
"Model has 20 quadratic objective terms\n",
"Variable types: 25 continuous, 5 integer (5 binary)\n",
"Coefficient statistics:\n",
" Matrix range [9e-03, 1e+04]\n",
" Objective range [0e+00, 0e+00]\n",
" QObjective range [2e+00, 2e+00]\n",
" Bounds range [1e+00, 1e+00]\n",
" RHS range [2e+00, 2e+01]\n",
"Presolve removed 5 rows and 0 columns\n",
"Presolve time: 0.00s\n",
"Presolved: 26 rows, 30 columns, 135 nonzeros\n",
"Presolved model has 20 quadratic objective terms\n",
"Variable types: 25 continuous, 5 integer (5 binary)\n",
"\n",
"Root relaxation: objective 2.196497e+00, 36 iterations, 0.00 seconds\n",
"\n",
" Nodes | Current Node | Objective Bounds | Work\n",
" Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time\n",
"\n",
" 0 0 2.19650 0 5 - 2.19650 - - 0s\n",
"H 0 0 415.3988979 2.19650 99.5% - 0s\n",
"H 0 0 356.2185398 2.19650 99.4% - 0s\n",
" 0 2 2.19650 0 5 356.21854 2.19650 99.4% - 0s\n",
"* 8 2 3 189.0359040 149.35562 21.0% 2.8 0s\n",
"\n",
"Explored 12 nodes (64 simplex iterations) in 0.06 seconds\n",
"Thread count was 4 (of 4 available processors)\n",
"\n",
"Solution count 3: 189.036 356.219 415.399 \n",
"Pool objective bound 189.036\n",
"\n",
"Optimal solution found (tolerance 1.00e-04)\n",
"Best objective 1.890359040380e+02, best bound 1.890359040380e+02, gap 0.0000%\n"
]
},
{
"data": {
"text/plain": [
":Optimal"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using JuMP, Gurobi\n",
"\n",
"M = 10000\n",
"k = 2\n",
"\n",
"m = Model(solver=GurobiSolver())\n",
"@variable(m, x[1:p] >= 0)\n",
"@variable(m, y[1:p], Bin)\n",
"@variable(m, z[1:n])\n",
"\n",
"@objective(m, Min, sum(z[i]^2 for i in 1:n))\n",
"\n",
"@constraint(m, [i=1:n], z[i] == sum(A[i,j]*x[j] for j in 1:p) - b[i])\n",
"@constraint(m,[j=1:p], x[j] <= M*y[j])\n",
"@constraint(m,[j=1:p], x[j] >= -M*y[j])\n",
"@constraint(m, sum(y[j] for j in 1:p) <= k)\n",
"\n",
"### --- NEW ---\n",
"solutionvalues = []\n",
"function solncallback(cb)\n",
" push!(solutionvalues, JuMP.getvalue(x))\n",
"end\n",
"addinfocallback(m, solncallback, when = :MIPSol)\n",
"### END OF NEW\n",
"\n",
"solve(m)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can inspect the incumbent solutions generated:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"3-element Array{Any,1}:\n",
" [0.0,0.0,0.0,13.563,11.1933] \n",
" [13.5941,16.1585,0.0,0.0,0.0]\n",
" [0.0,17.6869,12.7185,0.0,0.0]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"solutionvalues"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As well as their corresponding objective values:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"3-element Array{Float64,1}:\n",
" 415.399\n",
" 356.219\n",
" 189.036"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[norm(A*sol_x-b)^2 for sol_x in solutionvalues]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Remark**: Can you identify the objective values of the incumbent solution in the solver log above? Which of the incumbent solutions were found via heuristics, and which ones were found via branch&bound?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
">**\\[Exercise\\]**: Information Callbacks\n",
"\n",
"> How might you organize the code such that you can extract\n",
"\n",
"> - the objective function value of the current solution\n",
"> - the current best bound on the optimal solution\n",
"> - the current time taken since the start of the solution process\n",
"\n",
"> (Hint: http://www.juliaopt.org/JuMP.jl/0.15/callbacks.html#informational-callbacks)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Modelling using Lazy Constraints\n",
"\n",
"### Motivation\n",
"**Question**: Now suppose we wish to solve the sparse linear regression problem\n",
"\n",
"$$\n",
"\\underset{x}{\\min}\\ \\sum_i z_i^2 \\\\\n",
"\\text{s.t.}\\quad z_i = a_i^\\top x - b \\quad\\forall i \\\\\n",
"x_j \\leq My_j \\\\\n",
"x_j \\geq -My_j \\\\\n",
"\\sum_{j=1}^p y_j \\leq k \\\\\n",
"y_j\\in\\{0,1\\}\\quad\\forall j=1,\\dots,p\n",
"$$\n",
"\n",
"but as a sequence of mixed integer linear programs, rather than a single mixed integer quadratic program. How might we go about doing so?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Answer**: ![](outerapprox.jpg)\n",
"(image source: http://www.mit.edu/~dimitrib/Polyhedral_Approx_Tsinghua.pdf)\n",
"\n",
"In general, we might want to perform a \"linearization\" of some (often convex) function, where we perform a succession of \"outer-approximations\" that converges towards the underlying function we wish to optimize over."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Remark**: For readers who are interested in cutting-plane methods, the following might be a good reference: https://web.stanford.edu/class/ee392o/localization-methods.pdf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Discussion**: What are potential issues we might run into, if we wish to implement this?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Description\n",
"Lazy constraints are useful when the full set of constraints is too large to explicitly include in the initial formulation. We might want to modify the MIP solution process such that when we have a new solution (for example with a heuristic or by solving a problem at a node in the branch-and-bound tree), we are given the opportunity to generate additional constraint(s). This allows us to generate them only upon demand, and stop the process based on our termination criteria.\n",
"\n",
"There are three important steps to providing a lazy constraint callback:\n",
"\n",
"1. we must write a function that will analyze the current solution that takes a single argument, e.g. `mylazycongenerator(cb)`, where cb is a reference to the callback management code inside JuMP.\n",
"2. do whatever analysis of the solution you need to inside your function to generate the new constraint before adding it to the model with `@lazyconstraint(cb, myconstraint)` (instead of the usual `@constraint(m, myconstraint)`).\n",
"3. finally we notify JuMP that this function should be used for lazy constraint generation using the `addlazycallback(m, mylazycongenerator)` function before we call `solve(m)`.\n",
"\n",
"For more information, see http://www.juliaopt.org/JuMP.jl/0.15/callbacks.html#lazy-constraints"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Task\n",
"Now, we'd like to implementing sparse linear regression using lazy constraints, instead of formulating it as a quadratic optimization problem. To do so, we first represent the objective function\n",
"\n",
"$$\n",
"f(z) = ||z||_2^2\n",
"$$\n",
"\n",
"as the pointwise maximum of affine functions:\n",
"\n",
"$$\n",
"f(z) = \\underset{\\beta}{\\sup} ||\\beta||_2^2 + 2\\beta^\\top(z-\\beta)\n",
"$$\n",
"\n",
"(or equivalently)\n",
"\n",
"$$\n",
"f(z) = \\min\\ \\eta \\\\\n",
"\\text{s.t.}\\ \\ \\eta \\geq ||\\beta||_2^2 + 2\\beta^\\top(z-\\beta) \\quad\\forall\\beta\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Code\n",
"So our JuMP model becomes:\n",
"\n",
"$$\n",
"\\underset{x}{\\min}\\ \\eta \\\\\n",
"\\text{s.t.}\\quad \n",
"\\eta \\geq ||\\beta||_2^2 + 2\\beta^\\top(z-\\beta) \\quad\\forall \\beta \\\\\n",
"z_i = a_i^\\top x - b \\quad\\forall i \\\\\n",
"x_j \\leq My_j \\\\\n",
"x_j \\geq -My_j \\\\\n",
"\\sum_{j=1}^p y_j \\leq k \\\\\n",
"y_j\\in\\{0,1\\}\\quad\\forall j=1,\\dots,p\n",
"$$\n",
"\n",
"where we represent the set of (infinite) constraints:\n",
"\n",
"$$\n",
"\\eta \\geq ||\\beta||_2^2 + 2\\beta^\\top(z-\\beta) \\quad\\forall \\beta\n",
"$$\n",
"\n",
"using lazy constraints."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
">**\\[Exercise\\]**: Lazy Callbacks\n",
"\n",
"> Implement the above formulation in JuMP (reference: http://www.juliaopt.org/JuMP.jl/0.15/callbacks.html#lazy-constraints)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"using JuMP, Gurobi\n",
"\n",
"M = 10000\n",
"k = 2\n",
"\n",
"m = Model(solver=GurobiSolver())\n",
"@variable(m, x[1:p] >= 0)\n",
"@variable(m, y[1:p], Bin)\n",
"@variable(m, z[1:n])\n",
"@variable(m, eta >= 0)\n",
"\n",
"@objective(m, Min, eta)\n",
"\n",
"@constraint(m, [i=1:n], z[i] == sum(A[i,j]*x[j] for j in 1:p) - b[i])\n",
"@constraint(m,[j=1:p], x[j] <= M*y[j])\n",
"@constraint(m,[j=1:p], x[j] >= -M*y[j])\n",
"@constraint(m, sum(y[j] for j in 1:p) <= k)\n",
"\n",
"### --- Your solution here ---\n",
"function lazyleastsqs(cb)\n",
"end\n",
"addlazycallback(m, lazyleastsqs)\n",
"### --- End of solution ---\n",
"\n",
"solve(m)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Discussion**: So far, we've been looking at modelling a convex function as the pointwise maximum of a (possibly infinite) set of affine functions in the objective function. Can you anticipate other optimization problems that might be usefully modelled using a large (possibly infinite) number of linear constraints?\n",
"\n",
"See for e.g. Robust Portfolio Optimization and Travelling Salesman by Iain in [last year's IAP class](https://github.com/joehuchette/OR-software-tools-2015/blob/master/7-adv-optimization/Callbacks.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Discussion**: Sometimes a good polyhedral outer approximation might need too many linear constraints, and might benefit from \"extended formulations\". See http://www.mit.edu/~mlubin/micp-cribb.pdf (from slides 19 onwards) for some details."
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Julia 0.5.0",
"language": "julia",
"name": "julia-0.5"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "0.5.0"
}
},
"nbformat": 4,
"nbformat_minor": 0
}