问题
I'm trying to solve the following problem: I have a lot (~80000) surface patches of an organ that's growing. I measure each of its areas over time (18 time-points) and want to fit a growth curve to it (bi-logistic model, eg. just the sum of two logistic functions bcs. there are two 'growth spurts' happening in the observed period).
I have box constraints to ensure that the exponential terms don't explode and a linear constraint that one growth spurt has to happen after the other. Also, in order to enforce some sort of spatial continuity in the fitted parameters, I add a penalty term to the objective function (least squares) given as the squared sum of differences between (some of the) parameters of neighbouring patches, so the individual model fits are not independent anymore.
I have gradients and the hessian of the whole thing, and I solve it at increasingly finer scales, starting with the overall surface area, subdividing, mapping the parameters of the global model fit to each patch, running the solver again, subdividing ... until I get my desired resolution.
So, optimization using IPOPT works, but it is terribly slow, and since I'm a bit of a noob in all things concerning optimization theory I was wondering if I'm doing something terribly stupid in the settings. I'm using ma86 as linear solver compiled against openBLAS running on a machine with 256G RAM and 56 cores as well as metis reordering. The other settings I use are:
% nlp scaling
solv_options.ipopt.nlp_scaling_method = 'gradient-based';
solv_options.ipopt.nlp_scaling_max_gradient = 1;
solv_options.ipopt.nlp_scaling_min_value = 1e-16;
solv_options.ipopt.bound_mult_init_method='constant';
% Barrier Parameter
solv_options.ipopt.mu_strategy = 'adaptive';
solv_options.ipopt.mu_oracle = 'quality-function';
solv_options.ipopt.fixed_mu_oracle = 'average_compl';
solv_options.ipopt.adaptive_mu_globalization = 'kkt-error';
solv_options.ipopt.corrector_type = 'affine';
% linear solver
solv_options.ipopt.max_soc=0;
solv_options.ipopt.accept_every_trial_step='yes';
solv_options.ipopt.linear_system_scaling = 'none';
solv_options.ipopt.neg_curv_test_tol = 0;
solv_options.ipopt.neg_curv_test_reg = 'yes';
solv_options.ipopt.max_refinement_steps=0;
solv_options.ipopt.min_refinement_steps=0;
% ma86 settings
solv_options.ipopt.linear_solver='ma86';
solv_options.ipopt.ma86_order='auto';
solv_options.ipopt.ma86_scaling='mc64';
solv_options.ipopt.ma86_small=1e-10;
solv_options.ipopt.ma86_static=1;
solv_options.ipopt.recalc_y='yes';
and this gives me something like this:
This is Ipopt version 3.12, running with linear solver ma86.
Number of nonzeros in equality constraint Jacobian...: 0
Number of nonzeros in inequality constraint Jacobian.: 2560
Number of nonzeros in Lagrangian Hessian.............: 112280
Total number of variables............................: 8960
variables with only lower bounds: 2560
variables with lower and upper bounds: 3840
variables with only upper bounds: 2560
Total number of equality constraints.................: 0
Total number of inequality constraints...............: 1280
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 1280
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 9.8260736e-01 0.00e+00 1.19e-02 0.0 0.00e+00 - 0.00e+00 0.00e+00 0
1 9.6288666e-01 0.00e+00 1.24e-02 -4.5 1.14e-02 0.0 9.91e-01 1.00e+00f 1
2 9.1582880e-01 0.00e+00 1.16e-02 -4.4 2.72e-02 -0.5 1.00e+00 1.00e+00f 1
3 8.2635857e-01 0.00e+00 1.01e-02 -4.6 1.39e-01 -1.0 9.99e-01 1.00e+00f 1
4 7.8943781e-01 0.00e+00 9.40e-03 -4.9 2.70e-02 -0.5 1.00e+00 1.00e+00f 1
5 7.2123624e-01 0.00e+00 8.12e-03 -5.3 8.70e-02 -1.0 1.00e+00 1.00e+00f 1
6 6.9535003e-01 0.00e+00 7.56e-03 -6.1 2.20e-02 -0.6 1.00e+00 9.06e-01f 1
7 6.6635914e-01 0.00e+00 7.00e-03 -6.6 6.21e-02 -1.1 1.00e+00 5.40e-01f 1
8 6.5683787e-01 0.00e+00 6.78e-03 -7.6 2.14e-02 -0.6 1.00e+00 3.81e-01f 1
9 6.4238130e-01 0.00e+00 6.53e-03 -7.7 1.53e-01 -1.1 1.00e+00 2.90e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
10 6.3440625e-01 0.00e+00 6.33e-03 -8.7 2.20e-02 -0.7 1.00e+00 3.30e-01f 1
11 6.2431009e-01 0.00e+00 6.16e-03 -8.3 3.08e-01 -1.2 1.00e+00 2.04e-01f 1
12 6.1872460e-01 0.00e+00 6.05e-03 -8.8 2.29e-02 -0.7 1.00e+00 2.32e-01f 1
13 6.0753815e-01 0.00e+00 5.86e-03 -8.9 2.73e-01 -1.2 1.00e+00 2.31e-01f 1
14 6.0575477e-01 0.00e+00 5.82e-03 -9.9 2.60e-02 -0.8 1.00e+00 7.44e-02f 1
15 6.0089103e-01 0.00e+00 5.71e-03 -11.0 1.11e-02 -0.4 1.00e+00 4.48e-01f 1
16 5.9426852e-01 0.00e+00 5.58e-03 -11.0 3.14e-02 -0.8 1.00e+00 2.68e-01f 1
17 5.9175418e-01 0.00e+00 5.52e-03 -11.0 1.17e-02 -0.4 1.00e+00 2.26e-01f 1
18 5.8756155e-01 0.00e+00 5.44e-03 -11.0 3.97e-02 -0.9 1.00e+00 1.65e-01f 1
19 5.8651702e-01 0.00e+00 5.42e-03 -9.5 1.26e-02 -0.5 1.00e+00 8.89e-02f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
20 5.8278151e-01 0.00e+00 5.34e-03 -10.3 5.18e-02 -0.9 1.00e+00 1.40e-01f 1
21 5.8100614e-01 0.00e+00 5.30e-03 -11.0 1.37e-02 -0.5 1.00e+00 1.42e-01f 1
22 5.7827149e-01 0.00e+00 5.25e-03 -11.0 7.12e-02 -1.0 1.00e+00 9.77e-02f 1
23 5.7564914e-01 0.00e+00 5.19e-03 -9.4 1.52e-02 -0.6 1.00e+00 1.97e-01f 1
24 5.7340095e-01 0.00e+00 5.15e-03 -10.0 1.06e-01 -1.0 1.00e+00 7.66e-02f 1
25 5.7134734e-01 0.00e+00 5.10e-03 -11.0 1.78e-02 -0.6 1.00e+00 1.45e-01f 1
26 5.6897098e-01 0.00e+00 5.06e-03 -11.0 1.82e-01 -1.1 1.00e+00 7.65e-02f 1
27 5.6722484e-01 0.00e+00 5.02e-03 -11.0 2.11e-02 -0.7 1.00e+00 1.16e-01f 1
28 5.6402838e-01 0.00e+00 5.25e-03 -11.0 3.29e-01 -1.1 1.00e+00 9.57e-02f 1
29 5.6299118e-01 0.00e+00 5.26e-03 -10.5 2.38e-02 -0.7 1.00e+00 6.53e-02f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
30 5.5753178e-01 0.00e+00 5.63e-03 -11.0 1.38e-01 -1.2 1.00e+00 1.62e-01f 1
31 5.5730217e-01 0.00e+00 5.63e-03 -11.0 2.37e-02 -0.8 1.00e+00 1.38e-02f 1
32 5.5363398e-01 0.00e+00 5.71e-03 -11.0 8.55e-02 -1.2 1.00e+00 1.06e-01f 1
33 5.5223091e-01 0.00e+00 5.70e-03 -11.0 2.43e-02 -0.8 1.00e+00 7.99e-02f 1
34 5.5163441e-01 0.00e+00 5.70e-03 -9.9 1.30e-01 -1.3 1.00e+00 1.65e-02f 1
35 5.4699694e-01 0.00e+00 5.67e-03 -10.9 2.56e-02 -0.9 1.00e+00 2.48e-01f 1
36 5.4559780e-01 0.00e+00 5.67e-03 -11.0 2.30e-01 -1.3 1.00e+00 3.71e-02f 1
37 5.4248376e-01 0.00e+00 5.63e-03 -11.0 2.58e-02 -0.9 1.00e+00 1.60e-01f 1
38 5.3955718e-01 0.00e+00 5.63e-03 -10.7 6.78e-01 -1.4 1.00e+00 6.97e-02f 1
39 5.3606672e-01 0.00e+00 5.57e-03 -10.9 2.79e-02 -1.0 1.00e+00 1.73e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
40 5.3481638e-01 0.00e+00 5.55e-03 -11.0 1.32e-02 -0.5 1.00e+00 1.33e-01f 1
41 5.3248177e-01 0.00e+00 5.50e-03 -11.0 3.23e-02 -1.0 1.00e+00 1.10e-01f 1
42 5.3143109e-01 0.00e+00 5.48e-03 -10.4 1.37e-02 -0.6 1.00e+00 1.05e-01f 1
43 5.2975978e-01 0.00e+00 5.45e-03 -11.0 3.92e-02 -1.1 1.00e+00 7.50e-02f 1
44 5.2855042e-01 0.00e+00 5.42e-03 -11.0 1.50e-02 -0.6 1.00e+00 1.13e-01f 1
45 5.2670676e-01 0.00e+00 5.38e-03 -11.0 4.80e-02 -1.1 1.00e+00 7.85e-02f 1
46 5.2428812e-01 0.00e+00 5.31e-03 -11.0 1.64e-02 -0.7 1.00e+00 2.12e-01f 1
47 5.2221881e-01 0.00e+00 5.27e-03 -11.0 5.84e-02 -1.2 1.00e+00 8.45e-02f 1
48 5.2084516e-01 0.00e+00 5.23e-03 -10.8 1.79e-02 -0.8 1.00e+00 1.14e-01f 1
49 5.1928549e-01 0.00e+00 5.19e-03 -11.0 7.06e-02 -1.2 1.00e+00 6.06e-02f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
50 5.1787195e-01 0.00e+00 5.15e-03 -11.0 1.95e-02 -0.8 1.00e+00 1.10e-01f 1
51 5.1639782e-01 0.00e+00 5.12e-03 -10.9 8.49e-02 -1.3 1.00e+00 5.45e-02f 1
52 5.1606363e-01 0.00e+00 5.11e-03 -11.0 2.13e-02 -0.9 1.00e+00 2.45e-02f 1
53 5.1178863e-01 0.00e+00 5.01e-03 -11.0 1.04e-01 -1.3 1.00e+00 1.50e-01f 1
54 5.1173718e-01 0.00e+00 5.01e-03 -11.0 2.33e-02 -0.9 1.00e+00 3.61e-03f 1
55 5.0944130e-01 0.00e+00 4.96e-03 -11.0 1.40e-01 -1.4 1.00e+00 7.70e-02f 1
56 5.0727659e-01 0.00e+00 4.89e-03 -11.0 2.55e-02 -1.0 1.00e+00 1.43e-01f 1
57 5.0638483e-01 0.00e+00 4.85e-03 -11.0 1.11e-02 -0.5 1.00e+00 1.29e-01f 1
58 5.0426908e-01 0.00e+00 4.78e-03 -11.0 2.79e-02 -1.0 1.00e+00 1.32e-01f 1
59 5.0333858e-01 0.00e+00 4.74e-03 -11.0 1.22e-02 -0.6 1.00e+00 1.26e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
60 5.0161958e-01 0.00e+00 4.69e-03 -11.0 3.05e-02 -1.1 1.00e+00 1.02e-01f 1
61 5.0037437e-01 0.00e+00 4.64e-03 -11.0 1.34e-02 -0.6 1.00e+00 1.58e-01f 1
62 4.9971029e-01 0.00e+00 4.62e-03 -11.0 3.34e-02 -1.1 1.00e+00 3.72e-02f 1
63 4.9914235e-01 0.00e+00 4.59e-03 -11.0 1.48e-02 -0.7 1.00e+00 6.69e-02f 1
64 4.9664256e-01 0.00e+00 4.51e-03 -11.0 3.70e-02 -1.2 1.00e+00 1.32e-01f 1
65 4.9621567e-01 0.00e+00 4.50e-03 -11.0 1.62e-02 -0.7 1.00e+00 4.72e-02f 1
66 4.9453149e-01 0.00e+00 4.44e-03 -11.0 4.19e-02 -1.2 1.00e+00 8.39e-02f 1
67 4.9304626e-01 0.00e+00 4.38e-03 -11.0 1.79e-02 -0.8 1.00e+00 1.53e-01f 1
68 4.9163087e-01 0.00e+00 4.34e-03 -11.0 6.89e-02 -1.3 1.00e+00 6.69e-02f 1
69 4.9031174e-01 0.00e+00 4.28e-03 -11.0 1.97e-02 -0.8 1.00e+00 1.28e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
70 4.8795240e-01 0.00e+00 4.22e-03 -8.8 2.01e-01 -1.3 9.38e-01 1.04e-01f 1
71 4.8721723e-01 0.00e+00 4.18e-03 -9.8 2.16e-02 -0.9 1.00e+00 6.73e-02f 1
72 4.8560412e-01 0.00e+00 4.10e-03 -11.0 9.08e-03 -0.5 1.00e+00 3.33e-01f 1
73 4.8395604e-01 0.00e+00 4.03e-03 -11.0 2.38e-02 -0.9 1.00e+00 1.42e-01f 1
74 4.8308417e-01 0.00e+00 3.99e-03 -11.0 9.99e-03 -0.5 1.00e+00 1.69e-01f 1
75 4.8236398e-01 0.00e+00 3.96e-03 -10.8 2.63e-02 -1.0 1.00e+00 5.83e-02f 1
76 4.7919513e-01 0.00e+00 3.79e-03 -10.9 1.11e-02 -0.6 1.00e+00 5.72e-01f 1
77 4.7844811e-01 0.00e+00 3.76e-03 -11.0 3.12e-02 -1.0 1.00e+00 5.79e-02f 1
78 4.7801338e-01 0.00e+00 3.74e-03 -11.0 1.22e-02 -0.6 1.00e+00 7.40e-02f 1
79 4.7631089e-01 0.00e+00 3.67e-03 -11.0 4.30e-02 -1.1 1.00e+00 1.23e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
80 4.7516941e-01 0.00e+00 3.61e-03 -11.0 1.35e-02 -0.7 1.00e+00 1.81e-01f 1
81 4.7405503e-01 0.00e+00 3.57e-03 -11.0 7.24e-02 -1.1 1.00e+00 7.54e-02f 1
82 4.7394863e-01 0.00e+00 3.57e-03 -11.0 1.49e-02 -0.7 1.00e+00 1.57e-02f 1
83 4.7209146e-01 0.00e+00 3.51e-03 -8.9 1.76e-01 -1.2 9.65e-01 1.15e-01f 1
84 4.7199185e-01 0.00e+00 3.50e-03 -10.1 1.66e-02 -0.8 1.00e+00 1.36e-02f 1
85 4.7051045e-01 0.00e+00 3.41e-03 -11.0 6.70e-03 -0.3 1.00e+00 4.74e-01f 1
86 4.7028360e-01 0.00e+00 3.40e-03 -11.0 1.92e-02 -0.8 1.00e+00 2.87e-02f 1
87 4.6892068e-01 0.00e+00 3.31e-03 -11.0 7.44e-03 -0.4 1.00e+00 4.02e-01f 1
88 4.6754658e-01 0.00e+00 3.25e-03 -11.0 2.37e-02 -0.9 1.00e+00 1.62e-01f 1
89 4.6723856e-01 0.00e+00 3.23e-03 -11.0 8.24e-03 -0.4 1.00e+00 8.47e-02f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
90 4.6625331e-01 0.00e+00 3.18e-03 -11.0 3.07e-02 -0.9 1.00e+00 1.08e-01f 1
91 4.6621348e-01 0.00e+00 3.18e-03 -11.0 9.16e-03 -0.5 1.00e+00 1.01e-02f 1
92 4.6511327e-01 0.00e+00 3.17e-03 -11.0 4.12e-02 -1.0 1.00e+00 1.11e-01f 1
93 4.6363418e-01 0.00e+00 3.14e-03 -11.0 1.02e-02 -0.5 1.00e+00 3.45e-01f 1
94 4.6314545e-01 0.00e+00 3.14e-03 -11.0 6.17e-02 -1.0 1.00e+00 4.60e-02f 1
95 4.6235630e-01 0.00e+00 3.12e-03 -11.0 1.13e-02 -0.6 1.00e+00 1.70e-01f 1
96 4.6143809e-01 0.00e+00 3.11e-03 -9.7 1.03e-01 -1.1 1.00e+00 7.93e-02f 1
97 4.6069624e-01 0.00e+00 3.10e-03 -10.6 1.38e-02 -0.6 1.00e+00 1.47e-01f 1
98 4.5824023e-01 0.00e+00 3.68e-03 -8.6 3.14e-01 -1.1 8.52e-01 1.81e-01f 1
99 4.5822224e-01 0.00e+00 3.68e-03 -9.9 2.27e-02 -0.7 1.00e+00 3.30e-03f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
100 4.5747364e-01 0.00e+00 3.71e-03 -11.0 7.25e-03 -0.3 1.00e+00 3.33e-01f 1
101 4.5651156e-01 0.00e+00 3.77e-03 -11.0 2.65e-02 -0.8 1.00e+00 1.62e-01f 1
102 4.5624508e-01 0.00e+00 3.79e-03 -11.0 8.36e-03 -0.3 1.00e+00 1.09e-01f 1
103 4.5460694e-01 0.00e+00 3.91e-03 -10.5 3.10e-02 -0.8 1.00e+00 2.54e-01f 1
104 4.5415701e-01 0.00e+00 3.92e-03 -11.0 9.64e-03 -0.4 1.00e+00 1.69e-01f 1
105 4.5379967e-01 0.00e+00 3.95e-03 -11.0 3.54e-02 -0.9 1.00e+00 5.13e-02f 1
106 4.5282822e-01 0.00e+00 3.99e-03 -11.0 1.10e-02 -0.4 1.00e+00 3.33e-01f 1
107 4.5236345e-01 0.00e+00 4.03e-03 -11.0 4.03e-02 -0.9 1.00e+00 6.16e-02f 1
108 4.5158337e-01 0.00e+00 4.06e-03 -11.0 1.24e-02 -0.5 1.00e+00 2.45e-01f 1
109 4.4983066e-01 0.00e+00 4.19e-03 -8.9 4.54e-02 -1.0 1.00e+00 2.15e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
110 4.4976166e-01 0.00e+00 4.20e-03 -10.3 1.39e-02 -0.5 1.00e+00 2.01e-02f 1
111 4.4733643e-01 0.00e+00 4.34e-03 -8.3 4.72e-02 -1.0 9.30e-01 2.78e-01f 1
112 4.4674741e-01 0.00e+00 4.36e-03 -9.7 1.51e-02 -0.6 1.00e+00 1.60e-01f 1
113 4.4600496e-01 0.00e+00 4.39e-03 -8.6 4.25e-02 -1.1 1.00e+00 8.06e-02f 1
114 4.4596146e-01 0.00e+00 4.39e-03 -10.0 1.64e-02 -0.6 1.00e+00 1.09e-02f 1
115 4.4157951e-01 0.00e+00 4.49e-03 -8.4 4.50e-02 -1.1 1.00e+00 4.49e-01f 1
116 4.4137682e-01 0.00e+00 4.48e-03 -9.6 1.60e-02 -0.7 1.00e+00 4.90e-02f 1
117 4.3873388e-01 0.00e+00 4.47e-03 -7.5 3.70e-02 -1.2 9.19e-01 2.63e-01f 1
118 4.3815153e-01 0.00e+00 4.45e-03 -8.7 1.61e-02 -0.7 1.00e+00 1.33e-01f 1
119 4.3569743e-01 0.00e+00 4.41e-03 -7.5 3.98e-02 -1.2 1.00e+00 2.34e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
120 4.3444019e-01 0.00e+00 4.36e-03 -8.6 1.67e-02 -0.8 1.00e+00 2.71e-01f 1
121 4.3145745e-01 0.00e+00 4.27e-03 -7.3 4.24e-02 -1.3 1.00e+00 2.74e-01f 1
122 4.2975412e-01 0.00e+00 4.19e-03 -8.0 1.81e-02 -0.8 1.00e+00 3.55e-01f 1
123 4.2433966e-01 0.00e+00 3.98e-03 -6.9 4.56e-02 -1.3 1.00e+00 4.92e-01f 1
124 4.2414838e-01 0.00e+00 3.96e-03 -7.5 1.92e-02 -0.9 1.00e+00 4.01e-02f 1
125 4.2194620e-01 0.00e+00 3.87e-03 -6.9 4.80e-02 -1.4 1.00e+00 1.96e-01f 1
126 4.1778190e-01 0.00e+00 3.63e-03 -7.7 2.08e-02 -0.9 1.00e+00 8.34e-01f 1
127 4.1737428e-01 0.00e+00 2.94e-03 -7.1 4.72e-02 -1.4 1.00e+00 3.66e-02f 1
128 4.1315312e-01 0.00e+00 2.54e-03 -7.8 2.22e-02 -1.0 1.00e+00 8.46e-01f 1
129 4.1173197e-01 0.00e+00 2.51e-03 -7.1 6.31e-02 -1.5 1.00e+00 1.27e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
130 4.0915489e-01 0.00e+00 2.46e-03 -7.7 2.32e-02 -1.0 1.00e+00 5.13e-01f 1
131 4.0515277e-01 0.00e+00 2.38e-03 -6.9 9.97e-02 -1.5 1.00e+00 3.50e-01f 1
132 4.0343660e-01 0.00e+00 2.34e-03 -7.3 2.38e-02 -1.1 1.00e+00 3.43e-01f 1
133 3.9516682e-01 0.00e+00 2.17e-03 -6.8 1.68e-01 -1.6 1.00e+00 7.24e-01f 1
134 3.9098427e-01 0.00e+00 2.06e-03 -7.2 2.34e-02 -1.1 1.00e+00 9.07e-01f 1
135 3.9058959e-01 0.00e+00 2.05e-03 -7.0 5.50e-02 -1.6 1.00e+00 3.97e-02f 1
136 3.8628795e-01 0.00e+00 1.93e-03 -7.7 2.39e-02 -1.2 1.00e+00 9.42e-01f 1
137 3.8288703e-01 0.00e+00 1.84e-03 -7.2 6.76e-02 -1.7 1.00e+00 3.49e-01f 1
138 3.8113960e-01 0.00e+00 1.79e-03 -7.8 2.37e-02 -1.2 1.00e+00 4.02e-01f 1
139 3.7895865e-01 0.00e+00 1.74e-03 -7.3 8.98e-02 -1.7 1.00e+00 2.23e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
140 3.7630588e-01 0.00e+00 1.65e-03 -7.9 2.56e-02 -1.3 1.00e+00 6.04e-01f 1
141 3.7040736e-01 0.00e+00 2.44e-03 -7.4 4.40e-01 -1.8 1.00e+00 5.52e-01f 1
142 3.6744301e-01 0.00e+00 2.20e-03 -8.1 5.47e-02 -1.3 1.00e+00 6.87e-01f 1
143 3.6668338e-01 0.00e+00 2.11e-03 -9.7 1.27e-02 -0.9 1.00e+00 4.33e-01f 1
144 3.6563026e-01 0.00e+00 2.04e-03 -10.0 1.45e-01 -1.4 1.00e+00 2.33e-01f 1
145 3.6473497e-01 0.00e+00 1.94e-03 -11.0 1.73e-02 -1.0 1.00e+00 4.85e-01f 1
146 3.6423323e-01 0.00e+00 2.14e-03 -11.0 2.77e-01 -1.5 1.00e+00 1.00e-01f 1
147 3.6397151e-01 0.00e+00 2.15e-03 -10.1 2.02e-02 -1.0 1.00e+00 1.32e-01f 1
148 3.6196968e-01 0.00e+00 2.50e-03 -8.5 1.56e-01 -1.5 1.00e+00 3.92e-01f 1
149 3.6176499e-01 0.00e+00 2.49e-03 -9.6 1.65e-02 -1.1 1.00e+00 1.00e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
150 3.6069639e-01 0.00e+00 2.45e-03 -8.4 3.49e-02 -1.6 1.00e+00 2.18e-01f 1
151 3.6041434e-01 0.00e+00 2.43e-03 -9.4 1.63e-02 -1.1 1.00e+00 1.30e-01f 1
152 3.5921303e-01 0.00e+00 2.37e-03 -8.5 3.78e-02 -1.6 1.00e+00 2.34e-01f 1
153 3.5869485e-01 0.00e+00 2.33e-03 -9.4 1.64e-02 -1.2 1.00e+00 2.27e-01f 1
154 3.5699552e-01 0.00e+00 2.25e-03 -7.9 4.06e-02 -1.7 1.00e+00 3.20e-01f 1
155 3.5674149e-01 0.00e+00 2.23e-03 -8.7 1.77e-02 -1.2 1.00e+00 1.07e-01f 1
156 3.5471902e-01 0.00e+00 2.11e-03 -7.5 4.33e-02 -1.7 1.00e+00 3.71e-01f 1
157 3.5335745e-01 0.00e+00 1.99e-03 -8.3 1.91e-02 -1.3 1.00e+00 5.64e-01f 1
158 3.5245420e-01 0.00e+00 1.94e-03 -7.5 4.53e-02 -1.8 1.00e+00 1.67e-01f 1
159 3.5021073e-01 0.00e+00 1.73e-03 -8.3 2.05e-02 -1.3 1.00e+00 9.16e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
160 3.4610138e-01 0.00e+00 1.04e-03 -7.3 4.72e-02 -1.8 1.00e+00 7.94e-01f 1
161 3.4469412e-01 0.00e+00 9.99e-04 -8.0 2.07e-02 -1.4 1.00e+00 6.40e-01f 1
162 3.4399767e-01 0.00e+00 1.02e-03 -7.7 4.72e-02 -1.9 1.00e+00 1.42e-01f 1
163 3.4260897e-01 0.00e+00 9.60e-04 -8.1 2.22e-02 -1.4 1.00e+00 6.21e-01f 1
164 3.4129508e-01 0.00e+00 9.37e-04 -7.4 4.92e-02 -1.9 1.00e+00 2.68e-01f 1
165 3.4016827e-01 0.00e+00 9.15e-04 -8.0 2.34e-02 -1.5 1.00e+00 5.07e-01f 1
166 3.3825082e-01 0.00e+00 8.81e-04 -7.2 5.10e-02 -2.0 1.00e+00 3.98e-01f 1
167 3.3610567e-01 0.00e+00 8.39e-04 -7.7 2.44e-02 -1.5 1.00e+00 1.00e+00f 1
168 3.3460677e-01 0.00e+00 8.12e-04 -7.1 5.92e-02 -2.0 1.00e+00 3.31e-01f 1
169 3.3266040e-01 0.00e+00 7.74e-04 -7.7 2.51e-02 -1.6 1.00e+00 9.50e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
170 3.3037125e-01 0.00e+00 7.34e-04 -7.1 8.30e-02 -2.1 1.00e+00 5.31e-01f 1
171 3.2873890e-01 0.00e+00 7.02e-04 -7.7 2.53e-02 -1.6 1.00e+00 8.56e-01f 1
172 3.2622239e-01 0.00e+00 6.57e-04 -7.2 5.31e-02 -2.1 1.00e+00 6.32e-01f 1
173 3.2470308e-01 0.00e+00 6.28e-04 -7.8 2.53e-02 -1.7 1.00e+00 8.69e-01f 1
174 3.2372187e-01 0.00e+00 6.27e-04 -7.5 5.26e-02 -2.2 1.00e+00 2.63e-01f 1
175 3.2283458e-01 0.00e+00 5.92e-04 -8.2 2.60e-02 -1.7 1.00e+00 5.20e-01f 1
176 3.2183252e-01 0.00e+00 6.17e-04 -7.5 5.47e-02 -2.2 1.00e+00 2.71e-01f 1
177 3.2046827e-01 0.00e+00 5.47e-04 -8.1 2.70e-02 -1.8 1.00e+00 8.08e-01f 1
178 3.1975003e-01 0.00e+00 6.03e-04 -7.1 5.85e-02 -2.3 1.00e+00 2.04e-01f 1
179 3.1900035e-01 0.00e+00 5.19e-04 -7.6 2.77e-02 -1.8 1.00e+00 4.53e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
180 3.1615561e-01 0.00e+00 4.90e-04 -6.8 6.32e-02 -2.3 1.00e+00 8.53e-01f 1
181 3.1524761e-01 0.00e+00 4.49e-04 -7.3 2.68e-02 -1.9 1.00e+00 5.85e-01f 1
182 3.1326122e-01 0.00e+00 4.87e-04 -7.0 6.24e-02 -2.4 9.98e-01 6.26e-01f 1
183 3.1296679e-01 0.00e+00 4.09e-04 -7.7 2.63e-02 -1.9 1.00e+00 2.01e-01f 1
184 3.1055716e-01 0.00e+00 4.34e-04 -6.8 6.57e-02 -2.4 1.00e+00 8.22e-01f 1
185 3.1020674e-01 0.00e+00 3.60e-04 -7.5 2.70e-02 -2.0 1.00e+00 2.54e-01f 1
186 3.0929569e-01 0.00e+00 4.68e-04 -7.0 6.86e-02 -2.5 1.00e+00 3.18e-01f 1
187 3.0819030e-01 0.00e+00 3.23e-04 -7.8 2.94e-02 -2.0 1.00e+00 8.02e-01f 1
188 3.0763044e-01 0.00e+00 4.47e-04 -7.2 8.21e-02 -2.5 1.00e+00 2.02e-01f 1
189 3.0725472e-01 0.00e+00 3.07e-04 -8.0 3.11e-02 -2.1 1.00e+00 2.85e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
190 3.0578606e-01 0.00e+00 4.05e-04 -7.5 2.33e-01 -2.6 1.00e+00 5.11e-01f 1
191 3.0495222e-01 0.00e+00 3.43e-04 -8.1 2.91e-02 -2.2 1.00e+00 6.59e-01f 1
192 3.0413633e-01 0.00e+00 1.17e-03 -7.6 4.93e-01 -2.6 1.00e+00 2.84e-01f 1
193 3.0331078e-01 0.00e+00 9.60e-04 -7.7 3.14e-02 -2.2 1.00e+00 6.81e-01f 1
194 3.0239991e-01 0.00e+00 8.65e-04 -6.7 7.67e-02 -2.7 9.99e-01 4.00e-01f 1
195 3.0137954e-01 0.00e+00 6.77e-04 -7.5 3.36e-02 -2.3 1.00e+00 8.52e-01f 1
196 3.0056119e-01 0.00e+00 6.24e-04 -7.5 8.05e-02 -2.7 1.00e+00 3.28e-01f 1
197 3.0014178e-01 0.00e+00 5.65e-04 -8.3 3.58e-02 -2.3 1.00e+00 3.62e-01f 1
198 2.9923633e-01 0.00e+00 5.17e-04 -8.2 8.53e-02 -2.8 1.00e+00 3.59e-01f 1
199 2.9888005e-01 0.00e+00 4.75e-04 -8.9 3.83e-02 -2.4 1.00e+00 3.15e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
200 2.9848396e-01 0.00e+00 4.57e-04 -8.1 8.98e-02 -2.8 1.00e+00 1.58e-01f 1
201 2.9821853e-01 0.00e+00 4.30e-04 -8.8 4.15e-02 -2.4 1.00e+00 2.29e-01f 1
202 2.9738245e-01 0.00e+00 3.96e-04 -7.2 9.62e-02 -2.9 9.99e-01 3.39e-01f 1
203 2.9665614e-01 0.00e+00 3.36e-04 -7.8 4.41e-02 -2.5 1.00e+00 6.30e-01f 1
204 2.9578199e-01 0.00e+00 3.07e-04 -6.8 1.02e-01 -2.9 1.00e+00 3.87e-01f 1
205 2.9465429e-01 0.00e+00 2.39e-04 -7.4 4.54e-02 -2.5 1.00e+00 1.00e+00f 1
206 2.9301415e-01 0.00e+00 1.99e-04 -6.7 1.07e-01 -3.0 1.00e+00 8.01e-01f 1
207 2.9292261e-01 0.00e+00 1.94e-04 -7.3 4.47e-02 -2.6 1.00e+00 8.17e-02f 1
208 2.9132376e-01 0.00e+00 2.05e-04 -6.8 1.11e-01 -3.0 1.00e+00 7.08e-01f 1
209 2.9068030e-01 0.00e+00 1.46e-04 -7.4 4.72e-02 -2.6 1.00e+00 5.83e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
210 2.8989341e-01 0.00e+00 2.15e-04 -6.8 1.19e-01 -3.1 1.00e+00 3.71e-01f 1
211 2.8905846e-01 0.00e+00 1.25e-04 -7.4 5.00e-02 -2.7 1.00e+00 7.99e-01f 1
212 2.8817010e-01 0.00e+00 2.35e-04 -6.9 1.32e-01 -3.1 1.00e+00 4.36e-01f 1
213 2.8741077e-01 0.00e+00 1.25e-04 -7.5 5.21e-02 -2.7 1.00e+00 7.41e-01f 1
214 2.8705020e-01 0.00e+00 2.96e-04 -7.2 1.57e-01 -3.2 1.00e+00 1.65e-01f 1
215 2.8639044e-01 0.00e+00 1.29e-04 -8.0 5.54e-02 -2.8 1.00e+00 6.44e-01f 1
216 2.8567025e-01 0.00e+00 2.13e-04 -7.4 4.30e-01 -3.2 1.00e+00 3.16e-01f 1
217 2.8519078e-01 0.00e+00 1.22e-04 -8.1 6.05e-02 -2.8 1.00e+00 4.79e-01f 1
218 2.8508336e-01 0.00e+00 1.93e-04 -7.5 3.80e-01 -3.3 1.00e+00 4.58e-02f 1
219 2.8459190e-01 0.00e+00 1.16e-04 -8.3 6.70e-02 -2.9 1.00e+00 4.75e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
220 2.8382163e-01 0.00e+00 1.63e-04 -7.2 3.30e-01 -3.3 1.00e+00 3.29e-01f 1
221 2.8341130e-01 0.00e+00 1.28e-04 -7.8 7.27e-02 -2.9 1.00e+00 3.93e-01f 1
222 2.8284739e-01 0.00e+00 1.56e-04 -7.0 2.21e-01 -3.4 1.00e+00 2.45e-01f 1
223 2.8183563e-01 0.00e+00 1.19e-04 -7.7 7.96e-02 -3.0 1.00e+00 9.45e-01f 1
224 2.8117611e-01 0.00e+00 1.40e-04 -6.9 4.77e-01 -3.4 1.00e+00 2.96e-01f 1
225 2.8033847e-01 0.00e+00 1.04e-04 -7.6 8.59e-02 -3.0 1.00e+00 7.96e-01f 1
226 2.8005033e-01 0.00e+00 9.73e-05 -9.1 3.26e-02 -2.6 1.00e+00 6.19e-01f 1
Number of Iterations....: 226
(scaled) (unscaled)
Objective...............: 2.8005032523583379e-01 2.8005032523583379e-01
Dual infeasibility......: 9.7272674831558608e-05 9.7272674831558608e-05
Constraint violation....: 0.0000000000000000e+00 0.0000000000000000e+00
Complementarity.........: 3.2315149852880829e-07 3.2315149852880829e-07
Overall NLP error.......: 9.7272674831558608e-05 9.7272674831558608e-05
Number of objective function evaluations = 227
Number of objective gradient evaluations = 227
Number of equality constraint evaluations = 0
Number of inequality constraint evaluations = 228
Number of equality constraint Jacobian evaluations = 0
Number of inequality constraint Jacobian evaluations = 1
Number of Lagrangian Hessian evaluations = 226
Total CPU secs in IPOPT (w/o function evaluations) = 3403.326
Total CPU secs in NLP function evaluations = 214.977
EXIT: Optimal Solution Found.
.... the timing is ok at this resolution, but at higher ones, the linear solves become quite large and extremely slow. So I guess my question is: can I expect something like this (at a higher resolution, with ~500 000 parameters) to take a couple of days to solve or are there any settings I could tweak to increase convergence / reduce the number if iterations needed. The initial guesses are quite good actually btw, and a local optimum is actually all I'm looking for.
回答1:
It looks like you were very thorough in your approach. With ~500k variables and 100k constraints you have a lot of degrees of freedom. The primary alternative to interior point approaches (of which, IPOPT is quite good) is an active-set approach. Active-set methods tend to be better with few degrees of freedom, so IPOPT is your best bet.
The IPOPT output indicates a few things:
The Hessian is not positive-definite (nearly every iteration requires regularization)
After regularization, the problem is sufficiently convex within variable bounds (no back-tracking line search)
IPOPT time is >> function call time (3403.326 : 214.977 is very large). Most of this time is in matrix factorizations.
Nearly every iteration is truncated because of variable bounds.
It is my understanding that when your problem gets too big it exceeds hardware limitations (CPU cache) and the linear algebra time blows up. This is probably your main problem.
For these reasons, I would recommend trying a BFGS approach (IPOPT option). By directly calculating the approximate inverse of the Hessian, you avoid difficult matrix factorizations/solves. Further, the BFGS approach can guarantee positive-definite Hessians. Usually, the BFGS approximation is used when the Hessian is not available since the exact Hessian should provide more accurate steps. But with your regularization, expensive factorization and truncated steps, BFGS will likely be almost as good. Expect more iterations (226 is very small), but each should be much faster.
You may also want to play with loosening your variable bounds. Interior-point methods are designed to avoid going to bounds. With that many bounds, it could be slowing down progress.
回答2:
Not a definite answer but too long for a comment: Ipopt should be well optimized already so I'm afraid you won't be getting better results unless you change your whole algorithm.
If a local optimum is what you're looking for, then start at a coarse resolution and choose the "most promising" section (i.e. where an optimum is more likely to be found). Then enhance your resolution on that section and start over by dichotomy.
You could also check Neural Networks, which can model complex non-linear functions and are now mature, with plenty of implementation taking advantage of CPU+GPU architectures that could help a lot. The gradient backprop is designed to efficiently find local extremas.
来源:https://stackoverflow.com/questions/51274069/convergence-of-a-very-large-non-linear-least-squares-optimization