Why does FindMaximum with Newton's method complain it can't find a sufficient decrease in function?

问题

Firstly, this seems like (from ContourPlot) a fairly straightforward maximization problem, why is FindMaximum with Newton's method having problems?

Secondly, how can I get rid of the warnings?

Thirdly, if I can't get rid of these warnings, how can I tell if the warning is meaningful, ie, maximization failed?

For instance, in the code below, FindMaximum with Newton's method gives a warning, whereas the PrincipalAxis method doesn't

o = 1/5 Log[E^(-(h/Sqrt[3]))/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   3/10 Log[E^(h/Sqrt[3])/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/5 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))];
(* -1 makes more contours towards maximum *)

contourFunc[n_, p_] := Function[{min, max},
   range = max - min;
   Table[Exp[p (x - 1)] x range + min, {x, 0, 1, 1/n}]
   ];
cf = contourFunc[10, -1];
ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}, Contours -> cf}

FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "Newton"}
FindMaximum @@ {o, {{j, 0}, {h, 0}}, Method -> "PrincipalAxis"}

Note, I thought that maybe gradient being 0 in direction of one of the components was the problem, but if I perturb the initial point I still get the same warning, here's an example

o = 1/5 Log[E^(-(h/Sqrt[3]))/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/5 Log[E^(h/Sqrt[3])/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(-(h/Sqrt[3]) - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   3/10 Log[E^(h/Sqrt[3] - Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(-Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))] + 
   1/10 Log[E^(Sqrt[3] h + Sqrt[2] j)/(
     2 E^(-(h/Sqrt[3])) + 2 E^(h/Sqrt[3]) + 
      E^(-(h/Sqrt[3]) - Sqrt[2] j) + E^(h/Sqrt[3] - Sqrt[2] j) + 
      E^(-Sqrt[3] h + Sqrt[2] j) + E^(Sqrt[3] h + Sqrt[2] j))];
ContourPlot @@ {o, {j, -1, 1}, {h, -1, 1}}
FindMaximum @@ {o, {{j, -0.008983550852535105`}, {h, 
    0.06931364191023386`}}, Method -> "Newton"}

回答1:

Mathematically, I'm not sure exactly why Netwon's method fails, but the examples in the documentation for FindMaximum point out this specific problem and error message under Possible Issues: "With machine-precision arithmetic, even functions with smooth maxima may seem bumpy".

Thus, if you increase the working precision with e.g. the WorkingPrecision -> 20 option to FindMaximum the warnings go away:

In[25]:= FindMaximum[o, {{j, 0}, {h, 0}}, Method->"Newton", WorkingPrecision->20]

Out[25]= {-2.0694248079871222533, {j -> -0.14189560954670761863, h -> 0}}

Given that the text of the error is fairly descriptive:

FindMaximum::lstol: The line search decreased the step size to within the tolerance specified by AccuracyGoal and PrecisionGoal but was unable to find a sufficient increase in the function. You may need more than MachinePrecision digits of working precision to meet these tolerances. >>

... I suspect Newton's method is failing to reached a fixed point with sufficiently small error using machine-precision arithmetic.

As the error message hints, you can instead use the AccuracyGoal option to specify the number of significant digits you want in the solution if you don't want to switch to slower high-precision arithmetic:

In[27]:= FindMaximum[o, {{j, 0}, {h, 0}}, Method -> "Newton", AccuracyGoal -> 5]

Out[27]= {-2.06942, {j -> -0.141896, h -> -2.78113*10^-17}}

Hope that helps!

来源：https://stackoverflow.com/questions/3570799/why-does-findmaximum-with-newtons-method-complain-it-cant-find-a-sufficient-de

标签

wolfram-mathematica

mathematical-optimization