问题
The following self contained code highlights a problem in OCaml, possibly with the code generation. Array x has connectivity information for nodes in [0..9]. Function init_graph originally constructed explicit arrays of incoming nodes for every node. The reduced version shown below just prints the two connected nodes.
Function init_graph2 is identical to init_graph except for a "useless" else branch. But outputs produced by these two functions are quite different. You can run it and see that init_graph skips over the second if-then-else in some cases!
We have run this program on version 3.12.1 (with make_matrix substituted appropriately), 4.03.0 and 4.03.0+flambda. All of them have the same problem.
I have been dealing with this and related problems where OCaml mysteriously skips branches or in some cases takes both branches. Thanks to a collaborator we were able to pare down the real code to a small self contained example.
Any ideas on what's going on here? And is there a way to avoid this and related problems?
let x =
let arr = Array.make_matrix 10 10 false in
begin
arr.( 6).( 4) <- true;
arr.( 2).( 9) <- true;
end;
arr
let init_graph () =
for i = 0 to 9 do
for j = 0 to (i-1) do
begin
if x.(i).(j) then
let (i_inarr, _) = ([||],[||]) in
begin
Format.printf "updateA: %d %d \n" i j;
end
(* else () *)
;
if x.(j).(i) then
let (j_inarr, _) = ([||],[||]) in
begin
Format.printf "updateB: %d %d \n" i j;
end
end
done
done;
Format.printf "init_graph: num nodes is %i\n" 10
let init_graph2 () =
for i = 0 to 9 do
for j = 0 to (i-1) do
begin
if x.(i).(j) then
let (i_inarr, _) = ([||],[||]) in
begin
Format.printf "updateA: %d %d \n" i j;
end
else ()
;
if x.(j).(i) then
let (j_inarr, _) = ([||],[||]) in
begin
Format.printf "updateB: %d %d \n" i j;
end
end
done
done;
Format.printf "init_graph: num nodes is %i\n" 10
let test1 = init_graph ()
let test2 = init_graph2 ()
Update: Ocamllint flags the else branch in init_graph2 as "useless" which is clearly wrong.
Second, the indentation method suggested by camlspotter can be misleading in precisely this scenario. We follow Ocamllint advice and comment out the else branch. Emacs with taureg-mode doesn't re-indent this code unless explicitly asked leading us to believe everything is fine.
What is needed is a lint like tool that raises warning in these situations. I am waiting for good suggestions for this one.
Thanks.
回答1:
Your problem appears to be with the handling of let
... in
. This construct introduces a series of semicolon-separated expressions, not a single expression. So this code:
if x.(i).(j) then
let (i_inarr, _) = ([||],[||]) in
begin
Format.printf "updateA: %d %d \n" i j;
end
(* else () *)
;
if x.(j).(i) then
let (j_inarr, _) = ([||],[||]) in
begin
Format.printf "updateB: %d %d \n" i j;
end
Actually parses like this:
if x.(i).(j) then
let (i_inarr, _) = ([||],[||]) in
begin
Format.printf "updateA: %d %d \n" i j;
end
(* else () *)
;
if x.(j).(i) then
let (j_inarr, _) = ([||],[||]) in
begin
Format.printf "updateB: %d %d \n" i j;
end
In other words, both the first begin/end
and the second if/then
are controlled by the first if/then
.
Another way to say that is that ;
has higher precedence than let ... in
. So let x = y in a ; b
is parsed as let x = y in (a; b)
, not as (let x = y in a); b
.
When you included the "useless" else
, things parse like you think they should.
It's true, you have to be pretty careful when mixing if/then
with let
in OCaml. I have had problems like this. The general intuition that if/then
and else
control a single expression, while true, is easy to get wrong when one of the expressions is a let
.
回答2:
As Jeffrey has answered, your intention which is readable from your code indentations is very different from how the code is actually parsed.
You can avoid this kind of mistakes by using proper auto-indentation tools, such as caml-mode, tuareg-mode, ocp-indent and vim plugins for OCaml.
By auto-indenting the second if
of init_graph
, you can immediately find it is under the first if
's then
clasuse.
来源:https://stackoverflow.com/questions/39262731/possible-ocaml-code-generation-bug