I have the following code to find the diameter of a tree:
ALGORITHM: TREE_DIAMETER (T)
maxlength ← 0
for s ← 0 to s < |V[T]|
do temp ← BSF(T, S)
Here is a simple algorithm with linear time complexity:
1)Pick an arbitrary vertex v
.
2)Find the furthest vertex from v
using BFS(let's call it u
).
3)Find the furthest vertex from u
using BFS one more time(let's call it t
).
Then distance(u, t)
is the diameter.
BFS is called only twice, so the time complexity is linear.
As supplementary to the other answer, a proof can be done as follow:
Let's first denote the two ends of the diameter in the tree as s
and t
, and the function d(u, v)
denotes the distance between node u
and node v
.
Now we choose an arbitrary node X
, then we have 2 cases:
X
is on the diameter;X
is not on the diameter.For case 1, it's easy to see that by doing (2) (the second step in the algorithm described in @user2040251's answer), we will get either d(X, s)
or d(X, t)
. If we get something else, say d(X, u)
, then u
and s
or t
can form a longer path than d(s, t)
, which contradicts the fact. Therefore, by doing (3), we will get d(s, t)
.
For case 2, by doing (2), we have 2 cases:
X
contains at least 1 point on the diameter;X
doesn't share any points with the diameter.For case 2.1, let's denote the first intersection as Y
. Because of case 1, we know the longest path starting from Y
must end at either s
or t
. Therefore, in this case, the longest path starting from X
must end at either s
or t
. Consequently, by doing (3), we will get d(s, t)
.
For case 2.2, let's denote the other end of the longest path starting from X
as Z
. Since neither X
or Z
is on the diameter, and given X
, Z
, s
, t
are on the same tree, we can conclude that there must be a node V
on the path X
to Z
, links to a node W
on the path s
to t
. Because X
to Z
is the longest path starting from X
, so d(X, V) + d(V, W) + d(W, t) < d(X, Z)
. Similarly, we have d(Z, V) + d(V, W) + d(W, s) < d(X, Z)
. Adding them up and simplify will give us: d(X, Z) > 2d(V, W) + d(s, t) > d(s, t)
, which contradicts with the fact that s
to t
is the diameter.
A graph that illustrates case 2.2:
s Z
| |
| |
| |
W-------------V
| |
| |
| |
t X
So we have:
d(X, V) + d(V, W) + d(W, t) < d(X, Z)
because X
to Z
is the longest path starting from X
;
d(Z, V) + d(V, W) + d(W, s) < d(X, Z)
because X
to Z
is the longest path starting from Z
;
Adding up 2 expressions:
d(X, Z) + 2d(V, W) + d(s, t) < 2d(X, Z)
Finally, we have d(X, Z) > 2d(V, W) + d(s, t) > d(s, t)
, which contradicts the fact.