Depth-first search
dfs(G, s)
for each v ∈ V(G) [n]
visited[v] ← false [n]
stack ← { s } [1]
while stack not empty [m]
v ← pop stack [m] - (a)
if visited[v] = false [m]
visited[v] ← true [m]
for each (v → u) ∈ E(G) [m] - (b)
if visited[u] = false [m] - (c)
push u into stack [m]
Let \(n\) and \(m\) be the number of vertices and edges in the graph respectively. The size of a graph is defined as \(n+m\). In the algorithm, the quantity in square brackets on the right-hand side is the maximum number of times that step is executed. The crucial steps for time complexity analysis are marked.
The DFS algorithm runs in time \(O(n+m)\).
The loop (b) is executed at most once for each vertex v
. This is because just before that loop is executed visited[v]
is marked true
and this loop can only be executed if visited[v]
is false
. Also, note that we never set visited[v]
to false
after the initialization. Let the number of outgoing edges for each \(v\) be \(o_v\). For any v
, the statement (c) is executed exactly \(o_v\) times. We also have \(\sum_v o_v = m\). So (c) is executed at most \(m\) times in total.
Now, we establish that (a) is executed at most \(m\) times. Let the number of incoming edges for each \(u\) be \(i_u\). Observe that any vertex \(u\) is pushed into the stack at most \(i_u\) times because an edge v → u
is considered at most once in the loop (b). So the total number of pop
operations can only be at most \(\sum_u i_u = m\) which establishes the claim.
The following claim is why DFS works for path finding in graphs.
If there is a path from \(s\) to \(t\) in \(G\), then dfs(G, s)
will visit \(t\).
We will use induction on the distance from \(s\). Distance is defined as the minimum number of edges in a path from \(s\). The only vertex at distance zero from \(s\) is \(s\) and it will be visited in the first iteration of the loop (a). For the induction step, suppose that all vertices at distance \(d\) will be visited by the algorithm. Let \(v\) be any vertex at distance \(d+1\) from \(s\). Then, there is a vertex \(u\) at distance \(d\) from \(s\) such that \(u→v\) is an edge in \(G\). By the induction hypothesis, \(u\) will be visited. The loop (b) will consider the edge \(u→v\). If \(v\) is already visited, we are done. Otherwise, \(v\) will be pushed into the stack. Since we know the stack eventually becomes empty by our time complexity analysis, we conclude that the vertex \(v\) will be popped from the stack and visited (if not already visited).