Depth-first search (Java)

From LiteratePrograms
Jump to: navigation, search
Other implementations: C++ | Java | Python

Depth-first search is simple to implement in a procedural language like Java because it always expands the node most recently seen, making it ideal to implement with a stack (such as the procedural call stack).

Contents

[edit] Graph representation

Since we'll need to traverse through all the children of each node we visit, we choose an adjacency-list representation for the graph, genericizing over the node data type and creating a List of successor vertices:

<<imports>>=
import java.util.List;
import java.util.ArrayList;
import java.util.Stack;

<<vertex class>>=
public class Vertex<T>
{
    private final T data;
    private final List<Vertex<T>> _successors = new ArrayList<Vertex<T>>();

    vertex public methods

    depth first search

    generate Petersen graph

    main method
}

Each vertex is initialized with its vertex data or label, which we provide an accessor for:

<<vertex public methods>>=
Vertex(T data) { this.data = data; }
T getData() { return data; }

For this simple example, we provide direct access to the successors list, although this violates encapsulation:

<<vertex public methods>>=
List<Vertex<T>> successors() { return _successors; }

[edit] Depth first search

The search will take three parameters:

  • A start node, indicating the first node to expand
  • A goal predicate, which determines whether a given vertex is a goal vertex (a vertex on which the search stops successfully)
  • A partial result stack, describing the path (sequence of vertices) that the search has followed so far

In outline, the search looks like this:

<<depth first search>>=
public static <T> boolean depthFirstSearch(Vertex<T> start,
                                           GoalFunction<T> isGoal,
                                           Stack<Vertex<T>> result)
{
    ensure we're not stuck in a cycle

    result.push(start);

    check if we've found the goal
    expand each child node in order, returning if we find the goal

    // No path was found
    result.pop();
    return false; 
}

We provide a functor interface for the goal predicate:

<<goal functor interface>>=
interface GoalFunction<T>
{
    boolean evaluate(Vertex<T> o);
}

To expand each child, we simply call depth first search recursively on each child in order. If any of them return true, the final path will be in result and we also return true:

<<expand each child node in order, returning if we find the goal>>=
for (Vertex<T> v : start.successors()) {
    if (depthFirstSearch(v, isGoal, result))
    {
        return true;
    }
}

As our terminating condition, we check if the start node is a goal vertex by applying the goal predicate to it. If so, we return true immediately and the result is propagated back up to the initial caller:

<<check if we've found the goal>>=
if (isGoal.evaluate(start))
{
    return true;
}

Finally, if the graph we're processing might contain directed cycles (it's not a directed acyclic graph), we need to ensure we don't get stuck in an infinite recursion following a cycle. To do this, we simply search the list of visited nodes in result for the start node and return false immediately if we've followed a cycle:

<<ensure we're not stuck in a cycle>>=
if (result.contains(start))
{
    return false;
}

It's important of course that this be done before appending start to result. This completes the implementation.

[edit] Sample code

Petersen graph

This small test driver demonstrates use of the search function on the Petersen graph, a small graph with 10 vertices and diameter 2. We'll see that the path found by depth-first search can be much longer than the minimum 2 vertices.

First we have the code that generates the graph as a simple vector of vertices, labelling the vertices with sequential integers:

<<generate Petersen graph>>=
public static List<Vertex<Integer>> petersenGraph()
{
    List<Vertex<Integer>> v = new ArrayList<Vertex<Integer>>();
    for (int i = 0; i < 10; i++)
    {
        v.add(new Vertex<Integer>(i));
    }
    add Petersen graph edges
    return v;
}

Now we add the edges. We simply create a table of the edges and iterate over it. Note that since our data structure describes directed graphs, we need an edge in each direction for each edge of the graph:

<<add Petersen graph edges>>=
int[][] edges =
    {{0,1}, {1,0}, {1,2}, {2,1}, {2,3}, {3,2}, {3,4}, {4,3}, {4,0}, {0,4},
     {5,6}, {6,5}, {6,7}, {7,6}, {7,8}, {8,7}, {8,9}, {9,8}, {9,5}, {5,9},
     {5,0}, {0,5}, {6,2}, {2,6}, {7,4}, {4,7}, {8,1}, {1,8}, {9,3}, {3,9}};
for (int[] e : edges)
{
    v.get(e[0]).successors().add(v.get(e[1]));
}

Now we can complete our sample:

<<main method>>=
public static void main(String[] args)
{
    List<Vertex<Integer>> v = petersenGraph();
    Stack<Vertex<Integer>> path = new Stack<Vertex<Integer>>();
    if (depthFirstSearch(v.get(0), new GoalFunction<Integer>() {
        public boolean evaluate(Vertex<Integer> v)
        {
            return v.getData() == 7;
        }
    }, path))
    {
        System.out.print("Found path: ");
        for (Vertex<Integer> u : path)
        {
            System.out.print(u.getData() + " ");
        }
        System.out.println();
    }
    else
    {
        System.out.println("No path found");
    }
}

<<Vertex.java>>=
imports

goal functor interface

vertex class

When run, this produces the following path:

Found path: 0 1 2 3 4 7

This is far from the optimal path, which is "0 4 7".

[edit] Improvements and tradeoffs

On large graphs, the cycle-checking step can become expensive, because the result list may become very long. One solution is to also maintain a set data structure holding the nodes visited so far. Another more common solution is to give each node a visited flag, which has the advantage that no node is ever expanded twice, but has the disadvantage that we need to make an initial pass to clear the flag on every node of the graph. Visited flags also make it difficult to make the algorithm threadsafe.

Another issue on large graphs is that depth-first search can easily form a very long call chain following a useless path. It's useful to include a threshold on how deep we want to search. By searching with incrementally larger thresholds, we get iterative deepening depth-first search, which finds much shorter paths in general than ordinary depth-first search.

Download code
hijacker
hijacker
hijacker
hijacker