- Courses
- Programming
- Graph Data Structure & Algorithms
- Depth First Search

TUTORIAL

1. Introduction To Graphs

2. Properties Of Graph

3. Graph Traversals ( Dfs And Bfs )

4. Example Implementation Of Bfs And Dfs

5. Breadth First Search

6. Depth First Search

Let’s start with a basic problem. You are given a binary tree and you have to search if an element say B is present in the tree or not.

Few observations. Binary tree is a tree where each node has at max two children: left and right. Now suppose you are at the root of the tree, then there can be one of the four possibilities.

- B is the root element.
- B is present in the left subtree of root.
- B is present in the right subtree of root.
- B is not present in the tree with the given root.

What will we do? If we are at node A, we will simply check if A is equal to B. If not, we will recursively check if B is present in the left subtree of A. If not, we will recursively check if B is present in the right subtree of A, else we will say that B is not present in the tree rooted at A.

- In the above tree, we want to know if B is present in the subtree tree rooted at A or not. At first we are at node A. We check if A is equal to B, which is not. We check if B is present in the left subtree of A rooted at C.
- Now, the question changes to “whether B is present in the subtree tree rooted at C or not”. We see that C is not equal to B. Now we go in the left subtree of C.
- Now, we see that D is not equal to B so we go in left subtree of D which is none so we come back at D.
- Now, go in the right subtree of D to see if B is present there. Since there is no right subtree rooted at D, we come back. Subtree rooted at D is fully explored so we go back to C.
- Now, we check the right subtree of C rooted with G. G is equal to B. We now check left subtree of G (not there) and right subtree(not there). B is not there so we go back to C.
- We have explored all the nodes in the subtree rooted at C. We now go to the right of A i.e., in the subtree rooted at E.
- We see that E is not equal to B. We go in the left subtree of E rooted at B.
- Since B is equal to B we return true and go back to E.
- E gets the return value of true from its left subtree and goes back to A.
- A gets return value of true from its right subtree and returns true, meaning B is present in the subtree rooted at A.

What we have done above is actually a depth-first search, where we traversed branches one by one. We first went to the left subtree of A. When the search was completed in the left subtree, we searched the right subtree of A.

The depth-first search algorithm is the search algorithm which begins the searching from the root node and goes down till the leaf of a branch at a time looking for a particular key. If the key is not found, the searching retraces its steps back to the point from where the other branch was left unexplored and the same procedure is repeated for that branch.

Thus, one at a time, every single branch is explored depth-wise, in turn completing the entire forest.

The data structure used in a Depth-First Search is a stack and a graph. The edges that lead to the unvisited node are called discovery edge, and the edges that lead to the already visited nodes are called block edges.

Now, the question is **why do we need a stack and not any other data structure like queue used in Breadth-First Search?**

We need a stack to perform a depth-first search because **our requirements say so**.

We always want to find the last node we visited to go in the other branches (not visited yet) of that node, and we want to do it in the most efficient way possible. Guess, what? Stack does so!

The way stack works (**Last in First out**) will give the last node visited in **O(1)** operations because the node last visited will be the top element of the stack and we only have to pop it.

**Why not use a queue?**

Because the way queue works(**First in First out**) will give the last node visited in **O(n)** operations as that will be the last element inserted in the queue. To take the last node out, first, we will have to take all the elements out of the queue. We will also need to store these elements except the last one so that we can push them back in the queue. The last element taken out of the queue will be the last visited node. We now have to push all the elements taken out back to queue.

Our requirements tell us to use a stack.

- Push the root on the stack (we need stack because we always want to know the last node we visited and it will be the top element of the stack).
- Keep a data structure to mark if a node is visited or not say, visited[all nodes] = False
- While stack is not empty repeat steps 4 - 7 below
- Pop node from the stack say,
**A** - If
**visited[A]**isgo to step 4.*True* - Mark
**visited[A] = True.** - Check if
**A**is the key. If yes, break dfs. Else, push all the adjacent nodes of A which are not visited into the stack.

This way you will traverse a branch until it is exhausted meaning, either there are no adjacent nodes or all the adjacent nodes are already visited.

**Consider the following example:**

In the above graph, S is the root. We start dfs from S. Initially our stack STACK = [ ] is empty.

We push S into STACK.

STACK = [S]

Now, we repeat the above steps from 4 to 7 until STACK is empty.

Pop the top element i.e., node S out of STACK. Mark S as visited.

STACK = [ ]

Push all the adjacent nodes of S which are not visited yet into STACK.

We push nodes C, B, and A into STACK.

STACK = [C, B, A]

Pop the top element i.e., node A out of STACK. Mark A as visited.

STACK = [C, B]

Push all the adjacent nodes of A which are not visited yet into STACK.

We push node D into STACK.

STACK = [C, B, D]

Pop the top element i.e., node D out of STACK. Mark D as visited.

STACK = [C, B]

Push all the adjacent nodes of D which are not visited yet into STACK.

We push nodes C and B into STACK.

STACK = [C, B, C, B]

Pop the top element i.e., node B out of STACK. Mark B as visited.

STACK = [C, B, C]

Push all the adjacent nodes of B which are not visited yet into STACK.

There are no adjacent nodes of B which are not visited.

STACK = [C, B, C]

Pop the top element i.e., node C out of STACK. Mark C as visited.

STACK = [C, B]

Push all the adjacent nodes of C which are not visited yet into STACK.

There are no adjacent nodes of C which are not visited.

STACK = [C, B]

Now, we pop B from STACK and see that it was visited earlier.

STACK = [C]

We continue with the next iteration and pop C from STACK and see that it was visited earlier.

STACK = [ ]

STACK is now empty and we stop dfs.

This is how a depth-first search works, by traversing the nodes depth-wise.

```
Set all nodes to "not visited";
s = new Stack(); // get a stack
s.push(initial node); // push the root node into the stack
while ( s is not
```** EMPTY** ) do
{
x = s.pop(); // remove the top node from the stack and start exploring its branches
if ( x has not been visited ) // if you are visiting the node first time
{
visited[x] = true; //mark it as visited so that you do not visit it again
for ( every edge (x, y)) //all adjacent nodes of x will have an edge with x
if ( y has not been visited ) //check if adjacent node is not already visited
s.push(y);
}
}

```
DFS(adjacent[][], vertex, visited[]) {
visited[vertex] = True
FOR node in adjacent[vertex]:
IF visited[node] == False:
DFS(adjacent, node, visited)
END of IF
END of FOR
}
```

We will generally use the recursive approach because it is easy to understand and code too.

```
Implementation of Depth-first Search in C:
#include <stdio.h>
#include <stdlib.h>
struct node
{
int vertex;
struct node* next;
};
struct node* createNode(int v);
struct Graph
{
int numVertices;
int* visited;
struct node** adjLists; // we need int** to store a two dimensional array. Similarly, we need struct node** to store an array of Linked lists
};
struct Graph* createGraph(int);
void addEdge(struct Graph*, int, int);
void printGraph(struct Graph*);
void DFS(struct Graph*, int);
int main()
{
struct Graph* graph = createGraph(4);
addEdge(graph, 0, 1);
addEdge(graph, 0, 2);
addEdge(graph, 1, 2);
addEdge(graph, 2, 3);
printGraph(graph);
DFS(graph, 2);
return 0;
}
void DFS(struct Graph* graph, int vertex) {
struct node* ptr = graph->adjLists[vertex];
graph->visited[vertex] = 1;
printf("Visited %d \n", vertex);
while(ptr!=NULL) {
int adj_vertex = ptr->vertex;
if(graph->visited[adj_vertex] == 0) {
DFS(graph, adj_vertex);
}
ptr = ptr->next;
}
}
struct node* createNode(int v)
{
struct node* new_node = malloc(sizeof(struct node));
new_node->vertex = v;
new_node->next = NULL;
return new_node;
}
struct Graph* createGraph(int vertices)
{
struct Graph* graph = malloc(sizeof(struct Graph));
graph->numVertices = vertices;
graph->adjLists = malloc(vertices * sizeof(struct node*));
graph->visited = malloc(vertices * sizeof(int));
int i;
for (i = 0; i < vertices; i++) {
graph->adjLists[i] = NULL;
graph->visited[i] = 0;
}
return graph;
}
void addEdge(struct Graph* graph, int source, int destination)
{
// Add edge from src to dest
struct node* newNode = createNode(destination);
newNode->next = graph->adjLists[source];
graph->adjLists[source] = newNode;
// Add edge from destination to source
newNode = createNode(source);
newNode->next = graph->adjLists[destination];
graph->adjLists[destination] = newNode;
}
void printGraph(struct Graph* graph)
{
int v;
for (v = 0; v < graph->numVertices; v++)
{
struct node* ptr = graph->adjLists[v];
printf("\n Adjacency list of vertex %d\n ", v);
while (ptr)
{
printf("%d -> ", ptr->vertex);
ptr = ptr->next;
}
printf("\n");
}
}
```

```
Implementation of Depth-first Search in C++:
#include <iostream>
#include <vector>
void dfs(int u, vector<vector<int> > &adj, vector<bool> &visited) {
visited[u] = true;
for(int v : adj[u])
if(!visited[v])
dfs(v, adj, visited);
cout << “Done visiting node: “ << u + 1 << endl;
}
int main() {
int nodes, edges, u, v;
cin >> nodes >> edges;
vector<vector<int> > adj(nodes);
vector<bool> visited(nodes);
for(int i = 1; i <= edges; ++i) {
cin >> u >> v;
--u, --v; //there is an undirected edge between u and v (0 based)
adj[u].push_back(v);
adj[v].push_back(u);
}
dfs(0, adj, visited); //assume root is always node 0
return 0;
}
```

```
Implementation of DFS in Java:
import java.io.*;
import java.util.*;
class Graph
{
private int numVertices;
private LinkedList<Integer> adjLists[];
private boolean visited[];
Graph(int vertices)
{
numVertices = vertices;
adjLists = new LinkedList[vertices];
visited = new boolean[vertices];
for (int i = 0; i < vertices; i++)
adjLists[i] = new LinkedList<Integer>();
}
void addEdge(int source, int destination)
{
adjLists[source].add(destination);
}
void DFS(int vertex)
{
visited[vertex] = true;
System.out.print(vertex + " ");
Iterator itr = adjLists[vertex].listIterator();
while (itr.hasNext())
{
int adj_node = itr.next();
if (!visited[adj_node])
DFS(adj_node);
}
}
public static void main(String args[])
{
Graph g = new Graph(4);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 3);
System.out.println("Following is Depth First Traversal");
g.DFS(2);
}
}
```

```
Implementation of DFS in python:
def dfs(graph, vertex, visited=None):
if visited is None:
visited = set()
visited.add(vertex)
print(vertex)
for node in graph[vertex] :
if node not in visited:
dfs(graph, node, visited)
graph = {'0': set(['1', '2']),
'1': set(['0', '3', '4']),
'2': set(['0', ‘4’]),
'3': set(['1', ‘4’]),
'4': set([‘1’, '2', '3'])}
dfs(graph, '0')
```