2 hours ago

Kahn's Algorithm for Topological Sorting Explained with Code & Examples

Shot by Tuur Tisseghem

Learn Kahn's Algorithm for topological sorting with a fun Superman dressing order analogy, interactive visualizations, step-by-step C++ implementation, and a parallel execution extension using level-by-level BFS.

Opening

Welcome to Pants-After Planet! It’s a planet where you wear your pants after any other clothing item. We don’t care if you wear your socks outside your shoes or change your underwear in a Mr. Bean kind of way. Just make sure that putting on your pants is the last step in your getting-ready process.

Our mission is to develop an app that teaches ILL-MANNERED individuals—like Superman 🦸‍♂️, who normally wears his underwear over his pants—the correct dressing order so they can follow it properly.

First of all, let’s break down what the classic Superman wears in his everyday outfit:

🔵 A blue skintight top
👖 A pair of blue skintight pants
🧣 A red cape
🦸‍♂️ The “S” shield emblem on his chest
🩲 RED briefs worn UNDER the pants
👢 Red boots
🟨 A yellow belt

We can follow the typical order in which a human gets dressed to define the relationships between the outfit pieces.

For example, he can put on the red cape if and only if the upper part of the skintight suit is already on him.

We denote this dependency as:

python

      ('Skintight Top', 'Red Cape')

You may interpret this compact symbolic expression as the following completely straightforward and absolutely not-at-all overcomplicated instruction:

“If you would like to wear your red cape and have it rest properly and securely on your shoulders in the way a cape is generally expected to rest on someone’s shoulders, then before you reach for it, before you lift it up, before you swing it behind your back or attempt to settle it neatly around your neck, you would need to have already pulled a garment over your head, guided one arm through one sleeve and then the other through the other sleeve, tugged the fabric down over your chest and back, straightened it along your sides, smoothed out any folds, checked that it sits evenly across both shoulders, and confirmed that it is not twisted, bunched up, or halfway off — in other words, you need to have put on your blue skintight top.”

FAIRLY EASY.

We can keep doing this until Superman makes it past the online media platform’s review bots and DC Comics gets paid its licensing fees — basically, covered up enough to avoid moderation, but still looking enough like Superman to get you in trouble.

The Superhero Dressing Algorithm

It’s been a while since we introduced the syntax that says you put the Skintight Top on first, then you can have the Red Cape on.

python

      ('Skintight Top', 'Red Cape')

It’s obvious — because if you dont do it in that order, you’ll get stuck. So there must be a piece of clothing that we can freely put on without any further consideration. We’ll treat that as our beginning — the starting point of the dressing order.

Therefore, the first step is to find all the clothes that depend on nothing to start with:

python

      ('Skintight Top', 'Red Cape')
('Red Briefs', 'Skintight Pants')
('Skintight Top', 'Shield Emblem')
('Red Boots', 'Skintight Pants')

As shown in the list above, we can see that Skintight Top, Red Briefs and Red Boots can be worn without thinking — they don’t depend on anything else. Let’s create a todo list for the things we plan to wear:

shell

      ○ Skintight Top
○ Red Briefs
○ Red Boots

Feel free to start with any one you like, but for convention, we’ll follow the natural order. So we’ll start with Skintight Top first and see how things go.

Give it a check mark — we’ve already worn the Skintight Top:

shell

      ✔ Skintight Top
○ Red Briefs
○ Red Boots

Now we have more choices. Since some clothes depend on the Skintight Top like, Red Cape and Shield Emblem, they can be worn now. Let’s add them to the end of the choice list:

shell

      ✔ Skintight Top
○ Red Briefs
○ Red Boots
○ Red Cape
○ Shield Emblem

We repeat the process. Next, we’re going to wear Red Briefs, so strike it through:

shell

      ✔ Skintight Top
✔ Red Briefs
○ Red Boots
○ Red Cape
○ Shield Emblem

Now we check if any new choice can be added to the list. Since Skintight Pants depend on Red Briefs, they are now available. Add them to the tail of the list:

shell

      ✔ Skintight Top
✔ Red Briefs
○ Red Boots
○ Red Cape
○ Shield Emblem
○ Skintight Pants

Continue repeating this process until the list can no longer be expanded and all the clothes are worn. Then we’ll get a linear order that instructs the user how they should wear everything.

Here’s the pseudo-code step-by-step:

Find the clothes the user can wear immediately.
Add them to the TO-DO list.
Wear one item, then check whether any other clothes that depend on it can now be worn directly (with no remaining dependencies). If so, add them to the TO-DO list.
Keep popping and appending until the list is empty.

Now, let’s see how this can be translated into a programming language.

From Idea to Implementation: Kahn’s Algorithm Walkthrough

Let’s walk through an example using the following directed edges:

python

      ('A', 'B')
('A', 'C')
('B', 'D')
('C', 'E')
('D', 'F')
('E', 'F')
('F', 'G')
('H', 'I')
('I', 'G')

In the above data structure, each element — described within parentheses — represents a directed edge. The first item in the parentheses is the starting node, and the second item is the ending node.

Calculate In-Degree for Each Node

We want to find which nodes can be processed with no dependencies. To do that, we iterate through the list and track the counts.

Take the first element A -> B. We increment B’s counter in the dependency tracker, meaning one more node must be resolved before B can run.

Drag the slide below to follow along. The top of the diagram shows the direct edges we’re processing—starting with the first one highlighted in yellow. The bottom part is the dependency tracker, which updates as the counter increases.

step

By the end, any node with a counter of 0 can be executed freely. Keep sliding to see the full tracking process in action.

By the way, in graph theory, this counter is called the in-degree, which represents the number of incoming edges a node has—that is, how many dependencies must be resolved before it can be processed.

step

Initialize the Queue

Without a doubt, we identify all nodes with 0 dependencies and add them to a queue.

The layout below is slightly updated to show more of the process. The top now displays each node’s in-degree, moved up from the bottom, while the bottom shows the final result—the execution order of the nodes, or in this case, the dressing order.

Process Nodes in Queue

Now we connect this back to our superhero dressing example. At each step, we pick a node with no dependencies, add it to the result list, and update the counters of any nodes that depend on it. With those changes in place, we repeat the same process for the next available node.

It may sound like a lot, so let me walk you through it:

In the first step below, we focus on the first node in our IN-DEGREE list with no dependencies, highlighted in yellow. Slide forward—you add node A to the result list because it can be executed without further dependencies.

step

Next, we reflect the change in dependency relationships. We identify which nodes are affected. Since we’re not strictly following the machine step-by-step here, we can quickly eyeball the result—you’ll see the affected edges highlighted in yellow in the edge list.

One more step: we decrease the counter of the affected node(s) by 1 to show that the dependency has been updated, as highlighted in the in-degree map.

Just to be clear, this is the picture you should have in mind: we now have more 0-dependency nodes. From here, repeat the steps you learned above—this is the full process playing out.

Before each slide, try to predict what will happen next. Think about which node will be picked, how the dependencies will change, and what the updated result will look like.

step

C++ Implementation of Kahn’s Algorithm

“Less magic, more C++.” Finally, we get to the implementation.

Here’s our dummy input again:

python

      ('A', 'B')
('A', 'C')
('B', 'D')
('C', 'E')
('D', 'F')
('E', 'F')
('F', 'G')
('H', 'I')
('I', 'G')

Now let’s turn each directed edge into a pair—literally. In C++, we call that std::pair, and we’ll wrap them all in a std::vector.

cpp

      #include <utility>
#include <vector>

int main() {
  // Define the directed edges of the DAG
  std::vector<std::pair<char, char>> edges{{'A', 'B'}, {'A', 'C'}, {'B', 'D'},
                                           {'C', 'E'}, {'D', 'F'}, {'E', 'F'},
                                           {'F', 'G'}, {'H', 'I'}, {'I', 'G'}};
}

As you recall, we need a dependency counter. Let’s create an std::unordered_map for that:

cpp

      #include <cstddef>
#include <unordered_map>
#include <utility>
#include <vector>

int main() {
  // Define the directed edges of the DAG
  std::vector<std::pair<char, char>> edges{{'A', 'B'}, {'A', 'C'}, {'B', 'D'},
                                           {'C', 'E'}, {'D', 'F'}, {'E', 'F'},
                                           {'F', 'G'}, {'H', 'I'}, {'I', 'G'}};

  // In-degree map: node -> number of incoming edges
  std::unordered_map<char, std::size_t> in_degree;
}

Next, we identify which nodes need to be notified when their dependencies change. Instead of scanning all nodes every time a dependency updates, we store this information in a neighbor map. That way, we know exactly who to notify.

cpp

      #include <cstddef>
#include <set>
#include <unordered_map>
#include <utility>
#include <vector>

int main() {
  // Define the directed edges of the DAG
  std::vector<std::pair<char, char>> edges{{'A', 'B'}, {'A', 'C'}, {'B', 'D'},
                                           {'C', 'E'}, {'D', 'F'}, {'E', 'F'},
                                           {'F', 'G'}, {'H', 'I'}, {'I', 'G'}};

  // In-degree map: node -> number of incoming edges
  std::unordered_map<char, std::size_t> in_degree;
  // Adjacency list: node -> set of neighbors
  std::unordered_map<char, std::set<char>> neighbors;
}

It’s worth noting that we use a std::set for neighbors to avoid duplicate entries—just in case the same edge is added more than once.

Now it’s time to fill in all the information we talked about earlier.

For each edge u -> v, we increase the in-degree of v, since v depends on u. We also record v as a neighbor of u, so that once u is pushed to the result list, we can iterate over u’s neighbors and decrease their corresponding in-degree counters.

We’re only adding a few lines of code, and they do exactly that—nothing more. So don’t be scared. Take your time and read it through. ;)

cpp

      #include ...

int main() {
  ...
  // In-degree map: node -> number of incoming edges
  std::unordered_map<char, std::size_t> in_degree;
  // Adjacency list: node -> set of neighbors
  std::unordered_map<char, std::set<char>> neighbors;

  // Step 1: Build the graph
  for (const auto &edge : edges) {
    char u = edge.first;
    char v = edge.second;
    // Add v to the adjacency list of u
    neighbors[u].insert(v);
    // Increase in-degree of v
    in_degree[v]++;
  }
}

One small detail we should make sure of: every node needs to appear in the in_degree map—even those with no incoming edges—so we don’t miss any nodes during sorting.

For example, suppose we have the following input:

python

      # u -> v
('A', 'B')
('A', 'C')

If we only add v to the in_degree map, we end up with just two entries: B and C. That means A is missing. It may seem tiny, but it matters: A is actually where our algorithm starts—a 0-dependency node.

So we add one more branch inside the loop:

cpp

      #include ...

int main() {
  ...
  // In-degree map: node -> number of incoming edges
  std::unordered_map<char, std::size_t> in_degree;
  // Adjacency list: node -> set of neighbors
  std::unordered_map<char, std::set<char>> neighbors;

  // Step 1: Build the graph
  for (const auto &edge : edges) {
    char u = edge.first;
    char v = edge.second;
    // Add v to the adjacency list of u
    neighbors[u].insert(v);
    // Increase in-degree of v
    in_degree[v]++;
    // Make sure u is also in the in-degree map
    if (in_degree.find(u) == in_degree.end()) {
      in_degree[u] = 0;
    }
  }
}

Step 1 done—now onto Step 2. Time to find all nodes with in-degree 0, meaning they’re ready to be executed. We store them in a TO-DO list—implemented as a queue in this case:

cpp

      #include ...
#include <queue>

int main(){
  ...
  // In-degree map: node -> number of incoming edges
  std::unordered_map<char, std::size_t> in_degree;

  // Step 1: Build the graph
  ...

  // Step 2: Initialize queue with all nodes of in-degree 0
  std::queue<char> queue;
  for (const auto &[node, degree] : in_degree) {
    if (degree == 0) {
      queue.emplace(node);
    }
  }
}

As long as the queue is not empty, we dequeue a node—just like picking an item from our TO-DO list—and add it to the result:

cpp

      #include ...

int main(){
  ...
  // Step 1: Build the graph
  ...
  // Step 2: Initialize queue with all nodes of in-degree 0
  ...

  // Step 3: Perform Kahn's algorithm for topological sorting
  std::vector<char> result;
  while (!queue.empty()) {
    char node = queue.front();
    queue.pop();
    result.push_back(node);
  }
}

Then we notify the corresponding nodes by decreasing their counters:

cpp

      #include ...

int main(){
  ...
  // Step 1: Build the graph
  ...
  // Step 2: Initialize queue with all nodes of in-degree 0
  ...

  // Step 3: Perform Kahn's algorithm for topological sorting
  std::vector<char> result;
  while (!queue.empty()) {
    char node = queue.front();
    queue.pop();
    result.push_back(node);
    // For each neighbor, reduce its in-degree
    for (char neighbor : neighbors[node]) {
      --in_degree[neighbor];
    }
  }
}

Once a counter reaches 0, we add that node back to the TO-DO list for further processing. From there, the algorithm just keeps going:

cpp

      #include ...

int main(){
  ...
  // Step 1: Build the graph
  ...
  // Step 2: Initialize queue with all nodes of in-degree 0
  ...

  // Step 3: Perform Kahn's algorithm for topological sorting
  std::vector<char> result;
  while (!queue.empty()) {
    char node = queue.front();
    queue.pop();
    result.push_back(node);
    // For each neighbor, reduce its in-degree
    for (char neighbor : neighbors[node]) {
      --in_degree[neighbor];
      if (--in_degree[neighbor] == 0) {
        // If in-degree becomes 0, add to queue
        queue.push(neighbor);
      }
    }
  }
}

Finally, we print out the result.

cpp

      #include ...
#include <iostream>

int main(){
  ...
  // Step 1: Build the graph
  ...
  // Step 2: Initialize queue with all nodes of in-degree 0
  ...
  // Step 3: Perform Kahn's algorithm for topological sorting
  ...

  // Step 4: Print the topological sort result
  std::cout << "Topological Sort Order:" << std::endl;
  for (std::size_t i = 0; i < result.size(); ++i) {
    std::cout << result[i];
    if (i < result.size() - 1) {
      std::cout << " -> ";
    }
  }
  std::cout << std::endl;
  return 0;
}

And don’t forget to return 0 at the end. :)

python

      Topological Sort Order:
H -> A -> I -> B -> C -> D -> E -> F -> G

And here’s the complete code, putting everything together:

cpp

      #include <cstddef>
#include <iostream>
#include <queue>
#include <set>
#include <unordered_map>
#include <utility>
#include <vector>

int main() {
  // Define the directed edges of the DAG
  std::vector<std::pair<char, char>> edges{{'A', 'B'}, {'A', 'C'}, {'B', 'D'},
                                           {'C', 'E'}, {'D', 'F'}, {'E', 'F'},
                                           {'F', 'G'}, {'H', 'I'}, {'I', 'G'}};

  // In-degree map: node -> number of incoming edges
  std::unordered_map<char, std::size_t> in_degree;

  // Adjacency list: node -> set of neighbors
  std::unordered_map<char, std::set<char>> neighbors;

  // Step 1: Build the graph
  for (const auto &edge : edges) {
    char u = edge.first;
    char v = edge.second;

    // Add v to the adjacency list of u
    neighbors[u].insert(v);

    // Increase in-degree of v
    in_degree[v]++;

    // Make sure u is also in the in-degree map
    if (in_degree.find(u) == in_degree.end()) {
      in_degree[u] = 0;
    }
  }

  // Step 2: Initialize queue with all nodes of in-degree 0
  std::queue<char> queue;
  for (const auto &[node, degree] : in_degree) {
    if (degree == 0) {
      queue.emplace(node);
    }
  }

  // Step 3: Perform Kahn's algorithm for topological sorting
  std::vector<char> result;
  while (!queue.empty()) {
    char node = queue.front();
    queue.pop();
    result.push_back(node);

    // For each neighbor, reduce its in-degree
    for (char neighbor : neighbors[node]) {
      if (--in_degree[neighbor] == 0) {
        // If in-degree becomes 0, add to queue
        queue.push(neighbor);
      }
    }
  }

  // Step 4: Print the topological sort result
  std::cout << "Topological Sort Order:" << std::endl;
  for (std::size_t i = 0; i < result.size(); ++i) {
    std::cout << result[i];
    if (i < result.size() - 1) {
      std::cout << " -> ";
    }
  }
  std::cout << std::endl;

  return 0;
}

Parallel Topological Sorting with Kahn’s Algorithm

Now, if your nodes can be safely executed in parallel (for example, unlike some stateful APIs like OpenGL where certain calls aren’t thread-safe), we can tweak Step 3 a bit. Instead of producing a single flat result, we group nodes into “layers.”

As you can see, instead of a vector of results, we now use a vector of vector, called layers.

cpp

      #include ...

int main(){
  ...
  // Step 1: Build the graph
  ...
  // Step 2: Initialize queue with all nodes of in-degree 0
  ...

  // Step 3: Level-by-level topological sort (Kahn's algorithm)
  std::vector<char> result;
  std::vector<std::vector<char>> layers;
  while (!queue.empty()) {
  }

  // Step 4: Print the topological sort result
  ...

  return 0;
}

Instead of taking one item from the TO-DO list each time, we now process them in batches—because all of them have 0 dependencies and can be executed at the same time. The next batch will only include the nodes added to the TO-DO list during this iteration.

Sounds a bit complex? Let’s break it down.

We first create a result list called layer, and iterate through all nodes currently in the queue:

cpp

      #include ...

int main(){
  ...
  // Step 1: Build the graph
  ...
  // Step 2: Initialize queue with all nodes of in-degree 0
  ...

  // Step 3: Level-by-level topological sort (Kahn's algorithm)
  std::vector<std::vector<char>> layers;
  while (!queue.empty()) {
    std::vector<char> layer;
    std::size_t size = queue.size();
    // Process all nodes at current level
    for (std::size_t i = 0; i < size; ++i) {
    }
  }

  // Step 4: Print the topological sort result
  ...

  return 0;
}

The size variable tells us how many nodes are currently in the queue. Don’t forget—new nodes may be added to the queue during this loop. That’s exactly why we store this number first: we only want to process the nodes that were already in the queue at the start of this iteration.

Inside the loop, we do the same thing as before—just this time, we collect the current batch into layer.

For example, the first layer might look like:

python

      Layer 0: H A

This means these nodes have no dependencies at all—they’re valid starting points. Since they share the same property, we can execute them in parallel, in the same batch.

Next, we figure out which nodes become available for the next batch, and add them to the TO-DO list:

cpp

      #include ...

int main(){
  ...
  // Step 1: Build the graph
  ...
  // Step 2: Initialize queue with all nodes of in-degree 0
  ...

  // Step 3: Level-by-level topological sort (Kahn's algorithm)
  std::vector<char> result;
  std::vector<std::vector<char>> layers;
  while (!queue.empty()) {
    std::size_t size = queue.size();
    std::vector<char> layer;
    // Process all nodes at current level
    for (std::size_t i = 0; i < size; ++i) {
      char node = queue.front();
      queue.pop();
      layer.push_back(node);
      for (char neighbor : neighbors[node]) {
        if (--in_degree[neighbor] == 0) {
          queue.push(neighbor);
        }
      }
    }
    // Store current layer
    layers.push_back(layer);
  }

  // Step 4: Print the topological sort result
  ...

  return 0;
}

Then in Step 4, we update the output to print each layer:

cpp

      #include ...

int main(){
  ...
  // Step 1: Build the graph
  ...
  // Step 2: Initialize queue with all nodes of in-degree 0
  ...

  // Step 3: Level-by-level topological sort (Kahn's algorithm)
  std::vector<std::vector<char>> layers;
  ...

  // Step 4: Print the topological sort result
  for (std::size_t i = 0; i < layers.size(); ++i) {
    std::cout << "Layer " << i << ": ";
    for (char node : layers[i]) {
      std::cout << node << " ";
    }
    std::cout << std::endl;
  }

  return 0;
}

Here’s what the output might look like:

python

      Layer 0: H A
Layer 1: I B C
Layer 2: D E
Layer 3: F
Layer 4: G

You can interpret this as: we execute all nodes in one layer in parallel, then move on to the next layer.

Compared to the previous result:

python

      Topological Sort Order:
H -> A -> I -> B -> C -> D -> E -> F -> G

we’ve effectively reduced a 9-step execution into 5 batches, which can significantly improve performance when parallelism is possible.

Hopefully, this walkthrough gives you a clear picture of how Kahn’s Algorithm works—and how to extend it for parallel execution. Try tweaking the edges or playing with the example to see how the layers change.