From Theory to Practice: A Unique Merge-Insertion-Sort Implementation

Merge-insertion Sort, Reimagined: A Tree-Free Implementation with C++ Containers

Sorting is a fundamental problem in computer science, and while algorithms like Quick Sort and Merge Sort are widely used, the quest for the minimum number of…


This content originally appeared on DEV Community and was authored by ABDELKADER ADOUAB

Merge-insertion Sort, Reimagined: A Tree-Free Implementation with C++ Containers

Sorting is a fundamental problem in computer science, and while algorithms like Quick Sort and Merge Sort are widely used, the quest for the minimum number of comparisons leads us to more exotic algorithms. One such algorithm is the Merge-insertion Sort, also known as the Ford-Johnson algorithm. It's famous in academic circles for being one of the most comparison-efficient sorting algorithms.

The classic algorithm, as detailed in Donald Knuth's "The Art of Computer Programming," is often explained using tree-like structures to keep track of element relationships. But what if you're constrained to use standard library containers like std::vector, std::list, or std::deque ? This is where things get tricky.

Today, I'm sharing a unique implementation of Merge-insertion Sort that elegantly solves this problem. It achieves the optimal number of comparisons prescribed by the theory, all while using a simple nested container approach. The best part? The core recursive function is under 60 lines of code ✨.

The Classic Merge-insertion Sort: A Quick Refresher 🧠

The algorithm is a beautiful blend of merging and insertion. Here's the high-level view:

  1. Pair & Conquer: Group the elements into pairs. Compare each element within a pair, creating "winners" and "losers." This forms a "main chain" (all the winners) and a "pend chain" (all the losers). If there's an odd element out, it's set aside for later.

  2. Recursive Sort: Recursively sort the main chain. Since the main chain is half the size of the original input, you get to the base case quickly. Critically, you must keep track of which loser belongs to which winner.

  3. Insert with Jacobsthal: Insert the elements from the pend chain into the now-sorted main chain. This is not a simple linear insertion. To maintain the minimum number of comparisons, elements are inserted in a special order determined by Jacobsthal numbers.

The Container Conundrum 🤔

The biggest challenge of implementing this with containers is step #2. When you recursively sort the main chain, how do you remember the original (winner, loser) pairings?

A naive approach might be to use a std::map or another data structure to link winners to losers. However, this often introduces extra comparisons or memory overhead, defeating the purpose of using a comparison-optimal algorithm in the first place. You end up losing the elegant efficiency of the original design.

The Solution: Nested Containers ✨

My implementation overcomes this by changing the very nature of the data we're sorting. Instead of sorting a container of integers (container<int>), we sort a container of containers: container<container<int>>.

Here's the core idea: Each inner container represents a "group." The actual number we are comparing is always at the front() of its inner container. The rest of the elements in that inner container are the "losers" that have become associated with it during previous recursive steps.

When we compare two groups, A and B (which are both containers), we only compare A.front() and B.front(). If A.front() is the winner, we don't just move on. We append the entire B container to the end of the A container.

This way, the loser group is never lost; it's simply chained to its winner. This process perfectly simulates the parent-child relationship of a tree branch, but using only standard container operations.

When we unwind the recursion, we can easily reconstruct the winner and loser chains by splitting these merged containers. The split point is determined by pair_size = container.front().size() / 2, which tells us the size of the winner group at that specific recursion level.

A Step-by-Step Walkthrough 🔬

To see how the container-based approach works, let's trace the algorithm with a 16-element sequence. The magic happens in two phases: a recursive descent (pairing and merging) and a recursive ascent (splitting and inserting).

Initial shuffled sequence: {14, 1, 12, 9, 11, 10, 13, 7, 6, 0, 8, 4, 15, 3, 5, 2}

Phase 1: The Recursive Descent (Pairing & Merging)

The goal here is to recursively sort the "winners." We do this by pairing up our containers, comparing their front elements, and appending the loser's entire container to the winner's.

Entering Level 1: Initial Pairing

First, the algorithm takes the 16 raw numbers and groups them into 8 pairs. It compares the numbers in each pair and ensures the larger one is at the front.

--- Entering Recursion Level 1 ----------- Container Size: 16 -> 8

    index[0]: { 14,  1 }
    index[1]: { 12,  9 }
    index[2]: { 11, 10 }
    index[3]: { 13,  7 }
    index[4]: {  6,  0 }
    index[5]: {  8,  4 }
    index[6]: { 15,  3 }
    index[7]: {  5,  2 }

  ... (8 pairs total)

We now have 8 containers, where container.front() is the "winner" and the rest of the elements are the tracked "losers." The algorithm calls itself with this new set of 8 containers.

Entering Level 2: First Merge

The 8 containers are now paired up. For example, {14, 1} is paired with {12, 9}. Since 14 > 12, the second container is appended to the first, creating {14, 1, 12, 9}. This reduces the problem size from 8 containers to 4.

--- Entering Recursion Level 2 ----------- Container Size: 8 -> 4

    index[0]: { 14,  1, 12,  9 }
    index[1]: { 13,  7, 11, 10 }
    index[2]: {  8,  4,  6,  0 }
    index[3]: { 15,  3,  5,  2 }

  ... (4 pairs total)

Notice how the original pairs are perfectly preserved inside the new, larger containers. The process repeats.

Levels 3 & 4: Merging to the Base Case

The same logic applies, halving the number of containers at each step until only one remains.

  • Level 3: 4 containers are merged into 2.

  • Level 4: 2 containers are merged into 1.

--- Entering Recursion Level 4 ----------- Container Size: 2 -> 1

index[0]: { 15,  3,  5,  2,  8,  4,  6,  0, 14,  1, 12,  9, 13,  7, 11, 10 }

The algorithm has now hit the base case (container.size() <= 1). The recursive descent is over. Now, the magic happens as we unwind.

Phase 2: The Ascent — Unwinding with Precision 🚀

As the recursion unwinds, the real magic begins. This phase is not just a simple insertion; it's a highly optimized process driven by two key principles:

  1. The Jacobsthal Looping Engine: Instead of inserting losers (b1, b2, b3,...) in a simple linear order, the algorithm uses Jacobsthal numbers to pick them in a special sequence (b1, then b3, b2, then b5, b4, etc.). This order is designed to maximize the information gained from each binary search, as we insert elements from larger, more uncertain positions first.

  2. The Optimized Binary Search Range: This is the secret sauce. When we are ready to insert a loser b_k, we do not need to search the entire sorted main_chain. The algorithm guarantees that b_k is smaller than its winner a_k, which is in turn smaller than all subsequent winners (a_{k+1}, a_{k+2}, ...). Therefore, we only need to perform a binary search for b_k's position within the main_chain up to the position of a_k. This drastically reduces the size of each search and is fundamental to keeping the comparison count at its theoretical minimum.

Let's see this in action, level by level.

The Split: Extracting Losers from the Main Chain ✂️

As the recursion unwinds, our main container holds groups of merged winners and losers. The first action is to split them. We iterate through each group, extracting the "loser" elements into a temporary list (pend_chain), while the sorted "winner" elements remain in the original container, forming the foundation for our final sorted list.

Note: The main_chain refers to the original container after the losers have been extracted and moved into the temporary pend_chain. The insertions then happen back into this main_chain.

Leaving Level 4: The First Split & Insertion

The algorithm splits the single, massive container using pair_size = 8.

--- Leaving Recursion Level: 4 ------- Pair Size: 8 ---

Winners (main_chain):
    index[0]: { 15,  3,  5,  2,  8,  4,  6,  0 }    // a1

Losers  (pend_chain):
    index[0]: { 14,  1, 12,  9, 13,  7, 11, 10 }    // b1

Insertion Process:

  • Manual Insert: With only one loser (b1), the process is simple. We insert b1 at the front of the main_chain. No search is needed. All Insertions happen in-place, modifying the container that holds the main_chain rather than creating a new one.

Leaving Level 3: Reconstructing a Larger Chain

Here, pair_size = 4. The function splits the two groups.

--- Leaving Recursion Level: 3 ------- Pair Size: 4 ---

Winners (main_chain):
    index[0]: { 14,  1, 12,  9 }    // a1
    index[1]: { 15,  3,  5,  2 }    // a2

Losers (pend_chain):
    index[0]: { 13,  7, 11, 10 }    // b1
    index[1]: {  8,  4,  6,  0 }    // b2

Insertion Process:

  1. Manual Insert: Loser b1 is inserted at the front of the main_chain.

  2. Jacobsthal Loop: The next group to insert is for Jacobsthal indices (1, 3]. This includes b2. To insert b2, we use our optimized search.

* **Insert** `b2` (`{8,...}`): We only need to binary search in the `main_chain` up to the position of its winner, `a2`.

Leaving Level 2: More Granularity

The process continues with the sorted list of 4 groups. The pair_size is now 4 / 2 = 2. The containers are split, creating a main_chain of 4 winners and a corresponding pend_chain of 4 losers.

--- Leaving Recursion Level: 2 ------- Pair Size: 2 ---

Winners (main_chain):
    index[0]: {  8, 4 }    // a1
    index[1]: { 13, 7 }    // a2
    index[2]: { 14, 1 }    // a3
    index[3]: { 15, 3 }    // a4

Losers (pend_chain):
    index[0]: {  6,  0 }    // b1
    index[1]: { 11, 10 }    // b2
    index[2]: { 12,  9 }    // b3
    index[3]: {  5,  2 }    // b4

Insertion Process:

  1. Manual Insert: Loser b1 ({6, 0}) is inserted at the front of the main_chain.

  2. Jacobsthal Loop (prev=1, next=3): This group includes losers b3 and b2. We insert them in descending order:

* First, Insert `b3` (`{12, 9}`). The binary search is performed only on the part of the `main_chain` up to the position of its winner, `a3` (`{14, 1}`).

* Next, Insert `b2` (`{11, 10}`). The search is limited to the range up to its winner, `a2` (`{13, 7}`).
  1. Jacobsthal Loop (prev=3, next=5): This group includes the last remaining loser, b4.
* Insert `b4` (`{5, 2}`). The search is limited to the range up to its winner, `a4` (`{15, 3}`).

After this, the function returns a single, sorted list of 8 groups to the final level.

Leaving Level 1: The Final, Precise Insertion

This is where the optimizations are most visible. pair_size = 1.


Winners (main_chain): {  5,  6,  8, 11, 12, 13, 14, 15  }
//                      a1  a2  a3  a4  a5  a6  a7  a8

Losers  (pend_chain): {  2,  0,  4, 10,  9,  7,  1,  3  }
//                      b1  b2  b3  b4  b5  b6  b7  b8

Insertion Process:

  1. Manual Insert: b1 (2) is inserted at the front of the main_chain. No search needed. The chain is now {2, 5, 6, ...}.

  2. Jacobsthal Loop Engine Begins:

* **Group (1, 3\]:** Process indices 3, 2.

    * **Insert** `b3` (4): Perform a binary search (`lower_bound`) on the `main_chain` but **only up to the position of** `a3` (8).

    * **Insert** `b2` (0): Perform a binary search on the `main_chain` **only up to the position of** `a2` (6).

* **Group (3, 5\]:** Process indices 5, 4.

    * **Insert** `b5` ( 9): Search for its position **up to** `a5` (12).

    * **Insert** `b4` (10): Search for its position **up to** `a4` (11).

* **Group (5, 11\]:** Our list of losers ends at index 8, So we process indices 8, 7, 6.

    * **Insert** `b8` ( 3): Search for its position **up to** `a8` (15).

    * **Insert** `b7` ( 1): Search for its position **up to** `a7` (14).

    * **Insert** `b6` ( 7): Search for its position **up to** `a6` (13).

By precisely controlling both the order of insertion (Jacobsthal) and the range of each search (up to a_k), the algorithm avoids redundant comparisons, achieving its famed efficiency while operating in-place on a single growing container.

Merge-insertion Sort Pseudocode

function MergeInsertionSort(container_of_containers C):
    // Base case: a list of 0 or 1 is already sorted.
    if C.size() <= 1:
        return

    // Handle any odd element out by setting it aside.
    leftover = handleOddElement(C)

    // Phase 1: Pair up, compare fronts, and merge the losers
    // container into the winners. This halves the number of containers.
    for each adjacent pair (A, B) in C:
        if A.front() < B.front():
            swap(A, B)
        A.append_contents_of(B)
        // Remove B, as its contents are now merged into A.
        C.erase(B)

    // Recursive call on the smaller set of winner-containers.
    MergeInsertionSort(C)

    // --- Unwinding Phase ---

    // Calculate pair_size to know where the split between
    // a winner and its original loser(s) is located.
    pair_size = C.front().size() / 2

    // C now holds the main_chain of winners. Extract the
    // pend_chain of losers from within each container.
    pend_chain = extract_losers(C, pair_size)

    // insert the pend_chain back into the main_chain (C)
    // using the optimal Jacobsthal insert sequence.
    insertion_using_jacobsthal(C, pend_chain, leftover)

Conclusion & Key Takeaways 🚀

This container-based implementation of Merge-insertion Sort demonstrates that it's possible to adhere to the algorithm's theoretical efficiency even when faced with practical constraints. By rethinking the data structure, we can simulate the tree-like dependencies required by the algorithm in a clean, concise, and efficient manner.

This approach not only provides a performant sorting solution but also serves as a great example of how abstract algorithmic concepts can be translated into practical code with a bit of creative thinking.


This content originally appeared on DEV Community and was authored by ABDELKADER ADOUAB


Print Share Comment Cite Upload Translate Updates
APA

ABDELKADER ADOUAB | Sciencx (2025-08-27T19:56:57+00:00) From Theory to Practice: A Unique Merge-Insertion-Sort Implementation. Retrieved from https://www.scien.cx/2025/08/27/from-theory-to-practice-a-unique-merge-insertion-sort-implementation/

MLA
" » From Theory to Practice: A Unique Merge-Insertion-Sort Implementation." ABDELKADER ADOUAB | Sciencx - Wednesday August 27, 2025, https://www.scien.cx/2025/08/27/from-theory-to-practice-a-unique-merge-insertion-sort-implementation/
HARVARD
ABDELKADER ADOUAB | Sciencx Wednesday August 27, 2025 » From Theory to Practice: A Unique Merge-Insertion-Sort Implementation., viewed ,<https://www.scien.cx/2025/08/27/from-theory-to-practice-a-unique-merge-insertion-sort-implementation/>
VANCOUVER
ABDELKADER ADOUAB | Sciencx - » From Theory to Practice: A Unique Merge-Insertion-Sort Implementation. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/08/27/from-theory-to-practice-a-unique-merge-insertion-sort-implementation/
CHICAGO
" » From Theory to Practice: A Unique Merge-Insertion-Sort Implementation." ABDELKADER ADOUAB | Sciencx - Accessed . https://www.scien.cx/2025/08/27/from-theory-to-practice-a-unique-merge-insertion-sort-implementation/
IEEE
" » From Theory to Practice: A Unique Merge-Insertion-Sort Implementation." ABDELKADER ADOUAB | Sciencx [Online]. Available: https://www.scien.cx/2025/08/27/from-theory-to-practice-a-unique-merge-insertion-sort-implementation/. [Accessed: ]
rf:citation
» From Theory to Practice: A Unique Merge-Insertion-Sort Implementation | ABDELKADER ADOUAB | Sciencx | https://www.scien.cx/2025/08/27/from-theory-to-practice-a-unique-merge-insertion-sort-implementation/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.