In this post, I document my journey of building a Binary Search Tree (BST) from the ground up in Go. We start with raw memory pointers and end with a high-performance search engine that beats linear scanning by 6x.
1. The Physical Foundation: The Node
A tree is just a collection of “buckets” in memory. In Go, we define this using a struct with pointers to other nodes.
| |
2. The Logic: Smart Insertion
To make this a Search tree, we need a rule: Small values go left, large values go right. We use recursion to let the tree “find” the correct empty slot.
| |
3. Visualization: Seeing the Branches
Printing a pointer address like 0xc00000e030 doesn’t help. We wrote a visualizer to see the depth of our tree by tilting the output sideways.
└── 20
└── 15
└── 12
└── 10
└── 7
└── 5
└── 2
4. The Search Algorithm
The search is where the performance magic happens. Instead of looking at every number, we “discard” half the tree at every step.
| |
5. The Performance Trap (Benchmarking)
When I first ran Go benchmarks, the results were shocking. The Tree was slower than a simple array!
Initial Results:
TreeSearch: 777,923 ns/op (12,300 allocs/op)
LinearSearch: 38.75 ns/op
The Lesson: Don’t Benchmark the Setup
I was building the tree inside the benchmark loop. Allocating memory is expensive. By moving the tree construction outside the loop and using b.ResetTimer(), the allocations dropped to 0.
6. The Final Battle: Balance
Even with zero allocations, the tree was still slightly slower than a linear search. Why? Sorted input.
If you insert 1, 2, 3…, you don’t get a tree; you get a straight line. By “shuffling” the input to create a balanced, “bushy” tree, we finally unlocked the true speed of O(logN).
| |
Final Benchmark Results
| Algorithm | Time per Search | Improvement |
|---|---|---|
| Linear Search | 42.30 ns | 1x (Baseline) |
| Balanced BST | 7.01 ns | 6x Faster |
Conclusion
A Database Index isn’t magic; it’s just a well-balanced tree. By understanding how pointers connect in memory and avoiding the “straight-line” trap, we can build systems that stay fast even as data grows to millions of rows.