Category: Data structures

Splay trees and its applications

It is also a BST and self balancing tree like AVL and red-black trees but is used
in real-world applications where out of 100, 80 times the data which is queried
is 20% data in the database.

This tree implemenation not just balances the tree, to get insertion/deletion/search to complete
in O(log n) time, but also move recently accessed the data to the root. This way
frequent access to the same data will take O(1) time in next access.

Having said that, still the algorithm will result in O(log n) amortized time complexity.

Comparison between AVL, Red-Black and Splay trees

AVL and red-black trees only balances tree when tree is changed(insertion or deletion)
On contrary, Splay trees balances tree in read-only operation (like search) also.Splay tree does not have to keep extra book-keeping information in the data structure like AVL trees (height) or red-black trees (color)

Red-black trees are also balanced but less balanced than AVL trees.
Red-black trees are used in applications where insertion and deletion are frequent operations.Where as AVL trees are used for quick searching as searching takes O(Log n) because tree is balanced.

Due to insertion and deletion, tree need rebalancing. Hence if application involves
more insertion and deletion operations then AVL trees is not a best choice.
Use red-black trees instead for quick insertion and deletion.

Splay trees are also less balanced than AVl trees. Sometimes splay trees can even become less balanced than red-black trees. Why ? Because due to splaying operations they may result becoming skew trees.
Splay trees are generally used for fast searching (like AVL trees). In both AVL and splay trees, performance remains O(log n) only. But the basic difference is that AVL trees take O(log n) for every search operation whereas but splay trees can take O(1) time for frequently searched items.

Applications of Splay trees

  • Due to their reference of locality property, they can be used in implementing logic behind caches.
  • They can come handy in joining two trees – merging a tree between 2 elements of another tree (O(log M + N))
  • Also they can be used in bifurcating the data into 2 distinguish data sets – cutting a tree
  • Can be used in garbage collectors
  • Querying faster than O(log(N)) for highly biased query distributions
  • Used in network router (aprt from other data structures like Hash tables, trie, etc)

Advantages

  • Amortized O(log n) performance
  • Requires no extra book-keeping info
  • Can handle duplicate data (unlike other trees like AVL, red-black)

Disadvantages

  • Cannot be used in multithreading applications as even read-only(search) operations also alter the tree structure. Allowing multiple threads to search keys may corrupt the trees.

Implementation

Find my code here.

Reference

AVL trees and its applications

Introduction

AVL trees are self balancing binary search trees (BST).
To determine whether the tree is balanced, the height of left and right subtree is checked for each node in the tree.
The BST is considered to be balanced if |H(L) – H(R)| <= 1, where H(L) and H(R) are height of left and right subtree of a node respectively.
In other words, left and right subtrees need to be AVLs, for tree to be AVL.

Height of node in AVL = |H(L) – H(R)| <= 1

Need of AVL tree

Binary search trees have a cool property of segregating and organizing data, so that decisions to be taken at every node (while traverse) are binary. Due to this property, the search time in BSTs is
O(log n).
But BST cannot control the order in which data comes for insertion. Due to this, sometimes BST becomes skew.
Ans as we know that search in BST highly depends on height of the tree. when height is log n, the search will complete in log n time.
Having said that,  AVL trees also cannot control the order in which data comes for insertion, but they can re-arrange data in tree, so the searching times can reduce to O(log n) again. By rearrangement, AVL trees reduce the height of tree to log n

Red-Black trees vs AVL trees

Though this post focus on AVL trees, it is worth mentioning some of differences between AVL trees and Red-Black trees.
AVL trees are used frequently for quick searching as searching takes O(Log n) because tree is balanced.
Where as insertion and deletions are comparatively more tedious and slower as at every insertion and deletion, it requires re-balancing.
Hence, AVL trees are preferred for application, which are search intensive.
For application which involves more insert and delete operations, AVL trees are not a best choice.
For frequent insertion and deletion, red-black trees are used.
Red-black trees  are also balanced trees but comparatively less balanced than AVL trees.
In subsequent posts, I will try to explicate more on red-black trees.

Implementation

In AVL trees, at every decision we query height of the node. Hence AVL data structure also stores height of the node.

Ways the tree Rotates to re-balance

LL and RR rotation

 

1

LL means when new node inserted in left of left subtree – rotate right
RR means when new node inserted in right of right subtree – rotate left

LR and RL rotation

2

 

 LR means when new node inserted in right of left subtree – rotate left and then rotate right
 RL means when new node inserted in left of right subtree – rotate right and then rotate left
Tip: confused about LR and RL, read these in reverse to know the right case
but rotate the order LR is written i..e
For LR – R of L and rotate L then R
Check out my AVL tree C implementation here

Applications

As mentioned above, AVL trees are used for frequent insertion. One of the examples I know of is it is used in Memory management subsystem of linux kernel to search memory regions of processes during preemption.

Reference