Introduction to skip list: An advanced LinkedList

Skip lists are a probabilistic data structure that uses multiple levels of linked lists to provide efficient search and insertion operations. Each node in a skip list has multiple forward pointers, allowing for efficient traversal of nodes that are farther away from the current node.

Skip lists are designed to improve the efficiency of search and insertion operations in large, sorted datasets. They were originally proposed by William Pugh in 1989 as an alternative to balanced binary trees, which can be more complex to implement and maintain.

The time complexity of search, insertion, and deletion in a skip list is O(log n), where n is the number of elements in the list. This is the same as the time complexity of these operations in a balanced binary tree. The space complexity of a skip list is O(n), where n is the number of elements in the list.

Skip lists are efficient because they use a probabilistic approach to determine the number of levels in the list. As a result, the average number of nodes that need to be traversed during a search or insertion operation is relatively small, even for large datasets. This makes skip lists a good choice for situations where efficient search and insertion are important and memory usage is not a concern.

Pseudocode implementation for a skip list

// define the skip list node
class SkipNode:
    value
    forward: array of pointers to next nodes

// define the skip list
class SkipList:
    head: pointer to first node
    level: current maximum level
    p: probability factor

    // initialize the skip list
    def __init__(self, p):
        head = SkipNode(null, array of size 1)
        level = 0
        this.p = p

    // generate a random level for a new node
    def randomLevel(self):
        level = 0
        while random() < p and level < MAX_LEVEL:
            level = level + 1
        return level

    // find the node with the given value
    def find(self, value):
        node = head
        for i in range(level, -1, -1):
            while node.forward[i] is not null and node.forward[i].value < value:
                node = node.forward[i]
        if node.forward[0] is not null and node.forward[0].value == value:
            return node.forward[0]
        else:
            return null

    // insert a new node with the given value
    def insert(self, value):
        update = array of pointers to nodes
        node = head
        for i in range(level, -1, -1):
            while node.forward[i] is not null and node.forward[i].value < value:
                node = node.forward[i]
            update[i] = node
        node = node.forward[0]
        if node is null or node.value != value:
            newLevel = randomLevel()
            if newLevel > level:
                for i in range(level + 1, newLevel + 1):
                    update[i] = head
                level = newLevel
            node = SkipNode(value, array of size newLevel + 1)
            for i in range(0, newLevel + 1):
                node.forward[i] = update[i].forward[i]
                update[i].forward[i] = node

    // delete the node with the given value
    def delete(self, value):
        update = array of pointers to nodes
        node = head
        for i in range(level, -1, -1):
            while node.forward[i] is not null and node.forward[i].value < value:
                node = node.forward[i]
            update[i] = node
        node = node.forward[0]
        if node is not null and node.value == value:
            for i in range(0, level + 1):
                if update[i].forward[i] != node:
                    break
                update[i].forward[i] = node.forward[i]
            while level > 0 and head.forward[level] is null:
                level = level - 1

A skip list is a probabilistic data structure that uses multiple levels of linked lists to provide efficient search and insertion operations. The skip list is implemented using two classes – SkipNode and SkipList. The SkipNode class represents a node in the skip list, with a value attribute to store the value of the node, and a forward array to store pointers to the next nodes at each level. The SkipList class represents the skip list itself, with a head attribute to store a pointer to the first node, a level attribute to store the current maximum level of the skip list, and a p attribute to store the probability factor used to generate new levels.

The __init__ method initializes the skip list by creating the head node with a null value and a forward array of size 1, and setting the level to 0 and the probability factor to p. The randomLevel method generates a random level for a new node based on the probability factor p. It starts with a level of 0 and increments the level by 1 with probability p until it reaches the maximum level MAX_LEVEL.

The find method searches the skip list for a node with the given value. It starts at the head node and traverses the forward array at each level until it reaches a node with a greater value than the target value or a null node. If it finds a node with the target value, it returns that node; otherwise, it returns null.

The insert method inserts a new node with the given value into the skip list. It starts at the head node and traverses the forward array at each level until it finds the appropriate position for the new node. It stores the nodes it traverses in an update array to update their forward pointers. If the new node has a higher level than the current maximum level, it updates the update array to include the head node at the new levels and sets the level to the new level. It then creates the new node with the appropriate level and forward array, and updates the forward pointers of the nodes in the update array to point to the new node.

The delete method deletes the node with the given value from the skip list. It starts at the head node and traverses the forward array at each level until it finds the node with the target value or a null node. It stores the nodes it traverses in an update array to update their forward pointers. If it finds the node with the target value, it updates the forward pointers of the nodes in the update array to skip the node with the target value, effectively deleting it. It then checks if the current maximum level is greater than the actual maximum level of the skip list and adjusts it if necessary. Finally, it returns null if the node with the target value was not found, or the node with the target value if it was found and deleted.

Skip lists can be used in a variety of real-life scenarios, such as:

Database indexing: Skip lists can be used to index large databases, allowing for efficient search and retrieval of data.
Web search engines: Skip lists can be used to build search engine indexes, allowing for efficient search of web pages based on keywords.
Game development: Skip lists can be used to store game data such as high scores, player rankings, and other leaderboard data.
Network routing: Skip lists can be used to efficiently route packets through a network by maintaining a sorted list of available network paths.
File systems: Skip lists can be implemented by maintaining a sorted list of file blocks, allowing for efficient reading and writing of files.
Compiler design: Skip lists can be used in compilers to store information about the source code, such as symbol tables, allowing for efficient lookup during compilation.
Natural language processing: Skip lists can be used to store and search large collections of documents in natural language processing applications, such as search engines and information retrieval systems.

Different types of skip lists

Classic skip list: This is the original skip list proposed by William Pugh in 1989. It uses a fixed probability factor to determine the level of each new node, and has a maximum level that is logarithmic in the number of nodes.
Perfect skip list: This is a type of skip list that has a fixed number of levels, and every node is guaranteed to have exactly the same number of forward pointers at each level. This allows for more efficient search and insertion operations, as well as easier implementation and analysis of the skip list.
Fast skip list: This is a variation of the classic skip list that uses a variable probability factor to improve the efficiency of search and insertion operations. It adjusts the probability factor based on the current size of the skip list and the desired number of levels, allowing for more efficient traversal of the skip list.
Sorted skip list: This is a type of skip list that maintains its nodes in sorted order based on some key value. This can improve the efficiency of search and insertion operations, especially when the data is frequently accessed in sorted order.
Skiplist with Memory Management: This is a variation of the classic skip list where memory allocation is done in chunks of fixed size. This helps in reducing memory fragmentation and improves memory management.
Cache-aware skip list: This is a variant of the skip list that is optimized for use in cache-aware systems. It tries to reduce the number of cache misses by rearranging the nodes to improve data locality.
Concurrent skip list: This is a type of skip list that is designed for use in multi-threaded environments, where multiple threads may access the list concurrently. It uses lock-free algorithms to ensure that concurrent access does not lead to data corruption or inconsistency.
Fractional cascading skip list: This is a type of skip list that is optimized for range queries. It uses a technique called fractional cascading to speed up range queries by reusing pointers from higher-level lists in lower-level lists.
Hierarchical skip list: This is a type of skip list that uses a hierarchical structure to provide efficient access to nodes at different levels of granularity. It consists of multiple skip lists at different levels, with each higher-level list skipping over multiple nodes in the lower-level lists.
Darts and Double-Array-based Skip Lists: This is a type of skip list that is optimized for use in text processing applications. It uses a technique called Darts and Double-Array to compress the skip list and reduce the memory overhead.
Skip graph: This is a type of skip list that extends the skip list concept to a distributed system. It uses a decentralized approach to maintain a distributed index of key-value pairs, allowing for efficient search and insertion operations in a distributed environment.

Conclusion

Skip lists are a versatile data structure that can be used in many different applications. They offer efficient search and insertion operations with a relatively low memory overhead, making them a good choice for situations where memory usage is a concern. However, they may not be the best choice in situations where space efficiency is more important than search and insertion efficiency, as their space complexity is O(n).

Written by Shahid

With 10+ years in software engineering, I specialize in blogging and web development. My skills span front-end, back-end, database design, web security, and SEO. Passionate about crafting helpful digital experiences.

Introduction to skip list: An advanced LinkedList

Written by Shahid

Java LinkedList Class Implementation Guide and Examples

Linked list implementation and operations in Ruby