Adds a complete LSM-based storage engine with these features: - Single-writer based architecture for the storage engine - WAL for durability, and hey it's configurable - MemTable with skip list implementation for fast read/writes - SSTable with block-based structure for on-disk level-based storage - Background compaction with tiered strategy - ACID transactions - Good documentation (I hope)
9.4 KiB
Iterator Package Documentation
The iterator
package provides a unified interface and implementations for traversing key-value data across the Kevo engine. Iterators are a fundamental abstraction used throughout the system for ordered access to data, regardless of where it's stored.
Overview
Iterators in the Kevo engine follow a consistent interface pattern that allows components to access data in a uniform way. This enables combining and composing iterators to provide complex data access patterns while maintaining a simple, consistent API.
Key responsibilities of the iterator package include:
- Defining a standard iterator interface
- Providing adapter patterns for implementing iterators
- Implementing specialized iterators for different use cases
- Supporting bounded, composite, and hierarchical iteration
Iterator Interface
Core Interface
The core Iterator
interface defines the contract that all iterators must follow:
type Iterator interface {
// Positioning methods
SeekToFirst() // Position at the first key
SeekToLast() // Position at the last key
Seek(target []byte) bool // Position at the first key >= target
Next() bool // Advance to the next key
// Access methods
Key() []byte // Return the current key
Value() []byte // Return the current value
Valid() bool // Check if the iterator is valid
// Special methods
IsTombstone() bool // Check if current entry is a deletion marker
}
This interface is used across all storage layers (MemTable, SSTables, transactions) to provide consistent access to key-value data.
Iterator Types and Patterns
Adapter Pattern
The package provides adapter patterns to simplify implementing the full interface:
-
Base Iterators:
- Implement the core interface directly for specific data structures
- Examples: SkipList iterators, Block iterators
-
Adapter Wrappers:
- Transform existing iterators to provide additional functionality
- Examples: Bounded iterators, filtering iterators
Bounded Iterators
Bounded iterators limit the range of keys an iterator will traverse:
-
Key Range Limiting:
- Apply start and end bounds to constrain iteration
- Skip keys outside the specified range
-
Implementation Approach:
- Wrap an existing iterator
- Filter out keys outside the desired range
- Maintain the underlying iterator's properties otherwise
Composite Iterators
Composite iterators combine multiple source iterators into a single view:
-
MergingIterator:
- Merges multiple iterators into a single sorted stream
- Handles duplicate keys according to specified policy
-
Implementation Details:
- Maintains a priority queue or similar structure
- Selects the next appropriate key from all sources
- Handles edge cases like exhausted sources
Hierarchical Iterators
Hierarchical iterators implement the LSM tree's multi-level view:
-
LSM Hierarchy Semantics:
- Newer sources (e.g., MemTable) take precedence over older sources (e.g., SSTables)
- Combines multiple levels into a single, consistent view
- Respects the "newest version wins" rule for duplicate keys
-
Source Precedence:
- Iterators are provided in order from newest to oldest
- When multiple sources contain the same key, the newer source's value is used
- Tombstones (deletion markers) hide older values
Implementation Details
Hierarchical Iterator
The HierarchicalIterator
is a cornerstone of the storage engine:
-
Source Management:
- Maintains an ordered array of source iterators
- Sources must be provided in newest-to-oldest order
- Typically includes MemTable, immutable MemTables, and SSTable iterators
-
Key Selection Algorithm:
- During
Seek
,Next
, etc., examines all valid sources - Tracks seen keys to handle duplicates
- Selects the smallest key that satisfies the operation's constraints
- For duplicate keys, uses the value from the newest source
- During
-
Thread Safety:
- Mutex protection for concurrent access
- Safe for concurrent reads, though typically used from one thread
-
Memory Efficiency:
- Lazily fetches values only when needed
- Doesn't materialize full result set in memory
Key Selection Process
The key selection process is a critical algorithm in hierarchical iterators:
-
For
SeekToFirst
:- Position all source iterators at their first key
- Select the smallest key across all sources, considering duplicates
-
For
Seek(target)
:- Position all source iterators at the smallest key >= target
- Select the smallest valid key >= target, considering duplicates
-
For
Next
:- Remember the current key
- Advance source iterators past this key
- Select the smallest key that is > current key
Tombstone Handling
Tombstones (deletion markers) are handled specially:
-
Detection:
- Identified by
nil
values in most iterators - Allows distinguishing between deleted keys and non-existent keys
- Identified by
-
Impact on Iteration:
- Tombstones are visible during direct iteration
- During merging, tombstones from newer sources hide older values
- This mechanism enables proper deletion semantics in the LSM tree
Common Usage Patterns
Basic Iterator Usage
// Use any Iterator implementation
iter := someSource.NewIterator()
// Iterate through all entries
for iter.SeekToFirst(); iter.Valid(); iter.Next() {
fmt.Printf("Key: %s, Value: %s\n", iter.Key(), iter.Value())
}
// Or seek to a specific key
if iter.Seek([]byte("target")) {
fmt.Printf("Found: %s\n", iter.Value())
}
Bounded Range Iterator
// Create a bounded iterator
startKey := []byte("user:1000")
endKey := []byte("user:2000")
rangeIter := bounded.NewBoundedIterator(sourceIter, startKey, endKey)
// Iterate through the bounded range
for rangeIter.SeekToFirst(); rangeIter.Valid(); rangeIter.Next() {
fmt.Printf("Key: %s\n", rangeIter.Key())
}
Hierarchical Multi-Source Iterator
// Create iterators for each source (newest to oldest)
memTableIter := memTable.NewIterator()
sstableIter1 := sstable1.NewIterator()
sstableIter2 := sstable2.NewIterator()
// Combine them into a hierarchical view
sources := []iterator.Iterator{memTableIter, sstableIter1, sstableIter2}
hierarchicalIter := composite.NewHierarchicalIterator(sources)
// Use the combined view
for hierarchicalIter.SeekToFirst(); hierarchicalIter.Valid(); hierarchicalIter.Next() {
if !hierarchicalIter.IsTombstone() {
fmt.Printf("%s: %s\n", hierarchicalIter.Key(), hierarchicalIter.Value())
}
}
Performance Considerations
Time Complexity
Iterator operations have the following complexity characteristics:
-
SeekToFirst/SeekToLast:
- O(S) where S is the number of sources
- Each source may have its own seek complexity
-
Seek(target):
- O(S * log N) where N is the typical size of each source
- Binary search within each source, then selection across sources
-
Next():
- Amortized O(S) for typical cases
- May require advancing multiple sources past duplicates
-
Key()/Value()/Valid():
- O(1) - constant time for accessing current state
Memory Management
Iterator implementations focus on memory efficiency:
-
Lazy Evaluation:
- Values are fetched only when needed
- No materialization of full result sets
-
Buffer Reuse:
- Key/value buffers are reused where possible
- Careful copying when needed for correctness
-
Source Independence:
- Each source manages its own memory
- Composite iterators add minimal overhead
Optimizations
Several optimizations improve iterator performance:
-
Key Skipping:
- Skip sources that can't contain the target key
- Early termination when possible
-
Caching:
- Cache recently accessed values
- Avoid redundant lookups
-
Batched Advancement:
- Advance multiple levels at once when possible
- Reduces overall iteration cost
Design Principles
Interface Consistency
The iterator design follows several key principles:
-
Uniform Interface:
- All iterators share the same interface
- Allows seamless substitution and composition
-
Explicit State:
- Iterator state is always explicit
Valid()
must be checked before accessing data
-
Unidirectional Design:
- Forward-only iteration for simplicity
- Backward iteration would add complexity with little benefit
Composability
The iterators are designed for composition:
-
Adapter Pattern:
- Wrap existing iterators to add functionality
- Build complex behaviors from simple components
-
Delegation:
- Delegate operations to underlying iterators
- Apply transformations or filtering as needed
-
Transparency:
- Composite iterators behave like simple iterators
- Internal complexity is hidden from users
Integration with Storage Layers
The iterator system integrates with all storage layers:
-
MemTable Integration:
- SkipList-based iterators for in-memory data
- Priority for recent changes
-
SSTable Integration:
- Block-based iterators for persistent data
- Efficient seeking through index blocks
-
Transaction Integration:
- Combines buffer and engine state
- Preserves transaction isolation
-
Engine Integration:
- Provides unified view across all components
- Handles version selection and visibility