205 lines
7.7 KiB
Markdown
205 lines
7.7 KiB
Markdown
# Go Storage Engine Todo List
|
|
|
|
This document outlines the implementation tasks for the Go Storage Engine, organized by development phases. Follow these guidelines:
|
|
|
|
- Work on tasks in the order they appear
|
|
- Check off exactly one item (x) before moving to the next unchecked item
|
|
- Each phase must be completed before starting the next phase
|
|
- Test thoroughly before marking an item complete
|
|
|
|
## Phase A: Foundation
|
|
|
|
- [x] Setup project structure and Go module
|
|
- [x] Create directory structure following the package layout in PLAN.md
|
|
- [x] Initialize Go module and dependencies
|
|
- [x] Set up testing framework
|
|
|
|
- [x] Implement config package
|
|
- [x] Define configuration struct with serialization/deserialization
|
|
- [x] Include configurable parameters for durability, compaction, memory usage
|
|
- [x] Create manifest loading/saving functionality
|
|
- [x] Add versioning support for config changes
|
|
|
|
- [x] Build Write-Ahead Log (WAL)
|
|
- [x] Implement append-only file with atomic operations
|
|
- [x] Add Put/Delete operation encoding
|
|
- [x] Create replay functionality with error recovery
|
|
- [x] Implement both synchronous (default) and batched fsync modes
|
|
- [x] Add checksumming for entries
|
|
|
|
- [x] Write WAL tests
|
|
- [x] Test durability with simulated crashes
|
|
- [x] Verify replay correctness
|
|
- [x] Benchmark write performance with different sync options
|
|
- [x] Test error handling and recovery
|
|
|
|
## Phase B: In-Memory Layer
|
|
|
|
- [x] Implement MemTable
|
|
- [x] Create skip list data structure aligned to 64-byte cache lines
|
|
- [x] Add key/value insertion and lookup operations
|
|
- [x] Implement sorted key iteration
|
|
- [x] Add size tracking for flush threshold detection
|
|
|
|
- [x] Connect WAL replay to MemTable
|
|
- [x] Create recovery logic to rebuild MemTable from WAL
|
|
- [x] Implement consistent state reconstruction during recovery
|
|
- [x] Handle errors during replay with appropriate fallbacks
|
|
|
|
- [x] Test concurrent read/write scenarios
|
|
- [x] Verify reader isolation during writes
|
|
- [x] Test consistency guarantees with concurrent operations
|
|
- [x] Benchmark read/write performance under load
|
|
|
|
## Phase C: Persistent Storage
|
|
|
|
- [x] Design SSTable format
|
|
- [x] Define 16KB block structure with restart points
|
|
- [x] Create checksumming for blocks (xxHash64)
|
|
- [x] Define index structure with entries every ~64KB
|
|
- [x] Design file footer with metadata (version, timestamp, key count, etc.)
|
|
|
|
- [x] Implement SSTable writer
|
|
- [x] Add functionality to convert MemTable to blocks
|
|
- [x] Create sparse index generator
|
|
- [x] Implement footer writing with checksums
|
|
- [x] Add atomic file creation for crash safety
|
|
|
|
- [x] Build SSTable reader
|
|
- [x] Implement block loading with validation
|
|
- [x] Create binary search through index
|
|
- [x] Develop iterator interface for scanning
|
|
- [x] Add error handling for corrupted files
|
|
|
|
## Phase D: Basic Engine Integration
|
|
|
|
- [x] Implement Level 0 flush mechanism
|
|
- [x] Create MemTable to SSTable conversion process
|
|
- [x] Implement file management and naming scheme
|
|
- [x] Add background flush triggering based on size
|
|
|
|
- [x] Create read path that merges data sources
|
|
- [x] Implement read from current MemTable
|
|
- [x] Add reads from immutable MemTables awaiting flush
|
|
- [x] Create mechanism to read from Level 0 SSTable files
|
|
- [x] Build priority-based lookup across all sources
|
|
- [x] Implement unified iterator interface for all data sources
|
|
|
|
- [x] Refactoring (to be done after completing Phase D)
|
|
- [x] Create a common iterator interface in the iterator package
|
|
- [x] Rename component-specific iterators (BlockIterator, MemTableIterator, etc.)
|
|
- [x] Update all iterators to implement the common interface directly
|
|
|
|
## Phase E: Compaction
|
|
|
|
- [x] Implement tiered compaction strategy
|
|
- [x] Create file selection algorithm based on overlap/size
|
|
- [x] Implement merge-sorted reading from input files
|
|
- [x] Add atomic output file generation
|
|
- [x] Create size ratio and file count based triggering
|
|
|
|
- [x] Handle tombstones and key deletion
|
|
- [x] Implement tombstone markers
|
|
- [x] Create logic for tombstone garbage collection
|
|
- [x] Test deletion correctness across compactions
|
|
|
|
- [x] Manage file obsolescence and cleanup
|
|
- [x] Implement safe file deletion after compaction
|
|
- [x] Create consistent file tracking
|
|
- [x] Add error handling for cleanup failures
|
|
|
|
- [x] Build background compaction
|
|
- [x] Implement worker pool for compaction tasks
|
|
- [x] Add rate limiting to prevent I/O saturation
|
|
- [x] Create metrics for monitoring compaction progress
|
|
- [x] Implement priority scheduling for urgent compactions
|
|
|
|
## Phase F: Basic Atomicity and Features
|
|
|
|
- [x] Implement merged iterator across all levels
|
|
- [x] Create priority merging iterator
|
|
- [x] Add efficient seeking capabilities
|
|
- [x] Implement proper cleanup for resources
|
|
|
|
- [x] Implement SQLite-inspired reader-writer concurrency
|
|
- [x] Add reader-writer lock for basic isolation
|
|
- [x] Implement WAL-based reads during active write transactions
|
|
- [x] Design clean API for transaction handling
|
|
- [x] Test concurrent read/write operations
|
|
|
|
- [x] Implement atomic batch operations
|
|
- [x] Create batch data structure for multiple operations
|
|
- [x] Implement atomic batch commit to WAL
|
|
- [x] Add crash recovery for batches
|
|
- [x] Design extensible interfaces for future transaction support
|
|
|
|
- [ ] Add basic statistics and metrics
|
|
- [ ] Implement counters for operations
|
|
- [ ] Add timing measurements for critical paths
|
|
- [ ] Create exportable metrics interface
|
|
- [ ] Test accuracy of metrics
|
|
|
|
## Phase G: Optimization and Benchmarking
|
|
|
|
- [ ] Develop benchmark suite
|
|
- [ ] Create random/sequential write benchmarks
|
|
- [ ] Implement point read and range scan benchmarks
|
|
- [ ] Add compaction overhead measurements
|
|
- [ ] Build reproducible benchmark harness
|
|
|
|
- [ ] Optimize critical paths
|
|
- [ ] Profile and identify bottlenecks
|
|
- [ ] Optimize memory usage patterns
|
|
- [ ] Improve cache efficiency in hot paths
|
|
- [ ] Reduce GC pressure for large operations
|
|
|
|
- [ ] Tune default configuration
|
|
- [ ] Benchmark with different parameters
|
|
- [ ] Determine optimal defaults for general use cases
|
|
- [ ] Document configuration recommendations
|
|
|
|
## Phase H: Optional Enhancements
|
|
|
|
- [ ] Add Bloom filters
|
|
- [ ] Implement configurable Bloom filter
|
|
- [ ] Add to SSTable format
|
|
- [ ] Create adaptive sizing based on false positive rates
|
|
- [ ] Benchmark improvement in read performance
|
|
|
|
- [ ] Create monitoring hooks
|
|
- [ ] Add detailed internal event tracking
|
|
- [ ] Implement exportable metrics
|
|
- [ ] Create health check mechanisms
|
|
- [ ] Add performance alerts
|
|
|
|
- [ ] Add crash recovery testing
|
|
- [ ] Build fault injection framework
|
|
- [ ] Create randomized crash scenarios
|
|
- [ ] Implement validation for post-recovery state
|
|
- [ ] Test edge cases in recovery
|
|
|
|
## API Implementation
|
|
|
|
- [ ] Implement Engine interface
|
|
- [ ] `Put(ctx context.Context, key, value []byte, opts ...WriteOption) error`
|
|
- [ ] `Get(ctx context.Context, key []byte, opts ...ReadOption) ([]byte, error)`
|
|
- [ ] `Delete(ctx context.Context, key []byte, opts ...WriteOption) error`
|
|
- [ ] `Batch(ctx context.Context, ops []Operation, opts ...WriteOption) error`
|
|
- [ ] `NewIterator(opts IteratorOptions) Iterator`
|
|
- [x] `BeginTransaction(readOnly bool) (Transaction, error)`
|
|
- [ ] `Close() error`
|
|
|
|
- [ ] Implement error types
|
|
- [ ] `ErrIO` - I/O errors with recovery procedures
|
|
- [ ] `ErrCorruption` - Data integrity issues
|
|
- [ ] `ErrConfig` - Configuration errors
|
|
- [ ] `ErrResource` - Resource exhaustion
|
|
- [ ] `ErrConcurrency` - Race conditions
|
|
- [ ] `ErrNotFound` - Key not found
|
|
|
|
- [ ] Create comprehensive documentation
|
|
- [ ] API usage examples
|
|
- [ ] Configuration guidelines
|
|
- [ ] Performance characteristics
|
|
- [ ] Error handling recommendations
|