kevo/TODO.md

198 lines
7.3 KiB
Markdown

# Go Storage Engine Todo List
This document outlines the implementation tasks for the Go Storage Engine, organized by development phases. Follow these guidelines:
- Work on tasks in the order they appear
- Check off exactly one item (✓) before moving to the next unchecked item
- Each phase must be completed before starting the next phase
- Test thoroughly before marking an item complete
## Phase A: Foundation
- [✓] Setup project structure and Go module
- [✓] Create directory structure following the package layout in PLAN.md
- [✓] Initialize Go module and dependencies
- [✓] Set up testing framework
- [✓] Implement config package
- [✓] Define configuration struct with serialization/deserialization
- [✓] Include configurable parameters for durability, compaction, memory usage
- [✓] Create manifest loading/saving functionality
- [✓] Add versioning support for config changes
- [✓] Build Write-Ahead Log (WAL)
- [✓] Implement append-only file with atomic operations
- [✓] Add Put/Delete operation encoding
- [✓] Create replay functionality with error recovery
- [✓] Implement both synchronous (default) and batched fsync modes
- [✓] Add checksumming for entries
- [✓] Write WAL tests
- [✓] Test durability with simulated crashes
- [✓] Verify replay correctness
- [✓] Benchmark write performance with different sync options
- [✓] Test error handling and recovery
## Phase B: In-Memory Layer
- [✓] Implement MemTable
- [✓] Create skip list data structure aligned to 64-byte cache lines
- [✓] Add key/value insertion and lookup operations
- [✓] Implement sorted key iteration
- [✓] Add size tracking for flush threshold detection
- [✓] Connect WAL replay to MemTable
- [✓] Create recovery logic to rebuild MemTable from WAL
- [✓] Implement consistent snapshot reads during recovery
- [✓] Handle errors during replay with appropriate fallbacks
- [✓] Test concurrent read/write scenarios
- [✓] Verify reader isolation during writes
- [✓] Test snapshot consistency guarantees
- [✓] Benchmark read/write performance under load
## Phase C: Persistent Storage
- [ ] Design SSTable format
- [ ] Define 16KB block structure with restart points
- [ ] Create checksumming for blocks (xxHash64)
- [ ] Define index structure with entries every ~64KB
- [ ] Design file footer with metadata (version, timestamp, key count, etc.)
- [ ] Implement SSTable writer
- [ ] Add functionality to convert MemTable to blocks
- [ ] Create sparse index generator
- [ ] Implement footer writing with checksums
- [ ] Add atomic file creation for crash safety
- [ ] Build SSTable reader
- [ ] Implement block loading with validation
- [ ] Create binary search through index
- [ ] Develop iterator interface for scanning
- [ ] Add error handling for corrupted files
## Phase D: Basic Engine Integration
- [ ] Implement Level 0 flush mechanism
- [ ] Create MemTable to SSTable conversion process
- [ ] Implement file management and naming scheme
- [ ] Add background flush triggering based on size
- [ ] Create read path that merges data sources
- [ ] Implement read from current MemTable
- [ ] Add reads from immutable MemTables awaiting flush
- [ ] Create mechanism to read from Level 0 SSTable files
- [ ] Build priority-based lookup across all sources
## Phase E: Compaction
- [ ] Implement tiered compaction strategy
- [ ] Create file selection algorithm based on overlap/size
- [ ] Implement merge-sorted reading from input files
- [ ] Add atomic output file generation
- [ ] Create size ratio and file count based triggering
- [ ] Handle tombstones and key deletion
- [ ] Implement tombstone markers
- [ ] Create logic for tombstone garbage collection
- [ ] Test deletion correctness across compactions
- [ ] Manage file obsolescence and cleanup
- [ ] Implement safe file deletion after compaction
- [ ] Create consistent file tracking
- [ ] Add error handling for cleanup failures
- [ ] Build background compaction
- [ ] Implement worker pool for compaction tasks
- [ ] Add rate limiting to prevent I/O saturation
- [ ] Create metrics for monitoring compaction progress
- [ ] Implement priority scheduling for urgent compactions
## Phase F: Basic Atomicity and Features
- [ ] Implement merged iterator across all levels
- [ ] Create priority merging iterator
- [ ] Add efficient seeking capabilities
- [ ] Implement proper cleanup for resources
- [ ] Add snapshot capability
- [ ] Create point-in-time view mechanism
- [ ] Implement consistent reads across all data sources
- [ ] Add resource tracking and cleanup
- [ ] Test isolation guarantees
- [ ] Implement atomic batch operations
- [ ] Create batch data structure for multiple operations
- [ ] Implement atomic batch commit to WAL
- [ ] Add crash recovery for batches
- [ ] Design extensible interfaces for future transaction support
- [ ] Add basic statistics and metrics
- [ ] Implement counters for operations
- [ ] Add timing measurements for critical paths
- [ ] Create exportable metrics interface
- [ ] Test accuracy of metrics
## Phase G: Optimization and Benchmarking
- [ ] Develop benchmark suite
- [ ] Create random/sequential write benchmarks
- [ ] Implement point read and range scan benchmarks
- [ ] Add compaction overhead measurements
- [ ] Build reproducible benchmark harness
- [ ] Optimize critical paths
- [ ] Profile and identify bottlenecks
- [ ] Optimize memory usage patterns
- [ ] Improve cache efficiency in hot paths
- [ ] Reduce GC pressure for large operations
- [ ] Tune default configuration
- [ ] Benchmark with different parameters
- [ ] Determine optimal defaults for general use cases
- [ ] Document configuration recommendations
## Phase H: Optional Enhancements
- [ ] Add Bloom filters
- [ ] Implement configurable Bloom filter
- [ ] Add to SSTable format
- [ ] Create adaptive sizing based on false positive rates
- [ ] Benchmark improvement in read performance
- [ ] Create monitoring hooks
- [ ] Add detailed internal event tracking
- [ ] Implement exportable metrics
- [ ] Create health check mechanisms
- [ ] Add performance alerts
- [ ] Add crash recovery testing
- [ ] Build fault injection framework
- [ ] Create randomized crash scenarios
- [ ] Implement validation for post-recovery state
- [ ] Test edge cases in recovery
## API Implementation
- [ ] Implement Engine interface
- [ ] `Put(ctx context.Context, key, value []byte, opts ...WriteOption) error`
- [ ] `Get(ctx context.Context, key []byte, opts ...ReadOption) ([]byte, error)`
- [ ] `Delete(ctx context.Context, key []byte, opts ...WriteOption) error`
- [ ] `Batch(ctx context.Context, ops []Operation, opts ...WriteOption) error`
- [ ] `NewIterator(opts IteratorOptions) Iterator`
- [ ] `Snapshot() Snapshot`
- [ ] `Close() error`
- [ ] Implement error types
- [ ] `ErrIO` - I/O errors with recovery procedures
- [ ] `ErrCorruption` - Data integrity issues
- [ ] `ErrConfig` - Configuration errors
- [ ] `ErrResource` - Resource exhaustion
- [ ] `ErrConcurrency` - Race conditions
- [ ] `ErrNotFound` - Key not found
- [ ] Create comprehensive documentation
- [ ] API usage examples
- [ ] Configuration guidelines
- [ ] Performance characteristics
- [ ] Error handling recommendations