docs: added idea, plan, and todo docs

2025-04-19 14:06:53 -06:00 · 2025-04-19 14:06:53 -06:00 · ee23a47a74
commit ee23a47a74
4 changed files with 407 additions and 0 deletions
--- a/IDEA.md
+++ b/IDEA.md
@ -0,0 +1,52 @@
 # Go Storage: A Minimalist LSM Storage Engine
 ## Vision
 Build a clean, composable, and educational storage engine in Go that follows Log-Structured Merge Tree (LSM) principles, focusing on simplicity while providing the building blocks needed for higher-level database implementations.
 ## Goals
 ### 1. Extreme Simplicity
 - Create minimal but complete primitives that can support various database paradigms (KV, relational, graph)
 - Prioritize readability and educational value over hyper-optimization
 - Use idiomatic Go with clear interfaces and documentation
 - Implement a single-writer architecture for simplicity and reduced concurrency complexity
 ### 2. Durability + Performance
 - Implement the LSM architecture pattern: Write-Ahead Log → MemTable → SSTables
 - Provide configurable durability guarantees (sync vs. batched fsync)
 - Optimize for both point lookups and range scans
 ### 3. Configurability
 - Store all configuration parameters in a versioned, persistent manifest
 - Allow tuning of memory usage, compaction behavior, and durability settings
 - Support reproducible startup states across restarts
 ### 4. Composable Primitives
 - Design clean interfaces for fundamental operations (reads, writes, snapshots, iteration)
 - Enable building of higher-level abstractions (SQL, Gremlin, custom query languages)
 - Support both transactional and analytical workloads
 - Provide simple atomic write primitives that can be built upon:
  - Leverage read snapshots from immutable LSM structure
  - Support basic atomic batch operations
  - Ensure crash recovery through proper WAL handling
 ## Target Use Cases
 1. **Educational Tool**: Learn and teach storage engine internals
 2. **Embedded Storage**: Applications needing local, durable storage with predictable performance
 3. **Prototype Foundation**: Base layer for experimenting with novel database designs
 4. **Go Ecosystem Component**: Reusable storage layer for Go applications and services
 ## Non-Goals
 1. **Feature Parity with Production Engines**: Not trying to compete with RocksDB, LevelDB, etc.
 2. **Multi-Node Distribution**: Focusing on single-node operation
 3. **Complex Query Planning**: Leaving higher-level query features to layers built on top
 ## Success Criteria
 1. **Correctness**: Data is never lost or corrupted, even during crashes
 2. **Understandability**: Code is clear enough to serve as an educational reference
 3. **Performance**: Reasonable throughput and latency for common operations
 4. **Extensibility**: Can be built upon to create specialized database engines
--- a/PLAN.md
+++ b/PLAN.md
@ -0,0 +1,154 @@
 # Implementation Plan for Go Storage Engine
 ## Architecture Overview
 ```
 ┌─────────────┐     ┌─────────────┐     ┌─────────────────────────┐
 │ Client API  │────▶│  MemTable   │────▶│ Immutable SSTable Files │
 └─────────────┘     └─────────────┘     └─────────────────────────┘
       │                   ▲                         ▲
       │                   │                         │
       ▼                   │                         │
 ┌─────────────┐            │            ┌─────────────────────────┐
 │  Write-     │────────────┘            │ Background Compaction   │
 │  Ahead Log  │                         │ Process                 │
 └─────────────┘                         └─────────────────────────┘
       │                                            │
       │                                            │
       ▼                                            ▼
 ┌─────────────────────────────────────────────────────────────────┐
 │                       Persistent Storage                         │
 └─────────────────────────────────────────────────────────────────┘
 ```
 ## Package Structure
 ```
 go-storage/
 ├── cmd/
 │   └── storage-bench/       # Benchmarking tool
 │
 ├── pkg/
 │   ├── config/              # Configuration and manifest
 │   ├── wal/                 # Write-ahead logging with transaction markers
 │   ├── memtable/            # In-memory table implementation
 │   ├── sstable/             # SSTable read/write
 │   │   ├── block/           # Block format implementation
 │   │   └── footer/          # File footer and metadata
 │   ├── compaction/          # Compaction strategies
 │   ├── iterator/            # Merged iterator implementation
 │   ├── transaction/         # Transaction management with Snapshot + WAL
 │   │   ├── snapshot/        # Read snapshot implementation
 │   │   └── txbuffer/        # Transaction write buffer
 │   └── engine/              # Main engine implementation with single-writer architecture
 │
 └── internal/
    ├── checksum/            # Checksum utilities (xxHash64)
    └── utils/               # Shared internal utilities
 ```
 ## Development Phases
 ### Phase A: Foundation (1-2 weeks)
 1. Set up project structure and Go module
 2. Implement config package with serialization/deserialization
 3. Build basic WAL with:
   - Append operations (Put/Delete)
   - Replay functionality
   - Configurable fsync modes
 4. Write comprehensive tests for WAL durability
 ### Phase B: In-Memory Layer (1 week)
 1. Implement MemTable with:
   - Skip list data structure
   - Sorted key iteration
   - Size tracking for flush threshold
 2. Connect WAL replay to MemTable restore
 3. Test concurrent read/write scenarios
 ### Phase C: Persistent Storage (2 weeks)
 1. Design and implement SSTable format:
   - Block-based layout with restart points
   - Checksummed blocks
   - Index and metadata in footer
 2. Build SSTable writer:
   - Convert MemTable to blocks
   - Generate sparse index
   - Write footer with checksums
 3. Implement SSTable reader:
   - Block loading and validation
   - Binary search through index
   - Iterator interface
 ### Phase D: Basic Engine Integration (1 week)
 1. Implement Level 0 flush mechanism:
   - MemTable to SSTable conversion
   - File management and naming
 2. Create read path that merges:
   - Current MemTable
   - Immutable MemTables awaiting flush
   - Level 0 SSTable files
 ### Phase E: Compaction (2 weeks)
 1. Implement a single, efficient compaction strategy:
   - Simple tiered compaction approach
 2. Handle tombstones and key deletion
 3. Manage file obsolescence and cleanup
 4. Build background compaction scheduling
 ### Phase F: Basic Atomicity and Advanced Features (2-3 weeks)
 1. Implement merged iterator across all levels
 2. Add snapshot capability for reads:
   - Point-in-time view of the database
   - Consistent reads across MemTable and SSTables
 3. Implement simple atomic batch operations:
   - Support atomic multi-key writes
   - Ensure proper crash recovery for batch operations
   - Design interfaces that can be extended for full transactions
 4. Add basic statistics and metrics
 ### Phase G: Optimization and Benchmarking (1 week)
 1. Develop benchmark suite for:
   - Random vs sequential writes
   - Point reads vs range scans
   - Compaction overhead and pauses
 2. Optimize critical paths based on profiling
 3. Tune default configuration parameters
 ### Phase H: Optional Enhancements (as needed)
 1. Add Bloom filters to reduce disk reads
 2. Create monitoring hooks and detailed metrics
 3. Add crash recovery testing
 ## Testing Strategy
 1. **Unit Tests**: Each component thoroughly tested in isolation
 2. **Integration Tests**: End-to-end tests for complete workflows
 3. **Property Tests**: Generate randomized operations and verify correctness
 4. **Crash Tests**: Simulate crashes and verify recovery
 5. **Benchmarks**: Measure performance across different workloads
 ## Implementation Notes
 ### Error Handling
 - Use descriptive error types and wrap errors with context
 - Implement recovery mechanisms for all critical operations
 - Validate checksums at every read opportunity
 ### Concurrency
 - Implement single-writer architecture for the main write path
 - Allow concurrent readers (snapshots) to proceed without blocking
 - Use appropriate synchronization for reader-writer coordination
 - Ensure proper isolation between transactions
 ### Batch Operation Management
 - Use WAL for atomic batch operation durability
 - Leverage LSM's natural versioning for snapshots
 - Provide simple interfaces that can be built upon for transactions
 - Ensure proper crash recovery for batch operations
 ### Go Idioms
 - Follow standard Go project layout
 - Use interfaces for component boundaries
 - Rely on Go's GC but manage large memory allocations carefully
 - Use context for cancellation where appropriate
--- a/TODO.md
+++ b/TODO.md
@ -0,0 +1,198 @@
 # Go Storage Engine Todo List
 This document outlines the implementation tasks for the Go Storage Engine, organized by development phases. Follow these guidelines:
 - Work on tasks in the order they appear
 - Check off exactly one item (✓) before moving to the next unchecked item
 - Each phase must be completed before starting the next phase
 - Test thoroughly before marking an item complete
 ## Phase A: Foundation
 - [ ] Setup project structure and Go module
  - [ ] Create directory structure following the package layout in PLAN.md
  - [ ] Initialize Go module and dependencies
  - [ ] Set up testing framework
 - [ ] Implement config package
  - [ ] Define configuration struct with serialization/deserialization
  - [ ] Include configurable parameters for durability, compaction, memory usage
  - [ ] Create manifest loading/saving functionality
  - [ ] Add versioning support for config changes
 - [ ] Build Write-Ahead Log (WAL)
  - [ ] Implement append-only file with atomic operations
  - [ ] Add Put/Delete operation encoding
  - [ ] Create replay functionality with error recovery
  - [ ] Implement both synchronous (default) and batched fsync modes
  - [ ] Add checksumming for entries
 - [ ] Write WAL tests
  - [ ] Test durability with simulated crashes
  - [ ] Verify replay correctness
  - [ ] Benchmark write performance with different sync options
  - [ ] Test error handling and recovery
 ## Phase B: In-Memory Layer
 - [ ] Implement MemTable
  - [ ] Create skip list data structure aligned to 64-byte cache lines
  - [ ] Add key/value insertion and lookup operations
  - [ ] Implement sorted key iteration
  - [ ] Add size tracking for flush threshold detection
 - [ ] Connect WAL replay to MemTable
  - [ ] Create recovery logic to rebuild MemTable from WAL
  - [ ] Implement consistent snapshot reads during recovery
  - [ ] Handle errors during replay with appropriate fallbacks
 - [ ] Test concurrent read/write scenarios
  - [ ] Verify reader isolation during writes
  - [ ] Test snapshot consistency guarantees
  - [ ] Benchmark read/write performance under load
 ## Phase C: Persistent Storage
 - [ ] Design SSTable format
  - [ ] Define 16KB block structure with restart points
  - [ ] Create checksumming for blocks (xxHash64)
  - [ ] Define index structure with entries every ~64KB
  - [ ] Design file footer with metadata (version, timestamp, key count, etc.)
 - [ ] Implement SSTable writer
  - [ ] Add functionality to convert MemTable to blocks
  - [ ] Create sparse index generator
  - [ ] Implement footer writing with checksums
  - [ ] Add atomic file creation for crash safety
 - [ ] Build SSTable reader
  - [ ] Implement block loading with validation
  - [ ] Create binary search through index
  - [ ] Develop iterator interface for scanning
  - [ ] Add error handling for corrupted files
 ## Phase D: Basic Engine Integration
 - [ ] Implement Level 0 flush mechanism
  - [ ] Create MemTable to SSTable conversion process
  - [ ] Implement file management and naming scheme
  - [ ] Add background flush triggering based on size
 - [ ] Create read path that merges data sources
  - [ ] Implement read from current MemTable
  - [ ] Add reads from immutable MemTables awaiting flush
  - [ ] Create mechanism to read from Level 0 SSTable files
  - [ ] Build priority-based lookup across all sources
 ## Phase E: Compaction
 - [ ] Implement tiered compaction strategy
  - [ ] Create file selection algorithm based on overlap/size
  - [ ] Implement merge-sorted reading from input files
  - [ ] Add atomic output file generation
  - [ ] Create size ratio and file count based triggering
 - [ ] Handle tombstones and key deletion
  - [ ] Implement tombstone markers
  - [ ] Create logic for tombstone garbage collection
  - [ ] Test deletion correctness across compactions
 - [ ] Manage file obsolescence and cleanup
  - [ ] Implement safe file deletion after compaction
  - [ ] Create consistent file tracking
  - [ ] Add error handling for cleanup failures
 - [ ] Build background compaction
  - [ ] Implement worker pool for compaction tasks
  - [ ] Add rate limiting to prevent I/O saturation
  - [ ] Create metrics for monitoring compaction progress
  - [ ] Implement priority scheduling for urgent compactions
 ## Phase F: Basic Atomicity and Features
 - [ ] Implement merged iterator across all levels
  - [ ] Create priority merging iterator
  - [ ] Add efficient seeking capabilities
  - [ ] Implement proper cleanup for resources
 - [ ] Add snapshot capability
  - [ ] Create point-in-time view mechanism
  - [ ] Implement consistent reads across all data sources
  - [ ] Add resource tracking and cleanup
  - [ ] Test isolation guarantees
 - [ ] Implement atomic batch operations
  - [ ] Create batch data structure for multiple operations
  - [ ] Implement atomic batch commit to WAL
  - [ ] Add crash recovery for batches
  - [ ] Design extensible interfaces for future transaction support
 - [ ] Add basic statistics and metrics
  - [ ] Implement counters for operations
  - [ ] Add timing measurements for critical paths
  - [ ] Create exportable metrics interface
  - [ ] Test accuracy of metrics
 ## Phase G: Optimization and Benchmarking
 - [ ] Develop benchmark suite
  - [ ] Create random/sequential write benchmarks
  - [ ] Implement point read and range scan benchmarks
  - [ ] Add compaction overhead measurements
  - [ ] Build reproducible benchmark harness
 - [ ] Optimize critical paths
  - [ ] Profile and identify bottlenecks
  - [ ] Optimize memory usage patterns
  - [ ] Improve cache efficiency in hot paths
  - [ ] Reduce GC pressure for large operations
 - [ ] Tune default configuration
  - [ ] Benchmark with different parameters
  - [ ] Determine optimal defaults for general use cases
  - [ ] Document configuration recommendations
 ## Phase H: Optional Enhancements
 - [ ] Add Bloom filters
  - [ ] Implement configurable Bloom filter
  - [ ] Add to SSTable format
  - [ ] Create adaptive sizing based on false positive rates
  - [ ] Benchmark improvement in read performance
 - [ ] Create monitoring hooks
  - [ ] Add detailed internal event tracking
  - [ ] Implement exportable metrics
  - [ ] Create health check mechanisms
  - [ ] Add performance alerts
 - [ ] Add crash recovery testing
  - [ ] Build fault injection framework
  - [ ] Create randomized crash scenarios
  - [ ] Implement validation for post-recovery state
  - [ ] Test edge cases in recovery
 ## API Implementation
 - [ ] Implement Engine interface
  - [ ] `Put(ctx context.Context, key, value []byte, opts ...WriteOption) error`
  - [ ] `Get(ctx context.Context, key []byte, opts ...ReadOption) ([]byte, error)`
  - [ ] `Delete(ctx context.Context, key []byte, opts ...WriteOption) error`
  - [ ] `Batch(ctx context.Context, ops []Operation, opts ...WriteOption) error`
  - [ ] `NewIterator(opts IteratorOptions) Iterator`
  - [ ] `Snapshot() Snapshot`
  - [ ] `Close() error`
 - [ ] Implement error types
  - [ ] `ErrIO` - I/O errors with recovery procedures
  - [ ] `ErrCorruption` - Data integrity issues
  - [ ] `ErrConfig` - Configuration errors
  - [ ] `ErrResource` - Resource exhaustion
  - [ ] `ErrConcurrency` - Race conditions
  - [ ] `ErrNotFound` - Key not found
 - [ ] Create comprehensive documentation
  - [ ] API usage examples
  - [ ] Configuration guidelines
  - [ ] Performance characteristics
  - [ ] Error handling recommendations
--- a/go.mod
+++ b/go.mod
@ -0,0 +1,3 @@
 module git.canoozie.net/jer/go-storage
 go 1.24.2
		`@ -0,0 +1,3 @@`
							`module git.canoozie.net/jer/go-storage`

							`go 1.24.2`