kevo/PLAN.md

6.6 KiB

Implementation Plan for Go Storage Engine

Architecture Overview

┌─────────────┐     ┌─────────────┐     ┌─────────────────────────┐
│ Client API  │────▶│  MemTable   │────▶│ Immutable SSTable Files │
└─────────────┘     └─────────────┘     └─────────────────────────┘
       │                   ▲                         ▲
       │                   │                         │
       ▼                   │                         │
┌─────────────┐            │            ┌─────────────────────────┐
│  Write-     │────────────┘            │ Background Compaction   │
│  Ahead Log  │                         │ Process                 │
└─────────────┘                         └─────────────────────────┘
       │                                            │
       │                                            │
       ▼                                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                       Persistent Storage                         │
└─────────────────────────────────────────────────────────────────┘

Package Structure

go-storage/
├── cmd/
│   └── storage-bench/       # Benchmarking tool
│
├── pkg/
│   ├── config/              # Configuration and manifest
│   ├── wal/                 # Write-ahead logging with transaction markers
│   ├── memtable/            # In-memory table implementation
│   ├── sstable/             # SSTable read/write
│   │   ├── block/           # Block format implementation
│   │   └── footer/          # File footer and metadata
│   ├── compaction/          # Compaction strategies
│   ├── iterator/            # Merged iterator implementation
│   ├── transaction/         # Transaction management with Snapshot + WAL
│   │   ├── snapshot/        # Read snapshot implementation
│   │   └── txbuffer/        # Transaction write buffer
│   └── engine/              # Main engine implementation with single-writer architecture
│
└── internal/
    ├── checksum/            # Checksum utilities (xxHash64)
    └── utils/               # Shared internal utilities

Development Phases

Phase A: Foundation (1-2 weeks)

  1. Set up project structure and Go module
  2. Implement config package with serialization/deserialization
  3. Build basic WAL with:
    • Append operations (Put/Delete)
    • Replay functionality
    • Configurable fsync modes
  4. Write comprehensive tests for WAL durability

Phase B: In-Memory Layer (1 week)

  1. Implement MemTable with:
    • Skip list data structure
    • Sorted key iteration
    • Size tracking for flush threshold
  2. Connect WAL replay to MemTable restore
  3. Test concurrent read/write scenarios

Phase C: Persistent Storage (2 weeks)

  1. Design and implement SSTable format:
    • Block-based layout with restart points
    • Checksummed blocks
    • Index and metadata in footer
  2. Build SSTable writer:
    • Convert MemTable to blocks
    • Generate sparse index
    • Write footer with checksums
  3. Implement SSTable reader:
    • Block loading and validation
    • Binary search through index
    • Iterator interface

Phase D: Basic Engine Integration (1 week)

  1. Implement Level 0 flush mechanism:
    • MemTable to SSTable conversion
    • File management and naming
  2. Create read path that merges:
    • Current MemTable
    • Immutable MemTables awaiting flush
    • Level 0 SSTable files

Phase E: Compaction (2 weeks)

  1. Implement a single, efficient compaction strategy:
    • Simple tiered compaction approach
  2. Handle tombstones and key deletion
  3. Manage file obsolescence and cleanup
  4. Build background compaction scheduling

Phase F: Basic Atomicity and Advanced Features (2-3 weeks)

  1. Implement merged iterator across all levels
  2. Add snapshot capability for reads:
    • Point-in-time view of the database
    • Consistent reads across MemTable and SSTables
  3. Implement simple atomic batch operations:
    • Support atomic multi-key writes
    • Ensure proper crash recovery for batch operations
    • Design interfaces that can be extended for full transactions
  4. Add basic statistics and metrics

Phase G: Optimization and Benchmarking (1 week)

  1. Develop benchmark suite for:
    • Random vs sequential writes
    • Point reads vs range scans
    • Compaction overhead and pauses
  2. Optimize critical paths based on profiling
  3. Tune default configuration parameters

Phase H: Optional Enhancements (as needed)

  1. Add Bloom filters to reduce disk reads
  2. Create monitoring hooks and detailed metrics
  3. Add crash recovery testing

Testing Strategy

  1. Unit Tests: Each component thoroughly tested in isolation
  2. Integration Tests: End-to-end tests for complete workflows
  3. Property Tests: Generate randomized operations and verify correctness
  4. Crash Tests: Simulate crashes and verify recovery
  5. Benchmarks: Measure performance across different workloads

Implementation Notes

Error Handling

  • Use descriptive error types and wrap errors with context
  • Implement recovery mechanisms for all critical operations
  • Validate checksums at every read opportunity

Concurrency

  • Implement single-writer architecture for the main write path
  • Allow concurrent readers (snapshots) to proceed without blocking
  • Use appropriate synchronization for reader-writer coordination
  • Ensure proper isolation between transactions

Batch Operation Management

  • Use WAL for atomic batch operation durability
  • Leverage LSM's natural versioning for snapshots
  • Provide simple interfaces that can be built upon for transactions
  • Ensure proper crash recovery for batch operations

Go Idioms

  • Follow standard Go project layout
  • Use interfaces for component boundaries
  • Rely on Go's GC but manage large memory allocations carefully
  • Use context for cancellation where appropriate