kevo/PLAN.md

# Implementation Plan for Go Storage Engine

## Architecture Overview

```
┌─────────────┐     ┌─────────────┐     ┌─────────────────────────┐
│ Client API  │────▶│  MemTable   │────▶│ Immutable SSTable Files │
└─────────────┘     └─────────────┘     └─────────────────────────┘
       │                   ▲                         ▲
       │                   │                         │
       ▼                   │                         │
┌─────────────┐            │            ┌─────────────────────────┐
│  Write-     │────────────┘            │ Background Compaction   │
│  Ahead Log  │                         │ Process                 │
└─────────────┘                         └─────────────────────────┘
       │                                            │
       │                                            │
       ▼                                            ▼
┌─────────────────────────────────────────────────────────────────┐
│                       Persistent Storage                         │
└─────────────────────────────────────────────────────────────────┘
```

## Package Structure

```
go-storage/
├── cmd/
│   └── storage-bench/       # Benchmarking tool
│
├── pkg/
│   ├── config/              # Configuration and manifest
│   ├── wal/                 # Write-ahead logging with transaction markers
│   ├── memtable/            # In-memory table implementation
│   ├── sstable/             # SSTable read/write
│   │   ├── block/           # Block format implementation
│   │   └── footer/          # File footer and metadata
│   ├── compaction/          # Compaction strategies
│   ├── iterator/            # Merged iterator implementation
│   ├── transaction/         # Transaction management with WAL and locks
│   │   └── txbuffer/        # Transaction write buffer
│   └── engine/              # Main engine implementation with single-writer architecture
│
└── internal/
    ├── checksum/            # Checksum utilities (xxHash64)
    └── utils/               # Shared internal utilities
```

## Development Phases

### Phase A: Foundation (1-2 weeks)
1. Set up project structure and Go module
2. Implement config package with serialization/deserialization
3. Build basic WAL with:
   - Append operations (Put/Delete)
   - Replay functionality
   - Configurable fsync modes
4. Write comprehensive tests for WAL durability

### Phase B: In-Memory Layer (1 week)
1. Implement MemTable with:
   - Skip list data structure
   - Sorted key iteration
   - Size tracking for flush threshold
2. Connect WAL replay to MemTable restore
3. Test concurrent read/write scenarios

### Phase C: Persistent Storage (2 weeks)
1. Design and implement SSTable format:
   - Block-based layout with restart points
   - Checksummed blocks
   - Index and metadata in footer
2. Build SSTable writer:
   - Convert MemTable to blocks
   - Generate sparse index
   - Write footer with checksums
3. Implement SSTable reader:
   - Block loading and validation
   - Binary search through index
   - Iterator interface

### Phase D: Basic Engine Integration (1 week)
1. Implement Level 0 flush mechanism:
   - MemTable to SSTable conversion
   - File management and naming
2. Create read path that merges:
   - Current MemTable
   - Immutable MemTables awaiting flush
   - Level 0 SSTable files

### Phase E: Compaction (2 weeks)
1. Implement a single, efficient compaction strategy:
   - Simple tiered compaction approach
2. Handle tombstones and key deletion
3. Manage file obsolescence and cleanup
4. Build background compaction scheduling

### Phase F: Basic Atomicity and Advanced Features (2-3 weeks)
1. Implement merged iterator across all levels
2. Implement SQLite-inspired reader-writer concurrency:
   - Reader-writer lock for basic isolation
   - WAL-based reads for active transactions
   - Support for read operations during writes
3. Implement simple atomic batch operations:
   - Support atomic multi-key writes
   - Ensure proper crash recovery for batch operations
   - Design interfaces that can be extended for full transactions
4. Add basic statistics and metrics

### Phase G: Optimization and Benchmarking (1 week)
1. Develop benchmark suite for:
   - Random vs sequential writes
   - Point reads vs range scans
   - Compaction overhead and pauses
2. Optimize critical paths based on profiling
3. Tune default configuration parameters

### Phase H: Optional Enhancements (as needed)
1. Add Bloom filters to reduce disk reads
2. Create monitoring hooks and detailed metrics
3. Add crash recovery testing

## Testing Strategy

1. **Unit Tests**: Each component thoroughly tested in isolation
2. **Integration Tests**: End-to-end tests for complete workflows
3. **Property Tests**: Generate randomized operations and verify correctness
4. **Crash Tests**: Simulate crashes and verify recovery
5. **Benchmarks**: Measure performance across different workloads

## Implementation Notes

### Error Handling
- Use descriptive error types and wrap errors with context
- Implement recovery mechanisms for all critical operations
- Validate checksums at every read opportunity

### Concurrency
- Implement single-writer architecture for the main write path
- Allow concurrent readers through reader-writer lock mechanism
- Use SQLite-inspired WAL approach for reader-writer coordination:
  - Writers write to WAL, readers can read from either main database or WAL
  - Use appropriate synchronization with reader-writer lock
- Ensure proper isolation between transactions through lock-based exclusion

### Batch Operation Management
- Use WAL for atomic batch operation durability
- Leverage reader-writer lock for transactional isolation
- Provide simple interfaces that can be built upon for transactions
- Ensure proper crash recovery for batch operations through WAL replay

### Go Idioms
- Follow standard Go project layout
- Use interfaces for component boundaries
- Rely on Go's GC but manage large memory allocations carefully
- Use context for cancellation where appropriate