154 lines
6.6 KiB
Markdown
154 lines
6.6 KiB
Markdown
# Implementation Plan for Go Storage Engine
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐
|
|
│ Client API │────▶│ MemTable │────▶│ Immutable SSTable Files │
|
|
└─────────────┘ └─────────────┘ └─────────────────────────┘
|
|
│ ▲ ▲
|
|
│ │ │
|
|
▼ │ │
|
|
┌─────────────┐ │ ┌─────────────────────────┐
|
|
│ Write- │────────────┘ │ Background Compaction │
|
|
│ Ahead Log │ │ Process │
|
|
└─────────────┘ └─────────────────────────┘
|
|
│ │
|
|
│ │
|
|
▼ ▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Persistent Storage │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Package Structure
|
|
|
|
```
|
|
go-storage/
|
|
├── cmd/
|
|
│ └── storage-bench/ # Benchmarking tool
|
|
│
|
|
├── pkg/
|
|
│ ├── config/ # Configuration and manifest
|
|
│ ├── wal/ # Write-ahead logging with transaction markers
|
|
│ ├── memtable/ # In-memory table implementation
|
|
│ ├── sstable/ # SSTable read/write
|
|
│ │ ├── block/ # Block format implementation
|
|
│ │ └── footer/ # File footer and metadata
|
|
│ ├── compaction/ # Compaction strategies
|
|
│ ├── iterator/ # Merged iterator implementation
|
|
│ ├── transaction/ # Transaction management with Snapshot + WAL
|
|
│ │ ├── snapshot/ # Read snapshot implementation
|
|
│ │ └── txbuffer/ # Transaction write buffer
|
|
│ └── engine/ # Main engine implementation with single-writer architecture
|
|
│
|
|
└── internal/
|
|
├── checksum/ # Checksum utilities (xxHash64)
|
|
└── utils/ # Shared internal utilities
|
|
```
|
|
|
|
## Development Phases
|
|
|
|
### Phase A: Foundation (1-2 weeks)
|
|
1. Set up project structure and Go module
|
|
2. Implement config package with serialization/deserialization
|
|
3. Build basic WAL with:
|
|
- Append operations (Put/Delete)
|
|
- Replay functionality
|
|
- Configurable fsync modes
|
|
4. Write comprehensive tests for WAL durability
|
|
|
|
### Phase B: In-Memory Layer (1 week)
|
|
1. Implement MemTable with:
|
|
- Skip list data structure
|
|
- Sorted key iteration
|
|
- Size tracking for flush threshold
|
|
2. Connect WAL replay to MemTable restore
|
|
3. Test concurrent read/write scenarios
|
|
|
|
### Phase C: Persistent Storage (2 weeks)
|
|
1. Design and implement SSTable format:
|
|
- Block-based layout with restart points
|
|
- Checksummed blocks
|
|
- Index and metadata in footer
|
|
2. Build SSTable writer:
|
|
- Convert MemTable to blocks
|
|
- Generate sparse index
|
|
- Write footer with checksums
|
|
3. Implement SSTable reader:
|
|
- Block loading and validation
|
|
- Binary search through index
|
|
- Iterator interface
|
|
|
|
### Phase D: Basic Engine Integration (1 week)
|
|
1. Implement Level 0 flush mechanism:
|
|
- MemTable to SSTable conversion
|
|
- File management and naming
|
|
2. Create read path that merges:
|
|
- Current MemTable
|
|
- Immutable MemTables awaiting flush
|
|
- Level 0 SSTable files
|
|
|
|
### Phase E: Compaction (2 weeks)
|
|
1. Implement a single, efficient compaction strategy:
|
|
- Simple tiered compaction approach
|
|
2. Handle tombstones and key deletion
|
|
3. Manage file obsolescence and cleanup
|
|
4. Build background compaction scheduling
|
|
|
|
### Phase F: Basic Atomicity and Advanced Features (2-3 weeks)
|
|
1. Implement merged iterator across all levels
|
|
2. Add snapshot capability for reads:
|
|
- Point-in-time view of the database
|
|
- Consistent reads across MemTable and SSTables
|
|
3. Implement simple atomic batch operations:
|
|
- Support atomic multi-key writes
|
|
- Ensure proper crash recovery for batch operations
|
|
- Design interfaces that can be extended for full transactions
|
|
4. Add basic statistics and metrics
|
|
|
|
### Phase G: Optimization and Benchmarking (1 week)
|
|
1. Develop benchmark suite for:
|
|
- Random vs sequential writes
|
|
- Point reads vs range scans
|
|
- Compaction overhead and pauses
|
|
2. Optimize critical paths based on profiling
|
|
3. Tune default configuration parameters
|
|
|
|
### Phase H: Optional Enhancements (as needed)
|
|
1. Add Bloom filters to reduce disk reads
|
|
2. Create monitoring hooks and detailed metrics
|
|
3. Add crash recovery testing
|
|
|
|
## Testing Strategy
|
|
|
|
1. **Unit Tests**: Each component thoroughly tested in isolation
|
|
2. **Integration Tests**: End-to-end tests for complete workflows
|
|
3. **Property Tests**: Generate randomized operations and verify correctness
|
|
4. **Crash Tests**: Simulate crashes and verify recovery
|
|
5. **Benchmarks**: Measure performance across different workloads
|
|
|
|
## Implementation Notes
|
|
|
|
### Error Handling
|
|
- Use descriptive error types and wrap errors with context
|
|
- Implement recovery mechanisms for all critical operations
|
|
- Validate checksums at every read opportunity
|
|
|
|
### Concurrency
|
|
- Implement single-writer architecture for the main write path
|
|
- Allow concurrent readers (snapshots) to proceed without blocking
|
|
- Use appropriate synchronization for reader-writer coordination
|
|
- Ensure proper isolation between transactions
|
|
|
|
### Batch Operation Management
|
|
- Use WAL for atomic batch operation durability
|
|
- Leverage LSM's natural versioning for snapshots
|
|
- Provide simple interfaces that can be built upon for transactions
|
|
- Ensure proper crash recovery for batch operations
|
|
|
|
### Go Idioms
|
|
- Follow standard Go project layout
|
|
- Use interfaces for component boundaries
|
|
- Rely on Go's GC but manage large memory allocations carefully
|
|
- Use context for cancellation where appropriate |