52 lines
2.5 KiB
Markdown
52 lines
2.5 KiB
Markdown
# Go Storage: A Minimalist LSM Storage Engine
|
|
|
|
## Vision
|
|
|
|
Build a clean, composable, and educational storage engine in Go that follows Log-Structured Merge Tree (LSM) principles, focusing on simplicity while providing the building blocks needed for higher-level database implementations.
|
|
|
|
## Goals
|
|
|
|
### 1. Extreme Simplicity
|
|
- Create minimal but complete primitives that can support various database paradigms (KV, relational, graph)
|
|
- Prioritize readability and educational value over hyper-optimization
|
|
- Use idiomatic Go with clear interfaces and documentation
|
|
- Implement a single-writer architecture for simplicity and reduced concurrency complexity
|
|
|
|
### 2. Durability + Performance
|
|
- Implement the LSM architecture pattern: Write-Ahead Log → MemTable → SSTables
|
|
- Provide configurable durability guarantees (sync vs. batched fsync)
|
|
- Optimize for both point lookups and range scans
|
|
|
|
### 3. Configurability
|
|
- Store all configuration parameters in a versioned, persistent manifest
|
|
- Allow tuning of memory usage, compaction behavior, and durability settings
|
|
- Support reproducible startup states across restarts
|
|
|
|
### 4. Composable Primitives
|
|
- Design clean interfaces for fundamental operations (reads, writes, snapshots, iteration)
|
|
- Enable building of higher-level abstractions (SQL, Gremlin, custom query languages)
|
|
- Support both transactional and analytical workloads
|
|
- Provide simple atomic write primitives that can be built upon:
|
|
- Leverage read snapshots from immutable LSM structure
|
|
- Support basic atomic batch operations
|
|
- Ensure crash recovery through proper WAL handling
|
|
|
|
## Target Use Cases
|
|
|
|
1. **Educational Tool**: Learn and teach storage engine internals
|
|
2. **Embedded Storage**: Applications needing local, durable storage with predictable performance
|
|
3. **Prototype Foundation**: Base layer for experimenting with novel database designs
|
|
4. **Go Ecosystem Component**: Reusable storage layer for Go applications and services
|
|
|
|
## Non-Goals
|
|
|
|
1. **Feature Parity with Production Engines**: Not trying to compete with RocksDB, LevelDB, etc.
|
|
2. **Multi-Node Distribution**: Focusing on single-node operation
|
|
3. **Complex Query Planning**: Leaving higher-level query features to layers built on top
|
|
|
|
## Success Criteria
|
|
|
|
1. **Correctness**: Data is never lost or corrupted, even during crashes
|
|
2. **Understandability**: Code is clear enough to serve as an educational reference
|
|
3. **Performance**: Reasonable throughput and latency for common operations
|
|
4. **Extensibility**: Can be built upon to create specialized database engines |