DNArchive simulates DNA-inspired archival storage. It encodes binary data into constrained DNA base sequences and decodes them back into the original data with integrity verification.
The project emphasizes correctness, modular design, and extensibility rather than biological accuracy.
- Bit-to-DNA base mapping (2-bit bijection)
- Byte-level DNA encoding and decoding
- One’s-complement checksum for data integrity
- DNA sequence constraint validation
- Versioned packet format abstraction
- Runtime CLI sanity checks
- Unit tests using
pytest
- Explore DNA-based data encoding concepts
- Practice systems-style design (formats, packets, validation)
- Build a clean, testable, extensible codebase
- Emphasize correctness and separation of concerns
Core encoding, decoding, packet formatting, and integrity checks are implemented and tested.
The project is stable at the component level, with planned expansion for error modeling, redundancy, and higher-level simulations.
archive.codec.mapping— canonical bit/base mappingarchive.codec.encoder— bytes to DNA encodingarchive.codec.decoder— DNA to bytes decodingarchive.codec.constraints— sequence validationarchive.codec.checksum— integrity computation and verificationarchive.codec.format— packet structure and versioningarchive.cli— runtime demonstrations and sanity checkstests/— unit tests mirroring implementation modulesscripts/— development helper scripts
./scripts/run_tests.sh./scripts/run_checks.sh