A pipeline toolkit for processing, aligning, and annotating aDNA sequences from Pleistocene and Holocene specimens. Damage correction, sub-1× assembly, and haplotype calling — built for degraded fossil material, focused on Canis lineages and North American megafauna.
Ancient specimens present every challenge modern pipelines weren't designed for. Lacuna handles each one at the source rather than working around it downstream.
C-to-T misincorporations at fragment ends are the defining signature of aDNA. Standard callers treat this as real variation and call thousands of false SNPs. Lacuna models the damage pattern and corrects at the alignment stage.
mapDamage2 · per-sample modelPleistocene specimens routinely yield reads of 40–80bp. Adapter trimming and alignment parameters are tuned for these size distributions, not modern sequencing assumptions.
AdapterRemoval2 · BWA-MEMMost fossil samples can't be sequenced deeply. ANGSD's genotype likelihood methods are statistically valid at 0.1–0.5× and don't require minimum depth cutoffs that would discard the majority of the data.
ANGSD · GL modelmtDNA is abundant relative to nuclear DNA in ancient samples. Lacuna assigns haplogroups against reference panels built for Canis lineages and the active megafauna programs at Rewild Genomics.
mtDNA · haplogroup DBAdapterRemoval2 trims adapters and collapses paired-end reads. Minimum length and quality thresholds are set for ancient fragment size distributions.
BWA-MEM with aDNA-appropriate parameters aligns fragmented reads. Duplicate marking accounts for the high PCR duplication expected in ancient libraries.
mapDamage2 fits a statistical model to terminal deamination. Softclip or base quality rescaling — both options are exposed in the job configuration interface.
ANGSD computes genotype likelihoods from low-depth data without minimum coverage cutoffs. Samples at 0.1–0.5× still yield usable calls.
mtDNA haplotypes are called from genotype likelihood output and assigned haplogroups against reference panels for the target lineage.
Every run produces a JSON manifest, per-stage logs, damage model plots, and alignment summaries. Intermediate files are retained and inspectable.
Lacuna runs as a local FastAPI application with a React frontend. The embed below is the real UI. It connects to a backend when one is running locally — without it, the interface still loads but jobs won't process.
What you see below is the current development interface — actively being built. Styling, features, and workflows will change before release. We're showing it now because we believe in building in the open, not because it's ready.
Requires a running backend (uvicorn server:app --reload) to submit and track jobs · frontend loads independently
Lacuna is a Rewild Genomics research tool. The code is public, the methods are documented, and the results are reproducible. Ancient DNA analysis shouldn't require a proprietary platform.
Lacuna is being designed for researchers working with Pleistocene and Holocene material. If you have fossil samples and a question Lacuna might help answer, reach out — we're looking for datasets to test the pipeline against before release.
andrew@rewildgenomics.com · response within 48 hours