module CurationTool
Overview
High-level curation workflow functions used by the curation_tool binary.
Each method accepts a GritJiraIssue instance (conventionally named y)
and performs one step of the curation pipeline. Methods are mixed into the
calling context via include CurationTool.
Defined in:
lib/curation_tool.crConstant Summary
-
VERSION =
"v1.2.0"
Instance Method Summary
-
#build_release(y) : Nil
Builds the curated assembly FASTA from the most recent PretextView AGP file and, if a
.beddecontamination file is set, removes remaining contamination intervals and regenerates the pretext map. -
#copy_qc(y) : Nil
Copies curated assembly files and the pretext map into the curated directory for downstream QC.
-
#setup_local(y) : Nil
Sets up a local workstation directory for manual curation.
-
#setup_tol(y) : Nil
Initialises the HPC working directory and decompresses the assembly FASTA.
Instance Method Detail
Builds the curated assembly FASTA from the most recent PretextView AGP file
and, if a .bed decontamination file is set, removes remaining contamination
intervals and regenerates the pretext map.
Steps:
- Finds the latest
*.agp_1file in the working directory. - Runs
pretext-to-asmto produce the curated FASTA. - If a
.beddecon file is present, runsremove_contamination_bedvia LSF (bsub -K) for each haplotype (merged) or for the single primary FASTA.- For merged assemblies the hap2 decon file is derived by substituting
hap1→hap2(with a fallback for partially phased naming).
- For merged assemblies the hap2 decon file is derived by substituting
- Submits
curationpretext.shto regenerate the pretext map (hap1 only for merged assemblies).
Copies curated assembly files and the pretext map into the curated directory for downstream QC.
For merged assemblies, empty placeholder files are also created for the
haplotig FASTA and hap2 chromosome list to satisfy pipeline expectations.
The most recently modified *normal.pretext file from the curationpretext
output directory is selected and renamed to the canonical curated path under
pretext_dir.
Sets up a local workstation directory for manual curation.
Creates a subdirectory named after the ToL ID in the current directory,
touches a notes file, and copies all matching pretext maps from the
tol server via scp.
Initialises the HPC working directory and decompresses the assembly FASTA.
Creates working_dir on disk, then concatenates the appropriate
decontaminated FASTA file(s) into original.fa:
- Haploid/primary+haplotigs: single
decontaminated.fa.gzderived from the decon file path. - Merged (hap1/hap2, maternal/paternal, or primary/haplotigs): both haplotype FASTA files are concatenated in order.
Raises if scaffolds.tpf already exists in the working directory (guards
against accidentally overwriting an in-progress curation).