Skip to content

Feature/improved_genomeID_resolve #315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 46 commits into
base: dev
Choose a base branch
from

Conversation

liannette
Copy link
Contributor

This pull request updates the genome ID resolving process in PODP mode.

It builds upon pull request #312 and introduces the following improvements:

  • If no valid RefSeq assembly is available when resolving the genome ID, a GenBank assembly accession is attempted instead. This allows antiSMASH jobs to run for genomes that have a GenBank accession but no RefSeq accession.
  • When a genome record has a RefSeq ID, it is verified via NCBI to check its validity and determine if a more up-to-date version exists.
  • The genome status JSON is updated after each attempt to retrieve BGC data—whether successful or unsuccessful—instead of only after processing all genomes. This helps preserve progress in case the process is interrupted midway.

liannette and others added 30 commits March 18, 2025 17:58
…ieval into distinct functions for improved clarity
@liannette liannette changed the title Feature/updated_genbank_resolve Feature/improved_genomeID_resolve Mar 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

1 participant