Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for improved pbmm2 error message around zipped references #709

Open
mrvollger opened this issue Aug 20, 2024 · 8 comments
Open

Request for improved pbmm2 error message around zipped references #709

mrvollger opened this issue Aug 20, 2024 · 8 comments

Comments

@mrvollger
Copy link

Operating system
redhat

Package name

pbmm2 1.13.0
Using:
  pbmm2    : 1.13.0 (commit v1.13.0-2-gbcd99f5)
  pbbam    : 2.4.99 (commit v2.4.0-23-g59248fe)
  pbcopper : 2.3.99 (commit v2.3.0-28-ga9b1ffa)
  boost    : 1.81
  htslib   : 1.17
  minimap2 : 2.26
  zlib     : 1.2.13

Describe the bug
When providing a zipped reference pbmm2 complains about the format of the input reads instead of the reference.

Error message

pbmm2 align ERROR: Could not determine read input type(s). Please do not mix data types, such as BAM+FASTQ. File of files may only contain BAMs or datasets.

To Reproduce

pbmm2 align ref.fa.gz ../data/hap-alns/GM12878.PacBio.H1.GRCh38.bam example.bam
>|> 20240820 22:31:55.970 -|- WARN -|- operator() -|- 0x7f11a49d9f80|| -|- Input is aligned reads. Only primary alignments will be respected to allow idempotence!
>|> 20240820 22:31:55.970 -|- FATAL -|- CheckPositionalArgs -|- 0x7f11a49d9f80|| -|- pbmm2 align ERROR: Could not determine read input type(s). Please do not mix data types, such as BAM+FASTQ. File of files may only contain BAMs or datasets.

Expected behavior
I know that gzipped references are not supported for pbmm2, but it took me quite a while to discover this when I was looking for issues with the input reads rather than the reference. Alternatively, support for zipped references would be great!

As a side note, it would be nice if pbmm2 allowed the .fna extension for references, which is sometimes the extension you get when downloaded from NCBI, e.g.:

GCA_000001405.15_GRCh38_no_alt_analysis_set.fna

Thanks,
Mitchell

@mrvollger mrvollger changed the title Confusing pbmm2 error message Request for improved pbmm2 error message around zipped references Aug 20, 2024
@armintoepfer
Copy link
Member

We might consider adding, but at this point, you are the first one to ask in like >5 years.

@ASLeonard
Copy link

It seems like an uncommon use case, but especially since minimap2 can take gzipped references, this does (negligibly) complicate porting over to pbmm2.

@mrvollger
Copy link
Author

I would love gzip compatibility, but I wanted to clarify that my main issue is the error message when you use a gzip reference:

pbmm2 align ERROR: Could not determine read input type(s). Please do not mix data types, such as BAM+FASTQ. File of files may only contain BAMs or datasets.

This error incorrectly indicates that the format of the reads rather than the reference is incorrect.

@armintoepfer
Copy link
Member

That error message should be fixed in the latest version that we'll release soon

@mrvollger
Copy link
Author

Awesome, thanks @armintoepfer

@armintoepfer
Copy link
Member

Can you give it a try again?

@mrvollger
Copy link
Author

@armintoepfer should I be trying 1.16? Because I am still getting the same error msg:

 pbmm2 --version && pbmm2 align tmp.fa.gz ~/tmp.bam tmp.out.bam
pbmm2 1.16.0

Using:
  pbmm2    : 1.16.0 (commit v1.16.0)
  pbbam    : 2.7.0 (commit v2.7.0)
  pbcopper : 2.6.0 (commit v2.6.0)
  boost    : 1.81
  htslib   : 1.17
  minimap2 : 2.26
  zlib     : 1.2.13
>|> 20241114 19:01:20.294 -|- WARN -|- operator() -|- 0x7f7266e0ff80|| -|- Input is aligned reads. Only primary alignments will be respected to allow idempotence!
>|> 20241114 19:01:20.295 -|- FATAL -|- CheckPositionalArgs -|- 0x7f7266e0ff80|| -|- pbmm2 align ERROR: Could not open or determine read input type(s). Please do not mix data types, such as BAM+FASTQ. File of files may only contain BAMs or datasets.

@armintoepfer
Copy link
Member

Yes. I thought our fix would have fixed it as a side effect. Will file an actual issue. Stay tuned, ty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants