Skip to content

my proposed changes #185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: feature/cells_argument
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,10 @@ install:
- conda config --add channels defaults
- conda config --add channels conda-forge
- conda config --add channels bioconda
- conda install -c bioconda -c conda-forge snakemake
- conda create -q -n snakemake snakemake>=5.3.1 python=3.6
- conda create -q -n snakemake 'snakemake>=5.3.1' 'python>=3.10'
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires python >= 3.10 to get the | None working.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is only for documentation updates, not the code itself.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaah. But then, which python is used? It was written python3.6 setup.py install...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it on the branch to 3.11


script:
- python3.6 setup.py install
- pip install .
- pytest

after_success:
Expand Down
2 changes: 2 additions & 0 deletions cite_seq_count/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ def main():
print("Skipping cell barcode correction")
bcs_corrected = 0

###### HERE IT STOPS WORKING ##########

# Create sparse matrices for reads results
read_results_matrix = processing.generate_sparse_matrices(
final_results=final_results,
Expand Down
15 changes: 8 additions & 7 deletions cite_seq_count/preprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def parse_barcode_reference(
barcode_pattern = rf"^[ATGC]{{{barcode_length}}}"

header = barcodes_pl.columns
set_dif = set(required_header) - set(header)
set_dif = set([required_header]) - set(header)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without the brackets it was splitting every letter of the required_header

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep that one in, I'll merge once the PR is ready

if len(set_dif) != 0:
set_diff_string = ",".join(list(set_dif))
raise SystemExit(f"The header is missing {set_diff_string}. Exiting")
Expand All @@ -74,7 +74,7 @@ def parse_barcode_reference(

else:
barcodes_pl = barcodes_pl.with_columns(
reference=pl.col(REFERENCE_COLUMN).str.strip_chars(STRIP_CHARS),
reference=pl.col(required_header).str.strip_chars(STRIP_CHARS),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in barcodes_pl there is not the 'reference' column (at least in the whitelist barcode)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm changing the behavior here a little bit. I think I'm gonna use "reference" even in the whitelist input. Just makes the logic a bit cleaner

)

check_sequence_pattern(
Expand Down Expand Up @@ -110,7 +110,7 @@ def parse_tags_csv(file_name: str) -> pl.DataFrame:
TTCCGCCTCTCTTTG,Hashtag_3

Args:
file_name (str): file path as a tring
file_name (str): file path as a string

Returns:
pl.DataFrame: polars dataframe with the csv content
Expand Down Expand Up @@ -381,16 +381,17 @@ def get_barcode_subset(
enable_barcode_correction = True
if barcode_whitelist:
barcode_subset = parse_barcode_reference(
filename=expected_barcodes,
filename=barcode_whitelist,
barcode_length=(chemistry.cell_barcode_end - chemistry.cell_barcode_start),
required_header=WHITELIST_COLUMN,
)
n_barcodes = len(barcode_subset) # ???
else:
n_barcodes = barcode_whitelist
n_barcodes = expected_barcodes
if barcode_reference is not None:
barcode_subset = (
barcodes_df.filter(
pl.col(BARCODE_COLUMN).str.is_in(
pl.col(BARCODE_COLUMN).is_in(
barcode_reference[REFERENCE_COLUMN]
)
)
Expand All @@ -399,7 +400,7 @@ def get_barcode_subset(
.sort("count", descending=True)
.head(n_barcodes * 1.2)
.drop("count")
.rename({SEQUENCE_COLUMN: WHITELIST_COLUMN})
.rename({BARCODE_COLUMN: WHITELIST_COLUMN})
)
else:
raw_barcodes_dict = (
Expand Down
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@
"pyyaml==6.0",
"pooch==1.6.0",
"six==1.16.0",
"polars"
],
python_requires=">=3.8",
python_requires=">=3.10",
package_data={"report_template": ["templates/*.json"]},
)
Binary file not shown.
1 change: 0 additions & 1 deletion tests/test_data/matrix/.~lock.test_matrix.csv#

This file was deleted.

5 changes: 5 additions & 0 deletions tests/test_data/tags/pass/correct_3.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
sequence,feature_name
CGTACGTAGCCTAGC,test1
CGTAGCTCGAAAAAA,test2
CGTCGAAGCTGAACG,test3
CGTCGTAGCTGATCG,test4
3 changes: 3 additions & 0 deletions tests/test_data/whitelist.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
whitelist
TACATATTCTTTACTG
TAGAGGGAAGTCAAGC