Skip to content

Commit

Permalink
Merge branch 'main' into about
Browse files Browse the repository at this point in the history
  • Loading branch information
nathanweeks authored Dec 14, 2023
2 parents 3fa01a7 + e2c1713 commit da79d3d
Show file tree
Hide file tree
Showing 23 changed files with 91 additions and 76 deletions.
41 changes: 18 additions & 23 deletions .github/workflows/gh-pages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,36 +28,31 @@ jobs:
runs-on: ubuntu-latest
needs: build
steps:
- uses: ruby/setup-ruby@v1
with:
ruby-version: '3.2' # Not needed with a .ruby-version file
bundler-cache: true # runs 'bundle install' and caches installed gems automatically
cache-version: 0 # https://github.com/ruby/setup-ruby#dealing-with-a-corrupted-cache
- name: install htmlproofer
run: gem install html-proofer -v 5.0.8
# Cache HTMLProofer external URL results for 30 days
# https://github.com/gjtorikian/html-proofer/#caching-with-continuous-integration
- uses: actions/cache/restore@v3
id: restore-cache-htmlproofer
with:
path: tmp/.htmlproofer
key: ${{ runner.os }}-html-proofer
# Cache lychee external URL results for 30 days
- name: Download site
uses: actions/download-artifact@v3
with:
name: github-pages
- run: tar -xf artifact.tar && rm artifact.tar
- name: run htmlproofer
# keep --swap-urls in sync with mkdocs.yml site_url
run: |
htmlproofer --ignore-missing-alt --cache '{"timeframe": {"external": "30d"}}' --swap-urls /informatics-website/:/
# https://github.com/actions/cache/blob/main/save/README.md#always-save-cache
- uses: actions/cache/save@v3
id: save-cache-htmlproofer
# https://github.com/lycheeverse/lychee-action#utilising-the-cache-feature
- name: Restore lychee cache
id: restore-cache
uses: actions/cache/restore@v3
with:
path: .lycheecache
key: cache-lychee-${{ github.sha }}
restore-keys: cache-lychee-
- name: Run lychee
uses: lycheeverse/lychee-action@v1.8.0
with:
args: "--base . --cache --max-cache-age 30d --require-https --timeout 5 --exclude 'fonts.gstatic.com' --exclude 'www.microsoft.com/en-us/microsoft-365/onedrive/online-cloud-storage' --exclude-path 404.html -- './**/*.html' './**/*.css'"
fail: true
- name: Save lychee cache
uses: actions/cache/save@v3
if: always()
with:
path: tmp/.htmlproofer
key: ${{ runner.os }}-html-proofer
path: .lycheecache
key: ${{ steps.restore-cache.outputs.cache-primary-key }}

deploy:
if: github.ref == format('refs/heads/{0}', github.event.repository.default_branch)
Expand Down
2 changes: 1 addition & 1 deletion data/resources/resources-primary.json
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@
"id": "9",
"name": "Harvard library data services",
"link": "https://hlrdm.library.harvard.edu/",
"tags": ["data sciecne", "harvard"],
"tags": ["data science", "harvard"],
"date-added": "11-28-2023",
"status" : "active",
"description": ""
Expand Down
1 change: 1 addition & 0 deletions data/resources/tag-csv/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*
48 changes: 47 additions & 1 deletion docs/research/index.md
Original file line number Diff line number Diff line change
@@ -1 +1,47 @@
# Info about our research
# Research

Research in the Bioinformatics Group is primarily driven by collaborative projects with faculty on campus, and by grant-funded internal projects. However, we also conduct some best practices and methdological research to fill gaps in current knowledge and solve challenging bioinformatics problems. Here, we outline some recent examples of these kinds of project.

We also develop methods, which can generally be found on our software page.

## Collaborations

### Assembling Difficult and Repetitive Genomes

(Davis lab; Hopkins lab; Giribet lab)

### Genomic signatures of sexual reproduction in bdelloid rotifers

(Meselson - genetics paper)

### Phylogenetic methods to detect rate shifts in conserved elements

(PhyloAcc work)

### Methods for the analysis of single-cell RNA-seq data

(D4ALT; Hierfit; scclusteval)

## Best practices and methodological research

### Genome annotation

(Adam's stuff)

### snpArcher

(pop gen work)

### RNA-seq analysis

(Adam's papers on various things and general ideas about how to move forward with automated analysis of RNA-seq data)

## Grant funded research

### Convergent evolution

(including: nectar, brood parasite, ratite/NIH enhancer work)

### Comparative population genetics

(codfish)
2 changes: 1 addition & 1 deletion docs/resources/Workshops/R/R-workshop-2023-Part3.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Furthermore, while these basic plots are easy to generate, more complex plots, s

## Introduction to ggplot

[ggplot](https://ggplot2.tidyverse.org/) is a **package** (library of code with various functions) that is part of the **tidyverse**. It uses a somewhat standardized 'grammar of graphics' ([book](http://id.lib.harvard.edu/alma/990138412570203941/catalog); [paper](https://hollis.harvard.edu/permalink/f/1mdq5o5/TN_cdi_informaworld_taylorfrancis_310_1198_jcgs_2009_07098)) in its syntax to make almost every aspect of a plot customizable. Using ggplot it is easy to make reproducible scientific figures that look nice and are easily understandable. With ggplot, I rarely need to tweak my figures outside of R.
[ggplot](https://ggplot2.tidyverse.org/) is a **package** (library of code with various functions) that is part of the **tidyverse**. It uses a somewhat standardized 'grammar of graphics' ([book](https://id.lib.harvard.edu/alma/990138412570203941/catalog); [paper](https://hollis.harvard.edu/permalink/f/1mdq5o5/TN_cdi_informaworld_taylorfrancis_310_1198_jcgs_2009_07098)) in its syntax to make almost every aspect of a plot customizable. Using ggplot it is easy to make reproducible scientific figures that look nice and are easily understandable. With ggplot, I rarely need to tweak my figures outside of R.

This workshop is also heavily influenced by the book [Fundamentals of Data Visualization](https://hollis.harvard.edu/permalink/f/1s5nto6/01HVD_ALMA512300276480003941) by [Claus Wilke](https://clauswilke.com/).

Expand Down
2 changes: 1 addition & 1 deletion docs/resources/Workshops/R/css/rmd_header.html
Original file line number Diff line number Diff line change
@@ -1 +1 @@
<link type="text/css" rel="stylesheet" href="css/rmd.css" media="screen,projection" />
<link type="text/css" rel="stylesheet" href="rmd.css" media="screen,projection" />
2 changes: 1 addition & 1 deletion docs/resources/Workshops/R/end.html
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@
<div class="row" id="img-row">
<div class="col-7-24" id="margin"></div>
<div class="col-10-24" id="img-col">
<img id="res-img" src="img/pooh-meme.png">
<img id="res-img" src="../Unix/img/pooh-meme.png">
<center><span class="fig-caption">Source unknown.</span></center>
</div>
<div class="col-7-24" id="margin"></div>
Expand Down
26 changes: 0 additions & 26 deletions docs/resources/Workshops/R/index-old.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/resources/Workshops/R/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ <h1>Welcome to the Fall 2023 <a href="https://informatics.fas.harvard.edu/" targ
<p>
<b><a href="https://gwct.github.io/" target="_blank">Gregg Thomas</a></b>: A bioinformatics scientist in the
<a href="https://informatics.fas.harvard.edu/" target="_blank">FAS Informatics group</a> at Harvard University and recent postdoc
at the <a href="http://www.thegoodlab.org/" target="_blank">University of Montana</a> where he studied the phylogenetics and comparative
at the <a href="https://www.thegoodlab.org/" target="_blank">University of Montana</a> where he studied the phylogenetics and comparative
genomics of the mouse and rat radiation. He got his PhD at <a href="https://hahnlab.sitehost.iu.edu/" target="_blank">Indiana University</a> where
he worked on comparative genomics of arthropods, mutation rate evolution in primates, and convergent evolution using comparative genomics. In general,
Gregg uses and develops computational methods to study molecular evolution and phylogenetics to determine what forces drive divergence and
Expand Down
4 changes: 2 additions & 2 deletions docs/resources/Workshops/R/links.html
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ <h4>Find here a list of links to relevant resources.</h4>
</ul>

<ul id="links-list">
<li><a href="http://id.lib.harvard.edu/alma/990138412570203941/catalog" target="_blank">Grammar of Graphics book (Wilkinson 1999)</a></li>
<li><a href="https://id.lib.harvard.edu/alma/990138412570203941/catalog" target="_blank">Grammar of Graphics book (Wilkinson 1999)</a></li>
</ul>

<ul id="links-list">
Expand Down Expand Up @@ -160,7 +160,7 @@ <h4>Find here a list of links to relevant resources.</h4>
<div class="col-20-24" id="section-content">

<ul id="links-list">
<li><a href="https://royalsocietypublishing.org/doi/10.1098/rspb.2019.0364" target="_blank">Bird collision dataset (Winger et al. 2019)</a></li>
<li><a href="https://www.ncbi.nlm.nih.gov/pmc/articles/pmc6501673/" target="_blank">Bird collision dataset (Winger et al. 2019)</a></li>
</ul>

<ul id="links-list">
Expand Down
2 changes: 1 addition & 1 deletion docs/resources/Workshops/R/start.html
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ <h3>Note that R and RStudio are different programs and need to be installed sepa

<h2>Installing R</h2>

<h3>Please follow the instructions on <a href="http://ftp.ussg.iu.edu/CRAN/" target="_blank">this page to install R</a>.</h3>
<h3>Please follow the instructions on <a href="https://cloud.r-project.org/" target="_blank">this page to install R</a>.</h3>

<p>
To begin, click the link for your operating system to bring up a list of files to download.
Expand Down
4 changes: 2 additions & 2 deletions docs/resources/Workshops/Unix/Biotips-workshop-2023-Day1.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ pre code {

Welcome to the first day of the [FAS Informatics](https://informatics.fas.harvard.edu/) [Bioinformatics Tips & Tricks workshop](https://harvardinformatics.github.io/workshops/2023-fall/biotips/)!

If you're viewing this file on the website, you are viewing the final, formatted version of the workshop. The workshop itself will take place in the RStudio program and you will *edit and execute the code in this file*. Please download the raw file [here](https://harvardinformatics.github.io/workshops/2023-fall/biotips/Biotips-workshop-2023_Day1-student.Rmd)
If you're viewing this file on the website, you are viewing the final, formatted version of the workshop. The workshop itself will take place in the RStudio program and you will *edit and execute the code in this file*. Please download the raw file [here](https://harvardinformatics.github.io/workshops/2023-fall/biotips/Biotips-workshop-2023-Day1-student.Rmd)

This is the workshop file that should be opened in RStudio. This is an RMarkdown file, meaning that both formatted text and code blocks can be added to it, and the code blocks can be executed from the RStudio interface. RStudio also has an easy to access **Terminal** tab which is how one would normally execute **Unix** commands. For this workshop, we will be going through this document and copy-pasting code blocks within it to run in the terminal to demonstrate some basic concepts. We will also be doing exercises directly in the terminal panel of RStudio. Once you've got the correct command, you can paste it into the RMarkdown document to keep a record.

Expand Down Expand Up @@ -641,7 +641,7 @@ Let's take a look at a SAM file. We could use the typical bash commands like `ca

## SAMtools

[SAMtools](http://www.htslib.org/doc/samtools.html) is a suite of programs that are extremely useful for processing mapped reads and for downstream analysis. As stated above, SAM/BAM files from different programs are (mostly) interchangeable, so `samtools` will work with a file SAM/BAM file no matter what program produced it. It has a ton of functions (which you can check out on the [manual page](http://www.htslib.org/doc/samtools.html)), but we will go through several of the most common uses.
[SAMtools](https://www.htslib.org/doc/samtools.html) is a suite of programs that are extremely useful for processing mapped reads and for downstream analysis. As stated above, SAM/BAM files from different programs are (mostly) interchangeable, so `samtools` will work with a file SAM/BAM file no matter what program produced it. It has a ton of functions (which you can check out on the [manual page](https://www.htslib.org/doc/samtools.html)), but we will go through several of the most common uses.

### samtools view

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ In addition to this optional fourth column for an ID, **bed** files have several
For more information on bed files and these extra columns, visit the following links:

- [Description of bed files from bedtools](https://bedtools.readthedocs.io/en/latest/content/general-usage.html)
- [Description of bed files from UCSC](http://genome.ucsc.edu/FAQ/FAQformat#format1)
- [Description of bed files from UCSC](https://genome.ucsc.edu/FAQ/FAQformat#format1)

# Summarizing SVs from the command line

Expand Down
4 changes: 2 additions & 2 deletions docs/resources/Workshops/Unix/Biotips-workshop-2023-Day2.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ pre code {

Welcome to the second day of the [FAS Informatics](https://informatics.fas.harvard.edu/) [Bioinformatics Tips & Tricks workshop](https://harvardinformatics.github.io/workshops/2023-fall/biotips/)!

If you're viewing this file on the website, you are viewing the final, formatted version of the workshop. The workshop itself will take place in the RStudio program and you will *edit and execute the code in this file*. Please download the raw file [here](https://harvardinformatics.github.io/workshops/2023-fall/biotips/Biotips-workshop-2023_Day2-student.Rmd)
If you're viewing this file on the website, you are viewing the final, formatted version of the workshop. The workshop itself will take place in the RStudio program and you will *edit and execute the code in this file*. Please download the raw file [here](https://harvardinformatics.github.io/workshops/2023-fall/biotips/Biotips-workshop-2023-Day2-student.Rmd)

Today we're going to continue our tour and explanation of common genomics file formats and their associated tools by talking about interval files, that is files which indicate regions of a genome (.bed files, .gff files).

Expand Down Expand Up @@ -149,7 +149,7 @@ In addition to this optional fourth column for an ID, **bed** files have several
For more information on bed files and these extra columns, visit the following links:

- [Description of bed files from bedtools](https://bedtools.readthedocs.io/en/latest/content/general-usage.html)
- [Description of bed files from UCSC](http://genome.ucsc.edu/FAQ/FAQformat#format1)
- [Description of bed files from UCSC](https://genome.ucsc.edu/FAQ/FAQformat#format1)

# Summarizing SVs from the command line

Expand Down
6 changes: 3 additions & 3 deletions docs/resources/Workshops/Unix/Biotips-workshop-2023-Day3.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ output:

Welcome to the third day of the [FAS Informatics](https://informatics.fas.harvard.edu/) [Bioinformatics Tips and Tricks Workshop](https://harvardinformatics.github.io/workshops/2023-fall/biotips/)!

If you're viewing this file on the website, you are viewing the final, formatted version of the workshop. The workshop itself will take place in the RStudio program and you will *edit and execute the code in this file*. Please download the raw file [here](https://harvardinformatics.github.io/workshops/2023-fall/biotips/Biotips-workshop-2023_Day3-student.Rmd)
If you're viewing this file on the website, you are viewing the final, formatted version of the workshop. The workshop itself will take place in the RStudio program and you will *edit and execute the code in this file*. Please download the raw file [here](https://harvardinformatics.github.io/workshops/2023-fall/biotips/Biotips-workshop-2023-Day3-student.Rmd)

Today we're going to continue our tour and explanation of common genomics file formats and their associated tools, starting with GFF files, which are typically used to store gene annotations. We'll then talk about VCF files, which are used to store variants.

Expand Down Expand Up @@ -54,7 +54,7 @@ The format for encoding information about genic regions (commonly called a **gen
For more detailed information on **GFF** files, see the following links:

- [UCSC description (GFF2)](https://genome.ucsc.edu/FAQ/FAQformat.html#format3)
- [Ensembl description (GFF2)](http://www.ensembl.org/info/website/upload/gff.html?redirect=no)
- [Ensembl description (GFF2)](https://www.ensembl.org/info/website/upload/gff.html?redirect=no)
- [NCBI description (GFF3)](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/reference-docs/file-formats/annotation-files/about-ncbi-gff3/)
- [GFF3 Specification](https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md)

Expand Down Expand Up @@ -682,4 +682,4 @@ If everything is working, you should get the same answer as before. But now we c

This is better, but it still requires a lot of typing if we want to run this on 10 files. If, for example, we wanted to compute SNP density separately for each chromosome, or for different interval types (e.g., genes, introns, exons), we'd have to type out each bed file separately.

We can get around this by using loops, which we'll cover next time
We can get around this by using loops, which we'll cover next time
Original file line number Diff line number Diff line change
Expand Up @@ -364,7 +364,7 @@ <h4 class="date">November 17, 2023</h4>
<p>If you’re viewing this file on the website, you are viewing the
final, formatted version of the workshop. The workshop itself will take
place in the RStudio program and you will <em>edit the file while
executing code in the terminal</em>. Please download the raw file <a href="https://harvardinformatics.github.io/workshops/2023-spring/biotips/Biotips-workshop-2023_Day4-student.Rmd">here</a></p>
executing code in the terminal</em>. Please download the raw file <a href="https://harvardinformatics.github.io/workshops/2023-spring/biotips/Biotips-workshop-2023-Day4-student.Rmd">here</a></p>
<p>Today you’ll learn more about how to write scripts, control the
behavior of your scripts using loops and conditional statements, and
more!</p>
Expand Down
Loading

0 comments on commit da79d3d

Please sign in to comment.