ChIP-seq

The goal of ChIP-seq is to determine the locations in the genome associating with a protein factor

ChIP-seq experiment steps:

Chromatin ImmunoPrecipitation (ChIP)
Protein-DNA crosslinking with formaldehyde (for TF)
Chop the chromatin using sonication (TF) or micrococal nuclease (MNase) digestion (histone)
Specific factor-targeting antibody
Immunoprecipitation
DNA purification
PCR amplification (~150bp)
High-throughput sequencing (Illumina: can only sequencing the end of the DNA fragments)

History of the development of ChIP-seq technology

UV crosslinking (1984) : the protein-DNA interaction can be captured
Crosslinking + immunoprecipitation (1993) : use antibody to grab the DNA-protein complex
ChIP-chip (2000) : genomewide microarray method was developed, using a pre-designed way
Unbiased chromosomal coverage by tiling array (2004)
ChIP-seq (2007)

Today, ChIP-seq has become the predominant method for profiling chromatin epigenomes.

ChIP-seq data analysis

The analysis aims to achieve the following goals:

Where in the genome do these sequence reads come from? This is accomplished using sequence alignment after quality control
What does the enrichment of sequences mean? Accomplished using peak calling
What can we learn from these data? This requires further downstream analysis and integration

Here is a brief outline of steps that are required to achieve those goals:

Sequencing quality assessment using fastqc. If the quality scores across bases fail, either re-do the experiment or trim the data.
ChIP-seq read mapping: map the fastq file containing the sequence information to the genome; alignment of each sequence read: bowtie, BWA (Burrows–Wheeler Algorithm); usually use the reads can map to a unique/best location in the genome.
Redundancy control: completely identical reads are considered error (for example, induced by PCR)

Non-redundant rate:The ratio of the number of non-redundant reads to the number of mapped reads
PBC (PCR Bottleneck Coefficient): The ratio of the number of locations with 1 read mapped to the number of locations with reads mapped

DNA fragment size estimation:

peak model (MACS) for TF
cross-correlation (SICER) for any ChIP-seq (input): calculate the Correlation between two strings with a displacement; Auto-correlation: Cross-correlation with itself

Retrieve DNA fragments

Full length retrieval (MACS)
Partial retrieval (sharpen the signal)
Point retrieval (SICER)

Pile up: Signal map generation

ChIP-seq: Study design

Background Control: Input or IgG

Input chromatin: sonicated/digested chromatin without immunoprecipitation
IgG: “unspecific” immunoprecipitation

Study Control:

Control exp sample: ChIP + input
Treated exp sample: ChIP + input

ChIP-seq: Peak calling

Goal: Identify regions in the genome enriched for sequence reads: – Compared to genomic background – Compared to input control

MACS: Model-based Analysis for ChIP-Seq Read distribution along the genome

Poisson distribution ($\lambda B G$ = total tag / genome size)
Negative binomial distribution (MACS2)

ChIP-seq show local biases in the genome

Chromatin and sequencing bias
200-300 bp control windows have too few tags – But can look further

B-H adjustment to correct for FDR : p-value → q-value

Data Visualization

bedGraph to bigWig
macs2 output data
IGV

Quality Control

FRiP(FractionofReadsinPeaks)score – 1-10% for TF is normal
Numberofpeaks
- Number of peaks with high fold-enrichment, e.g, 5, 10, ...
- 2000
Sequenceconservation
Fractionofpeakswithinregulatoryregions – 80%

Biological interpretation: ChIP-seq captures a snapshot of binding patterns from a cell population

TF intrinsic property
Binding activity
Cellular heterogeneity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

13-ChIP-seq.md

13-ChIP-seq.md

ChIP-seq

ChIP-seq experiment steps:

History of the development of ChIP-seq technology

ChIP-seq data analysis

ChIP-seq: Study design

ChIP-seq: Peak calling

Files

13-ChIP-seq.md

Latest commit

History

13-ChIP-seq.md

File metadata and controls

ChIP-seq

ChIP-seq experiment steps:

History of the development of ChIP-seq technology

ChIP-seq data analysis

ChIP-seq: Study design

ChIP-seq: Peak calling