RGT and its tools can be used to analyze ChIP-Seq data from transcription factor. We will use data from a biological study measuring histone modifications and the transcription factor PU.1 during dendritic cell differentiation . Figure below depicts all steps of the pipeline analyzing PU.1 ChIP-Seq data. First, a ChIP-Seq experiment produces reads around interaction sites of a protein of interest. After quality control, reads are aligned to the genome (step 1). This low level analysis can be performed with any general read mapper, such as bwa. This tutorial will not cover low level analysis steps.
Next, we use a peak caller to find regions in the genome, which are likely to be bound by the PU.1. A peak caller receives read alignments as input. Peak callers usually generate an intermediary representation of the read alignments called “genomic signal”. Next, they perform a statistical test to detect regions with higher reads than in other genomic regions. These candidate peaks represents the regions with DNA-protein interaction. In this tutorial, we will show how you can implement your own peak caller using functionality provided by RGT with very few lines (steps 2-3).
Usually several high level analyses are performed on candidate peaks. A typical analysis is to evaluate how other genomic data is associated to your ChIP-Seq peaks. We will show here how you can create line plots with RGT-Viz to explore association between PU.1 peaks and histone modifications. These line plots display average “genomic signals” of other ChIP-Seq experiments around the predicted PU.1 peaks (step 5). For ChIP-Seq experiments of transcription factor, it is relevant to find motifs of the transcription factor and its potential co-binding factors (step 4). We will show here how to use the Motif Analysis tool for this task.