ATAC-seq data preprocessing

In this step, we process scATAC-seq data (or bulk ATAC-seq data) to obtain the accessible promoter/enhancer DNA sequence. We can get the active proximal promoter/enhancer genome sequences by picking up the ATAC-seq peaks that exist around the transcription starting site (TSS). Distal cis-regulatory elements can be picked up using Cicero . Cicero analyzes scATAC-seq data to calculate a co-accessible score between peaks. We can identify cis-regulatory elements using Cicero’s co-access score and TSS information.

If you have bulk ATAC-seq data instead of scATAC-data, we’ll get only the proximal promoter/enhancer genome sequences.

A. Extract TF binding information from scATAC-seq data

If you have scATAC-seq data, you can get information on the distal cis-regulatory elements. This step uses Cicero and does not use celloracle. You need to get co-accessibility table in this analysis. Although we provide an example notebook here, you can analyze your data with Cicero in a different way if you are familiar with Cicero. If you have a question about Cicero, please read the documentation of Cicero for the detailed usage.

scATAC-seq analysis with Cicero and Monocle3

The jupyter notebook files and data used in this tutorial are available here .

R notebook

TSS annotation

The jupyter notebook files and data used in this tutorial are available here .

Python notebook

B. Extract TF binding information from bulk ATAC-seq data or Chip-seq data

Bulk DNA-seq data can be used to get the accessible promoter/enhancer sequences.

The jupyter notebook files and data used in this tutorial are available here .

Python notebook