Human Chromatin Accessibility During Development

0Cells In


Tissue Data - Seurat Objects

Contain both gene by cell matrix used for cell type annotation (RNA assay) and peak by cell matrix (peaks assay) used for downstream analyses. Cell metadata includes:

Sampled Data

To compare cell types across organs, up to 800 cells were randomly sampled per cell type per tissue (or in cases where less than 800 cells of a given cell type were represented in a given tissue, all cells were taken). Seurat object contains peak by cell matrix for 86,685 cells and 1,001,437 peaks (z-score filtered master list) in the peaks assay.

Cell Metadata

Cell metadata for all cells, including tissue of origin, donor ID, estimated gestational age, sex, experimental batch, total number of reads, total number of deduplicated reads, total number of deduplicated reads that fall into peaks, total number of deduplicated reads that fall into a 2 kb window centered on TSS, fraction of reads in peaks (frip), fraction of reads in 2 kb window centered on TSS (frit), number of peaks that contain at least one read (nFeature_peaks), Louvain cluster ID and UMAP coordinates of per tissue (not combined) UMAP visualizations and cell type annotation.

Masterlist of peaks/regions with motif occurrences

For each region within a merged set of 1.05 M peaks of accessibility the chromosomal location, peak width and motif occurrences for motifs in the JASPAR vertebrate motif database at a p-value threshold of 1e-7 are provided.

Specificity scores

Specificity scores calculated for each region/cell type pair using Jensen-Shannon divergence. A higher score indicates a peak that is more specific to a given cell type.

Motif enrichment across cell type

For each of the 579 motifs from the JASPAR vertebrate database, the enrichment in accessible sites in each of the main 54 cell types was determined using a linear regression model. Fold-change of the mean motif occurrence in sites of a given cell type relative to the rest of the dataset and matching Benjamini Hochberg-adjusted p-values are reported for each motif-cell type pair.

Cicero co-accessibility scores by cell type

Comma-separated table of Cicero co-accessibility scores greater than 0.1, generated for each of 101 cell type/tissue pairs. The first two columns are the coordinates in hg19 of the two tested sites. Each of the remaining columns represents the co-accessibility scores for each of the cell types. NA values indicate that the pair of sites was not tested because of insufficient depth or that the co-accessibility value was less than 0.1.

Cell-type specific accessibility as bigwig files

Fragment endpoints were extended 100 bp in each direction, reads were summed across all cells in a cell type and then normalized to total number of cells per cell type.