DWEN: A novel method for accurate estimation of cell type compositions from bulk data samples


Advances in single-cell RNA sequencing (scRNAseq) technologies have allowed us to study the heterogeneity of cell populations. The cell compositions of tissues from different hosts may vary greatly, indicating the condition of the hosts, from which the samples are collected. However, the high sequencing cost and the lack of fresh tissues make single-cell approaches less appealing. In many cases, it is practically impossible to generate single-cell data in a large number of subjects, making it challenging to monitor changes in cell type compositions in various diseases. Here we introduce a novel approach, named Deconvolution using Weighted Elastic Net (DWEN), that allows researchers to accurately estimate the cell type compositions from bulk data samples without the need of generating single-cell data. It also allows for the re-analysis of bulk data collected from rare conditions to extract more in-depth cell-type level insights. The approach consists of two modules. The first module constructs the cell type signature matrix from single-cell data while the second module estimates the cell type compositions of input bulk samples. In an extensive analysis using 20 datasets generated from scRNA-seq data of different human tissues, we demonstrate that DWEN outperforms current state-of-the-arts in estimating cell type compositions of bulk samples.

The 13th International Conference on KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2021)
Duc Tran
Bioinformatics Scientist