THetA: Tumor Heterogeneity Analysis


THetA (_T_umor _Het_erogeneity _A_nalysis) is an algorithm that estimates the tumor purity and clonal/sublconal copy number aberrations directly from high-throughput DNA sequencing data. The latest release is called THetA2 and includes a number of improvements over previous versions.


You can download the latest version of THetA (now called THetA2) from the THetA GitHub project.


For support, please see the THetA Google Group.


THetA2 is described in the following publication:

L. Oesper, G. Satas, and B.J. Raphael. (2014) Quantifying Tumor Heterogeneity in Whole-Genome and Whole-Exome Sequencing Data. Bioinformatics 30 (24):3532-3540. [Publisher Link]

THetA is described in the following publications:

L. Oesper, A. Mahmoody, and B.J. Raphael. (2013) Inferring Intra-Tumor Heterogeneity from High-Throughput DNA Sequencing Data. (Abstract) Proceedings of the 17th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2013), LNCS 7821, Pages 171-172. [PDF]

L. Oesper, A. Mahmoody, and B.J. Raphael. (2013) THetA: Inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biology. 14:R80. [Publisher Link | Supplemental Material]

The THetA2 paper includes a likelihood model fo BAF which may be used to distinguish between multiple reconstruction returned using read depth only. The latest release includes code to compute this likelihood model. The following are support files which may be helpful when computing this likelihood model for observed BAFs.

Simulated data used in the THetA2 paper can be found here.

Previous versions

These are for archival purposes. It is strongly recommended to download the latest version from the link above.