Modules ======== Each of the following modules implement one the key functions in the Multi-Dendrix pipeline. Each can be run as a standalone program, or imported and used as part of a larger pipeline. The Multi-Dendrix package has the following structure. Multi-Dendrix +++++++++++++ This is the main module that implements the integer linear program (ILP) described in the Multi-Dendrix paper. It includes the functions for loading mutation data and interfacing with CPLEX to calculate the optimal collection of gene sets for given parameters. Requirements: * `IBM's CPLEX Python module `_. .. currentmodule:: multi_dendrix .. autosummary:: :toctree: module_docs/multi_dendrix ILP A W load_mutation_data load_mutation_data_w_cutoff white_and_blacklisting (Sub)type-specific mutations ++++++++++++++++++++++++++++ (**multi_dendrix.subtypes**) This module analyzes mutation data for genes (or mutation classes) that are targeted in certain (sub)types more than others. It uses Fisher's exact test to perform the statistical test (described in :func:`multi_dendrix.subtypes.subtype_specificity`). Requirements: * Either SciPy v0.11 or `fisher0.1.4 `_ module. .. currentmodule:: multi_dendrix.subtypes .. autosummary:: :toctree: module_docs/subtype_specific_genes subtype_analysis subtype_specificity ty_contingency_table keep_significant load_patient2ty_file Permute mutation data ++++++++++++++++++++++ (**multi_dendrix.permute.mutation_data**) This module includes functions for permuting mutation data. Permuted mutation matrices are used to calculate the statistical significance of collections identified by Multi-Dendrix. Mutation data is first represented as a bipartite graph, where edges represent the mutation of a particular gene (or mutation class) in a particular patient. A description of the method for permuting the data is described in :func:`permute_mutation_data.permute_mutation_data`. Note that the mutation data provided as input to this module should be restricted to only the genes and patients used by the Multi-Dendrix algorithm *after processing*. Requirements: * `NetworkX `_. .. currentmodule:: multi_dendrix.permute.mutation_data .. autosummary:: :toctree: module_docs/permute_mutation_data permute_mutation_data construct_mutation_graph bipartite_double_edge_swap graph_to_mutation_data Matrix permutation test ++++++++++++++++++++++++ (**multi_dendrix.evaluate.matrix**) This module contains the functions for performing the matrix permutation (statistical significance) test. The matrix permutation test uses an empirical distribution of mutation data (generated by the permute mutation data module above) to evaluate the statistical significance of a collection of gene sets identified by Multi-Dendrix. Note that the matrix permutation test *does not filter genes or patients* (e.g. with the :func:`multi_dendrix.white_and_blacklisting` function). The mutation data provided to the :func:`permute_mutation_data.permute_mutation_data` function should already be restricted to the genes / patients used as input to Multi-Dendrix. Requirements: * `IBM's CPLEX Python module `_ for repeated calls of :func:`multi_dendrix.multi_dendrix`. .. currentmodule:: multi_dendrix.evaluate.matrix .. autosummary:: :toctree: module_docs/matrix_permutation_test matrix_permutation_test load_w_prime load_permuted_matrices Permute protein-protein interaction network +++++++++++++++++++++++++++++++++++++++++++ (**multi_dendrix.permute.ppi_network**) This module permutes a protein-protein interaction network while retaining its degree distribution. Permuted PPI networks are used to empirically determine the enrichment of collections (and individual) gene sets for protein interactions (as described in the *Direct Interactions Test* below). Requirements: * `NetworkX `_. .. currentmodule:: multi_dendrix.permute.ppi_network .. autosummary:: :toctree: module_docs/permute_ppi_network permute_network load_network Direct interactions test ++++++++++++++++++++++++ (**multi_dendrix.evaluate.network**) This module implements the *Direct Interactions Test* as described in the Multi-Dendrix paper. For a given collection of gene sets, the *Direct Interactions Test* assesses its enrichment for interactions *within* gene sets in iRefIndex compared to an empirical distribution of permuted iRefIndex networks. Optional: average pairwise distance can be used as the test statistic instead of the number of interactions. Requirements: * `NetworkX `_. .. currentmodule:: multi_dendrix.evaluate.network .. autosummary:: :toctree: module_docs/network_tests evaluate_collection direct_interactions_test direct_interactions_stat count_interactions interact eval_gene_sets_by_interactions num_interactions_in_gene_set avg_pair_dist_test avg_pair_dist_ratio eval_gene_sets_by_dist dist sum_dist avg_pair_dist_of_gene_set remove_name_annotation load_collection Core modules ++++++++++++++++++++++++ (**multi_dendrix.core_modules**) This module identifies the "core modules" from a set of Multi-Dendrix runs. In other words, given multiple collections of gene sets identified by Multi-Dendrix, this module outputs sets of genes that appear together for > *S* (default: *S* = 1) parameter settings. Requirements: * `NetworkX `_. .. currentmodule:: multi_dendrix.core_modules .. autosummary:: :toctree: module_docs/core_modules extract Output functions +++++++++++++++++ (**multi_dendrix.output**) This module contains many of the output functions used as part of the :doc:`/pipeline`. The purpose is to convert Multi-Dendrix results into text and HTML output. In the future, I hope to add functions for rendering mutation matrices as SVGs. Requirements: * `NetworkX `_. * `GraphViz `_.