Multi-Dendrix Logo

This Page

multi_dendrix.load_mutation_data

multi_dendrix.load_mutation_data(file_loc, patient_whitelist=None, gene_whitelist=None)[source]

Loads the mutation data in the given file.

Parameters:
  • file_loc (string) – Location of mutation data file.
  • patient_whitelist (dictionary) – Maps patient IDs to whether they should be included in the analyzed mutation data.
  • gene_whitelist (dictionary) – Maps genes to whether they should be included in the analyzed mutation data.
Return type:

Tuple

Returns:
  • m (int) - number of patients.
  • n (int) - number of genes.
  • genes (list) - genes in the mutation data.
  • patients (list) - patients in the mutation data.
  • mutation2patients (dictionary) - mapping of genes to the patients in which they are mutated.
  • patient2mutations (dictionary) - mapping of patients to the genes they have mutated.
Example:
A view into example data:
>>> file_loc = 'test.m2'
>>> open(file_loc).readlines()
["TCGA-01\tG1\tG3\tG5\n", "TCGA-02\tG2\tG1\tG4\n"]
Mutation data with no whitelisting:
>>> load_mutation_data(file_loc)
(5, 2, ["G1", "G2", "G3", "G4", "G5"], ["TCGA-01", "TCGA-02"],
    {"G1" : ["TCGA-01", "TCGA-02"], "G2" : ["TCGA-02"], "G3" : ["TCGA-01"],
    "G4" : ["TCGA-2"], "G5" : ["TCGA-01"]},
    {"TCGA-01" : ["G1", "G3", "G5"], "TCGA-02" : ["G2", "G1", "G4"]})
Mutation data with patient whitelisting only:
>>> patient_wlst = {"TCGA-01" : False, "TCGA-02" : True}
>>> load_mutation_data_w_cutoff(file_loc, patient_wlst)
(3 , 1, ["G1", "G2", "G4"], ["TCGA-02"],
    {"G1" : ["TCGA-02"], "G2" : ["TCGA-02"], "G4" : ["TCGA-01"]},
    {"TCGA-01" : ["G1", "G2", "G4"]})
Mutation data with patient and gene whitelisting:
>>> gene_wlst = {"G1" : True, "G2" : True, "G4" : False}
>>> load_mutation_data_w_cutoff(file_loc, patient_wlst, gene_wlst
(2 , 1, ["G1", "G2"], ["TCGA-02"], {"G1" : ["TCGA-02"], "G2" : ["TCGA-01"]},
    {"TCGA-01" : ["G1", "G2"]})

See also: white_and_blacklisting(), load_mutation_data_w_cutoff()