Multi-Dendrix Logo

This Page

multi_dendrix.load_mutation_data_w_cutoff

multi_dendrix.load_mutation_data_w_cutoff(file_loc, patient_wlst=None, gene_wlist=None, cutoff=0)

Loads the mutation data in the given file, restricting to genes with a given mutation frequency.

Parameters:
  • file_loc (string) – Location of mutation data file.
  • patient_wlst (dictionary) – Maps patient IDs to whether they should be included in the analyzed mutation data.
  • gene_wlist (dictionary) – Maps genes to whether they should be included in the analyzed mutation data.
  • cutoff (int) – Minimum mutation frequency a gene must have to be included in the analyzed mutation data.
Returns:

Mutation data tuple (see load_mutation_data()).

Example:
A view into the example data:
>>> file_loc = 'test.m2'
>>> open(file_loc).readlines() # view of the  data
["TCGA-01\tG1\tG3\tG5\n", "TCGA-02\tG2\tG1\tG4\n"]
Mutation data with no cutoff:
>>> load_mutation_data_w_cutoff(file_loc, cutoff=0) # mutation data with no cutoff
(5, 2, ["G1", "G2", "G3", "G4", "G5"], ["TCGA-01", "TCGA-02"],
    {"G1" : ["TCGA-01", "TCGA-02"], "G2" : ["TCGA-02"], "G3" : ["TCGA-01"],
    "G4" : ["TCGA-2"], "G5" : ["TCGA-01"]},
    {"TCGA-01" : ["G1", "G3", "G5"], "TCGA-02" : ["G2", "G1", "G4"]})
Mutation data with a cutoff of 2:
>>> load_mutation_data_w_cutoff(file_loc, cutoff=2)
(1, 2, ["G1"], ["TCGA-01", "TCGA-02"], {"G1" : ["TCGA-01", "TCGA-02"]},
    {"TCGA-01" : ["G1"], "TCGA-02" : ["G1"]})

See also: white_and_blacklisting(), load_mutation_data()