methylcheck.qc_signal_intensity

methylcheck.qc_signal_intensity(data_containers=None, path=None, meth=None, unmeth=None, poobah=None, palette=None, noob=True, silent=False, verbose=False, plot=True, cutoff_line=True, bad_sample_cutoff=11.5, return_fig=False)

Suggests sample outliers based on methylated and unmethylated signal intensity.

path
to csv files processed using methylprep these have “noob_meth” and “noob_unmeth” columns per sample file this function can use. if you want it to processed data uncorrected data.
data_containers
output from the methylprep.run_pipeline() command when run in a script or notebook. you can also recreate the list of datacontainers using methylcheck.load(<filepath>,’meth’)
(meth and unmeth)
if you chose process –all you can load the raw intensities like this, and pass them in: meth = pd.read_pickle(‘meth_values.pkl’) unmeth = pd.read_pickle(‘unmeth_values.pkl’) THIS will run the fastest.
(meth and unmeth and poobah)

if poobah=None (default): Does nothing if poobah=False: suppresses this color if poobah=dataframe: color-codes samples according to percent probe failure range,

but only if you pass in meth and unmeth dataframes too, not data_containers object.

if poobah=True: looks for poobah_values.pkl in the path provided.

cutoff_line: True will draw the line; False omits it. bad_sample_cutoff (default 11.5): set the cutoff for determining good vs bad samples, based on signal intensities of meth and unmeth fluorescence channels. 10.5 was borrowed from minfi’s internal defaults. noob: use noob-corrected meth/unmeth values verbose: additional messages plot: if True (default), shows a plot. if False, this function returns the median values per sample of meth and unmeth probes. return_fig (False default), if True, and plot is True, returns a figure object instead of showing plot. compare: if the processed data contains both noob and uncorrected values, it will plot both in different colors palette: if using poobah to color code, you can specify a Seaborn palette to use.

this will draw a diagonal line on plots

A dictionary of data about good/bad samples based on signal intensity
TODO:
doesn’t return both types of data if using compare and not plotting doesn’t give good error message for compare