methylcheck.beta_density_plot

methylcheck.beta_density_plot(df, verbose=False, save=False, silent=False, reduce=0.1, plot_title=None, ymax=None, return_fig=False, full_range=False, highlight_samples=None, figsize=(12, 9), show_labels=None, filename='beta.png')

Returns a plot of beta values for each sample in a batch of samples as a separate line. Y-axis values is an arbitrary scale, similar to a histogram of probes that have a given beta value. X-axis values are beta values (0 to 1) for a single samples

Input (df):
  • a dataframe with probes in rows and sample_ids in columns.
  • to get this formatted import, use methylprep.consolidate_values_for_sheet(), as this will return a matrix of beta-values for a batch of samples (by default).
Returns:
None (but if return_fig is True, returns the figure object instead of showing plot)
Parameters:
verbose:
display extra messages
save:
if True, saves a copy of the plot as a png file.
silent:
if True, eliminates all messages (useful for automation and scripting)
reduce:
when working with datasets and you don’t need publication quality “exact” plots, supply a float between 0 and 1 to sample the probe data for plotting. We recommend 0.1, which plots 10% of the 450k or 860k probes, and doesn’t distort the distribution much. Values below 0.001 (860 probes out of 860k) will show some sampling distortion. Using 0.1 will speed up plotting 10-fold of large batches.
ymax (None):
If defined, upper limit of plot will not exceed this value. But it y-range can be smaller if values are less than this range.
full_range: (False)
if True, x-axis will be auto-scaled, instead of fixed in the 0-to-1.0 range.
return_fig: (False)
if True, returns figure object instead of showing plot.
highlight_samples:
a string or list of df col-names that, if provided, will highlight sample(s) in blue and bold in plot returned. all other samples in df will be grayed out. Useful for QC reports.
figsize:
tuple of width, height, with 12,9 being default if ommitted.

show_labels: By default, sample names appear in a legend if there are <30 samples. Otherwise, ommitted. Use this to force legend on or off.

Note:
if the sample_ids in df.index are not unique, it will make them so for the purpose of plotting.