Analyze module

Collection of functions needed to analyze the Markov chains.

This module defines as well a class Information, that stores useful quantities, and shortens the argument passing between the functions.

Note

Some of the methods used in this module are directly adapted from the CosmoPmc code from Kilbinger et. al.

analyze.analyze(command_line)[source]

Main function, does the entire analysis.

It calls in turn all the other routines from this module. To limit the arguments of each function to a reasonnable size, a Information instance is used. This instance is initialized in this function, then appended by the other routines.

analyze.prepare(files, info)[source]

Scan the whole input folder, and include all chains in it.

Since you can decide to analyze some file(s), or a complete folder, this function first needs to separate between the two cases.

Warning

If someday you change the way the chains are named, remember to change here too, because this routine assumes the chains have a double underscore in their names.

Note

Only files ending with .txt will be selected, to keep compatibility with CosmoMC format

Note

New in version 2.0.0: if you ask to analyze a Nested Sampling sub-folder (i.e. something that ends in NS with capital letters), the analyze module will translate the output from Nested Sampling to standard chains for Monte Python, and stops. You can then run the – info flag on the whole folder. This procedure is not necessary if the run was complete, but only if the Nested Sampling run was killed before completion.

Parameters:
  • files (list) – list of potentially only one element, containing the files to analyze. This can be only one file, or the encompassing folder, files
  • info (Information instance) – Used to store the result
analyze.convergence(info)[source]

Compute convergence for the desired chains, using Gelman-Rubin diagnostic

Chains have been stored in the info instance of Information. Note that the G-R diagnostic can be computed for a single chain, albeit it will most probably give absurd results. To do so, it separates the chain into three subchains.

analyze.compute_posterior(information_instances)[source]

computes the marginalized posterior distributions, and optionnally plots them

Parameters:information_instances (list) – list of information objects, initialised on the given folders, or list of file, in input. For each of these instance, plot the 1d and 2d posterior distribution, depending on the flags stored in the instances, comming from command line arguments or read from a file.
analyze.ctr_level(histogram2d, lvl, infinite=False)[source]

Extract the contours for the 2d plots (Karim Benabed)

analyze.minimum_credible_intervals(info)[source]

Extract minimum credible intervals (method from Jan Haman) FIXME

analyze.write_h(info_file, indices, name, string, quantity, modifiers=None)[source]

Write one horizontal line of output

analyze.cubic_interpolation(info, hist, bincenters)[source]

Small routine to accomodate the absence of the interpolate module

analyze.write_histogram(hist_file_name, x_centers, hist)[source]

Store the posterior distribution to a file

analyze.read_histogram(histogram_path)[source]

Recover a stored 1d posterior

analyze.write_histogram_2d(hist_file_name, x_centers, y_centers, extent, hist)[source]

Store the histogram information to a file, to plot it later

analyze.read_histogram_2d(histogram_path)[source]

Read the histogram information that was stored in a file.

To use it, call something like this:

x_centers, y_centers, extent, hist = read_histogram_2d_from_file(path)
fig, ax = plt.subplots()
ax.contourf(
    y_centers, x_centers, hist, extent=extent,
    levels=ctr_level(hist, [0.68, 0.95]),
    zorder=5, cma=plt.cm.autumn_r)
plt.show()
analyze.clean_conversion(module_name, tag, folder)[source]

Execute the methods “convert” from the different sampling algorithms

Returns True if something was made, False otherwise

analyze.separate_files(files)[source]

Separate the input files in folder

Given all input arguments to the command line files entry, separate them in a list of lists, grouping them by folders. The number of identified folders will determine the number of information instances to create

analyze.recover_folder_and_files(files)[source]

Distinguish the cases when analyze is called with files or folder

Note that this takes place chronologically after the function separate_files

analyze.extract_array(line)[source]

Return the array on the RHS of the line

>>> extract_array("toto = ['one', 'two']
”)
[‘one’, ‘two’] >>> extract_array(‘toto = [“one”, 0.2]
‘)
[‘one’, 0.2]
analyze.extract_dict(line)[source]

Return the key and value of the dictionary element contained in line

>>> extract_dict("something['toto'] = [0, 1, 2, -2, 'cosmo']")
'toto', [0, 1, 2, -2, 'cosmo']
analyze.extract_parameter_names(info)[source]

Reading the log.param, store in the Information instance the names

analyze.find_maximum_of_likelihood(info)[source]

Finding the global maximum of likelihood

min_minus_lkl will be appended with all the maximum likelihoods of files, then will be replaced by its own maximum. This way, the global maximum likelihood will be used as a reference, and not each chain’s maximum.

analyze.remove_bad_points(info)[source]

Create an array with all the points from the chains, after removing non-markovian, burn-in and fixed fraction

analyze.compute_mean(mean, spam, total)[source]
analyze.compute_variance(var, mean, spam, total)[source]
analyze.compute_covariance_matrix(info)[source]
analyze.adjust_ticks(param, information_instances)[source]
analyze.store_contour_coordinates(info, name1, name2, contours)[source]

docstring

analyze.iscomment(s)[source]

Define what we call a comment in MontePython chain files

class analyze.Information(command_line, other=None)[source]

Bases: object

Hold all information for analyzing runs

The following initialization creates the three tables that can be customized in an extra plot_file (see parser_mp).

Parameters:command_line (Namespace) – it contains the initialised command line arguments
has_interpolate_module = False
cm = [(0.0, 0.0, 0.0, 1.0), (0.30235, 0.15039, 0.74804, 1.0), (0.99843, 0.25392, 0.14765, 1.0), (0.9, 0.75353, 0.10941, 1.0)]
cmaps = [<Mock id='139808601642512'>, <Mock id='139808601642704'>, <Mock id='139808601642896'>, <Mock id='139808601643088'>]
alphas = [1.0, 0.8, 0.6, 0.4]
to_change = None

Dictionary whose keys are the old parameter names, and values are the new ones. For instance {'beta_plus_lambda':'beta+lambda'}

to_plot = None

Array of names of parameters to plot. If left empty, all will be plotted.

Warning

If you changed a parameter name with to_change, you need to give the new name to this array

new_scales = None

Dictionary that redefines some scales. The keys will be the parameter name, and the value its scale.

remap_parameters(spam)[source]

Perform substitutions of parameters for analyzing

Note

for arbitrary combinations of parameters, the prior will not necessarily be flat.

define_ticks()[source]
write_information_files()[source]
write_h_info()[source]
write_v_info()[source]

Write vertical info file

write_tex()[source]

Write a tex table containing the main results