# Data module¶

class data.Data(command_line, path)[source]

Bases: object

Store all relevant data to communicate between the different modules.

The Data class holds the cosmological information, the parameters from the MCMC run, the information coming from the likelihoods. It is a wide collections of information, with in particular two main dictionaries: cosmo_arguments and mcmc_parameters.

It defines several useful methods. The following ones are called just once, at initialization:

On the other hand, these two following functions are called every step.

Finally, the convenient method get_mcmc_parameters() will be called in many places, to return the proper list of desired parameters.

It has a number of different attributes, and the more important ones are listed here:

Note

The experiments attribute is extracted from the parameter file, and contains the list of likelihoods to use

Note

The path argument will be used in case it is a first run, and hence a new folder is created. If starting from an existing folder, this dictionary will be compared with the one extracted from the log.param, and will use the latter while warning the user.

Warning

New in version 2.0.0, you can now specify an oversampling of the nuisance parameters, to hasten the execution of a run with likelihoods that have many of them. You should specify a new field in the parameter file, data.over_sampling = [1, ...], that contains a 1 on the first element, and then the over sampling of the desired likelihoods. This array must have the same size as the number of blocks (1 for the cosmo + 1 for each likelihood with varying nuisance parameters). You need to call the code with the flag -j jast for it to be used.

To create an instance of this class, one must feed the following parameters and keyword arguments:

Parameters: command_line (NameSpace) – NameSpace containing the input from the parser_mp. It stores the input parameter file, the jumping methods, the output folder, etc... Most of the information extracted from the command_file will be transformed into Data attributes, whenever it felt meaningful to do so. path (dict) – Contains a dictionary of important local paths. It is used here to find the cosmological module location.
boundary_loglike = None

Define the boundary loglike, the value used to defined a loglike that is out of bounds. If a point in the parameter space is affected to this value, it will be automatically rejected, hence increasing the multiplicity of the last accepted point.

cosmo_arguments = None

Simple dictionary that will serve as a communication interface with the cosmological code. It contains all the parameters for the code that will not be set to their default values. It is updated from mcmc_parameters.

Return type: dict
mcmc_parameters = None

Ordered dictionary of dictionaries, it contains everything needed by the mcmc module for the MCMC procedure. Every parameter name will be the key of a dictionary, containing the initial configuration, role, status, last accepted point and current point.

Return type: ordereddict
NS_arguments = None

Dictionary containing the parameters needed by the PyMultiNest sampler. It is filled just before the run of the sampler. Those parameters not defined will be set to the default value of PyMultiNest.

Return type: dict
over_sampling = None

List storing the respective over sampling of the parameters. The first entry, applied to the cosmological parameters, will always be 1. Setting it to anything else would simply rescale the whole process. If not specified otherwise in the parameter file, all other numbers will be set to 1 as well.

Return type: list
need_cosmo_update = None

added in version 1.1.1. It stores the truth value of whether the cosmological block of parameters was changed from one step to another. See group_parameters_in_blocks()

Return type: bool
log_flag = None

Stores the information whether or not the likelihood data files need to be written down in the log.param file. Initially at False.

Return type: bool
fill_mcmc_parameters()[source]

Initializes the ordered dictionary mcmc_parameters from the input parameter file.

It uses read_file(), and initializes instances of parameter to actually fill in mcmc_parameters.

initialise_likelihoods(experiments)[source]

Given an array of experiments, return an ordered dict of instances

Note

in the __init__ method, experiments is naturally self.experiments, but it is useful to keep it as a parameter, for the case of importance sampling.

read_file(param, structure, field='', separate=False)[source]

Execute all lines concerning the Data class from a parameter file

All lines starting with data. will be replaced by self., so the current instance of the class will contain all the information.

Note

A rstrip() was added at the end, because of an incomprehensible bug on some systems that imagined some inexistent characters at the end of the line... Now should work

Note

A security should be added to protect from obvious attacks.

Parameters: Keyword Arguments: param (str) – Name of the parameter file structure (str) – Name of the class entries we want to execute (mainly, data, or any other likelihood) field (str) – If nothing is specified, this routine will execute all the lines corresponding to the structure parameters. If you specify a specific field, like path, only this field will be read and executed. separate (bool) – If this flag is set to True, a container class will be created for the structure field, so instead of appending to the namespace of the data instance, it will append to a sub-namespace named in the same way that the desired structure. This is used to extract custom values from the likelihoods, allowing to specify values for the likelihood directly in the parameter file.
group_parameters_in_blocks()[source]

Regroup mcmc parameters by blocks of same speed

This method divides all varying parameters from mcmc_parameters into as many categories as there are likelihoods, plus one (the slow block of cosmological parameters).

It creates the attribute block_parameters, to be used in the module mcmc.

Note

It does not compute by any mean the real speed of each parameter, instead, every parameter belonging to the same likelihood will be considered as fast as its neighbour.

Warning

It assumes that the nuisance parameters are already written sequentially, and grouped together (not necessarily in the order described in experiments). If you mix up the different nuisance parameters in the .param file, this routine will not method as intended. It also assumes that the cosmological parameters are written at the beginning of the file.

assign_over_sampling_indices()[source]

Create the list of varied parameters given the oversampling

read_version(param_file)[source]

Extract version and subversion from an existing log.param

get_mcmc_parameters(table_of_strings)[source]

Returns an ordered array of parameter names filtered by table_of_strings.

Parameters: table_of_strings (list) – List of strings whose role and status must be matched by a parameter. For instance, >>> data.get_mcmc_parameters(['varying']) ['omega_b', 'h', 'amplitude', 'other']  will return a list of all the varying parameters, both cosmological and nuisance ones (derived parameters being fixed, they wont be part of this list). Instead, >>> data.get_mcmc_parameters(['nuisance', 'varying']) ['amplitude', 'other']  will only return the nuisance parameters that are being varied.
check_for_slow_step(new_step)[source]

Check whether the value of cosmological parameters were changed, and if no, skip computation of the cosmology.

update_cosmo_arguments()[source]

Put in cosmo_arguments the current values of mcmc_parameters

This method is called at every step in the Markov chain, to update the dictionary. In the Markov chain, the scale is not remembered, so one has to apply it before giving it to the cosmological code.

Note

When you want to define new parameters in the Markov chain that do not have a one to one correspondance to a cosmological name, you can redefine its behaviour here. You will find in the source several such examples.

Note

For complex CLASS parameters, that expect a string of numbers separated with commas, you can now use the name of the argument, for instance m_ncdm, then append a double underscore and a number. So if you run with two cosmological parameters, m_ncdm__1 and m_ncdm__2, this function will automatically concatenate the two and feed class m_ncdm. You still have to make sure that the other variables are properly set, like N_ncdm to 2, in this example.

static folder_is_initialised(folder)[source]

Static method to call for checking if a folder was already initialised

This method can be used to speed up the mpi initialisation in run. If a process finds that the folder is already a proper Monte Python one, it sends directly a ‘go’ signal to its next in line.

Warning

This method assumes that the last lines of the log.param are the path indication. If this would ever change, adjust this method accordingly.

__cmp__(other)[source]

Redefinition of the ‘compare’ method for two instances of this class.

It will decide which basic operations to perform when the code asked if two instances are the same (in case you want to launch a new chain in an existing folder, with your own parameter file) Comparing cosmological code versions (warning only, will not fail the comparison)

__call__(ctx)[source]

Interface layer with CosmoHammer

Store quantities to a the context, to be accessed by the Cosmo Module and each of the likelihoods.

Parameters: ctx (context) – Contains several dictionaries storing data and cosmological information
class data.Parameter(array, key)[source]

Bases: dict

Store all important fields, and define a few convenience methods

This class replaces the old function defined in the Data class, called from_input_to_mcmc_parameters. The traduction is now done inside the Parameter class, which interprets the array given as an input inside the parameter file, and returns a dictionary having all relevant fields initialized.

Warning

This used to be an ordered dictionary, for no evident reason. It is now reverted back to an ordinary dictionary. If this broke anything, it will be reverted back

At the end of this initialization, every field but one is filled for the specified parameter, be it fixed or varying. The missing field is the ‘last_accepted’ one, that will be filled in the module mcmc.

Note

The syntax of the parameter files is defined here - if one wants to change it, one should report the changes in there.

The other fields are

Variables: initial (array) – Initial array of input values defined in the parameter file. Contains (in this order) mean, minimum, maximum, 1-sigma. If the min/max values (TO CHECK proposal density boundaries) are unimportant/unconstrained, use None or -1 (without a period !) scale (float) – 5th entry of the initial array in the parameter file, defines the factor with which to multiply the values defined in initial to give the real value. role (str) – 6th entry of the initial array, can be cosmo, nuisance or derived. A derived parameter will not be considered as varying, but will be instead recovered from the cosmological code for each point in the parameter space. prior (Prior) – defined through the optional 7th entry of the initial array, can be ommited or set to flat (same), or set to gaussian. An instance of the prior defined in prior will be initialized and set to this value. tex_name (str) – A tentative tex version of the name, provided by the function io_mp.get_tex_name(). status (str) – Depending on the 1-sigma value in the initial array, it will be set to fixed or varying (resp. zero and non-zero) current (float) – Stores the value at the current point in parameter space (not allowed initially) value (list) – Array read from the parameter file key (str) – Name of the parameter