Data module¶
-
class
data.
Data
(command_line, path)[source]¶ Bases:
object
Store all relevant data to communicate between the different modules.
The Data class holds the cosmological information, the parameters from the MCMC run, the information coming from the likelihoods. It is a wide collections of information, with in particular two main dictionaries: cosmo_arguments and mcmc_parameters.
It defines several useful methods. The following ones are called just once, at initialization:
On the other hand, these two following functions are called every step.
Finally, the convenient method
get_mcmc_parameters()
will be called in many places, to return the proper list of desired parameters.It has a number of different attributes, and the more important ones are listed here:
Note
The experiments attribute is extracted from the parameter file, and contains the list of likelihoods to use
Note
The path argument will be used in case it is a first run, and hence a new folder is created. If starting from an existing folder, this dictionary will be compared with the one extracted from the log.param, and will use the latter while warning the user.
Warning
New in version 2.0.0, you can now specify an oversampling of the nuisance parameters, to hasten the execution of a run with likelihoods that have many of them. You should specify a new field in the parameter file, data.over_sampling = [1, ...], that contains a 1 on the first element, and then the over sampling of the desired likelihoods. This array must have the same size as the number of blocks (1 for the cosmo + 1 for each likelihood with varying nuisance parameters). You need to call the code with the flag -j jast for it to be used.
To create an instance of this class, one must feed the following parameters and keyword arguments:
Parameters: - command_line (NameSpace) –
NameSpace containing the input from the
parser_mp
. It stores the input parameter file, the jumping methods, the output folder, etc... Most of the information extracted from the command_file will be transformed intoData
attributes, whenever it felt meaningful to do so. - path (dict) – Contains a dictionary of important local paths. It is used here to find the cosmological module location.
-
boundary_loglike
= None¶ Define the boundary loglike, the value used to defined a loglike that is out of bounds. If a point in the parameter space is affected to this value, it will be automatically rejected, hence increasing the multiplicity of the last accepted point.
-
cosmo_arguments
= None¶ Simple dictionary that will serve as a communication interface with the cosmological code. It contains all the parameters for the code that will not be set to their default values. It is updated from
mcmc_parameters
.Return type: dict
-
mcmc_parameters
= None¶ Ordered dictionary of dictionaries, it contains everything needed by the
mcmc
module for the MCMC procedure. Every parameter name will be the key of a dictionary, containing the initial configuration, role, status, last accepted point and current point.Return type: ordereddict
-
NS_arguments
= None¶ Dictionary containing the parameters needed by the PyMultiNest sampler. It is filled just before the run of the sampler. Those parameters not defined will be set to the default value of PyMultiNest.
Return type: dict
-
over_sampling
= None¶ List storing the respective over sampling of the parameters. The first entry, applied to the cosmological parameters, will always be 1. Setting it to anything else would simply rescale the whole process. If not specified otherwise in the parameter file, all other numbers will be set to 1 as well.
Return type: list
-
need_cosmo_update
= None¶ added in version 1.1.1. It stores the truth value of whether the cosmological block of parameters was changed from one step to another. See
group_parameters_in_blocks()
Return type: bool
-
log_flag
= None¶ Stores the information whether or not the likelihood data files need to be written down in the log.param file. Initially at False.
Return type: bool
-
fill_mcmc_parameters
()[source]¶ Initializes the ordered dictionary
mcmc_parameters
from the input parameter file.It uses
read_file()
, and initializes instances ofparameter
to actually fill inmcmc_parameters
.
-
initialise_likelihoods
(experiments)[source]¶ Given an array of experiments, return an ordered dict of instances
Note
in the __init__ method, experiments is naturally self.experiments, but it is useful to keep it as a parameter, for the case of importance sampling.
-
read_file
(param, structure, field='', separate=False)[source]¶ Execute all lines concerning the Data class from a parameter file
All lines starting with data. will be replaced by self., so the current instance of the class will contain all the information.
Note
A rstrip() was added at the end, because of an incomprehensible bug on some systems that imagined some inexistent characters at the end of the line... Now should work
Note
A security should be added to protect from obvious attacks.
Parameters: - param (str) – Name of the parameter file
- structure (str) – Name of the class entries we want to execute (mainly, data, or any other likelihood)
Keyword Arguments: - field (str) – If nothing is specified, this routine will execute all the lines corresponding to the structure parameters. If you specify a specific field, like path, only this field will be read and executed.
- separate (bool) – If this flag is set to True, a container class will be created for the structure field, so instead of appending to the namespace of the data instance, it will append to a sub-namespace named in the same way that the desired structure. This is used to extract custom values from the likelihoods, allowing to specify values for the likelihood directly in the parameter file.
-
group_parameters_in_blocks
()[source]¶ Regroup mcmc parameters by blocks of same speed
This method divides all varying parameters from
mcmc_parameters
into as many categories as there are likelihoods, plus one (the slow block of cosmological parameters).It creates the attribute
block_parameters
, to be used in the modulemcmc
.Note
It does not compute by any mean the real speed of each parameter, instead, every parameter belonging to the same likelihood will be considered as fast as its neighbour.
Warning
It assumes that the nuisance parameters are already written sequentially, and grouped together (not necessarily in the order described in
experiments
). If you mix up the different nuisance parameters in the .param file, this routine will not method as intended. It also assumes that the cosmological parameters are written at the beginning of the file.
-
get_mcmc_parameters
(table_of_strings)[source]¶ Returns an ordered array of parameter names filtered by table_of_strings.
Parameters: table_of_strings (list) – List of strings whose role and status must be matched by a parameter. For instance,
>>> data.get_mcmc_parameters(['varying']) ['omega_b', 'h', 'amplitude', 'other']
will return a list of all the varying parameters, both cosmological and nuisance ones (derived parameters being fixed, they wont be part of this list). Instead,
>>> data.get_mcmc_parameters(['nuisance', 'varying']) ['amplitude', 'other']
will only return the nuisance parameters that are being varied.
-
check_for_slow_step
(new_step)[source]¶ Check whether the value of cosmological parameters were changed, and if no, skip computation of the cosmology.
-
update_cosmo_arguments
()[source]¶ Put in
cosmo_arguments
the current values ofmcmc_parameters
This method is called at every step in the Markov chain, to update the dictionary. In the Markov chain, the scale is not remembered, so one has to apply it before giving it to the cosmological code.
Note
When you want to define new parameters in the Markov chain that do not have a one to one correspondance to a cosmological name, you can redefine its behaviour here. You will find in the source several such examples.
Note
For complex CLASS parameters, that expect a string of numbers separated with commas, you can now use the name of the argument, for instance
m_ncdm
, then append a double underscore and a number. So if you run with two cosmological parameters,m_ncdm__1
andm_ncdm__2
, this function will automatically concatenate the two and feed classm_ncdm
. You still have to make sure that the other variables are properly set, likeN_ncdm
to 2, in this example.
-
static
folder_is_initialised
(folder)[source]¶ Static method to call for checking if a folder was already initialised
This method can be used to speed up the mpi initialisation in
run
. If a process finds that the folder is already a proper Monte Python one, it sends directly a ‘go’ signal to its next in line.Warning
This method assumes that the last lines of the log.param are the path indication. If this would ever change, adjust this method accordingly.
-
__cmp__
(other)[source]¶ Redefinition of the ‘compare’ method for two instances of this class.
It will decide which basic operations to perform when the code asked if two instances are the same (in case you want to launch a new chain in an existing folder, with your own parameter file) Comparing cosmological code versions (warning only, will not fail the comparison)
- command_line (NameSpace) –
NameSpace containing the input from the
-
class
data.
Parameter
(array, key)[source]¶ Bases:
dict
Store all important fields, and define a few convenience methods
This class replaces the old function defined in the Data class, called from_input_to_mcmc_parameters. The traduction is now done inside the Parameter class, which interprets the array given as an input inside the parameter file, and returns a dictionary having all relevant fields initialized.
Warning
This used to be an ordered dictionary, for no evident reason. It is now reverted back to an ordinary dictionary. If this broke anything, it will be reverted back
At the end of this initialization, every field but one is filled for the specified parameter, be it fixed or varying. The missing field is the ‘last_accepted’ one, that will be filled in the module
mcmc
.Note
The syntax of the parameter files is defined here - if one wants to change it, one should report the changes in there.
The other fields are
Variables: - initial (array) – Initial array of input values defined in the parameter file. Contains (in this order) mean, minimum, maximum, 1-sigma. If the min/max values (TO CHECK proposal density boundaries) are unimportant/unconstrained, use None or -1 (without a period !)
- scale (float) – 5th entry of the initial array in the parameter file, defines the factor with which to multiply the values defined in initial to give the real value.
- role (str) – 6th entry of the initial array, can be cosmo, nuisance or derived. A derived parameter will not be considered as varying, but will be instead recovered from the cosmological code for each point in the parameter space.
- prior (
Prior
) – defined through the optional 7th entry of the initial array, can be ommited or set to flat (same), or set to gaussian. An instance of theprior
defined inprior
will be initialized and set to this value. - tex_name (str) – A tentative tex version of the name, provided by the function
io_mp.get_tex_name()
. - status (str) – Depending on the 1-sigma value in the initial array, it will be set to fixed or varying (resp. zero and non-zero)
- current (float) – Stores the value at the current point in parameter space (not allowed initially)
Parameters: - value (list) – Array read from the parameter file
- key (str) – Name of the parameter