margarine Functions

class margarine.maf.MAF(theta, **kwargs)[source]

This class is used to train, load and call instances of a bijector built from a series of autoregressive neural networks.

Parameters:

theta: numpy array or anesthetic.samples

The samples from the probability distribution that we require the MAF to learn. This can either be a numpy array or an anesthetic NestedSamples or MCMCSamples object.

kwargs:

weights: numpy array / default=np.ones(len(theta))

The weights associated with the samples above. If an anesthetic NestedSamples or MCMCSamples object is passed the code draws the weights from this.

number_networks: int / default = 6

The bijector is built by chaining a series of autoregressive neural networks together and this parameter is used to determine how many networks there are in the chain.

learning_rate: float / default = 1e-3

The learning rate determines the ‘step size’ of the optimization algorithm used to train the MAF. Its value can effect the quality of emulation.

hidden_layers: list / default = [50, 50]

The number of layers and number of nodes in each hidden layer for each neural network. The default is two hidden layers with 50 nodes each and each network in the chain has the same hidden layer structure.

activation_func: string / default = ‘tanh’

The choice of activation function. It must be an activation function keyword recognisable by TensorFlow. The default is ‘tanh’, the hyperbolic tangent activation function.

theta_max: numpy array

The true upper limits of the priors used to generate the samples that we want the MAF to learn.

theta_min: numpy array

As above but the true lower limits of the priors.

parameters: list of strings

A list of the relevant parameters to train on. Only needed if theta is an anestehetic samples object. If not provided, all parameters will be used.

Attributes:

A list of some key attributes accessible to the user.

theta_max: numpy array

The true upper limits of the priors used to generate the samples that we want the MAF to learn. If theta_max is not supplied as a kwarg, then this is is an approximate estimate.

theta_min: numpy array

As above but for the true lower limits of the priors. If theta_max is not supplied as a kwarg, then this is is an approximate estimate.

loss_history: list

This list contains the value of the loss function at each epoch during training.

Methods

`__call__`(u)	This function is used when calling the MAF class to transform samples from the unit hypercube to samples on the MAF.
`gen_mades`()	Generating the masked autoregressive flow.
`load`(filename)	This function can be used to load a saved MAF.
`log_like`(params, logevidence[, prior_de])	This function should return the log-likelihood for a given set of parameters.
`log_prob`(params)	Function to caluclate the log-probability for a given MAF and set of parameters.
`sample`([length])	This function is used to generate samples on the MAF via the MAF __call__ function.
`save`(filename)	This function can be used to save an instance of a trained MAF as a pickled class so that it can be loaded and used in differnt scripts.
`train`([epochs, early_stop, loss_type])	This function is called to train the MAF once it has been initialised.

__call__(u)[source]

This function is used when calling the MAF class to transform samples from the unit hypercube to samples on the MAF.

Parameters:

u: numpy array

Samples on the uniform hypercube.

gen_mades()[source]: Generating the masked autoregressive flow.

classmethod load(filename)[source]

This function can be used to load a saved MAF. For example

from margarine.maf import MAF

file = 'path/to/pickled/MAF.pkl'
bij = MAF.load(file)

Parameters:

filename: string

Path to the saved MAF.

log_like(params, logevidence, prior_de=None)[source]

This function should return the log-likelihood for a given set of parameters.

It requires the logevidence from the original nested sampling run in order to do this and in the case that the prior is non-uniform a trained prior density estimator should be provided.

Parameters:

params: numpy array

The set of samples for which to calculate the log probability.

logevidence: float

Should be the log-evidence from the full nested sampling run with nuisance parameters.

prior_de: margarine.maf.MAF / default=None

If the prior is non-uniform then a trained prior density estimator should be provided. Otherwise the prior is assumed to be uniform and the prior probability is calculated analytically from the minimum and maximum values of the parameters.

log_prob(params)[source]

Function to caluclate the log-probability for a given MAF and set of parameters.

While the density estimator has its own built in log probability function, a correction has to be applied for the transformation of variables that is used to improve accuracy when learning. The correction is implemented here.

Parameters:

params: numpy array

The set of samples for which to calculate the log probability.

sample(length=1000)[source]

This function is used to generate samples on the MAF via the MAF __call__ function.

Kwargs:

length: int / default=1000

This should be an integer and is used to determine how many samples are generated when calling the MAF.

save(filename)[source]

This function can be used to save an instance of a trained MAF as a pickled class so that it can be loaded and used in differnt scripts.

Parameters:

filename: string

Path in which to save the pickled MAF.

train(epochs=100, early_stop=False, loss_type='sum')[source]

This function is called to train the MAF once it has been initialised. It calls the _training() function.

from margarine.maf import MAF

bij = MAF(theta, weights=weights)
bij.train()

Kwargs:

epochs: int / default = 100

The number of iterations to train the neural networks for.

early_stop: boolean / default = False

Determines whether or not to implement an early stopping algorithm or train for the set number of epochs. If set to True then the algorithm will stop training when test loss has not improved for 2% of the requested epochs. At this point margarine will roll back to the best model and return this to the user.

loss_type: string / default = ‘sum’

Determines whether to use the sum or mean of the weighted log probabilities to calculate the loss function.

class margarine.kde.KDE(theta, **kwargs)[source]

This class is used to generate a KDE given a weighted set of samples, generate samples from that KDE, transform samples on the hypercube into samples on the KDE and save and load the KDE model.

Parameters:

theta: numpy array or anesthetic.samples

The samples from the probability distribution that we require the MAF to learn. This can either be a numpy array or an anesthetic NestedSamples or MCMCSamples object.

kwargs:

weights: numpy array / default=np.ones(len(theta))

The weights associated with the samples above. If an anesthetic NestedSamples or MCMCSamples object is passed the code draws the weights from this.

bw_method: str, scalar or callable

The bandwidth for the KDE.

theta_max: numpy array

The true upper limits of the priors used to generate the samples that we want the MAF to learn.

theta_min: numpy array

As above but the true lower limits of the priors.

parameters: list of strings

A list of the relevant parameters to train on. Only needed if theta is an anestehetic samples object. If not provided, all parameters will be used.

Attributes:

A list of some key attributes accessible to the user.

kde: Instance of scipy.stats.gaussian_kde
Once the class has been initalised with a set of samples and their corresponding weights we can generate the kde using the following code
from bayesstats.kde import KDE
import numpy as np

theta = np.loadtxt('path/to/samples.txt')
weights = np.loadtxt('path/to/weights.txt')

KDE_class = KDE(theta, weights)
KDE_class.generate_kde()
This is analogous to training a Normalising Flow (Bijector class). Once the KDE is generated it can be accessed via KDE_class.kde. Initialisation of the class and generation of the KDE are kept seperate to allow models to be saved and loaded effectively.
theta_max: numpy array

The true upper limits of the priors used to generate the samples that we want the MAF to learn. If theta_max is not supplied as a kwarg, then this is is an approximate estimate.

theta_min: numpy array

As above but for the true lower limits of the priors. If theta_max is not supplied as a kwarg, then this is an approximate estimate.

Methods

`__call__`(u)	This function is used when calling the kde class to transform samples from the unit hypercube to samples on the kde.
`generate_kde`()	Function noramlises the input data into a standard normal parameter space and then generates a weighted KDE.
`load`(filename)	This function can be used to load a saved KDE.
`log_like`(params, logevidence[, prior, ...])	This function should return the log-likelihood for a given set of parameters.
`log_prob`(params)	Function to caluclate the log-probability for a given KDE and set of parameters.
`sample`([length])	Function can be used to generate samples from the KDE.
`save`(filename)	Function can be used to save an initalised version of the KDE class and it's assosiated generated KDE.

__call__(u)[source]

This function is used when calling the kde class to transform samples from the unit hypercube to samples on the kde.

Parameters:

u: numpy array

Samples on the uniform hypercube.

generate_kde()[source]: Function noramlises the input data into a standard normal parameter space and then generates a weighted KDE.

classmethod load(filename)[source]

This function can be used to load a saved KDE. For example

from margarine.kde import KDE

file = 'path/to/pickled/bijector.pkl'
KDE_class = KDE.load(file)

Parameters:

filename: string

Path to the saved KDE.

log_like(params, logevidence, prior=None, prior_weights=None)[source]

This function should return the log-likelihood for a given set of parameters.

It requires the logevidence from the original nested sampling run in order to do this and in the case that the prior is non-uniform prior samples should be provided.

Parameters:

params: numpy array

The set of samples for which to calculate the log probability.

logevidence: float

Should be the log-evidence from the full nested sampling run with nuisance parameters.

prior: numpy array/default=None

An array of prior samples corresponding to the prior. Default assumption is that the prior is uniform which is required if you want to combine likelihoods from different experiments/data sets. In this case samples and prior samples should be reweighted prior to any training.

log_prob(params)[source]

Function to caluclate the log-probability for a given KDE and set of parameters.

While the density estimator has its own built in log probability function, a correction has to be applied for the transformation of variables that is used to improve accuracy when learning. The correction is implemented here.

Parameters:

params: numpy array

The set of samples for which to calculate the log probability.

sample(length=1000)[source]

Function can be used to generate samples from the KDE. It is much faster than the __call__ function but does not transform samples from the hypercube onto the KDE. It is however useful if we want to generate a large number of samples that can then be used to calulate the marginal statistics.

Kwargs:

length: int / default=1000

This should be an integer and is used to determine how many samples are generated when calling the bijector.

save(filename)[source]

Function can be used to save an initalised version of the KDE class and it’s assosiated generated KDE.

Parameters:

filename: string

Path in which to save the pickled KDE.

class margarine.clustered.clusterMAF(theta, **kwargs)[source]

This class is used to train, load and call a piecewise normalising flow built from a set of masked autoregressive flows. The class is essentially a wrapper around the MAF class with some additional clustering functionality. This class has all the same functionality as the MAF class and can be used in the same way.

Parameters:

theta: numpy array or anesthetic.samples

The samples from the probability distribution that we require the MAF to learn. This can either be a numpy array or an anesthetic NestedSamples or MCMCSamples object.

kwargs:

weights: numpy array / default=np.ones(len(theta))

The weights associated with the samples above. If an anesthetic NestedSamples or MCMCSamples object is passed the code draws the weights from this.

number_networks: int / default = 6

The bijector is built by chaining a series of autoregressive neural networks together and this parameter is used to determine how many networks there are in the chain.

learning_rate: float / default = 1e-3

The learning rate determines the ‘step size’ of the optimization algorithm used to train the MAF. Its value can effect the quality of emulation.

hidden_layers: list / default = [50, 50]

The number of layers and number of nodes in each hidden layer for each neural network. The default is two hidden layers with 50 nodes each and each network in the chain has the same hidden layer structure.

activation_func: string / default = ‘tanh’

The choice of activation function. It must be an activation function keyword recognisable by TensorFlow. The default is ‘tanh’, the hyperbolic tangent activation function.

cluster_labels: list / default = None

If clustering has been performed externally to margarine you can provide a list of labels for the samples theta. The labels should be integers from 0 to k corresponding to the cluster that each sample is in.

cluster_number: int / default = None

If clustering has been performed externally to margarine you need to provide the number of clusters, k, alongside the cluster labels.

parameters: list of strings

A list of the relevant parameters to train on. Only needed if theta is an anestehetic samples object. If not provided, all parameters will be used.

Methods

`__call__`(u[, seed])	This function is used when calling the clusterMAF class to transform samples from the unit hypercube to samples on the clusterMAF.
`load`(filename)	This function can be used to load a saved MAF.
`log_like`(params, logevidence[, prior_de])	This function should return the log-likelihood for a given set of parameters.
`log_prob`(params)	Function to caluclate the log-probability for a given clusterMAF and set of parameters.
`sample`([length])	This function is used to generate samples on the clusterMAF via the clusterMAF __call__ function.
`save`(filename)	This function can be used to save an instance of a trained clusterMAF as a pickled class so that it can be loaded and used in differnt scripts.
`train`([epochs, early_stop, loss_type])	This function is called to train the clusterMAF once it has been initialised.

__call__(u, seed=1420)[source]

This function is used when calling the clusterMAF class to transform samples from the unit hypercube to samples on the clusterMAF.

Parameters:

u: numpy array

Samples on the uniform hypercube.

seed: int / default=1420

Set the seed for the cluster choice.

classmethod load(filename)[source]

This function can be used to load a saved MAF. For example

from margarine.clustered import clusterMAF

file = 'path/to/pickled/MAF.pkl'
bij = clusterMAF.load(file)

Parameters:

filename: string

Path to the saved MAF.

log_like(params, logevidence, prior_de=None)[source]

This function should return the log-likelihood for a given set of parameters.

It requires the logevidence from the original nested sampling run in order to do this and in the case that the prior is non-uniform a trained prior density estimator should be provided.

Parameters:

params: numpy array

The set of samples for which to calculate the log probability.

logevidence: float

Should be the log-evidence from the full nested sampling run with nuisance parameters.

prior_de: margarine.maf.MAF / default=None

If the prior is non-uniform then a trained prior density estimator should be provided. Otherwise the prior is assumed to be uniform and the prior probability is calculated analytically from the minimum and maximum values of the parameters.

log_prob(params)[source]

Function to caluclate the log-probability for a given clusterMAF and set of parameters.

While each density estimator has its own built in log probability function, a correction has to be applied for the transformation of variables that is used to improve accuracy when learning and we have to sum probabilities over the series of flows. The correction and the sum are implemented here.

Parameters:

params: numpy array

The set of samples for which to calculate the log probability.

sample(length=1000)[source]

This function is used to generate samples on the clusterMAF via the clusterMAF __call__ function.

Kwargs:

length: int / default=1000

This should be an integer and is used to determine how many samples are generated when calling the clusterMAF.

save(filename)[source]

This function can be used to save an instance of a trained clusterMAF as a pickled class so that it can be loaded and used in differnt scripts.

Parameters:

filename: string

Path in which to save the pickled MAF.

train(epochs=100, early_stop=False, loss_type='sum')[source]

This function is called to train the clusterMAF once it has been initialised. It calls the train() function for each flow.

from margarine.clustered import clusterMAF

bij = clusterMAF(theta, weights=weights)
bij.train()

Kwargs:

epochs: int / default = 100

The number of iterations to train the neural networks for.

early_stop: boolean / default = False

Determines whether or not to implement an early stopping algorithm or train for the set number of epochs. If set to True then the algorithm will stop training when test loss has not improved for 2% of the requested epochs. At this point margarine will roll back to the best model and return this to the user.

loss_type: string / default = ‘sum’

Determines whether to use the sum or mean of the weighted log probabilities to calculate the loss function.

class margarine.marginal_stats.calculate(de, **kwargs)[source]

This class, once initalised with a trained MAF or KDE and samples, can be used to calculate marginal KL divergences and bayesian dimensionalities.

Paramesters:

de: instance of MAF class or KDE class
This should be a loaded and trained instance of a MAF, clusterMAF or KDE.Bijectors can be loaded like so
from margarine.maf import MAF
from margarine.kde import KDE
from margarine.clustered import clusterMAF

file = '/trained_maf.pkl'
maf = MAF.load(file)

file = '/trained_kde.pkl'
kde = KDE.load(file)

file = '/trained_clustered_maf.pkl'
clustered_maf = clusterMAF.load(file)
samples: numpy array
This should be the output of the bijector when called to generate a set of samples from the replicated probability distribution. e.g. after loading a trained MAF we would pass
u = np.random.uniform(0, 1, size=(10000, 5))
prior_limits = np.array([[0]*5, [1]*5])
samples = maf(u, prior_limits)

Kwargs:

prior_de: instance of MAF class, clusterMAF class or KDE class

This should be a loaded and trained instance of a MAF, clusterMAF or KDE for the prior. If not provided, a uniform prior will be used.

Methods

`integrate`(loglikelihood, prior_pdf[, ...])	Importance sampling integration of a likelihood function
`statistics`()	Calculate marginal bayesian KL divergence and dimensionality with approximate errors.

integrate(loglikelihood, prior_pdf, batch_size=1000, sample_size=10000, logzero=-1e+30)[source]

Importance sampling integration of a likelihood function

args:

loglikelihood: function: A function that takes a numpy array of samples and returns the loglikelihood of each sample.
prior_pdf: function: A function that takes a numpy array of samples and returns the prior logpdf of each sample.
batch_size: int: The number of samples to draw at each iteration.
sample_size: int: The number of samples to draw in total.
logzero: float: The definition of zero for the loglikelihood function.

returns:

stats: dict: Dictionary containing useful statistics

statistics()[source]: Calculate marginal bayesian KL divergence and dimensionality with approximate errors.