neurotools.stats.information module

Routines concerning information theory

class neurotools.stats.information.discrete[source]

Bases: object

Collected methods for calculating biased information-theoretic quantities directly from discrete (categorical) probability distributions. These have never been fully tested.

classmethod DKL(P, Q, axis=None)[source]

Compute KL divergence D(P||Q) between discrete distributions P and Q.

Parameters:

P (np.array) – Vector of probabilities
Q (np.array) – Vector of probabilities

Returns:

DKL – KL divergence from P to Q

Return type:

float

classmethod H(p, axis=None)[source]

Sample entropy -Σ p ln(p) in nats.

Parameters:: p (array-like numeric) – List of frequencies or counts
Returns:: Shannon entropy of discrete distribution with observed counts
Return type:: float

classmethod H_samples(samples)[source]

Calculate sample entropy -Σ p ln(p) (in nats) from a list of samples.

Parameters:: samples (array-like) – 1D array-like iterable of samples. Samples can be of any type, but must be hashable.
Returns:: Shannon entropy of samples
Return type:: float

classmethod Hcond(p, axis=None)[source]

Conditional entropy \(H_{y|x}\).

Parameters:

p (array-like numeric) – List of frequencies or counts
axis (tuple) – List of axes corresponding to y in \(H_{y|x}\). Remaining axes presumed to correspond to x.

Returns:

Shannon entropy of discrete distribution with observed counts

Return type:

float

classmethod I(p, axes1=None, axes2=None)[source]

Mutual information from a discrete PDF

Parameters:

p (np.ndarray) – Array of probabilities or counts, at lest two dimensional.
axes1 (tuple) – List of axes corresponding to the first set of variables.
axes2 (tuple) – List of axes corresponding to the second set of variables.

classmethod shuffle(p, axes=(0, 1))[source]: Replace the joint density for the variables in axes with the product of their marginals, all conditioned on any remaining variables not included in axes.

classmethod Ishuff(p, axes=(0, 1))[source]

The shuffled information between (neuron₁, neuron₂, stimulus), in nats.

Parameters:: p (np.ndarray) – a 3D array of joint probabilities. The first two axes should index neuronal responses. The third axis should index the extrinsic covariate.

classmethod deltaIshuff(p, axes=(0, 1))[source]

Mutual information between (neuron₁, neuron₂) and (stimulus), relative to a conditionally-shuffled baseline where P(neuron₁, neuron₂) is replaced by P(neuron₁) P(neuron₂), in nats.

Naïve calculation of Latham and Nirenberg equation (2).

classmethod deltaI(p, axes=(0, 1))[source]

Mutual information between (neuron₁, neuron₂) and (stimulus), Minus the mutual information I(neuron₁, stimulus) and I(neuron₂, stimulus).

Naïve calculation of Latham and Nirenberg equation (5) from Brenner et al., 2000; Machens et al., 2001; Schneidman et al., 2003.

classmethod deltaInoise(p, axes=(0, 1))[source]

\(I_{r_1,r_2;s} - I^{\text{shuffle}}_{r_1,r2;s}\).

Naïve calculation of Schneidman, Bialek, Berry (2003) equation (14).

classmethod deltaIsig(p, axes=(0, 1))[source]

\(I_{r_1,r_2;s} - I^{\text{shuffle}}_{r_1,r2;s}\).

Naïve calculation of Schneidman, Bialek, Berry (2003) equation (15).

classmethod syn(p, axes=(0, 1))[source]

Mutual information between (neuron₁, neuron₂) and (stimulus), Minus the mutual information I(neuron₁, stimulus) and I(neuron₂, stimulus).

Naïve calculation of Latham and Nirenberg equation (4) from Brenner et al., 2000; Machens et al., 2001; Schneidman et al., 2003.

classmethod syn2(p)[source]

Naïve calculation of Schneidman, Bialek, Berry (2003) equation (11).

This should match discrete.syn

This is fixed: p should be 3D with axes (0,1) the neurons and axis 2 the stimulus

neurotools.stats.information.poisson_entropy_nats(l)[source]

Approximate entropy of a Poisson distribution in nats

Parameters:: l (positive float) – Poisson rate parameter

neurotools.stats.information.betapr(k, N)[source]

Baysian estimation of rate p in Bernoulli trials. This returns the posterior median for p given k positive examples from N trials, using Jeffery’s prior.

Parameters:

k (in or np.int23) – Number of observations for each state
N (positive int) – N>k total observations

Returns:

p – Estiamted probability or probability per bin, if k is a np.int32 array. Probabilities are normalized to sum to 1.

Return type:

float or np.float32

neurotools.stats.information.beta_regularized_histogram_mutual_information(x, y, nb=4, eps=1e-09, plo=2.5, phi=97.5, nshuffle=1000, nbootstrap=1000)[source]

A quick and dirt mutual information estimator.

The result will depend on the bin size but a quick suffle control provides a useful chance level.

Parameters:

x (iterable<number>) – Samples for first variable x
y (iterable<number>) – Samples for second variable y
nb (positive int; default 4) – Number of bins. I suggest 3-10.
eps (small positive float; default 1e-9)
plo (number ∈(0,100); default 2.5)
phi (number ∈(0,100); default 97.5)

Returns:

Ixy (float) – Histogram-based MI estimate Ixy = Hx + Hy - Hxy
Idelta (float) – Shuffle-adjust MI Idelta = np.median(Hx+Hy) - Hxy
pvalue (float) – p-value for significant MI from the shuffle test
lo (float) – The boostrap plo percentil of Idelta
hi (float) – The boostrap phi percentil of Idelta

class neurotools.stats.information.JointCategoricalDistribution(counts, joined, kept, states, nstate)[source]

Bases: NamedTuple

counts: ndarray: Alias for field number 0

joined: ndarray: Alias for field number 1

kept: ndarray: Alias for field number 2

states: ndarray: Alias for field number 3

nstate: ndarray: Alias for field number 4

neurotools.stats.information.joint(*args, nstates=None, remove_empty=False)[source]

Convert a list of samples from several categorical variables in a single, new categorical variable.

This drops marginal states not present in any variable (which may not be what you want)

class neurotools.stats.information.dirichlet_density[source]

Bases: object

Model empirical (sampled) densities of categorical variables using a Dirichlet prior.

The default prior is α = 1/k, where k is the number of states. This makes things behave nicer under marginalization. Specify bias=0.5 if you want Jeffreys prior.

These are biased.

classmethod joint_model(*args, bias=None)[source]: Build dirichlet model of joint categorical distribution

classmethod p(*samples, bias=0.5)[source]: Expected probabilty

classmethod lnp(*samples, bias=0.5)[source]: Expected log-probability

classmethod plnp(*samples, bias=None)[source]: Expected p*ln(p)

classmethod H(*samples, bias=None)[source]: Expected <-p*ln(p)>

classmethod I(a, b, bias=None)[source]: Mutual information

classmethod redundancy(x1, x2, y, bias=None)[source]

For (x1,x2,y) calculate I(x1,y) + I(x2,y) - I(joint(x1,x2),y).

positive: redundant zero: independent negative: synergistic

classmethod foo(x1, x2, y)[source]