neurotools.stats.information module

Routines concerning information theory

class neurotools.stats.information.discrete[source]

Bases: object

Collected methods for calculating biased information-theoretic quantities directly from discrete (categorical) probability distributions. These have never been fully tested.

classmethod DKL(P, Q, axis=None)[source]

Compute KL divergence D(P||Q) between discrete distributions P and Q.

Parameters:
  • P (np.array) – Vector of probabilities

  • Q (np.array) – Vector of probabilities

Returns:

DKL – KL divergence from P to Q

Return type:

float

classmethod H(p, axis=None)[source]

Sample entropy p ln(p) in nats.

Parameters:

p (array-like numeric) – List of frequencies or counts

Returns:

Shannon entropy of discrete distribution with observed counts

Return type:

float

classmethod H_samples(samples)[source]

Calculate sample entropy p ln(p) (in nats) from a list of samples.

Parameters:

samples (array-like) – 1D array-like iterable of samples. Samples can be of any type, but must be hashable.

Returns:

Shannon entropy of samples

Return type:

float

classmethod Hcond(p, axis=None)[source]

Conditional entropy \(H_{y|x}\).

Parameters:
  • p (array-like numeric) – List of frequencies or counts

  • axis (tuple) – List of axes corresponding to y in \(H_{y|x}\). Remaining axes presumed to correspond to x.

Returns:

Shannon entropy of discrete distribution with observed counts

Return type:

float

classmethod I(p, axes1=None, axes2=None)[source]

Mutual information from a discrete PDF

Parameters:
  • p (np.ndarray) – Array of probabilities or counts, at lest two dimensional.

  • axes1 (tuple) – List of axes corresponding to the first set of variables.

  • axes2 (tuple) – List of axes corresponding to the second set of variables.

classmethod shuffle(p, axes=(0, 1))[source]

Replace the joint density for the variables in axes with the product of their marginals, all conditioned on any remaining variables not included in axes.

classmethod Ishuff(p, axes=(0, 1))[source]

The shuffled information between (neuron₁, neuron₂, stimulus), in nats.

Parameters:

p (np.ndarray) – a 3D array of joint probabilities. The first two axes should index neuronal responses. The third axis should index the extrinsic covariate.

classmethod deltaIshuff(p, axes=(0, 1))[source]

Mutual information between (neuron₁, neuron₂) and (stimulus), relative to a conditionally-shuffled baseline where P(neuron₁, neuron₂) is replaced by P(neuron₁) P(neuron₂), in nats.

Naïve calculation of Latham and Nirenberg equation (2).

classmethod deltaI(p, axes=(0, 1))[source]

Mutual information between (neuron₁, neuron₂) and (stimulus), Minus the mutual information I(neuron₁, stimulus) and I(neuron₂, stimulus).

Naïve calculation of Latham and Nirenberg equation (5) from Brenner et al., 2000; Machens et al., 2001; Schneidman et al., 2003.

classmethod deltaInoise(p, axes=(0, 1))[source]

\(I_{r_1,r_2;s} - I^{\text{shuffle}}_{r_1,r2;s}\).

Naïve calculation of Schneidman, Bialek, Berry (2003) equation (14).

classmethod deltaIsig(p, axes=(0, 1))[source]

\(I_{r_1,r_2;s} - I^{\text{shuffle}}_{r_1,r2;s}\).

Naïve calculation of Schneidman, Bialek, Berry (2003) equation (15).

classmethod syn(p, axes=(0, 1))[source]

Mutual information between (neuron₁, neuron₂) and (stimulus), Minus the mutual information I(neuron₁, stimulus) and I(neuron₂, stimulus).

Naïve calculation of Latham and Nirenberg equation (4) from Brenner et al., 2000; Machens et al., 2001; Schneidman et al., 2003.

classmethod syn2(p)[source]

Naïve calculation of Schneidman, Bialek, Berry (2003) equation (11).

This should match discrete.syn

This is fixed: p should be 3D with axes (0,1) the neurons and axis 2 the stimulus

neurotools.stats.information.poisson_entropy_nats(l)[source]

Approximate entropy of a Poisson distribution in nats

Parameters:

l (positive float) – Poisson rate parameter

neurotools.stats.information.betapr(k, N)[source]

Baysian estimation of rate p in Bernoulli trials. This returns the posterior median for p given k positive examples from N trials, using Jeffery’s prior.

Parameters:
  • k (in or np.int23) – Number of observations for each state

  • N (positive int) – N>k total observations

Returns:

p – Estiamted probability or probability per bin, if k is a np.int32 array. Probabilities are normalized to sum to 1.

Return type:

float or np.float32

neurotools.stats.information.beta_regularized_histogram_mutual_information(x, y, nb=4, eps=1e-09, plo=2.5, phi=97.5, nshuffle=1000, nbootstrap=1000)[source]

A quick and dirt mutual information estimator.

The result will depend on the bin size but a quick suffle control provides a useful chance level.

Parameters:
  • x (iterable<number>) – Samples for first variable x

  • y (iterable<number>) – Samples for second variable y

  • nb (positive int; default 4) – Number of bins. I suggest 3-10.

  • eps (small positive float; default 1e-9)

  • plo (number ∈(0,100); default 2.5)

  • phi (number ∈(0,100); default 97.5)

Returns:

  • Ixy (float) – Histogram-based MI estimate Ixy = Hx + Hy - Hxy

  • Idelta (float) – Shuffle-adjust MI Idelta = np.median(Hx+Hy) - Hxy

  • pvalue (float) – p-value for significant MI from the shuffle test

  • lo (float) – The boostrap plo percentil of Idelta

  • hi (float) – The boostrap phi percentil of Idelta

class neurotools.stats.information.JointCategoricalDistribution(counts, joined, kept, states, nstate)[source]

Bases: NamedTuple

counts: ndarray

Alias for field number 0

joined: ndarray

Alias for field number 1

kept: ndarray

Alias for field number 2

states: ndarray

Alias for field number 3

nstate: ndarray

Alias for field number 4

neurotools.stats.information.joint(*args, nstates=None, remove_empty=False)[source]

Convert a list of samples from several categorical variables in a single, new categorical variable.

This drops marginal states not present in any variable (which may not be what you want)

class neurotools.stats.information.dirichlet_density[source]

Bases: object

Model empirical (sampled) densities of categorical variables using a Dirichlet prior.

The default prior is α = 1/k, where k is the number of states. This makes things behave nicer under marginalization. Specify bias=0.5 if you want Jeffreys prior.

These are biased.

classmethod joint_model(*args, bias=None)[source]

Build dirichlet model of joint categorical distribution

classmethod p(*samples, bias=0.5)[source]

Expected probabilty

classmethod lnp(*samples, bias=0.5)[source]

Expected log-probability

classmethod plnp(*samples, bias=None)[source]

Expected p*ln(p)

classmethod H(*samples, bias=None)[source]

Expected <-p*ln(p)>

classmethod I(a, b, bias=None)[source]

Mutual information

classmethod redundancy(x1, x2, y, bias=None)[source]

For (x1,x2,y) calculate I(x1,y) + I(x2,y) - I(joint(x1,x2),y).

positive: redundant zero: independent negative: synergistic

classmethod foo(x1, x2, y)[source]