Distributions#
Phase-type distributions. The Coalescent class serves as an entry point for accessing all other distributions.
- class Coalescent(n: int | Dict[str, int] | List[int] | LineageConfig, model: CoalescentModel = None, demography: Demography = None, loci: int | LocusConfig = 1, recombination_rate: float = None, start_time: float = 0, end_time: float = None)[source]#
Bases:
AbstractCoalescent,SerializableCoalescent distribution.
- __init__(n: int | Dict[str, int] | List[int] | LineageConfig, model: CoalescentModel = None, demography: Demography = None, loci: int | LocusConfig = 1, recombination_rate: float = None, start_time: float = 0, end_time: float = None)[source]#
Create object.
- Parameters:
n (
Union[int,Dict[str,int],List[int],LineageConfig]) – Number of lineages. Either a single integer if only one population, or a list of integers or dictionary with population names as keys and number of lineages as values for multiple populations. Alternatively, aLineageConfigobject can be passed.model (
CoalescentModel) – Coalescent model. Default is the standard coalescent.demography (
Demography) – Demography.loci (
int|LocusConfig) – Number of loci or locus configuration.recombination_rate (
float) – Recombination rate.start_time (
float) – Time when to start accumulating moments. By default, this is 0.end_time (
float) – Time when to end the accumulating moments. IfNone, the end time is taken to be the time of almost sure absorption. Note that unnecessarily long end times can lead to numerical errors.
-
start_time:
float# Time when to start accumulating moments
- property lineage_counting_state_space: LineageCountingStateSpace#
The lineage-counting state space.
- property block_counting_state_space: BlockCountingStateSpace#
The block-counting state space.
- property tree_height: TreeHeightDistribution#
Tree height distribution.
- property total_branch_length: PhaseTypeDistribution#
Total branch length distribution.
- property sfs: UnfoldedSFSDistribution#
Unfolded site-frequency spectrum distribution.
- property fsfs: FoldedSFSDistribution#
Folded site-frequency spectrum distribution.
- moment(k: int = 1, rewards: Sequence[Reward] = None, start_time: float = None, end_time: float = None, center: bool = True, permute: bool = True)[source]#
Get the kth (non-central) moment using the specified rewards and state space.
- Parameters:
k (
int) – The order of the momentrewards (
Sequence[Reward]) – Sequence of k rewards. By default, tree height rewards are used.start_time (
float) – Time when to start accumulation of moments. By default, the start time specified when initializing the distribution.end_time (
float) – Time when to end accumulation of moments. By default, either the end time specified when initializing the distribution or the time until almost sure absorption.center (
bool) – Whether to center the moment.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
float- Returns:
The kth moment
- accumulate(k: int, end_times: Iterable[float], rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True)[source]#
Accumulate moments at different times.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – Times when to evaluate the moment. By default, 200 evenly spaced values between 0 and the 99th percentile.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
ndarray- Returns:
Accumulation of moments.
- plot_accumulation(k: int = 1, end_times: Iterable[float] = None, rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True, ax: plt.Axes = None, show: bool = True, file: str = None, clear: bool = False, label: str = None, title: str = None)[source]#
Plot the accumulation of moments.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – Times when to evaluate the moment. By default, 200 evenly spaced values between 0 and the 99th percentile.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.ax (plt.Axes) – Axes to plot on.
show (
bool) – Whether to show the plot.file (
str) – File to save the plot to.clear (
bool) – Whether to clear the plot before plotting.label (
str) – Label for the plot.title (
str) – Title of the plot.
- Return type:
plt.Axes
- Returns:
Axes.
- to_json()[source]#
Serialize to JSON. Drop cache before serializing.
- Return type:
str- Returns:
JSON string.
- to_msprime(num_replicates: int = 10000, n_threads: int = 10, parallelize: bool = True, record_migration: bool = False, simulate_mutations: bool = False, mutation_rate: float = None, seed: int = None)[source]#
Convert to msprime coalescent.
- Parameters:
num_replicates (
int) – Number of replicates.n_threads (
int) – Number of threads.parallelize (
bool) – Whether to parallelize.record_migration (
bool) – Whether to record migrations which is necessary to calculate statistics per deme.simulate_mutations (
bool) – Whether to simulate mutations.mutation_rate (
float) – Mutation rate.seed (
int) – Random seed.
- Return type:
MsprimeCoalescent- Returns:
msprime coalescent.
- classmethod from_file(file: str, classes=None)#
Load object from file.
- Parameters:
classes – Classes to be used for unserialization
file (
str) – File to load from
- Return type:
Self
- classmethod from_json(json: str, classes=None)#
Unserialize object.
- Parameters:
classes – Classes to be used for deserialization.
json (
str) – JSON string
- Return type:
Self
- to_file(file: str)#
Save object to file (JSON).
- Parameters:
file (
str) – File path.
-
lineage_config:
LineageConfig# Population configuration
-
locus_config:
LocusConfig# Locus configuration
-
model:
CoalescentModel# Coalescent model
-
demography:
Demography# Demography
-
end_time:
float# End time
- class PhaseTypeDistribution(state_space: StateSpace, tree_height: TreeHeightDistribution, demography: Demography = None, reward: Reward = None)[source]#
Bases:
MomentAwareDistributionPhase-type distribution for a piecewise time-homogeneous process.
- __init__(state_space: StateSpace, tree_height: TreeHeightDistribution, demography: Demography = None, reward: Reward = None)[source]#
Initialize the distribution.
- Parameters:
state_space (
StateSpace) – The state space.tree_height (
TreeHeightDistribution) – The tree height distribution.demography (
Demography) – The demography.reward (
Reward) – The reward. By default, the tree height reward.
-
lineage_config:
LineageConfig# Population configuration
-
locus_config:
LocusConfig# Locus configuration
-
state_space:
StateSpace# State space
-
demography:
Demography# Demography
-
tree_height:
TreeHeightDistribution# Tree height distribution
- property demes: MarginalDemeDistributions#
Marginal distributions over each deme.
- property loci: MarginalLocusDistributions#
Marginal distributions over each locus.
- moment(k: int, rewards: Sequence[Reward] = None, start_time: float = None, end_time: float = None, center: bool = True, permute: bool = True)[source]#
Get the kth (non-central) (cross-)moment.
- Parameters:
k (
int) – The order of the moment.rewards (
Sequence[Reward]) – Iterable of k rewards. By default, the reward of the underlying distribution.start_time (
float) – Time when to start accumulation of moments. By default, the start time specified when initializing the distribution.end_time (
float) – Time when to end accumulation of moments. By default, either the end time specified when initializing the distribution or the time until almost sure absorption.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
float- Returns:
The kth moment
- accumulate(k: int, end_times: Iterable[float], rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True)[source]#
Evaluate the kth moment at different end times.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – List of ends times or end time when to evaluate the moment.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
ndarray- Returns:
The moment accumulated at the specified times or time.
- plot_accumulation(k: int = 1, end_times: Iterable[float] = None, rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True, ax: plt.Axes = None, show: bool = True, file: str = None, clear: bool = True, label: str = None, title: str = None)[source]#
Plot accumulation of (non-central) moments at different times.
Note
This is different from a CDF, as it shows the accumulation of moments rather than the probability of having reached absorption at a certain time.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – Times when to evaluate the moment. By default, 200 evenly spaced values between 0 and the 99th percentile.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.ax (plt.Axes) – The axes to plot on.
show (
bool) – Whether to show the plot.file (
str) – File to save the plot to.clear (
bool) – Whether to clear the plot before plotting.label (
str) – Label for the plot.title (
str) – Title of the plot.
- Return type:
plt.Axes
- Returns:
Axes.
- touch(**kwargs)#
Touch all cached properties.
- Parameters:
kwargs – Additional keyword arguments.
- class TreeHeightDistribution(state_space: LineageCountingStateSpace, demography: Demography = None, start_time: float = 0, end_time: float = None)[source]#
Bases:
PhaseTypeDistribution,DensityAwareDistributionPhase-type distribution for a piecewise time-homogeneous process that allows the computation of the density function. This is currently only possible with default rewards.
-
max_epochs:
int= 10000# Maximum number of epochs to consider when determining time to almost sure absorption.
-
max_iter:
int= 20# Maximum number of time we double the end time when determining time to almost sure absorption.
-
p_absorption:
float= 0.999999999999999# Probability of almost sure absorption.
- __init__(state_space: LineageCountingStateSpace, demography: Demography = None, start_time: float = 0, end_time: float = None)[source]#
Initialize the distribution.
- Parameters:
state_space (
LineageCountingStateSpace) – The state space.demography (
Demography) – The demography.start_time (
float) – Time when to start accumulating moments.end_time (
float) – Time when to end accumulation of moments. By default, the time until almost sure absorption.
-
state_space:
LineageCountingStateSpace# State space
-
start_time:
float# Start time
-
end_time:
float|None# End time
- cdf(t: float | Sequence[float])[source]#
Cumulative distribution function.
- Parameters:
t (
Union[float,Sequence[float]]) – Value or values to evaluate the CDF at.- Return type:
float|ndarray- Returns:
Cumulative probability
- Raises:
NotImplementedError – if rewards are not default
- quantile(q: float, expansion_factor: float = 2, precision: float = 1e-05, max_iter: int = 1000)[source]#
Find the specified quantile of a CDF using an adaptive bisection method.
- Parameters:
q (
float) – The desired quantile (between 0 and 1).expansion_factor (
float) – Factor by which to multiply the upper bound that does not yet contain the quantile.precision (
float) – Precision for quantile, i.e.F(b) - F(a) < precision.max_iter (
int) – Maximum number of iterations.
- Returns:
The quantile.
- pdf(t: float | Sequence[float], dx: float = None)[source]#
Density function. We use numerical differentiation of the CDF to calculate the density. This provides good results as the CDF is exact and continuous.
- Parameters:
t (
Union[float,Sequence[float]]) – Value or values to evaluate the density function at.dx (
float) – Step size for numerical differentiation. By default, the 99th percentile divided by 1e10.
- Return type:
float|ndarray- Returns:
Density
- property t_max: float#
Time until which computations are performed. This is either the end time specified when initializing the distribution or the time until almost sure absorption.
- accumulate(k: int, end_times: Iterable[float], rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True)#
Evaluate the kth moment at different end times.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – List of ends times or end time when to evaluate the moment.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
ndarray- Returns:
The moment accumulated at the specified times or time.
- property demes: MarginalDemeDistributions#
Marginal distributions over each deme.
- property loci: MarginalLocusDistributions#
Marginal distributions over each locus.
- moment(k: int, rewards: Sequence[Reward] = None, start_time: float = None, end_time: float = None, center: bool = True, permute: bool = True)#
Get the kth (non-central) (cross-)moment.
- Parameters:
k (
int) – The order of the moment.rewards (
Sequence[Reward]) – Iterable of k rewards. By default, the reward of the underlying distribution.start_time (
float) – Time when to start accumulation of moments. By default, the start time specified when initializing the distribution.end_time (
float) – Time when to end accumulation of moments. By default, either the end time specified when initializing the distribution or the time until almost sure absorption.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
float- Returns:
The kth moment
- plot_accumulation(k: int = 1, end_times: Iterable[float] = None, rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True, ax: plt.Axes = None, show: bool = True, file: str = None, clear: bool = True, label: str = None, title: str = None)#
Plot accumulation of (non-central) moments at different times.
Note
This is different from a CDF, as it shows the accumulation of moments rather than the probability of having reached absorption at a certain time.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – Times when to evaluate the moment. By default, 200 evenly spaced values between 0 and the 99th percentile.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.ax (plt.Axes) – The axes to plot on.
show (
bool) – Whether to show the plot.file (
str) – File to save the plot to.clear (
bool) – Whether to clear the plot before plotting.label (
str) – Label for the plot.title (
str) – Title of the plot.
- Return type:
plt.Axes
- Returns:
Axes.
- plot_cdf(ax: plt.Axes = None, t: ndarray = None, show: bool = True, file: str = None, clear: bool = True, label: str = None, title: str = 'Tree height CDF')#
Plot cumulative distribution function.
- Parameters:
ax (plt.Axes) – Axes to plot on.
t (
ndarray) – Values to evaluate the CDF at. By default, 200 evenly spaced values between 0 and the 99th percentile.show (
bool) – Whether to show the plot.file (
str) – File to save the plot to.clear (
bool) – Whether to clear the plot before plotting.label (
str) – Label for the plot.title (
str) – Title of the plot.
- Return type:
plt.Axes
- Returns:
Axes.
- plot_pdf(ax: plt.Axes = None, t: ndarray = None, show: bool = True, file: str = None, clear: bool = True, label: str = None, title: str = 'Tree height PDF', dx: float = None)#
Plot density function.
- Parameters:
ax (plt.Axes) – The axes to plot on.
t (
ndarray) – Values to evaluate the density function at. By default, 200 evenly spaced values between 0 and the 99th percentile.show (
bool) – Whether to show the plot.file (
str) – File to save the plot to.clear (
bool) – Whether to clear the plot before plotting.label (
str) – Label for the plot.title (
str) – Title of the plot.dx (
float) – Step size for numerical differentiation. By default, the 99th percentile divided by 1e10.
- Return type:
plt.Axes
- Returns:
Axes.
- touch(**kwargs)#
Touch all cached properties.
- Parameters:
kwargs – Additional keyword arguments.
-
lineage_config:
LineageConfig# Population configuration
-
locus_config:
LocusConfig# Locus configuration
-
demography:
Demography# Demography
-
tree_height:
TreeHeightDistribution# Tree height distribution
-
max_epochs:
- class FoldedSFSDistribution(state_space: BlockCountingStateSpace, tree_height: TreeHeightDistribution, demography: Demography, reward: Reward = None)[source]#
Bases:
SFSDistributionFolded site-frequency spectrum distribution.
- __init__(state_space: BlockCountingStateSpace, tree_height: TreeHeightDistribution, demography: Demography, reward: Reward = None)#
Initialize the distribution.
- Parameters:
state_space (
BlockCountingStateSpace) – Block-counting state space.tree_height (
TreeHeightDistribution) – The tree height distribution.demography (
Demography) – The demography.reward (
Reward) – The reward to multiply the SFS reward with. By default, the unit reward is used, which has no effect.
- accumulate(k: int, end_times: Iterable[float], rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True)#
Evaluate the kth (non-central) moments for site-frequency spectrum at different end times.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – Times or time when to evaluate the moment.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
ndarray- Returns:
Array of moments accumulated at the specified times, one for each site-frequency count.
- property demes: MarginalDemeDistributions#
Marginal distributions over each deme.
- get_accumulation(k: int, i: int, end_times: Iterable[float] | float, rewards: Sequence[SFSReward] = None, center: bool = True, permute: bool = True)#
Get accumulation of moments for the ith site-frequency count.
- Parameters:
k (
int) – The order of the momenti (
int) – The ith site-frequency count.end_times (
Union[Iterable[float],float]) – Times or time when to evaluate the moment.rewards (
Sequence[SFSReward]) – Sequence of k rewards.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
ndarray|float- Returns:
The kth SFS (cross)-moment accumulations at the ith site-frequency count
- get_corr(i: int, j: int)#
Get the correlation coefficient between the ith and jth site-frequency.
- Parameters:
i (
int) – The ith frequency countj (
int) – The jth frequency count
- Return type:
float- Returns:
Correlation coefficient
- get_cov(i: int, j: int)#
Get the covariance between the ith and jth site-frequency.
- Parameters:
i (
int) – The ith frequency countj (
int) – The jth frequency count
- Return type:
float- Returns:
covariance
- get_mutation_config(config: Sequence[int], theta: float)#
Get the probabilities of observing the given mutational configurations according to the infinite sites model.
Note
This currently only works for a single epoch, i.e. a time-homogeneous demography, and recombination is not supported.
- Parameters:
config (
Sequence[int]) – The mutational configuration. A sequence of integers of length n - 1 for unfolded configurations and n // 2 for folded configurations, where n is the number of lineages. Each element in the sequence is an integer representing the number of mutations at each frequency count starting from 1. For example, the unfolded configuration [2, 1, 0] represents two singleton, one doubleton and zero tripleton mutations for a sample size of 4 lineages. Similarly, the folded configuration [2, 1] represents two singleton or tripleton and one doubleton mutation for the same number of lineages.theta (
float) – The mutation rate.
- Return type:
float- Returns:
The probability of observing the given mutational configuration.
- get_mutation_configs(theta: float)#
An iterator over the probabilities of observing mutational configurations according to the infinite sites model. The order of the mutational configurations generated ascends in the number of mutations observed. See
get_mutation_config()for more information on mutational configurations.Note
This currently only works for a single epoch, i.e. a time-homogeneous demography, and recombination is not supported. Also note that the number of configurations is infinite, so this iterator will never stop. However, depending on the mutation rate, the probability of observing configurations of higher mutation counts will decrease over time. You can keep track of the generated probability mass by checking the
generated_massattribute, which is reset every time this method is called. A good approach is thus to keep generating configurations until the generated mass is above a certain threshold. More complex demographic models and larger sample sizes increase the number of configurations and higher mutation rates, the number of generated configurations necessary to reach a certain mass.Code example:
coal = pg.Coalescent(n=5) it = coal.sfs.get_mutation_configs(theta=1) # continue until generated mass is above 0.8 samples = list(pg.takewhile_inclusive(lambda _: coal.sfs.generated_mass < 0.8, it))
- Parameters:
theta (
float) – The mutation rate.- Return type:
Iterator[Tuple[Tuple[float,...],float]]- Returns:
An iterator over the probabilities of observing mutational configurations.
- property loci: MarginalLocusDistributions#
Marginal distributions over each locus.
- moment(k: int, rewards: Sequence[SFSReward] = None, start_time: float = None, end_time: float = None, center: bool = True, permute: bool = True)#
Get the kth moments of the site-frequency spectrum.
- Parameters:
k (
int) – The order of the momentrewards (
Sequence[SFSReward]) – Sequence of k rewardsstart_time (
float) – Time when to start accumulation of moments. By default, the start time specified when initializing the distribution.end_time (
float) – Time when to end accumulation of moments. By default, either the end time specified when initializing the distribution or the time until almost sure absorption.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
- Returns:
A site-frequency spectrum of kth order moments.
- plot_accumulation(k: int = 1, end_times: Iterable[float] = None, rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True, ax: plt.Axes = None, show: bool = True, file: str = None, clear: bool = True, label: str = None, title: str = None)#
Plot accumulation of (non-central) SFS moments at different times.
Note
This is different from a CDF, as it shows the accumulation of moments rather than the probability of having reached absorption at a certain time.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – Times when to evaluate the moment. By default, 200 evenly spaced values between 0 and the 99th percentile.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.ax (plt.Axes) – The axes to plot on.
show (
bool) – Whether to show the plot.file (
str) – File to save the plot to.clear (
bool) – Whether to clear the plot before plotting.label (
str) – Label for the plot.title (
str) – Title of the plot.
- Return type:
plt.Axes
- Returns:
Axes.
- touch(**kwargs)#
Touch all cached properties.
- Parameters:
kwargs – Additional keyword arguments.
- generated_mass#
Generated probability mass by iterator returned from
get_mutation_configs().
-
lineage_config:
LineageConfig# Population configuration
-
locus_config:
LocusConfig# Locus configuration
-
state_space:
StateSpace# State space
-
demography:
Demography# Demography
-
tree_height:
TreeHeightDistribution# Tree height distribution
- class UnfoldedSFSDistribution(state_space: BlockCountingStateSpace, tree_height: TreeHeightDistribution, demography: Demography, reward: Reward = None)[source]#
Bases:
SFSDistributionUnfolded site-frequency spectrum distribution.
- __init__(state_space: BlockCountingStateSpace, tree_height: TreeHeightDistribution, demography: Demography, reward: Reward = None)#
Initialize the distribution.
- Parameters:
state_space (
BlockCountingStateSpace) – Block-counting state space.tree_height (
TreeHeightDistribution) – The tree height distribution.demography (
Demography) – The demography.reward (
Reward) – The reward to multiply the SFS reward with. By default, the unit reward is used, which has no effect.
- accumulate(k: int, end_times: Iterable[float], rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True)#
Evaluate the kth (non-central) moments for site-frequency spectrum at different end times.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – Times or time when to evaluate the moment.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
ndarray- Returns:
Array of moments accumulated at the specified times, one for each site-frequency count.
- property demes: MarginalDemeDistributions#
Marginal distributions over each deme.
- get_accumulation(k: int, i: int, end_times: Iterable[float] | float, rewards: Sequence[SFSReward] = None, center: bool = True, permute: bool = True)#
Get accumulation of moments for the ith site-frequency count.
- Parameters:
k (
int) – The order of the momenti (
int) – The ith site-frequency count.end_times (
Union[Iterable[float],float]) – Times or time when to evaluate the moment.rewards (
Sequence[SFSReward]) – Sequence of k rewards.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
ndarray|float- Returns:
The kth SFS (cross)-moment accumulations at the ith site-frequency count
- get_corr(i: int, j: int)#
Get the correlation coefficient between the ith and jth site-frequency.
- Parameters:
i (
int) – The ith frequency countj (
int) – The jth frequency count
- Return type:
float- Returns:
Correlation coefficient
- get_cov(i: int, j: int)#
Get the covariance between the ith and jth site-frequency.
- Parameters:
i (
int) – The ith frequency countj (
int) – The jth frequency count
- Return type:
float- Returns:
covariance
- get_mutation_config(config: Sequence[int], theta: float)#
Get the probabilities of observing the given mutational configurations according to the infinite sites model.
Note
This currently only works for a single epoch, i.e. a time-homogeneous demography, and recombination is not supported.
- Parameters:
config (
Sequence[int]) – The mutational configuration. A sequence of integers of length n - 1 for unfolded configurations and n // 2 for folded configurations, where n is the number of lineages. Each element in the sequence is an integer representing the number of mutations at each frequency count starting from 1. For example, the unfolded configuration [2, 1, 0] represents two singleton, one doubleton and zero tripleton mutations for a sample size of 4 lineages. Similarly, the folded configuration [2, 1] represents two singleton or tripleton and one doubleton mutation for the same number of lineages.theta (
float) – The mutation rate.
- Return type:
float- Returns:
The probability of observing the given mutational configuration.
- get_mutation_configs(theta: float)#
An iterator over the probabilities of observing mutational configurations according to the infinite sites model. The order of the mutational configurations generated ascends in the number of mutations observed. See
get_mutation_config()for more information on mutational configurations.Note
This currently only works for a single epoch, i.e. a time-homogeneous demography, and recombination is not supported. Also note that the number of configurations is infinite, so this iterator will never stop. However, depending on the mutation rate, the probability of observing configurations of higher mutation counts will decrease over time. You can keep track of the generated probability mass by checking the
generated_massattribute, which is reset every time this method is called. A good approach is thus to keep generating configurations until the generated mass is above a certain threshold. More complex demographic models and larger sample sizes increase the number of configurations and higher mutation rates, the number of generated configurations necessary to reach a certain mass.Code example:
coal = pg.Coalescent(n=5) it = coal.sfs.get_mutation_configs(theta=1) # continue until generated mass is above 0.8 samples = list(pg.takewhile_inclusive(lambda _: coal.sfs.generated_mass < 0.8, it))
- Parameters:
theta (
float) – The mutation rate.- Return type:
Iterator[Tuple[Tuple[float,...],float]]- Returns:
An iterator over the probabilities of observing mutational configurations.
- property loci: MarginalLocusDistributions#
Marginal distributions over each locus.
- moment(k: int, rewards: Sequence[SFSReward] = None, start_time: float = None, end_time: float = None, center: bool = True, permute: bool = True)#
Get the kth moments of the site-frequency spectrum.
- Parameters:
k (
int) – The order of the momentrewards (
Sequence[SFSReward]) – Sequence of k rewardsstart_time (
float) – Time when to start accumulation of moments. By default, the start time specified when initializing the distribution.end_time (
float) – Time when to end accumulation of moments. By default, either the end time specified when initializing the distribution or the time until almost sure absorption.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.
- Return type:
- Returns:
A site-frequency spectrum of kth order moments.
- plot_accumulation(k: int = 1, end_times: Iterable[float] = None, rewards: Sequence[Reward] = None, center: bool = True, permute: bool = True, ax: plt.Axes = None, show: bool = True, file: str = None, clear: bool = True, label: str = None, title: str = None)#
Plot accumulation of (non-central) SFS moments at different times.
Note
This is different from a CDF, as it shows the accumulation of moments rather than the probability of having reached absorption at a certain time.
- Parameters:
k (
int) – The order of the moment.end_times (
Iterable[float]) – Times when to evaluate the moment. By default, 200 evenly spaced values between 0 and the 99th percentile.rewards (
Sequence[Reward]) – Sequence of k rewards. By default, the reward of the underlying distribution.center (
bool) – Whether to center the moment around the mean.permute (
bool) – For cross-moments, whether to average over all permutations of rewards. Default isTrue, which will provide the correct cross-moment. If set toFalse, the cross-moment will be conditioned on the order of rewards.ax (plt.Axes) – The axes to plot on.
show (
bool) – Whether to show the plot.file (
str) – File to save the plot to.clear (
bool) – Whether to clear the plot before plotting.label (
str) – Label for the plot.title (
str) – Title of the plot.
- Return type:
plt.Axes
- Returns:
Axes.
- touch(**kwargs)#
Touch all cached properties.
- Parameters:
kwargs – Additional keyword arguments.
- generated_mass#
Generated probability mass by iterator returned from
get_mutation_configs().
-
lineage_config:
LineageConfig# Population configuration
-
locus_config:
LocusConfig# Locus configuration
-
state_space:
StateSpace# State space
-
demography:
Demography# Demography
-
tree_height:
TreeHeightDistribution# Tree height distribution
- class MarginalLocusDistributions(dist: PhaseTypeDistribution)[source]#
Bases:
MarginalDistributionsMarginal locus distributions.
- __init__(dist: PhaseTypeDistribution)[source]#
Initialize the distributions.
- Parameters:
dist (
PhaseTypeDistribution) – The distribution.
- property loci: MarginalLocusDistributions#
Distributions marginalized over loci.
- get_cov(locus1: int, locus2: int)[source]#
Get the covariance between two loci.
- Parameters:
locus1 (
int) – The first locus.locus2 (
int) – The second locus.
- Return type:
float- Returns:
The covariance.
- property cov: ndarray#
Covariance matrix across loci.
- get_corr(locus1: int, locus2: int)[source]#
Get the correlation coefficient between two loci.
- Parameters:
locus1 (
int) – The first locus.locus2 (
int) – The second locus.
- Return type:
float- Returns:
The correlation coefficient.
- property corr: ndarray#
Correlation matrix across loci.
- get(k[, d]) D[k] if k in D, else d. d defaults to None.#
- items() a set-like object providing a view on D's items#
- keys() a set-like object providing a view on D's keys#
- values() an object providing a view on D's values#
- class MarginalDemeDistributions(dist: PhaseTypeDistribution)[source]#
Bases:
MarginalDistributionsMarginal deme distributions.
- __init__(dist: PhaseTypeDistribution)[source]#
Initialize the distributions.
- Parameters:
dist (
PhaseTypeDistribution) – The distribution.
- property demes: MarginalDemeDistributions#
Distributions marginalized over demes.
- get_cov(pop1: str, pop2: str)[source]#
Get the covariance between two demes.
- Parameters:
pop1 (
str) – The first deme.pop2 (
str) – The second deme.
- Return type:
float- Returns:
The covariance.
- property cov: ndarray#
Covariance matrix across demes.
- get_corr(pop1: str, pop2: str)[source]#
Get the correlation coefficient between two demes.
- Parameters:
pop1 (
str) – The first deme.pop2 (
str) – The second deme.
- Return type:
float- Returns:
The correlation coefficient.
- property corr: ndarray#
Correlation matrix across demes.
- get(k[, d]) D[k] if k in D, else d. d defaults to None.#
- items() a set-like object providing a view on D's items#
- keys() a set-like object providing a view on D's keys#
- values() an object providing a view on D's values#