Reference

Contents

Index

IdentityByDescentDispersal.composite_loglikelihood_constant_densityFunction
composite_loglikelihood_constant_density(D::Real, sigma::Real, df::DataFrame, contig_lengths::AbstractArray{<:Real}, chromosomal_edges::Bool=true, diploid::Bool=true) -> Real

Computes the composite log-likelihood of the observed IBD blocks under a model with constant population density.

  • D: Effective population density (diploid individuals per unit area).
  • sigma: Root mean square dispersal distance per generation.
  • df: DataFrame containing the observed IBD blocks in the format returned by preprocess_dataset.
  • contig_lengths: Array of contig lengths in Morgans.

Optionally:

  • chromosomal_edges: Whether to account for chromosomal edge effects.
  • diploid: Whether to account for diploidy.
source
IdentityByDescentDispersal.composite_loglikelihood_customFunction
composite_loglikelihood_custom(De::Function, parameters::AbstractArray, sigma::Real, df::DataFrame, contig_lengths::AbstractArray{<:Real}, chromosomal_edges::Bool=true, diploid::Bool=true) -> Real

Computes the composite log-likelihood of the observed IBD blocks under a model with constant population density.

  • De is a user-defined function that takes time t and parameters and returns the effective population density at time t.
  • parameters is a user-defined array of parameters that the function De depends on.
  • sigma: Root mean square dispersal distance per generation.
  • df: DataFrame containing the observed IBD blocks in the format returned by preprocess_dataset.
  • contig_lengths: Array of contig lengths in Morgans.

Optionally:

  • chromosomal_edges: Whether to account for chromosomal edge effects.
  • diploid: Whether to account for diploidy.
source
IdentityByDescentDispersal.composite_loglikelihood_power_densityFunction
composite_loglikelihood_power_density(D::Real, beta::Real, sigma::Real, df::DataFrame, contig_lengths::AbstractArray{<:Real}, chromosomal_edges::Bool=true, diploid::Bool=true) -> Real

Computes the composite log-likelihood of the observed IBD blocks under a model with constant population density.

  • D: Effective population density (diploid individuals per unit area).
  • beta is the power of the density function,
  • sigma: Root mean square dispersal distance per generation.
  • df: DataFrame containing the observed IBD blocks in the format returned by preprocess_dataset.
  • contig_lengths: Array of contig lengths in Morgans.

Optionally:

  • chromosomal_edges: Whether to account for chromosomal edge effects.
  • diploid: Whether to account for diploidy.
source
IdentityByDescentDispersal.default_ibd_binsMethod
default_ibd_bins()

Returns the default bins used in Ringbauer et. al. for identity-by-descent (IBD) block analysis, as well as the minimum length of IBD blocks to consider in Morgans.

Ringbauer, H., Coop, G. & Barton, N.H. Genetics 205, 1335–1351 (2017).

source
IdentityByDescentDispersal.expected_ibd_blocks_constant_densityFunction
expected_ibd_blocks_constant_density(r::Real, D::Real, sigma::Real, L::Real, G::Real, chromosomal_edges::Bool=true, diploid::Bool=true) -> Real

Computes the expected density of identity-by-descent (IBD) blocks of length L for a model with constant population density. This function returns the expected number of IBD blocks per pair of individuals and per unit of block length.

\[\mathbb{E}[N_L] = \int_0^\infty \mathbb{E}[N_L^t] \,dt = \frac{G}{8\pi D \sigma^2} \left(\frac{r}{\sqrt{L} \sigma}\right)^2 K_2\left(\frac{\sqrt{2L} \, r}{\sigma}\right)\]

where:

  • r is the geographic distance between samples,
  • D is the effective population density (diploid individuals per unit area),
  • sigma is the root mean square dispersal distance per generation,
  • L is the length of the IBD block (in Morgans),
  • G is the total map length of the genome (in Morgans),
  • K₂ is the modified Bessel function of the second kind of order 2.

If chromosomal_edges is true (the default), we account for chromosomal edge effects. If diploid is true, we multiply by a factor of 4 to account for the fact that each individual has two copies of each chromosome. For more details, see Appendix B.

There is a function overload that accepts a vector of G values and returns the aggregated expected number of IBD blocks.

Reference: Ringbauer, H., Coop, G., & Barton, N. H. (2017). Genetics, 205(3), 1335–1351.

source
IdentityByDescentDispersal.expected_ibd_blocks_customFunction
expected_ibd_blocks_custom(r::Real, De::Function, parameters::AbstractArray, sigma::Real, L::Real, G::Real, chromosomal_edges::Bool = true, diploid::Bool = true)

Computes the expected density of identity-by-descent (IBD) blocks of length L for a model where the effective population density is given by a custom function De(t, parameters). This function returns the expected number of IBD blocks per pair of individuals and per unit of block length.

\[\mathbb{E}[N_L] = \int_0^\infty \mathbb{E}[N_L^t] \,dt = \int_0^\infty G 4t^2 \exp(-2Lt) \cdot \Phi(t) \,dt\]

The integral is computed numerically using Gaussian-Legendre quadrature rules with the QuadGK package.

where:

  • r is the geographic distance between samples,
  • De is a user-defined function that takes time t and a parameters and returns the effective population density at time t.
  • parameters is a user-defined array of parameters that the function De depends on.
  • sigma is the root mean square dispersal distance per generation,
  • L is the length of the IBD block (in Morgans),
  • G is the total map length of the genome (in Morgans),
  • K₂ is the modified Bessel function of the second kind of order 2.

If chromosomal_edges is true (the default), we account for chromosomal edge effects. If diploid is true, we multiply by a factor of 4 to account for the fact that each individual has two copies of each chromosome. For more details, see Appendix B.

Reference: Ringbauer, H., Coop, G., & Barton, N. H. (2017). Genetics, 205(3), 1335–1351.

source
IdentityByDescentDispersal.expected_ibd_blocks_power_densityFunction
expected_ibd_blocks_power_density(r::Real, D::Real, beta::Real, sigma::Real, L::Real, G::Real, chromosomal_edges::Bool = true, diploid::Bool = true)

Computes the expected density of identity-by-descent (IBD) blocks of length L for a model with a power population density in the form of $D(t) = D_0t^{-eta}$ This function returns the expected number of IBD blocks per pair of individuals and per unit of block length.

\[\mathbb{E}[N_L] = \int_0^\infty \mathbb{E}[N_L^t] \,dt = 2^{\frac{-3\beta}{2}-3} \frac{G}{\pi D \sigma^2} \left(\frac{r}{\sqrt{L} \sigma}\right)^{(2+\beta)} K_{2+\beta}\left(\frac{\sqrt{2L} \, r}{\sigma}\right)\]

where:

  • r is the geographic distance between samples,
  • D is the effective population density (diploid individuals per unit area),
  • beta is the power of the density function,
  • sigma is the root mean square dispersal distance per generation,
  • L is the length of the IBD block (in Morgans),
  • G is the total map length of the genome (in Morgans),
  • K₂ is the modified Bessel function of the second kind of order 2.

If chromosomal_edges is true (the default), we account for chromosomal edge effects. If diploid is true, we multiply by a factor of 4 to account for the fact that each individual has two copies of each chromosome. For more details, see Appendix B.

Reference: Ringbauer, H., Coop, G., & Barton, N. H. (2017). Genetics, 205(3), 1335–1351.

source
IdentityByDescentDispersal.preprocess_datasetMethod
preprocess_dataset(ibd_blocks::DataFrame, dist_matrix::AbstractMatrix, bins::AbstractVector, min_length::Real)

Preprocesses the input data for identity-by-descent (IBD) block analysis.

  • ibd_blocks: DataFrame containing IBD blocks with columns ID1, ID2, and span. The ID1 and ID2 columns should contain the IDs of the individuals involved in the IBD block, and the span column should contain the length of the IBD block in Morgans.
  • individual_distances: DataFrame containing distances between individuals with columns ID1, ID2, and distance. The ID1 and ID2 columns should contain the IDs of the individuals involved in the distance. Notice that the units of the estimated density and dispersal rate will match the units of the distances provided.
  • bins: A vector of right bins for the IBD blocks.
  • min_length: Minimum length of IBD blocks to consider in Morgans.

It returns a DataFrame in "long" format with the following columns:

  • DISTANCE: The pairwise distance between individuals.
  • IBD_LEFT: The left bound of the IBD length bin.
  • IBD_RIGHT: The right bound of the IBD length bin.
  • IBD_MID: The center of the IBD length bin.
  • NR_PAIRS: The number of unique individual pairs within that distance.
  • COUNT: The number of IBD blocks observed in the corresponding bin.
  • DISTANCE_INDEX: The index of the distance bin.
  • IBD_INDEX: The index of the IBD length bin.
source
IdentityByDescentDispersal.probability_coalescenceMethod
probability_coalescence(t::Real, r::Real, De::Function, sigma::Real)

Computes the probability $\phi(t)$ that two homologous loci coalesce $t$ generations ago according to

\[\phi(t) = \frac{1}{2D_e(t)} \frac{1}{4\pi t \sigma^2} \exp(\frac{-r^2}{4 t \sigma^2})\]

Ringbauer, H., Coop, G. & Barton, N.H. Genetics 205, 1335–1351 (2017).

source