vectorose.stats#

Statistical analyses.

Statistical tests, analyses and routines for analysing the directional data used to construct the VectoRose plots.

Warning

For many of the statistical operations defined here, a set of unit vectors is required in order for interpretation to be possible. To produce a set of unit vectors, the function util.normalise_vectors() can be called.

The vectors passed to these functions should not contain spatial location coordinates.

Notes

These statistical tests are largely derived from the work by Fisher, Lewis and Embleton. [1]

References

Classes#

`HypothesisResult`	Results for hypothesis testing.
`OrientationMatrixParameters`	Orientation matrix parameters.
`FisherVonMisesParameters`	Parameters for a Fisher-von Mises distribution.

Functions#

`compute_resultant_vector`(→ numpy.ndarray)	Compute the resultant vector for a set of orientations.
`compute_orientation_matrix`(→ numpy.ndarray)	Compute the orientation matrix for a set of vectors.
`compute_orientation_matrix_eigs`(→ NamedTuple)	Compute the eigenvectors and eigenvalues of the orientation matrix.
`compute_orientation_matrix_parameters`(...)	Compute Woodcock's orientation matrix parameters.
`uniform_vs_unimodal_test`(→ HypothesisResult)	Uniformity vs. unimodality test.
`_compute_sum_of_arc_lengths`(→ float)	Compute the sum of arc lengths from vectors to a specified vector.
`compute_median_direction`(→ numpy.ndarray)	Compute the median direction for a unimodal distribution.
`compute_elliptical_confidence_cone_points`(, ...)	Compute ellipse points on the surface of a sphere.
`compute_confidence_cone_for_median`(→ numpy.ndarray)	Compute elliptical confidence cone for the median orientation.
`_kappa_equation`(→ float)	Equation which is satisfied by the concentration parameter.
`compute_mean_unit_direction`(→ numpy.ndarray)	Compute the mean direction as a unit vector.
`estimate_concentration_parameter`(→ float)	Estimate the concentration parameter.
`compute_confidence_cone_radius`(→ float)	Compute confidence cone radius for mean direction estimate.
`fit_fisher_vonmises_distribution`(...)	Fit a Fisher-von Mises spherical distribution to a vector field.
`compute_magnitude_orientation_correlation`(...)	Compute the correlation between the magnitude and orientation.

Module Contents#

class vectorose.stats.HypothesisResult[source]#

Results for hypothesis testing.

can_reject_null_hypothesis: bool#: Indicate whether the null hypothesis can be rejected.

p_value: float#: Computed p-value for the test.

test_significance: float#: Significance level used for the test.

vectorose.stats.compute_resultant_vector(vector_field: numpy.ndarray, compute_mean_resultant: bool = True) → numpy.ndarray[source]#

Compute the resultant vector for a set of orientations.

Compute the resultant vector from a set of orientations or direction. This vector is computed as the sum of all constituent vectors.

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.
compute_mean_resultant – Indicate whether the mean resultant should be returned instead of the non-normalised resultant vector.

Returns:

numpy.ndarray – Array of shape (d,) containing the resultant vector. If compute_mean_resultant is True, then this is the mean resultant vector.

Notes

This implementation is based on the description in chapter 3 of Fisher, Lewis and Embleton’s book [1] on statistics on the sphere.

vectorose.stats.compute_orientation_matrix(vectors: numpy.ndarray) → numpy.ndarray[source]#

Compute the orientation matrix for a set of vectors.

Compute the orientation matrix for a set of vectors, as described in Fisher, Lewis and Embleton. [1] This d * d matrix contains the sum of the pairwise products of the vector components.

Parameters:: vectors – Array of shape (n, d) where n is the number of vectors and d is the number of dimensions.
Returns:: numpy.ndarray – Array of shape (d, d) corresponding to the orientation matrix.

vectorose.stats.compute_orientation_matrix_eigs(vector_field: numpy.ndarray) → NamedTuple[source]#

Compute the eigenvectors and eigenvalues of the orientation matrix.

Compute the eigen-decomposition of the orientation matrix. This function computes the matrix and then performs eigenvector calculation.

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

Returns:

eigenvectors (numpy.ndarray) – Eigenvectors of the orientation matrix.
eigenvalues (numpy.ndarray) – Eigenvalues of the orientation matrix.

Notes

Equivalent to calling compute_orientation_matrix() and then using NumPy to compute the eigenvectors and eigenvalues.

class vectorose.stats.OrientationMatrixParameters[source]#

Bases: NamedTuple

Orientation matrix parameters.

These parameters were first described by Woodcock. [2]

shape_parameter: float#: Shape parameter, also known as gamma.

strength_parameter: float#: Strength parameter, also known as zeta.

vectorose.stats.compute_orientation_matrix_parameters(eigs: numpy.ndarray) → OrientationMatrixParameters[source]#

Compute Woodcock’s orientation matrix parameters.

Compute the shape and strength parameters based on the orientation matrix, using the process first described by Woodcock [2] and using the notation presented by Fisher, Lewis and Embleton. [1]

Parameters:: eigs – The eigenvalues of the orientation matrix.
Returns:: OrientationMatrixParameters – The distribution parameters computed from the orientation matrix eigenvalues.

Notes

See section 3.4 of Fisher, Lewis and Embleton [1] for computational and notational details. For the original description, see Woodcock. [#woodcock-1977]

vectorose.stats.uniform_vs_unimodal_test(vector_field: numpy.ndarray, significance_level: float = 0.05) → HypothesisResult[source]#

Uniformity vs. unimodality test.

Apply a test to determine if a distribution is uniform or unimodal, as described in section 5.3.1(i) of Fisher, Lewish and Embleton. [1]

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.
significance_level – Type I error value for the statistical test, default 0.05.

Returns:

HypothesisResult – Results of the hypothesis testing, indicating whether the null hypothesis of uniformity can be rejected, as well as the computed p-value.

Notes

In this function, the null hypothesis considers the orientations to be uniformly distributed on the surface of a sphere. The alternative hypothesis states that the data are not uniform, and are instead unimodal.

This implementation assumes the large sample size scenario. The resultant length R is computed, and then the test statistic 3R^2 / n is calculated and compared with a chi-squared variable with 3 degrees of freedom. If the test statistic is greater than the chi-squared value, then we can reject the null hypothesis in favour of the alternative hypothesis.

References

See [1], section 5.3.1(i).

vectorose.stats._compute_sum_of_arc_lengths(new_vector: numpy.ndarray, vectors: numpy.ndarray) → float[source]#

Compute the sum of arc lengths from vectors to a specified vector.

See section 5.3.1(ii) in [1]. This function is used to estimate the spherical median of a sample of vectors.

Parameters:

new_vector – The vector under consideration.
vectors – Cartesian components of a set of vectors.

Returns:

float – The sum of arc lengths from all vectors to the specified vector.

vectorose.stats.compute_median_direction(vector_field: numpy.ndarray) → numpy.ndarray[source]#

Compute the median direction for a unimodal distribution.

Using the method described by Fisher, Lewis and Embleton [1], compute the median direction for a unimodal directional distribution.

Parameters:: vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.
Returns:: numpy.ndarray – Cartesian coordinates for the estimate of the median direction.

Warning

The vectors must be normalised to unit length before computing these statistics.

vectorose.stats.compute_elliptical_confidence_cone_points(w_matrix: numpy.ndarray, constant: float, h_matrix: numpy.ndarray = np.eye(3), number_of_points: int = 36) → numpy.ndarray[source]#

Compute ellipse points on the surface of a sphere.

Following the procedure described in section 3.2.5 of Fisher, Lewis and Embleton, [1] compute the coordinates of an ellipse on the surface of a unit sphere.

Parameters:

w_matrix – Matrix whose eigenvectors and eigenvalues define the ellipse.
constant – Value on the right hand side of the ellipse equation.
h_matrix – Frame matrix used to compute the ellipse, by default the identity matrix.
number_of_points – Number of points to compute.

Returns:

numpy.ndarray – Array containing number_of_points + 1 rows of 3D cartesian points which lie on the computed ellipse. The extra point is added to ensure that the ellipse is complete.

vectorose.stats.compute_confidence_cone_for_median(vector_field: numpy.ndarray, median_direction: numpy.ndarray | None = None, significance_level: float = 0.05, number_of_points: int = 36) → numpy.ndarray[source]#

Compute elliptical confidence cone for the median orientation.

Compute the matrix defining the elliptical confidence cone, as described by Fisher, Lewis and Embleton. [1]

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.
median_direction – The median direction vector in a NumPy array. If None, then the median direction is computed in this function.
significance_level – The acceptable type I error used to define the confidence cone. Based on repeated sampling, the true population median should fall within the computed ellipse (1 - significance_level) * 100 percent of the time.
number_of_points – Number of ellipse points to compute for the confidence cone.

Returns:

numpy.ndarray – Points on the elliptical confidence cone.

Warning

Requires a large sample size.

vectorose.stats._kappa_equation(k: float, mean_resultant_length: float) → float[source]#

Equation which is satisfied by the concentration parameter.

See Fisher, Lewis and Embleton, [1] section 5.3.2(iv).

Parameters:

k – The concentration parameter of the Fisher-von Mises distribution.
mean_resultant_length – Mean resultant length of the sampled vectors.

Returns:

float – Value of the equation. Should be zero.

vectorose.stats.compute_mean_unit_direction(vector_field: numpy.ndarray, mean_resultant_vector: numpy.ndarray | None = None) → numpy.ndarray[source]#

Compute the mean direction as a unit vector.

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.
mean_resultant_vector – Optional mean resultant vector. If provided, this vector is normalised. Otherwise, this vector is computed.

Returns:

numpy.ndarray – Unit vector containing the cartesian coordinates of the mean direction.

Warning

Per Fisher, Lewis and Embleton, [1] the sample mean corresponds to the maximum likelihood estimate. This may not hold for other distributions.

vectorose.stats.estimate_concentration_parameter(vector_field: numpy.ndarray, mean_resultant_vector: numpy.ndarray | None = None, initial_guess: float = 0.5) → float[source]#

Estimate the concentration parameter.

Using the maximum likelihood estimator presented in section 5.3.2(iv) of Fisher, Lewis and Embleton, [1] estimate the concentration parameter of the provided vector field, assuming that the orientations follow a Fisher-von Mises distribution.

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.
mean_resultant_vector – The mean resultant vector, in cartesian coordinates. If not provided, it will be computed in this function.
initial_guess – Initial guess for the concentration parameter.

Returns:

float – The maximum likelihood estimate of the concentration parameter.

Warning

The orientations provided are assumed to be distributed following a Fisher-von Mises distribution. The result is meaningless if the data are obtained from a different underlying distribution.

This estimator is biased. See Fisher, Lewis and Embleton for alternative unbiased estimators.

Notes

As described by Fisher, Lewis and Embleton, [1] the maximum likelihood estimate of the Fisher-von Mises concentration parameter \(\kappa\) is obtained by solving:

\[\coth(\kappa) - 1/\kappa = R / n\]

where \(\coth\) is the hyperbolic cotangent, \(R\) is the resultant length and \(n\) is the number of vectors.

vectorose.stats.compute_confidence_cone_radius(vector_field: numpy.ndarray, kappa_estimate: float | None = None, confidence_level: float = 0.01, use_degrees: bool = False) → float[source]#

Compute confidence cone radius for mean direction estimate.

Determine the confidence cone radius around the estimated mean direction for a specified significance level for a Fisher-von Mises distribution. See the description in section 5.3.2 (iv) of Fisher, Lewis and Embleton. [1]

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.
kappa_estimate – Optional estimate of kappa. If not provided, then an estimate is computed to determine which estimation approach to use.
confidence_level – Desired confidence level for the mean direction estimate.
use_degrees – Indicate whether the angular radius should be converted to degrees.

Returns:

float – Arc length along the sphere of the confidence cone for the specified significance level.

Warning

This function is only valid on orientations obtained by processes with an underlying Fisher-von Mises distribution. The results cannot be interpreted for data generated by other processes.

class vectorose.stats.FisherVonMisesParameters[source]#

Parameters for a Fisher-von Mises distribution.

mu: numpy.ndarray#: Mean direction in cartesian coordinates, of shape (3, ).

kappa: float#: Concentration parameter.

vectorose.stats.fit_fisher_vonmises_distribution(vector_field: numpy.ndarray) → FisherVonMisesParameters[source]#

Fit a Fisher-von Mises spherical distribution to a vector field.

Parameters:: vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.
Returns:: FisherVonMisesParameters – Parameters necessary to construct a Fisher-von Mises distribution that fits the vector field.

Warning

The vectors must be normalised to unit length before computing these statistics.