vectorose.stats#

Statistical analyses.

Statistical tests, analyses and routines for analysing the directional data used to construct the VectoRose plots.

Warning

For many of the statistical operations defined here, a set of unit vectors is required in order for interpretation to be possible. To produce a set of unit vectors, the function util.normalise_vectors() can be called.

The vectors passed to these functions should not contain spatial location coordinates.

Notes

These statistical tests are largely derived from the work by Fisher, Lewis and Embleton. [1]

References

Classes#

HypothesisResult

Results for hypothesis testing.

OrientationMatrixParameters

Orientation matrix parameters.

FisherVonMisesParameters

Parameters for a Fisher-von Mises distribution.

Functions#

compute_resultant_vector(→ numpy.ndarray)

Compute the resultant vector for a set of orientations.

compute_orientation_matrix(→ numpy.ndarray)

Compute the orientation matrix for a set of vectors.

compute_orientation_matrix_eigs(→ NamedTuple)

Compute the eigenvectors and eigenvalues of the orientation matrix.

compute_orientation_matrix_parameters(...)

Compute Woodcock's orientation matrix parameters.

uniform_vs_unimodal_test(→ HypothesisResult)

Uniformity vs. unimodality test.

_compute_sum_of_arc_lengths(→ float)

Compute the sum of arc lengths from vectors to a specified vector.

compute_median_direction(→ numpy.ndarray)

Compute the median direction for a unimodal distribution.

compute_elliptical_confidence_cone_points(, ...)

Compute ellipse points on the surface of a sphere.

compute_confidence_cone_for_median(→ numpy.ndarray)

Compute elliptical confidence cone for the median orientation.

_kappa_equation(→ float)

Equation which is satisfied by the concentration parameter.

compute_mean_unit_direction(→ numpy.ndarray)

Compute the mean direction as a unit vector.

estimate_concentration_parameter(→ float)

Estimate the concentration parameter.

compute_confidence_cone_radius(→ float)

Compute confidence cone radius for mean direction estimate.

fit_fisher_vonmises_distribution(...)

Fit a Fisher-von Mises spherical distribution to a vector field.

compute_magnitude_orientation_correlation(...)

Compute the correlation between the magnitude and orientation.

Module Contents#

class vectorose.stats.HypothesisResult[source]#

Results for hypothesis testing.

can_reject_null_hypothesis: bool#

Indicate whether the null hypothesis can be rejected.

p_value: float#

Computed p-value for the test.

test_significance: float#

Significance level used for the test.

vectorose.stats.compute_resultant_vector(vector_field: numpy.ndarray, compute_mean_resultant: bool = True) numpy.ndarray[source]#

Compute the resultant vector for a set of orientations.

Compute the resultant vector from a set of orientations or direction. This vector is computed as the sum of all constituent vectors.

Parameters:
  • vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

  • compute_mean_resultant – Indicate whether the mean resultant should be returned instead of the non-normalised resultant vector.

Returns:

numpy.ndarray – Array of shape (d,) containing the resultant vector. If compute_mean_resultant is True, then this is the mean resultant vector.

Notes

This implementation is based on the description in chapter 3 of Fisher, Lewis and Embleton’s book [1] on statistics on the sphere.

vectorose.stats.compute_orientation_matrix(vectors: numpy.ndarray) numpy.ndarray[source]#

Compute the orientation matrix for a set of vectors.

Compute the orientation matrix for a set of vectors, as described in Fisher, Lewis and Embleton. [1] This d * d matrix contains the sum of the pairwise products of the vector components.

Parameters:

vectors – Array of shape (n, d) where n is the number of vectors and d is the number of dimensions.

Returns:

numpy.ndarray – Array of shape (d, d) corresponding to the orientation matrix.

vectorose.stats.compute_orientation_matrix_eigs(vector_field: numpy.ndarray) NamedTuple[source]#

Compute the eigenvectors and eigenvalues of the orientation matrix.

Compute the eigen-decomposition of the orientation matrix. This function computes the matrix and then performs eigenvector calculation.

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

Returns:

  • eigenvectors (numpy.ndarray) – Eigenvectors of the orientation matrix.

  • eigenvalues (numpy.ndarray) – Eigenvalues of the orientation matrix.

Notes

Equivalent to calling compute_orientation_matrix() and then using NumPy to compute the eigenvectors and eigenvalues.

class vectorose.stats.OrientationMatrixParameters[source]#

Bases: NamedTuple

Orientation matrix parameters.

These parameters were first described by Woodcock. [2]

shape_parameter: float#

Shape parameter, also known as gamma.

strength_parameter: float#

Strength parameter, also known as zeta.

vectorose.stats.compute_orientation_matrix_parameters(eigs: numpy.ndarray) OrientationMatrixParameters[source]#

Compute Woodcock’s orientation matrix parameters.

Compute the shape and strength parameters based on the orientation matrix, using the process first described by Woodcock [2] and using the notation presented by Fisher, Lewis and Embleton. [1]

Parameters:

eigs – The eigenvalues of the orientation matrix.

Returns:

OrientationMatrixParameters – The distribution parameters computed from the orientation matrix eigenvalues.

Notes

See section 3.4 of Fisher, Lewis and Embleton [1] for computational and notational details. For the original description, see Woodcock. [#woodcock-1977]

vectorose.stats.uniform_vs_unimodal_test(vector_field: numpy.ndarray, significance_level: float = 0.05) HypothesisResult[source]#

Uniformity vs. unimodality test.

Apply a test to determine if a distribution is uniform or unimodal, as described in section 5.3.1(i) of Fisher, Lewish and Embleton. [1]

Parameters:
  • vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

  • significance_level – Type I error value for the statistical test, default 0.05.

Returns:

HypothesisResult – Results of the hypothesis testing, indicating whether the null hypothesis of uniformity can be rejected, as well as the computed p-value.

Notes

In this function, the null hypothesis considers the orientations to be uniformly distributed on the surface of a sphere. The alternative hypothesis states that the data are not uniform, and are instead unimodal.

This implementation assumes the large sample size scenario. The resultant length R is computed, and then the test statistic 3R^2 / n is calculated and compared with a chi-squared variable with 3 degrees of freedom. If the test statistic is greater than the chi-squared value, then we can reject the null hypothesis in favour of the alternative hypothesis.

References

See [1], section 5.3.1(i).

vectorose.stats._compute_sum_of_arc_lengths(new_vector: numpy.ndarray, vectors: numpy.ndarray) float[source]#

Compute the sum of arc lengths from vectors to a specified vector.

See section 5.3.1(ii) in [1]. This function is used to estimate the spherical median of a sample of vectors.

Parameters:
  • new_vector – The vector under consideration.

  • vectors – Cartesian components of a set of vectors.

Returns:

float – The sum of arc lengths from all vectors to the specified vector.

vectorose.stats.compute_median_direction(vector_field: numpy.ndarray) numpy.ndarray[source]#

Compute the median direction for a unimodal distribution.

Using the method described by Fisher, Lewis and Embleton [1], compute the median direction for a unimodal directional distribution.

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

Returns:

numpy.ndarray – Cartesian coordinates for the estimate of the median direction.

Warning

The vectors must be normalised to unit length before computing these statistics.

vectorose.stats.compute_elliptical_confidence_cone_points(w_matrix: numpy.ndarray, constant: float, h_matrix: numpy.ndarray = np.eye(3), number_of_points: int = 36) numpy.ndarray[source]#

Compute ellipse points on the surface of a sphere.

Following the procedure described in section 3.2.5 of Fisher, Lewis and Embleton, [1] compute the coordinates of an ellipse on the surface of a unit sphere.

Parameters:
  • w_matrix – Matrix whose eigenvectors and eigenvalues define the ellipse.

  • constant – Value on the right hand side of the ellipse equation.

  • h_matrix – Frame matrix used to compute the ellipse, by default the identity matrix.

  • number_of_points – Number of points to compute.

Returns:

numpy.ndarray – Array containing number_of_points + 1 rows of 3D cartesian points which lie on the computed ellipse. The extra point is added to ensure that the ellipse is complete.

vectorose.stats.compute_confidence_cone_for_median(vector_field: numpy.ndarray, median_direction: numpy.ndarray | None = None, significance_level: float = 0.05, number_of_points: int = 36) numpy.ndarray[source]#

Compute elliptical confidence cone for the median orientation.

Compute the matrix defining the elliptical confidence cone, as described by Fisher, Lewis and Embleton. [1]

Parameters:
  • vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

  • median_direction – The median direction vector in a NumPy array. If None, then the median direction is computed in this function.

  • significance_level – The acceptable type I error used to define the confidence cone. Based on repeated sampling, the true population median should fall within the computed ellipse (1 - significance_level) * 100 percent of the time.

  • number_of_points – Number of ellipse points to compute for the confidence cone.

Returns:

numpy.ndarray – Points on the elliptical confidence cone.

Warning

Requires a large sample size.

vectorose.stats._kappa_equation(k: float, mean_resultant_length: float) float[source]#

Equation which is satisfied by the concentration parameter.

See Fisher, Lewis and Embleton, [1] section 5.3.2(iv).

Parameters:
  • k – The concentration parameter of the Fisher-von Mises distribution.

  • mean_resultant_length – Mean resultant length of the sampled vectors.

Returns:

float – Value of the equation. Should be zero.

vectorose.stats.compute_mean_unit_direction(vector_field: numpy.ndarray, mean_resultant_vector: numpy.ndarray | None = None) numpy.ndarray[source]#

Compute the mean direction as a unit vector.

Parameters:
  • vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

  • mean_resultant_vector – Optional mean resultant vector. If provided, this vector is normalised. Otherwise, this vector is computed.

Returns:

numpy.ndarray – Unit vector containing the cartesian coordinates of the mean direction.

Warning

Per Fisher, Lewis and Embleton, [1] the sample mean corresponds to the maximum likelihood estimate. This may not hold for other distributions.

vectorose.stats.estimate_concentration_parameter(vector_field: numpy.ndarray, mean_resultant_vector: numpy.ndarray | None = None, initial_guess: float = 0.5) float[source]#

Estimate the concentration parameter.

Using the maximum likelihood estimator presented in section 5.3.2(iv) of Fisher, Lewis and Embleton, [1] estimate the concentration parameter of the provided vector field, assuming that the orientations follow a Fisher-von Mises distribution.

Parameters:
  • vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

  • mean_resultant_vector – The mean resultant vector, in cartesian coordinates. If not provided, it will be computed in this function.

  • initial_guess – Initial guess for the concentration parameter.

Returns:

float – The maximum likelihood estimate of the concentration parameter.

Warning

The orientations provided are assumed to be distributed following a Fisher-von Mises distribution. The result is meaningless if the data are obtained from a different underlying distribution.

This estimator is biased. See Fisher, Lewis and Embleton for alternative unbiased estimators.

Notes

As described by Fisher, Lewis and Embleton, [1] the maximum likelihood estimate of the Fisher-von Mises concentration parameter \(\kappa\) is obtained by solving:

\[\coth(\kappa) - 1/\kappa = R / n\]

where \(\coth\) is the hyperbolic cotangent, \(R\) is the resultant length and \(n\) is the number of vectors.

vectorose.stats.compute_confidence_cone_radius(vector_field: numpy.ndarray, kappa_estimate: float | None = None, confidence_level: float = 0.01, use_degrees: bool = False) float[source]#

Compute confidence cone radius for mean direction estimate.

Determine the confidence cone radius around the estimated mean direction for a specified significance level for a Fisher-von Mises distribution. See the description in section 5.3.2 (iv) of Fisher, Lewis and Embleton. [1]

Parameters:
  • vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

  • kappa_estimate – Optional estimate of kappa. If not provided, then an estimate is computed to determine which estimation approach to use.

  • confidence_level – Desired confidence level for the mean direction estimate.

  • use_degrees – Indicate whether the angular radius should be converted to degrees.

Returns:

float – Arc length along the sphere of the confidence cone for the specified significance level.

Warning

This function is only valid on orientations obtained by processes with an underlying Fisher-von Mises distribution. The results cannot be interpreted for data generated by other processes.

class vectorose.stats.FisherVonMisesParameters[source]#

Parameters for a Fisher-von Mises distribution.

mu: numpy.ndarray#

Mean direction in cartesian coordinates, of shape (3, ).

kappa: float#

Concentration parameter.

vectorose.stats.fit_fisher_vonmises_distribution(vector_field: numpy.ndarray) FisherVonMisesParameters[source]#

Fit a Fisher-von Mises spherical distribution to a vector field.

Parameters:

vector_field – The vector field to consider, represented as either an array of shape (n, d) or an n+1-dimensional array containing the components at their spatial locations, with the components present along the last axis.

Returns:

FisherVonMisesParameters – Parameters necessary to construct a Fisher-von Mises distribution that fits the vector field.

Warning

The vectors must be normalised to unit length before computing these statistics.

See also

scipy.stats.vonmises_fisher.fit

Function used to perform the fitting.

vectorose.stats.compute_magnitude_orientation_correlation(vectors: numpy.ndarray, significance_level: float = 0.05) Tuple[float, HypothesisResult][source]#

Compute the correlation between the magnitude and orientation.

Following the procedure outlined in section 8.2.4 in Fisher, Lewis and Embleton, [1] compute the correlation between the magnitude and orientation of a set of non-unit vectors.

Parameters:
  • vectors – Array of shape (n, 3) containing the vectors to analyse. These should not be unit vectors.

  • significance_level – The test significance to compare the computed p-value against.

Returns:

  • correlation_coefficient (float) – Biased estimate of the correlation coefficient.

  • hypothesis_result (HypothesisResult) – Result of the hypothesis test to determine if the magnitude and orientation are correlated.

Warning

This implementation assumes that a large sample is used (i.e., n > 25).

Notes

The correlation coefficient is computed using the deviations from the mean of each variable. The jackknife approach has not yet been implemented.

In this statistical test, the null hypothesis is that the magnitude and orientation are not correlated. If the test statistics is below the chi-squared value at the desired significance level, we reject this null hypothesis.

The current implementation modifies the description in Fisher, Lewis and Embleton [1] by performing array operations.