vectorose.stats =============== .. py:module:: vectorose.stats .. autoapi-nested-parse:: Statistical analyses. Statistical tests, analyses and routines for analysing the directional data used to construct the VectoRose plots. .. warning:: For many of the statistical operations defined here, a set of *unit vectors* is required in order for interpretation to be possible. To produce a set of unit vectors, the function :func:`.util.normalise_vectors` can be called. The vectors passed to these functions should not contain spatial location coordinates. .. rubric:: Notes These statistical tests are largely derived from the work by Fisher, Lewis and Embleton. [#fisher-lewis-embleton]_ .. rubric:: References .. [#fisher-lewis-embleton] Fisher, N. I., Lewis, T., & Embleton, B. J. J. (1993). Statistical analysis of spherical data ([New ed.], 1. paperback ed). Cambridge Univ. Press. .. [#woodcock-1977] Woodcock, N. H. (1977). Specification of fabric shapes using an eigenvalue method. Geological Society of America Bulletin, 88(9), 1231. https://doi.org/10.1130/0016-7606(1977)88<1231:SOFSUA>2.0.CO;2 Classes ------- .. autoapisummary:: vectorose.stats.HypothesisResult vectorose.stats.OrientationMatrixParameters vectorose.stats.FisherVonMisesParameters Functions --------- .. autoapisummary:: vectorose.stats.compute_resultant_vector vectorose.stats.compute_orientation_matrix vectorose.stats.compute_orientation_matrix_eigs vectorose.stats.compute_orientation_matrix_parameters vectorose.stats.uniform_vs_unimodal_test vectorose.stats._compute_sum_of_arc_lengths vectorose.stats.compute_median_direction vectorose.stats.compute_elliptical_confidence_cone_points vectorose.stats.compute_confidence_cone_for_median vectorose.stats._kappa_equation vectorose.stats.compute_mean_unit_direction vectorose.stats.estimate_concentration_parameter vectorose.stats.compute_confidence_cone_radius vectorose.stats.fit_fisher_vonmises_distribution vectorose.stats.compute_magnitude_orientation_correlation Module Contents --------------- .. py:class:: HypothesisResult Results for hypothesis testing. .. py:attribute:: can_reject_null_hypothesis :type: bool Indicate whether the null hypothesis can be rejected. .. py:attribute:: p_value :type: float Computed p-value for the test. .. py:attribute:: test_significance :type: float Significance level used for the test. .. py:function:: compute_resultant_vector(vector_field: numpy.ndarray, compute_mean_resultant: bool = True) -> numpy.ndarray Compute the resultant vector for a set of orientations. Compute the resultant vector from a set of orientations or direction. This vector is computed as the sum of all constituent vectors. :param vector_field: The vector field to consider, represented as either an array of shape ``(n, d)`` or an ``n+1``-dimensional array containing the components at their spatial locations, with the components present along the *last* axis. :param compute_mean_resultant: Indicate whether the mean resultant should be returned instead of the non-normalised resultant vector. :returns: :class:`numpy.ndarray` -- Array of shape ``(d,)`` containing the resultant vector. If `compute_mean_resultant` is `True`, then this is the mean resultant vector. .. rubric:: Notes This implementation is based on the description in chapter 3 of Fisher, Lewis and Embleton's book [#fisher-lewis-embleton]_ on statistics on the sphere. .. py:function:: compute_orientation_matrix(vectors: numpy.ndarray) -> numpy.ndarray Compute the orientation matrix for a set of vectors. Compute the orientation matrix for a set of vectors, as described in Fisher, Lewis and Embleton. [#fisher-lewis-embleton]_ This ``d * d`` matrix contains the sum of the pairwise products of the vector components. :param vectors: Array of shape ``(n, d)`` where ``n`` is the number of vectors and ``d`` is the number of dimensions. :returns: :class:`numpy.ndarray` -- Array of shape ``(d, d)`` corresponding to the orientation matrix. .. py:function:: compute_orientation_matrix_eigs(vector_field: numpy.ndarray) -> NamedTuple Compute the eigenvectors and eigenvalues of the orientation matrix. Compute the eigen-decomposition of the orientation matrix. This function computes the matrix and then performs eigenvector calculation. :param vector_field: The vector field to consider, represented as either an array of shape ``(n, d)`` or an ``n+1``-dimensional array containing the components at their spatial locations, with the components present along the *last* axis. :returns: * **eigenvectors** (:class:`numpy.ndarray`) -- Eigenvectors of the orientation matrix. * **eigenvalues** (:class:`numpy.ndarray`) -- Eigenvalues of the orientation matrix. .. rubric:: Notes Equivalent to calling :func:`compute_orientation_matrix` and then using NumPy to compute the eigenvectors and eigenvalues. .. py:class:: OrientationMatrixParameters Bases: :py:obj:`NamedTuple` Orientation matrix parameters. These parameters were first described by Woodcock. [#woodcock-1977]_ .. py:attribute:: shape_parameter :type: float Shape parameter, also known as gamma. .. py:attribute:: strength_parameter :type: float Strength parameter, also known as zeta. .. py:function:: compute_orientation_matrix_parameters(eigs: numpy.ndarray) -> OrientationMatrixParameters Compute Woodcock's orientation matrix parameters. Compute the shape and strength parameters based on the orientation matrix, using the process first described by Woodcock [#woodcock-1977]_ and using the notation presented by Fisher, Lewis and Embleton. [#fisher-lewis-embleton]_ :param eigs: The eigenvalues of the orientation matrix. :returns: :class:`OrientationMatrixParameters` -- The distribution parameters computed from the orientation matrix eigenvalues. .. rubric:: Notes See section 3.4 of Fisher, Lewis and Embleton [#fisher-lewis-embleton]_ for computational and notational details. For the original description, see Woodcock. [#woodcock-1977] .. py:function:: uniform_vs_unimodal_test(vector_field: numpy.ndarray, significance_level: float = 0.05) -> HypothesisResult Uniformity vs. unimodality test. Apply a test to determine if a distribution is uniform or unimodal, as described in section 5.3.1(i) of Fisher, Lewish and Embleton. [#fisher-lewis-embleton]_ :param vector_field: The vector field to consider, represented as either an array of shape ``(n, d)`` or an ``n+1``-dimensional array containing the components at their spatial locations, with the components present along the *last* axis. :param significance_level: Type I error value for the statistical test, default 0.05. :returns: :class:`HypothesisResult` -- Results of the hypothesis testing, indicating whether the null hypothesis of uniformity can be rejected, as well as the computed p-value. .. rubric:: Notes In this function, the null hypothesis considers the orientations to be uniformly distributed on the surface of a sphere. The alternative hypothesis states that the data are not uniform, and are instead unimodal. This implementation assumes the large sample size scenario. The resultant length ``R`` is computed, and then the test statistic ``3R^2 / n`` is calculated and compared with a chi-squared variable with 3 degrees of freedom. If the test statistic is greater than the chi-squared value, then we can reject the null hypothesis in favour of the alternative hypothesis. .. rubric:: References See [#fisher-lewis-embleton]_, section 5.3.1(i). .. py:function:: _compute_sum_of_arc_lengths(new_vector: numpy.ndarray, vectors: numpy.ndarray) -> float Compute the sum of arc lengths from vectors to a specified vector. See section 5.3.1(ii) in [#fisher-lewis-embleton]_. This function is used to estimate the spherical median of a sample of vectors. :param new_vector: The vector under consideration. :param vectors: Cartesian components of a set of vectors. :returns: :class:`float` -- The sum of arc lengths from all vectors to the specified vector. .. py:function:: compute_median_direction(vector_field: numpy.ndarray) -> numpy.ndarray Compute the median direction for a unimodal distribution. Using the method described by Fisher, Lewis and Embleton [#fisher-lewis-embleton]_, compute the median direction for a unimodal directional distribution. :param vector_field: The vector field to consider, represented as either an array of shape ``(n, d)`` or an ``n+1``-dimensional array containing the components at their spatial locations, with the components present along the *last* axis. :returns: :class:`numpy.ndarray` -- Cartesian coordinates for the estimate of the median direction. .. warning:: The vectors must be normalised to unit length before computing these statistics. .. py:function:: compute_elliptical_confidence_cone_points(w_matrix: numpy.ndarray, constant: float, h_matrix: numpy.ndarray = np.eye(3), number_of_points: int = 36) -> numpy.ndarray Compute ellipse points on the surface of a sphere. Following the procedure described in section 3.2.5 of Fisher, Lewis and Embleton, [#fisher-lewis-embleton]_ compute the coordinates of an ellipse on the surface of a unit sphere. :param w_matrix: Matrix whose eigenvectors and eigenvalues define the ellipse. :param constant: Value on the right hand side of the ellipse equation. :param h_matrix: Frame matrix used to compute the ellipse, by default the identity matrix. :param number_of_points: Number of points to compute. :returns: :class:`numpy.ndarray` -- Array containing `number_of_points + 1` rows of 3D cartesian points which lie on the computed ellipse. The extra point is added to ensure that the ellipse is complete. .. py:function:: compute_confidence_cone_for_median(vector_field: numpy.ndarray, median_direction: Optional[numpy.ndarray] = None, significance_level: float = 0.05, number_of_points: int = 36) -> numpy.ndarray Compute elliptical confidence cone for the median orientation. Compute the matrix defining the elliptical confidence cone, as described by Fisher, Lewis and Embleton. [#fisher-lewis-embleton]_ :param vector_field: The vector field to consider, represented as either an array of shape ``(n, d)`` or an ``n+1``-dimensional array containing the components at their spatial locations, with the components present along the *last* axis. :param median_direction: The median direction vector in a NumPy array. If `None`, then the median direction is computed in this function. :param significance_level: The acceptable type I error used to define the confidence cone. Based on repeated sampling, the true population median should fall within the computed ellipse `(1 - significance_level) * 100` percent of the time. :param number_of_points: Number of ellipse points to compute for the confidence cone. :returns: :class:`numpy.ndarray` -- Points on the elliptical confidence cone. .. warning:: Requires a large sample size. .. py:function:: _kappa_equation(k: float, mean_resultant_length: float) -> float Equation which is satisfied by the concentration parameter. See Fisher, Lewis and Embleton, [#fisher-lewis-embleton]_ section 5.3.2(iv). :param k: The concentration parameter of the Fisher-von Mises distribution. :param mean_resultant_length: Mean resultant length of the sampled vectors. :returns: :class:`float` -- Value of the equation. Should be zero. .. py:function:: compute_mean_unit_direction(vector_field: numpy.ndarray, mean_resultant_vector: Optional[numpy.ndarray] = None) -> numpy.ndarray Compute the mean direction as a unit vector. :param vector_field: The vector field to consider, represented as either an array of shape ``(n, d)`` or an ``n+1``-dimensional array containing the components at their spatial locations, with the components present along the *last* axis. :param mean_resultant_vector: Optional mean resultant vector. If provided, this vector is normalised. Otherwise, this vector is computed. :returns: :class:`numpy.ndarray` -- Unit vector containing the cartesian coordinates of the mean direction. .. warning:: Per Fisher, Lewis and Embleton, [#fisher-lewis-embleton]_ the sample mean corresponds to the maximum likelihood estimate. This may not hold for other distributions. .. py:function:: estimate_concentration_parameter(vector_field: numpy.ndarray, mean_resultant_vector: Optional[numpy.ndarray] = None, initial_guess: float = 0.5) -> float Estimate the concentration parameter. Using the maximum likelihood estimator presented in section 5.3.2(iv) of Fisher, Lewis and Embleton, [#fisher-lewis-embleton]_ estimate the concentration parameter of the provided vector field, assuming that the orientations follow a Fisher-von Mises distribution. :param vector_field: The vector field to consider, represented as either an array of shape ``(n, d)`` or an ``n+1``-dimensional array containing the components at their spatial locations, with the components present along the *last* axis. :param mean_resultant_vector: The mean resultant vector, in cartesian coordinates. If not provided, it will be computed in this function. :param initial_guess: Initial guess for the concentration parameter. :returns: :class:`float` -- The maximum likelihood estimate of the concentration parameter. .. warning:: The orientations provided are assumed to be distributed following a Fisher-von Mises distribution. The result is meaningless if the data are obtained from a different underlying distribution. This estimator is biased. See Fisher, Lewis and Embleton for alternative unbiased estimators. .. rubric:: Notes As described by Fisher, Lewis and Embleton, [#fisher-lewis-embleton]_ the maximum likelihood estimate of the Fisher-von Mises concentration parameter :math:`\kappa` is obtained by solving: .. math:: \coth(\kappa) - 1/\kappa = R / n where :math:`\coth` is the hyperbolic cotangent, :math:`R` is the resultant length and :math:`n` is the number of vectors. .. py:function:: compute_confidence_cone_radius(vector_field: numpy.ndarray, kappa_estimate: Optional[float] = None, confidence_level: float = 0.01, use_degrees: bool = False) -> float Compute confidence cone radius for mean direction estimate. Determine the confidence cone radius around the estimated mean direction for a specified significance level for a Fisher-von Mises distribution. See the description in section 5.3.2 (iv) of Fisher, Lewis and Embleton. [#fisher-lewis-embleton]_ :param vector_field: The vector field to consider, represented as either an array of shape ``(n, d)`` or an ``n+1``-dimensional array containing the components at their spatial locations, with the components present along the *last* axis. :param kappa_estimate: Optional estimate of kappa. If not provided, then an estimate is computed to determine which estimation approach to use. :param confidence_level: Desired confidence level for the mean direction estimate. :param use_degrees: Indicate whether the angular radius should be converted to degrees. :returns: :class:`float` -- Arc length along the sphere of the confidence cone for the specified significance level. .. warning:: This function is only valid on orientations obtained by processes with an underlying Fisher-von Mises distribution. The results cannot be interpreted for data generated by other processes. .. py:class:: FisherVonMisesParameters Parameters for a Fisher-von Mises distribution. .. py:attribute:: mu :type: numpy.ndarray Mean direction in cartesian coordinates, of shape ``(3, )``. .. py:attribute:: kappa :type: float Concentration parameter. .. py:function:: fit_fisher_vonmises_distribution(vector_field: numpy.ndarray) -> FisherVonMisesParameters Fit a Fisher-von Mises spherical distribution to a vector field. :param vector_field: The vector field to consider, represented as either an array of shape ``(n, d)`` or an ``n+1``-dimensional array containing the components at their spatial locations, with the components present along the *last* axis. :returns: :class:`FisherVonMisesParameters` -- Parameters necessary to construct a Fisher-von Mises distribution that fits the vector field. .. warning:: The vectors must be normalised to unit length before computing these statistics. .. seealso:: :obj:`scipy.stats.vonmises_fisher.fit` Function used to perform the fitting. .. py:function:: compute_magnitude_orientation_correlation(vectors: numpy.ndarray, significance_level: float = 0.05) -> Tuple[float, HypothesisResult] Compute the correlation between the magnitude and orientation. Following the procedure outlined in section 8.2.4 in Fisher, Lewis and Embleton, [#fisher-lewis-embleton]_ compute the correlation between the magnitude and orientation of a set of **non-unit** vectors. :param vectors: Array of shape ``(n, 3)`` containing the vectors to analyse. These should **not** be unit vectors. :param significance_level: The test significance to compare the computed p-value against. :returns: * **correlation_coefficient** (:class:`float`) -- Biased estimate of the correlation coefficient. * **hypothesis_result** (:class:`HypothesisResult`) -- Result of the hypothesis test to determine if the magnitude and orientation are correlated. .. warning:: This implementation assumes that a **large sample** is used (i.e., ``n`` > 25). .. rubric:: Notes The correlation coefficient is computed using the deviations from the mean of each variable. The jackknife approach has not yet been implemented. In this statistical test, the **null hypothesis** is that the magnitude and orientation are **not** correlated. If the test statistics is below the chi-squared value at the desired significance level, we reject this null hypothesis. The current implementation modifies the description in Fisher, Lewis and Embleton [#fisher-lewis-embleton]_ by performing array operations.