Manifolds.jl: An Extensible Julia Framework for Data Analysis on Manifolds

We present the Julia package Manifolds.jl, providing a fast and easy-to-use library of Riemannian manifolds and Lie groups. This package enables working with data defined on a Riemannian manifold, such as the circle, the sphere, symmetric positive definite matrices, or one of the models for hyperbolic spaces. We introduce a common interface, available in ManifoldsBase.jl, with which new manifolds, applications, and algorithms can be implemented. We demonstrate the utility of Manifolds.jl using Bézier splines, an optimization task on manifolds, and principal component analysis on nonlinear data. In a benchmark, Manifolds.jl outperforms all comparable packages for low-dimensional manifolds in speed; over Python and Matlab packages, the improvement is often several orders of magnitude, while over C/C++ packages, the improvement is two-fold. For high-dimensional manifolds, it outperforms all packages except for Tensorflow-Riemopt, which is specifically tailored for high-dimensional manifolds.


Introduction
In many applications, measured data appears in non-linear spaces.In such spaces, common data analysis operations like the computation of a mean are not directly possible, since for example there is no addition operation on such data.However, many of these spaces have the advantage of being Riemannian manifolds, that is, they are smooth manifolds equipped with a Riemannian metric, i. e. with a locally varying inner product do Carmo (1992); Absil et al. (2008); Boumal (2023).Such manifolds appear in many applications.For example, in Interferometric Synthetic Aperture Radar (InSAR) Bürgmann et al. (2000), the measured data are angles, and in diffusion tensor magnetic resonance imaging (DT-MRI) Fletcher and Joshi (2007-02); Dryden et al. (2009); Ghosh et al. (2008), diffusion at each voxel is described by a symmetric positive definite matrix.The special orthogonal group of rotations is used in analysis of data from orientation sensors (Marins et al., 2001).The sphere is used in various applications of directional statistics such as hydrology (Chen et al., 2013).The discipline of computational anatomy (Grenander and Miller, 1998;Cruz-Orive et al., 1985) extensively uses geometric methods, including Large Deformation Diffeomorphic Metric Mapping (Miller, 2004) with applications to medical data analysis (Pennec et al., 2019;Tulik et al., 2019).In image analysis, Grassmann and Stiefel manifolds are used for pose estimation by Turaga et al. (2011).Various manifolds with Fisher-Rao-like metrics are also used for shape analysis by, for example, Srivastava et al. (2011), Baran (2018), and Rzecki and Baran (2021).
Other commonly encountered spaces include spheres or the Stiefel manifold of matrices with orthonormal columns.
Using tools of differential geometry, many operations can be generalized to manifolds by using only generic functions like the inner product, its induced norm, geodesics, the exponential, or the logarithmic map.Closed-form expressions for the generic functions do not always exist, but efficient implementations to approximate them are still often possible.Recent advances continue to resolve numerical challenges with efficiently implementing these operations.Our library offers a wide range of such implementations, which makes performant data analysis for manifold-valued data more accessible.
One example of an operation that can be realized with these generic functions is the mean, which can be rephrased as the point in the space that (globally) has the minimum sum of squared distances from the given data points.This definition can even be generalized to metric spaces where this minimizer is called the Fréchet mean.Local minimizers may also exist on a Riemannian manifold; these are the Riemannian centers of mass (Grove and Karcher, 1973;Karcher, 1977), sometimes also called Karcher means.
A main necessity for a wider adoption of geometric methods is a common, extensible, and efficient collection of operations on a large variety of manifolds.One of the first toolboxes for optimization on manifolds was the Matlab package Manopt (Boumal et al., 2014).The manifolds within this package are objects, where a fixed metric is assumed but not explicitly mentioned.Implementing a second metric for the same manifold requires a new Matlab object.A manifold can have an exponential map.While most have at least one retraction implemented, there is no generic way to choose the exponential map or one of the retractions while calling a solver.Several libraries followed, such as the C++ library ROPTLIB (Huang et al., 2018), which does provide a short user manual with introductory examples and an interface overview but lacks comprehensive documentation.The seven major libraries for differential geometry in Python are Pymanopt (Townsend et al., 2016), Geomstats (Miolane et al., 2020), Geoopt (Kochurov et al., 2020), TheanoGeometry (Kühnel and Sommer, 2017), Jax Geometry (Kühnel and Sommer, 2017), Tensorflow RiemOpt (Smirnov, 2021), and McTorch (Meghwanshi et al., 2018), each with their own implementations of a sometimes limited set of Riemannian manifolds.Geomstats implements a metric interface, such that different metrics on one manifold can be easily distinguished.Some libraries are more focused on specific tasks.For example, MVIRT (Bergmann, 2017) provides a variational model formulation for manifold-valued image processing.This toolbox introduces implementations of a cyclic proximal point algorithm (Bačák, 2014) applied to first-and second-order total variation (Bačák et al., 2016), as well as total generalized variation (Bergmann et al., 2018).MVIRT also provides the parallel Douglas-Rachford algorithm (Bergmann et al., 2016) for large-scale, nonsmooth optimization.
The above packages all follow the same basic approach: they generically implement their statistical or optimization algorithms for an arbitrary Riemannian manifold and provide a certain set of Riemannian manifolds that implement the necessary properties for these generic implementations to work.The speed of an algorithm then depends on efficiency of both the abstract implementation of the algorithm as well as of employed operations on the specific manifold considered in an application.
Here we present the Julia package Manifolds.jl,together with its accompanying interface package ManifoldsBase.jl.They provide an easily extensible interface to implement methods on a generic Riemannian manifold.Moreover, they implement a comprehensive, well-documented library of efficient operations on common manifolds.
The remainder of this paper is organized as follows: Section 2 provides basic definitions of concepts from differential geometry and illustrates them using the unit sphere.Next, Section 3 describes the general design of Manifolds.jl.Section 4 compares features of this library to what other libraries offer.A few examples of usage of Manifolds.jlare given in Section 5. Section 6 presents performance benchmarks of Manifolds.jland other libraries on a set of manifolds for selected operations.Finally, conclusions are contained in Section 7.

Manifolds
In this section we introduce the necessary terms to work on Riemannian manifolds.We refer the reader to the textbooks do Carmo (1992); Lee (2012); Absil et al. (2008); Boumal (2023).After introducing manifolds and related terms and definitions, the concepts are illustrated for the unit sphere in R 3 and the manifold of symmetric positive definite matrices.

General definitions
We denote by M a d-dimensional manifold.A manifold M is defined using an atlas A consisting of charts φ : U → R d that introduce bijections from subsets U ⊂ M of M to open subsets of R d Absil et al. (2008).One way to work on manifolds is to work in charts.Often, however, it is more computationally convenient or efficient to represent points on a manifold using coordinates of an embedding, instead of using a chart and working with intrinsic methods.Intrinsic here refers to the idea that the methods and tools used are "well defined regardless of the embedding space" (Boumal, 2023, p. 3).
At every point p ∈ M we denote the tangent space by T p M and the tangent vectors by X p , Y p ∈ T p M, where we leave out the index when the point p is clear from context.We denote the Riemannian metric by ⟨•, •⟩ p : T p M × T p M → R. The metric also introduces a norm for tangent vectors X ∈ T p M, which we denote by ∥X∥ p .Note that a smooth manifold can be equipped with different metrics.
For p, q ∈ M and X ∈ T p M we denote by γ p,X : I → M the geodesic starting in γ p,X (0) = p with γp,X (0) = X, where I is the maximal interval over which the geodesic can be defined (Lee,Cor. 4.28) and O p = {X ∈ T p M : γ p,X is defined on an interval containing [0, 1]}.A shortest geodesic denoted by γ p;q (t) starts at p at time t = 0, reaches γ p;q (1) = q at t = 1, and is length-minimizing.Note that the shortest geodesic might not be unique.
We further denote by exp p : T p M → M the exponential map.We define the injectivity radius inj(p) as the supremum over all radii r > 0 such that exp p is a diffeomorphism on the open ball B(p, r) = {X ∈ T p M : ∥X∥ p < r}.We denote by U p = B(p, inj(p)) and introduce U p = exp p (U ) ⊂ M. By definition exp p Up : U p → U p is a diffeomorphism and hence the inverse of exp p Up exists, called the logarithmic map.Formally the logarithmic map is defined as (Boumal, 2023, Def. 10.20) In some cases, the exponential and logarithmic maps can be derived in closed form, but otherwise they are expensive to evaluate.Therefore, one often uses a retraction retr p : T p M → M that approximates the exponential map to (at least) first order (Boumal, 2023, Def. 3.47).When the retraction can be inverted, i. e. from p and q we obtain X, this inverse retraction retr −1 p : q = retr p X → X can be used to approximate the logarithmic map.While the exponential map, and hence also the logarithmic map, depend on the metric, a retraction is defined by properties of the retraction along a curve c(t) = retr p (tX) for X ∈ T p M and its derivative c ′ (t) and hence is independent of the metric.
A tangent space can be seen as "the directions to walk in" from a point p.Let q be a second point on M such that the shortest geodesic is unique.For a tangent vector X ∈ T p M at p we denote the parallel transport to q along the shortest geodesic (assuming it is unique) by PT q←p X.Again, since this map might not be given in closed form or might be expensive to compute, it is often approximated with methods called vector transports.
Computationally, points p ∈ M and tangent vectors X ∈ T p M are usually represented by arrays, i. e. vectors or matrices.This is realized usually by embedding a manifold into some larger space.A smooth function i : M → R m whose differential Di(p) has rank d for every p ∈ M is called an immersion.The space R m is then called the embedding space of M, and M is an embedded manifold.
A manifold M is called a Lie group when we have a group operation • : M × M → M that is smooth and whose inverse map p → p −1 is also smooth.

Example: the two-dimensional unit sphere
The unit sphere in the three-dimensional Euclidean space, denoted S 2 , is an intuitive example of a Riemannian manifold.Points p ∈ S 2 on the unit sphere have unit norm ∥p∥ 2 = 1.Adding (or subtracting) two of these points yields a point with a non-unit norm, which therefore does not belong to S 2 .However, there is an intuitive way of reinterpreting addition and subtraction that makes an analogous operation on the sphere possible.
We can interpret the difference y − x between two points x, y ∈ R 3 as a vector starting from x and pointing to y. Multiplying such a vector by a scalar and adding it to x defines a set of points on a straight line passing through x and y, i. e. t → x + t(y − x).This line segment joining x and y is the shortest connecting curve on R 3 called a shortest geodesic γ x;y (t) on R 3 .On S 2 , the shortest geodesic is no longer a line segment but is instead a shortest great arc passing through points p, q ∈ S 2 .For any two non-opposing points, the intersection of their common plane through the origin with the sphere yields two geodesics, one shorter than the other.We call this shorter great arc the shortest geodesic.For two opposing points p, −p ∈ S 2 , all great arcs that join the two points are half-circles; they are therefore joined by infinitely many shortest geodesics γ p;−p .We can measure the distance d S 2 (p, q) between the points p and q as the length of a shortest geodesic, depicted as the blue curve in Figure 1.
To conceptualize tangent vectors, consider the set of vectors orthogonal to a point p ∈ S 2 , that is We obtain an actual plane tangent to the sphere if we consider all points p + X, X ∈ T p S 2 , which is tangent to the sphere and "touches" the sphere at p (at X = 0).For every X ∈ T p S 2 there exists a great arc c(t), which is a geodesic (acceleration free curve when "measuring" in the tangent spaces it passes through) that runs through c(0) = p with derivative ċ(0) = X.
When parametrising the great arc with constant speed ∥ ċ(t)∥ 2 = ∥X∥ 2 we reach some point c(1) = q ∈ S 2 after time t = 1.This point is q = exp p X.As long as the norm of X is less than π, the inverse function mapping q to X = log p q is unique as well.With a start point p and an initial velocity X, the great arc c(t) can also be determined if both end points c(0) = p and c(1) = q, p ̸ = −q are given.The plane containing the origin, p, and q is uniquely determined, and its intersection with the sphere S 2 forms a circle.The shorter arc segment connecting p and q on this circle is the image of the curve c(t).Finally, there exists a unique constant speed parametrisation ∥ ċ(t)∥ 2 = const that reaches q = c(1) after time 1.Then X = ċ(0) = log p q.If q = −p, the intersection plane is not unique, and any tangent vector X ∈ T p S 2 with ∥X∥ 2 = π is a solution of the above construction.Vectors tangent to the sphere at different points cannot be added or subtracted.First, they need to be transported to the same point.This is performed using parallel transport.Transport of a vector X tangent at p to a point q is denoted PT q←p X.This operation results in a new vector tangent to the sphere at q.Note that transporting a vector X tangent at p between a series of points -for example from p to q, from q to r, and from r back to pmay not result in the same vector X.
These operations can be extended to all manifolds as described in Section 2.1.

Example: two metrics on the symmetric positive-definite matrices
A second, more elaborate example is the set of n × n symmetric positive definite (SPD) matrices P(n).The set consists of all matrices p ∈ R n×n that are symmetric (p = p T ) and positive definite, that is, for any 0 ̸ = x ∈ R n , it holds that x T px > 0, which equivalently means that all eigenvalues of p are strictly positive.
Figure 1: Unit sphere in R 3 with two points p, q.Tangent space at p is labelled T p S 2 , geodesic between p and q is the blue arc γ(p, q), and the vector X is the tangent vector from T p S 2 that points towards q.Y is another vector from T p S 2 orthogonal to X.All tangent vectors are blue.Parallel transports of X and Y from p to q are shown as vector PT q←p X and PT q←p Y , respectively.
The "directions to walk in", or to be precise, the tangent space T p P(n) is the set of symmetric matrices X = X T .Note that for an arbitrary symmetric matrix q = p + tX where t > 0, q might only be SPD for very small values of t.Similar to the sphere, addition might mean leaving the manifold.
There are several metrics we can introduce on the tangent spaces T p P(n).In the following we compare two and their effect on the exponential map, or informally said "how to move into a direction X".One possible metric is the affine invariant metric, given by Pennec et al. ( 2019) With this metric, the exponential map is where Exp is the matrix exponential.The simplest case of n = 1 yields the manifold of positive numbers, whose inner product simplifies to ⟨X, Y ⟩ p,AI = XY p 2 and whose exponential is exp p X = pe X/p .Hence, non-positive numbers can never be reached.The main computational effort for this metric is the computation of square roots and exponentials.
Another possibility is to use the Bures-Wasserstein metric, which is given by Bhatia et al. ( 2019) where q = L p (X) denotes the Lyapunov operator, which solves pq+qp = X.The exponential map then reads where for both operations the computational effort is mainly to compute the Lyapunov operator.

Structure of the library
Manifolds.jl 1 is a registered Julia package.Its most recent version, currently 0.8.61, can be easily installed typing using Pkg; Pkg.add("Manifolds") in the Julia REPL and activated by typing using Manifolds afterwards.
The library of manifolds Manifolds.jl is based on the interface ManifoldsBase.jl 2 , whose current version is 0.14.5.This interface defines a common base for Riemannian manifolds in Julia.It hence serves two purposes.On one hand it can be used to implement additional manifolds like the ones in Manifolds.jl.On the other hand it can be used to define arbitrary functionality on arbitrary manifolds without depending on or loading the whole library of manifolds.
Both the library and the interface use GitHub Actions to automatically generate their documentation as well as run tests, following the philosophy of continuous integration.These tests are run on Linux and Mac OS with the latest (Julia 1.9), long-term-support (Julia 1.6), and nightly versions of Julia.For pull requests (PRs) and for commits on the master branch a format test is run to ensure that all source code follows the formatting rules using JuliaFormatter.jl 3 Using the JuliaRegistrator GitHub application and TagBot GitHub action, new releases are automatically registered to the registry and new GitHub release entries as well as documentation page generation is triggered.
Currently, the code coverage, i. e. percentage of code lines executed during testing, is 99.86 % for ManifoldsBase.jl and 98.93 % for Manifolds.jl,respectively.

Manifolds and functions thereon
A Riemannian manifold in Manifolds.jl is defined by introducing a subtype of AbstractManifold{F}, where F is the field the manifold is based on: the real numbers R, complex numbers C, or quaternions H.For example, creates the sphere S 2 ⊂ R 3 from 2.2, the space of complex-valued 2 × 3 matrices C 2×3 , and the set of 3 × 3 symmetric positive definite matrices P(3) from 2.3, respectively.
The interface provides a unified way to work with data defined on arbitrary manifolds, including those listed above.As a first example, manifold_dimension(M) returns the dimension of the manifold.The two functions is_point(M, p) and is_vector(M, p, X) can be used to check whether points or tangent vectors are valid.The last two have an optional positional parameter to activate throwing an error for invalid points or tangent vectors, which is false by default.Hence calling is_point(M, p, true) returns true if p is a point on M and otherwise throws an error describing why p is not a point on M.
For functions that return points or tangent vectors, interface functions have both in-place and allocating versions.By default, the allocating version allocates an output and then calls the in-place version.For example, for the exponential map, the function exp!(M, q, p, X) stores exp p X in the variable q.When the in-place variant is implemented for a new manifold, the allocating version exp(M, p, X) is automatically available.Calling exp(M, p, X) first allocates memory and then calls the in-place function mentioned above.The same holds for geodesic(M, p, X), which returns the function γ p,X .To evaluate the geodesic at a certain t one can call geodesic(M, p, X, t).In both cases, in-place variants are available as well.The same principle of in-place and allocating variants is also realized for log(M, p, q) and log!(M, X, p, q), which should return a deterministic result if the logarithmic map is not unique as for opposite points on the sphere.With the latter, the former is available with a default implementation as well as shortest_geodesic(M, p, q) (and its in-place variant) again with t as an optional further parameter, similar to geodesic.While all these default implementations are available, they can be overwritten if for example the shortest geodesic γ p;q can be computed more efficiently than by calling exp p (t log p q). Finally, the parallel transport PT q←p X is implemented as parallel_transport_to(M, p, X, q) and parallel_transport_to!(M, Y, p, X, q) in place of Y, respectively.
When using SymmetricPositiveDefinite(n), the affine-invariant metric is used by default.

Representation of points and vectors
Usually points p ∈ M and tangent vectors X ∈ T p M are not explicitly typed, and only the manifolds type is specified for dispatch.Most manifolds implicitly assume that both points and tangent vectors are instances of AbstractArray or at least have several functions like addition defined.This enables a great deal of flexibility in types to use for these parameters in a function, such as the statically sized arrays from StaticArrays.jl,which for small dimensions are stack-allocated.In two situations the points and tangent vectors are typed.If the representation requires more than one array, as is the case for the fixed-rank matrices, then a type is used to enforce this.Since the interface functions are not typed, features like the allocating variant of a function being available as soon as the in-place one is implemented still work.Another reason for using types is to distinguish different representations of points and tangent vectors on one manifold.For example, for the hyperbolic space H n , three representations are implemented: the hyperboloid model, the Poincaré ball model, and the Poincaré half-space model.These are distinguished by using different types for points and tangent vectors, such as PoincareBallPoint and PoincareBallTVector for points and tangent vectors in the Poincaré ball model.Using (plain) arrays on H n is equivalent to using HyperboloidPoint and HyperboloidTVector.

Retractions and vector transports
The exponential map exp p , the logarithmic map log p , and parallel transport PT q←p are computationally expensive or not given in closed form for some manifolds.For these cases one can specify subtypes of AbstractRetractionMethod, AbstractInverseRetractionMethod, or AbstractVectorTransportMethod and create objects of these subtypes, respectively denoted r, s, and v in the following example.The corresponding functions are called by retract(M, p, X, r), inverse_retract(M, p, q, s), and vector_transport_to(M, p, X, q, v).These methods also have in-place variants.On the sphere S 2 , for example, r = ProjectionRetraction() computes the retraction given by projecting p + X onto the sphere.
These three methods can be used to implement generic algorithms on manifolds.Since the exponential map might not be given in closed form, the interface provides the function default_retraction_method(M) to obtain the retraction that is most often used on a certain manifold.If the exponential map is available (and can be computed efficiently), this method returns ExponentialRetraction(), which lets retract fall back to using exp.This way algorithms can easily be modified to use different available retractions.The same scheme applies for inverse retractions and vector transports.

Features on manifolds using a decorator pattern
As mentioned above, the manifold M3 = SymmetricPositiveDefinite(3) implicitly uses the affine invariant metric.To change that, the interface ManifoldsBase.jl provides a traitbased decorator pattern.To decorate the manifold with the Bures-Wasserstein metric, we can declare This acts like a decorator for the manifold in the following sense: all methods that depend on the metric, like exp and log mentioned above or inner(M3b, p, X, Y) and norm(M3b, p, X), are implemented differently for this manifold.Other methods like manifold_dimension or is_point that do not depend on the metric are "passed down" to M3 automatically.
Traits can be seen as a list of properties "attached" to a manifold.For example, M2 = Euclidean(2, 3, C) has the property IsDefaultMetric(EuclideanMetric()), that is, using the manifold itself is completely equivalent to explicitly equipping the manifold with the same metric M2b = MetricManifold(M2, EuclideanMetric()).This formulation allows a manifold to be implemented with an implicit default metric without needing to directly implement a metric type or use MetricManifold.But one can still implement a second metric and decorate the manifold with it.
Similarly, a default embedding can be implicitly assumed, for example by implementing get_embedding(M), embed(M, p) and project(M, p).An embedding of M into a different manifold N can be represented by EmbeddedManifold(M, N) similarly to the metric above.For example, for the sphere M1 = Sphere(2) we have that get_embedding(M1) re-turns Euclidean(3), and embed(M1, p) is the identity function since points are represented by unit vectors.The projection project(M1, p) normalises p to norm 1.Also, for a Lie group, this implicit approach can be realized by implementing compose(M, p, q).The explicit decorator is the GroupManifold(M, op), where op is the Lie group operation.
As a final example of a decorator, we mention the ValidationManifold.For all default functions mentioned above, this manifold adds checks (is_point and is_vector) to their input and output values.While this somewhat reduces performance, using M1b = ValidationManifold(M1) provides an easy way to verify that none of the operations receive invalid inputs or produce invalid outputs.Here the IsExplicitDecorator property is set; that is, if no new method is implemented for the validation manifold M1b, the method for M1 is automatically called.
When implementing a new manifold, the properties or traits are set using the function active_traits.For example, for the sphere this is defined as function active_traits(f, ::AbstractSphere, args...) return merge_traits(IsIsometricEmbeddedManifold(), IsDefaultMetric( EuclideanMetric())) end which finally illustrates that setting IsIsometricEmbeddedManifold() specifies that the metric from the embedding is used and does not need to be implemented.Here the metric ⟨X, Y ⟩ p is computed when calling inner(M1, p, X, Y) as inner(get_embedding(M1), embed(M1, p), embed(M1, It does not need its own implementation.Since both embedding points and embedding tangent vectors are just the identity function for the sphere, this does not even introduce any overhead.Further, since all the information required for these dispatches is given by the type of the manifold, this dispatch-logic can be determined at compile time.It therefore does not yield any runtime overhead.
Note that the extent of support varies significantly across libraries.For example Manifolds.jlcurrently implements five different metrics (affine-invariant, Bures-Wasserstein, Log-Euclidean, Log-Cholesky, and generalized Bures-Wasserstein) on the manifold of symmetric positive definite matrices, Geomstats only the first three ones and a Power-Euclidean one, Manopt only the first two, and Pymanopt only the first.
On the other hand, Manifolds.jlfocuses solely on implementing Riemannian manifolds, metrics, Lie groups, and functions thereon.Geomstats further provides several statistical tools, and Manopt and Pymanopt further focus on optimisation algorithms, such as a gradient descent scheme.For optimisation on manifolds in Julia, we refer to Manopt.jl(Bergmann, 2022), which, like Manopt and Pymanopt, focuses also on nonsmooth optimi- Lie groups and actions

✓ ✓ ✓
Logarithmic maps (inverse retractions) sation algorithms on manifolds.For that reason, Table 1 compares features provided by the libraries that focus on implementations of manifolds but not statistics and/or optimisation.All packages but ROPTLIB provide support for several Automatic Differentiation (AD) backends to compute Euclidean gradients and Hessians of functions defined in the embedding of a manifold and to convert them to Riemannian variants afterwards.

Examples
This section demonstrates a few examples of composing generic operations from Manifolds.jlwith the Julia ecosystem (Bezanson et al., 2017).The first example defines a Bézier spline on a generic manifold to demonstrate how to easily define new functions on manifolds using ManifoldsBase.jl.Then, Manopt.jl 4 (Bergmann, 2022) is used as an example of a complex optimization library built on top of Manifolds.jl.Further, it is demonstrated how Manifolds.jlcan be easily combined with independently developed libraries to implement fast, geometry-aware algorithms.
Note that all examples can be split into two components.First, a manifold and data on the manifold are constructed.Second, the algorithm is executed on this manifold.Since all methods are implemented in a generic way, switching to data from another manifold just requires to change the first part of introducing the manifold and loading/generating corresponding data.
Each of the examples illustrates one feature original to Manifolds.jl.While all of them are implemented agnostic of a specific manifold at hand, using a recursion by dispatch is unique to Julia.An optimisation problem can be set up in three lines, defining the cost and the gradient as inline functions.The last example illustrates the generic implementation of 4. Current version 0.3.39,available at https://manoptjl.org bases of tangent spaces.These can even be abstract in the sense that no tangent space basis vectors are stored and they are only allocated if needed.

Implementing functions on manifolds
Functions on manifolds dispatch on the manifold itself, which is included as the first argument.As an example, we implement a Bézier curve on an arbitrary manifold.Bézier curves are used to model curves and surfaces in computer graphics.Their shape is determined by so called control points.The curve starts in the first control point and ends in the last.For more details on their use on manifolds, see for example (Absil et al., 2016;Gousenbourger et al., 2018;Bergmann and Gousenbourger, 2018).
A Bézier curve can be defined using the De Casteljau algorithm.Let us first consider the classical Euclidean case: given a degree n and n + 1 (control) points x 0 , . . ., x n ∈ R n , the Bézier curve b n (t; x 0 , . . ., x n ) is defined recursively as A Bézier curve b 1 (t; x 0 , x 1 ) of order 1 is just a line.To evaluate a Bézier curve b 2 (t; x 0 , x 1 , x 2 ) at a point t, we would need three line evaluations: On a Riemannian manifold M we obtain a Bézier curve by evaluating shortest geodesics γ(t; p, q) connecting p, q ∈ M instead of line segments.
The implementation for an arbitrary Riemannian manifold is given by the following code and returns the Bézier curve evaluated at t. function bezier(M::AbstractManifold, t, pts::NTuple) p = bezier(M, t, pts[1:(end -1)]) q = bezier(M, t, pts[2:end]) return shortest_geodesic(M, p, q, t) end function bezier(M::AbstractManifold, t, pts::NTuple{2}) For four points on the sphere given by and an evaluation of b 3 at t = 2 3 all shortest geodesics involved in the evaluation are shown in Figure 2.This figure has been adapted from (Bergmann and Gousenbourger, 2018, Figure 9).

Optimization using Manopt.jl
Now we consider the generalization of the mean to Riemannian manifolds.The mean of some (Euclidean) data x 1 , . . ., x N ∈ R n can be computed as One can interpret this formula as the solution of the first-order optimality condition of the optimization problem That is, x is the point y that minimizes the sum of squared distances, or equivalently, the variance.This interpretation can be generalized to manifolds, but usually no closed form for the minimizer x exists.Still, replacing the squared distance with the squared Riemannian distance, the gradient can easily be computed to perform a gradient descent algorithm with a suitable line search strategy, cf. for example (Absil et al., 2008, Chapter 4).For given points p 1 , . . ., p N ∈ M the cost function and the gradient read This algorithm gradient_descent is implemented in Manopt.jl.Furthermore, a library of cost functions, gradients, differentials, adjoint differentials, as well as proximal maps is available in the package as well.They are implemented on arbitrary manifolds based only on ManifoldsBase.jl and can hence be used to design an optimization problem even independent of the manifold.One example that we also require here is the already mentioned gradient of the (exponentiated) distance function distance(M,p,q)ˆs with respect to q, where s=2 is the default.This function is called grad_distance(M, p, q, s) in Manopt.jl.
Note that since Manopt.jl is based on the lightweight interface ManifoldsBase.jl, we need to load Manifolds.jlonly to have our manifold of interest available.

Tangent space PCA
Thanks to the high composability of the Julia ecosystem and the design of the Manifolds.jllibrary, it is very easy to compose independent libraries to achieve new functionality.For example, tangent space principal component analysis (PCA) described by Fletcher et al. (2004) can be split into 1) computing the Riemannian centers of mass and coordinates of tangent vectors using Manifolds.jland 2) calculation of PCA vectors using MultivariateStats.jl.Using a d-dimensional manifold, tangent vectors might be stored in a format different from the d-dimensional vector.The interface ManifoldsBase.jl provides c = get_coordinates(M, p, X, B) to obtain a d-dimensional representation and X = get_vector(M, p, x, B) to reconstruct the tangent space format.For example tangent vectors from T p S 2 are stored as vectors X orthogonal to p (in R 3 ) and they have a representation in 2 coordinates.Both these functions require a basis B, which can be a full set of vectors in a CachedBasis or a memory-efficient type to represent a deterministic function of a DefaultOrthonormalBasis to not store the basis explicitly.The whole procedure is demonstrated by the following code.# compute their mean m = mean(M, pts) # compute logarithmic maps of points at their mean vectors = map(p -> log(M, m, p), pts) # choose a basis of the tangent space at point m basis = DefaultOrthonormalBasis() # compute coordinates of tangent vectors in the selected basis coords = map(X -> get_coordinates(M, m, X, basis), vectors) # flatten the array of coordinates to a matrix coords_red = reduce(hcat, coords) z = zeros(manifold_dimension(M)) # compute the first principal vector assuming zero mean model = fit(PCA, coords_red; maxoutdim=1, mean=z) # get the tangent vector corresponding to the first principal vector X = get_vector (M, m, reconstruct(model, [1.0]), basis)
The second set of manifolds are the special orthogonal group SO(n), the symmetric positive definite matrices P(n) with the affine invariant metric, and for a high dimensional test the power manifold (P(n)) 128×128 , where in all three cases we used n = 2, 3, 4, 8, 16, 32.
Here again, ROPTLIB only provides an exponential map for the last two manifolds.The power manifold as well as a distance on SO(n) is not available in Geomstats and GeoOpt.

General setup
A computer running Linux Mint 20 Ulyana with Intel(R) Core(TM) i7-9700KF CPU @ 3.60GHz CPU and 32 GB of RAM was used to benchmark all libraries.In particular, no limitations were imposed on the number of threads.
The benchmarking was performed using points generated randomly by Manifolds.jl.Each benchmarked operation was performed using 100 pairs of these points 10 N times in a loop written in the language native to the library.The number N was chosen to be large enough to ensure that the entire benchmarking loop takes at least one tenth of a second to reduce the influence of system clock inaccuracy.Results of operations were accumulated in an array, which incurs a small performance penalty but prevents the compiler from optimizing out an otherwise unused computation.Details of benchmarks specific to particular libraries and the settings used are given in Appendix B.

Discussion
The results for the first experiment are shown in Figure 3. On all three manifolds, Manifolds.jlperforms the fastest for the smaller dimensions n = 2, 3, 4, 8, 16, and for example in the sphere for n = 3 ROPTLIB performs the exponential map (bottom center bar plot) only a factor of 2 slower.This is likely due to the more sophisticated memory management and use of specialized linear algebra packages in Manifolds.jl.For larger dimensions, e. g. for the manifold M = R 2 10 , Pymanopt computes the logarithmic map about 1/6 faster.Similarly for n = 32, ROPTLIB computes the exponential map on the sphere S n about 33% faster than Manifolds.jland on R n about four times faster.For very high dimensions, RiemOpt with its TensorFlow backend is faster by a factor of up to 20 on most functions, and just for the hyperbolic manifold, our distance function is a factor of 4 faster.For such large dimensions, RiemOpt benefits from advanced automatic transformations available in the Tensorflow backend that cannot yet be easily replicated in Julia.
For the second experiment, the results are shown in shown in Figure 4. Compared to ROPTLIB, we see the same effect: often the Julia package is a factor of at least 2 faster.Only for the exponential map on P(n) ROPTLIB is a factor of 2 faster, probably again due to different linear algebra library or memory management.
Concerning RiemOpt, again, only for the very high-dimensional power manifold, this Tensorflow based package is usually a factor of approximately 10 faster.This starts already for the case n = 3, where Manifolds.jl is a factor of 2 slower for exponential and logarithmic maps but a factor of 5 faster for distance; all other packages are at least a factor of 4-5 slower, while Manopt seems to be the slowest by several orders of magnitude, probably due to the use of cell arrays in Matlab.
For the highest dimension n = 32 and the power manifold, both Pymanopt and Manopt reach the same order of magnitude as Manifolds.jland Riemopt, where the latter is still a factor of 5 faster.
Manifolds.jl is the fastest package overall -on par with ROPTLIB wherever it provides an implementation -and only significantly surpassed by Riemopt for some large-scale manifolds of dimensions of more than 2 16 .
We further compared the accuracy of the methods run in the benchmark.We first computed reference results using Manifolds.jlwith BigFloat as number type and both GenericLinearAlgebra.jl and GenericSchur.jlfor the more precise linear algebra routines.We then performed the accuracy test against these exact results for all above benchmarked functions and packages.The full results are given in Appendix A.
Overall, Manifolds.jllies among the best performing ones here as well.Only for the symmetric positive definite matrices P(n) -more precisely the logarithmic and exponential map thereon -our library is slightly less accurate than others.The PyTorch backend seems to only reach a significantly lower accuracy, apparently due to an internal conversion to 32-bit floating point representation.

Conclusions
Manifolds.jlprovides a comprehensive API and set of features for working with points on manifolds.It also offers one of the largest collections of implementations of manifolds.Using this API, Manopt.jlprovides a library of algorithms for optimization on manifolds, and ManifoldDiffEq.jloffers a variety of solvers for ordinary differential equations.As a result, all manifolds defined with ManifoldsBase.jl are compatible with both libraries and can directly be used therein.
One prominent advantage of Manifolds.jlover similar packages is that, by using Julia, the "two-language-barrier" is overcome while keeping the code concise and expressive.Implementations in Manifolds.jlwere demonstrated to be superior in performance to those in libraries with interfaces in Python and Matlab, where only for large scale manifolds Riemopt outperforms Manifolds.jlslightly.The library is approximately twice as fast as ROPTLIB, though that is probably due to the linear algebra library.
However, this high performance did not require sacrificing clarity of the implementations themselves, the ease of use of a dynamic programming language, or computational accuracy.
Future work on Manifolds.jlwill improve integration of Julia's automatic differentiation packages, add support for GPU acceleration, and expand the set of available manifolds and operations.Other plans include support for more statistical and machine learning methods.

Figure 2 :
Figure 2: Illustration of the De Casteljau algorithm to evaluate a Bézier curve on the Sphere S 2 .The initial points and their shortest geodesics (blue) yield three points (cyan) and two points (green) during the recursion, and their connecting geodesic the point on the Bézier curve (orange).

Figure 4 :
Figure 4: Benchmark of distance (left), exponential maps (middle), and logarithmic maps (right) on the manifolds of rotations (top), manifold of symmetric positive definite matrices (middle), and its power manifold (bottom).The different colors correspond to the different software packages, where for Geomstats additionally the different line styles refer to different backends.

Figure 6 :
Figure 6: Mean absolute error of distance (left), exponential maps (middle), and logarithmic maps (right) on the manifolds of rotations (top), manifold of symmetric positive definite matrices (middle), and its power manifold (bottom).The different colors correspond to the different software packages, where for Geomstats additionally the different line styles refer to different backends.

Table 1 :
Comparison of manifolds available in related libraries.A manifold is marked as available if at least a retraction is implemented using a generic interface.The fields R , C , H are meant in increasing order, indicating that a manifold is not available.a this manifold is available for arrays with any number of indices.b only the two-dimensional manifold is available.

Table 2 :
Comparison of features available in related libraries.