You're organising a conference of operations research analysts from all over the world, but their time is very valuable and they only agree to meet if you minimise the average distance they need to travel (even if they have to have it on a boat in the middle of the ocean). Where do you put the conference?

Let's model the world as a unit sphere in 3 dimensional space, and have the N people at cartesian coordinates \( \{ p_i\}_{i=1}^{N}\). Then the point c of minimum average distance, the spherical centroid (or spherical Fréchet mean), is given by the forumula:

\[ c = k \sum_{i=1}^{N} \frac{p_i}{\sqrt{1 - (c \cdot p_i)^2}} \]

Where k is a normalising constant so that c lies on the unit sphere. This is like the geometric centroid of the points, but weighted based on the similarity of the centroid to those points.

The equation doesn't solve for c since it's on both sides of the equation, but can be evaluated iteratively. Start with a reasonable c (even though it's not on the sphere, c = 0 actually works perfectly). Then iteratively update c by using your current estimate of c on the right hand side of the equation. Stop when this converges to a centroid.

Context

I found a few discussions of this problem but no actual solution. This question was inspired by a Notion Parrallax blog post. In there he suggests a few methods but he doesn't actually solve it. There's a GIS Stackexchange question on Calculating a Spherical Polygon Centroid. One of the answers references the paper Spherical Averages and Applications to Spherical Splines and Interpolation, by Buss and Fillmore, ACM Transactions on Graphics 20, 95126 (2001). However the paper gives a really complex derivation of what is essentially gradient descent (and they forget to multiply by a small step size) and I found it hard to follow. But this is very similar to finding the centroid of cosine similarity and we can take the same approach.

Solution

The extrema of average distance can be found using the method of Lagrange multipliers. The total distance to a point c is given by \( d(c) = \sum_{i=1}^{N} \arccos\left( c \cdot p_i \right) \), and this is extremised at the same points that the average distance is (since they just differ by a constant factor of N). Then the lagrangian that constrains c to the unit sphere is given by \( L(c, \lambda) = \sum_{i=1}^{N} \arccos\left( c \cdot p_i \right) + \lambda (1 - c \cdot c) \).

Taking the partial derivatives (using arccos derivative) we get (where \( p_{ij} \) is a clumsy notation meaning the jth coordinate of the ith point):

\[ \frac{\partial L}{\partial C_j}(c, \lambda) = - \sum_{i=1}^{N} \frac{p_{ij}}{\sqrt{1 - (c \cdot p_i)^2}} - 2 \lambda c_j \]

\[ \frac{\partial L}{\partial \lambda}(c, \lambda) = 1 - c \cdot c \]

The extrema occur where these are 0, and in particular where \( c = k \sum_{i=1}^{N} \frac{p_i}{\sqrt{1 - (c \cdot p_i)^2}} \) where k is a constant so that the second normalisation constraint holds.

What's not obvious to me is when this point will be the minimum, and when calculating it iteratively will converge. There are going to be issues when the points are symmetric; for example for two antipodal points the entire great circle on the plane orthogonal to them will be minimum, and the formula blows up. But if we slightly perturb the points then in general there should be a unique minimum distance point. However there could also be a maximum distance point (since the sphere is bounded) and that would also satisfy the equation. I would expect if we start close to the minimum then the iterative process will converge to the minimum, and that the projection of the geometric average (what you get if you set c = 0 initially) will be almost always close to the minimum. I don't have proof on this, but a few computations suggests that the approach can work well in practice.

To see a concrete implementation of this in code, have a look at the post calculating centroid on a sphere.