A Statistical Analysis of XX vs XY Chromosomes
Advantage in Olympic Boxing
Introduction
Now that all the 2024 Olympic boxing matches are done and the gold
medals have been handed out, the data can be usefully analyzed to get
an idea of how much advantage the possession of XY chromosomes may
have over XX chromosomes in boxing or similar strength-oriented
sports.
First off, let’s acknowledge that the specifics of this
situation are still being debated. The details of the circumstances
are rather murky and the cases are deeply embroiled in controversies
about gender and sex. But below is a quick sketch of what seems to
be the issue, based on the claims that have been made by the various
sides of the debate. There are probably other nuances, but this is
the general idea.
The two boxers in question were born and raised as girls and think
of themselves as such. However, the general consensus seems to be
that they have XY chromosomes, but underwent female development in
the womb, rather than male development, as would almost always be the
case. These situations do arise in a small percentage of pregnancies
(probably on the order of one in ten thousand, but these numbers are
also disputed).
This condition results in the outward forms of female biology, but
the possession of XY genetics results in post-puberty development
that would normally be associated with males. These XY triggered
hormonal changes are assumed to have given these women substantial
physical advantages over XX chromosome women. Those advantages are
mostly related to more and stronger muscles as well as denser bone
structure, similar to what men develop. Both of those factors would
be expected to be significant advantages in a sport like boxing.
Analysis 1
Each of the two XY women boxers faced four XX opponents in their
respective weight divisions, with each winning all four of their
matches. A bit of probability theory along with some simulation data
can thereby be used to estimate the minimum advantage that the XY
chromosomes confer.
This is a pretty interesting natural experiment, as the sorting of
participants into weight categories basically controls for the factor
of greater overall muscle mass that is usually the case when women
and men have physical confrontations. It is also useful that there
are two women boxers that are said to have this XY chromosome
condition. That makes the result more robust, as it reduces the
probability that some other cause was producing the wins (it is
unlikely that both boxers would share some unknown cause).
An initial analysis, based on very basic probability theory
follows:
-
Assume a given probability of XY
Boxer 1winning a single match is P1.
-
Calculate the probability of
winning 4 consecutive matches, which is P1All = P1*P1*P1*P1.
-
Do the same for Boxer 2, call
that P2.
-
Calculate the probability of
winning 4 straight. P2All = P2*P2*P2*P2
-
Calculate the probability of
Boxer1 AND Boxer2 winning their respective categories, in 4 straight
wins. P1_2_All = P1All * P2All
-
Repeat the exorcise, using a
different assumed set of probabilities.
-
Create a table and graph
with the results for the entire set of assumptions.
Below are the results in graph and table form. Here are some
comments:
-
If we assume that the prior probability of an XY boxer
winning any given match is 50% (i.e. the XY factor has no effect,
positive or negative), then the likelihood of both boxers with the
condition winning all of their matches, simply by chance, is low, at
about 1/64 or 0.4%. This is similar to the case of two people
flipping fair coins at the same time and each getting 4 heads in a
row. It isn’t impossible, but you would have serious doubts about
the coins.
-
We might think of the converse of the above. If XY
chromosomes confer no advantage, we would expect at least 1 loss,
with 99% confidence in our expectation.
-
To use statistical jargon, if we had a null hypothesis that
XY chromosomes confer no advantage, we would have to reject the null
hypothesis at the 99% confidence level. That would mean accepting
the alternative hypothesis, that the XY chromosome does confer a
substantial advantage.
-
If we assumed that the prior probability of an XY boxer
winning any given match against an XX is 80%, then we would not be
at all surprised that these boxers both took home gold, In fact, we
would expect that about half the time, if the experiment were to be
repeated in another Olympics.
-
As the prior probability goes up, we would be less and less
surprised by the two gold medal result.
Probability of Winning All 4 Matches
|
Given a Single Match Probability
|
Boxer 1
|
|
Boxer 2
|
|
Win 1 Match
|
Win All 4 Matches
|
Win 1 Match
|
Win All 4 Matches
|
50.0%
|
6.3%
|
50.0%
|
6.3%
|
60.0%
|
13.0%
|
60.0%
|
13.0%
|
70.0%
|
24.0%
|
70.0%
|
24.0%
|
80.0%
|
41.0%
|
80.0%
|
41.0%
|
90.0%
|
65.6%
|
90.0%
|
65.6%
|
95.0%
|
81.5%
|
95.0%
|
81.5%
|
|
|
|
|
Prob of Boxer 1 AND Boxer 2 Winning All 4
Matches
|
|
|
|
Win 1 Match
|
Win All 4 Matches
|
|
|
50.0%
|
0.4%
|
|
|
60.0%
|
1.7%
|
|
|
70.0%
|
5.8%
|
|
|
80.0%
|
16.8%
|
|
|
90.0%
|
43.0%
|
|
|
95.0%
|
66.3%
|
|
|
|
|
|
|
Single Boxer (4 matches)
|
|
|
|
Both Boxers (8 matches)
|
|
|
|
Analysis 2
Another way of analyzing the data is to use Monte Carlo
simulations. This method takes certain probability assumptions,
generates random data with those assumptions and gives results for
that run of the simulation. That process can be repeated for a large
number of runs (in a computer) to see the distribution of likely
outcomes. The actual outcome can then be compared to the various
sets of assumptions (scenarios) best fits the actual data.
In this case:
-
Matches were created between
XY female boxers and XX boxers, 4 per XY boxer, 2 such boxers per
trial (this would be equivalent to an “Olympic Tournament”, such
as in 2024, but focussing only on the two categories with XY female
participants).
-
The XY and XX boxers were
given power scores on a variable that might be thought of as muscle
power, in some arbitrary units. The scores were assigned from a
simulated normal distribution.
-
To start, the mean for both
XY and XX opponents was set to 100, with the standard deviation set
to 10. The computer created a random number drawn from a normal
distribution with these statistical characteristics.
-
The winner of each match was
the participant with the higher score.
-
The XY vs XX status of the
winner was recorded.
-
The number of XY winners for
each trial was counted.
-
The process was repeated for
a new set of scores, varying the mean of the XY group for each run.
-
The difference between XY and
XX mean muscle score is referred to as “XY Boost”.
-
The count of XX winners for
each year was put into a table and graph, showing the distribution
of XY win counts over a 200 ‘year’ run of the model.
Below are the results in graph and table form (including the
appendix). Here are some comments:
-
In Run 1, the XY and XX women are given equal scores
for the latent muscle power variable. The result is a nice normally
distributed histogram, where the XY and XX participants were about
equally likely to win matches. An 8 match sweep only occurred in 1%
of the model runs. This is exactly what one would expect when
running a model with these equal parameters (means = 100, Sds = 10).
-
In Run 2, the scores for the latent muscle power
variable for the XY group are assigned to be 10% higher than the XX
group. The histogram of XY winners is now shifted to the higher
range. Scores of 8 XY wins are rather rare (about 10% of the time),
but they do occur. This is exactly what one would expect when
running a model with scores somewhat tilted towards the XY (means =
110 XY and 100 XX, Sds = 10).
-
In Run 3, the scores for the latent muscle power
variable for the XY group are assigned to be 20% higher than the XX
group. The histogram is now shifted very prominently towards higher
numbers of wins for the XY group (7s and 8s), and is very skewed. A
full sweep of 8 wins for the XY group now occurs in about half of
the simulations.
-
In Run 4, the scores for the latent muscle power
variable for the XY group are assigned to be 30% higher than the XX
group. The win count is now very skewed towards the XY group, with
an 8 match sweep occurring about 85% of the time.
-
In Run 5, the scores for the latent
muscle power variable for the XY group are assigned to be 40% higher
than the XX group. The win count is now very skewed towards the XY
group, with an 8 match sweep occurring 99% of the time, almost a
sure thing.
Conclusion
This data does support the idea that
the XY women (or whatever term is more appropriate in this situation)
did have a significant advantage over the XX women in the Olympic
Boxing competition. It would be useful to have more data and more
transparent data. While it is technically possible to have a lucky streak
of 8 wins in these matches, it is highly unlikely. It seems pretty
clear that something systematic was contributing to their success.
Or, to use a vernacular phrase - “that’s how the smart money
bets.”
The size of the effect seemed likely
to be somewhere between 20% and 30% though it could be more, With
this data one can only establish a lower bound. Research on this
subject that compares male muscle strength to female, generally gives
males a 50% edge (i.e. females are about two-thirds as strong, pound
for pound). This data could support a claim of 50% advantage to XY
females, but that seems unlikely, given the likely complicated
biological and social factors at work that could reduce the “XY”
effect.
Note that as
someone who analyzes data (professionally and away-from-the job), I
am making no ethical or sociological claims. Just laying out the
facts, as they seem to be, in this case.
-------------------------------------------------------------------------------------------
Here are some links to biologically based research on this subject:
Gender Differences in Strength |
According to the Journal of Exercise Physiology, women generally produce about two-thirds the amount of total strength and applied force that men produce. Women are also physically built so that they generally carry two-thirds as much muscle mass as men. |
This proves that there is, in fact, a difference in strength, that men are typically stronger, and that most of the difference is based on body size and muscle cross-sectional area alone. |
|
It can be seen, however, that women tend to match the strength of men more closely in lower body muscles than in upper body muscles. For examples, squats and lunges come easier to women than push-ups or pull-ups. |
|
https://www.livestrong.com/article/509536-muscular-strength-in-women-compared-to-men/ |
|
|
Journal of Exercise Physiology online |
Volume 19 Number 5 |
Male Relative Muscle Strength Exceeds Females for |
Bench Press and Back Squat |
Estêvão Rios Monteiro1, Amanda Fernandes Brown1, Leonardo |
Bigio1, Alexandre Palma1, Luiz Gustavo dos Santos1, Mark Tyler |
Cavanaugh2, David George Behm2, Victor Gonçalves Corrêa Neto1,3 |
1School of Physical Education and Sports, Federal University of Rio |
de Janeiro, Rio de Janeiro, Brazil, 2Scholl of Human Kinetics, |
Memorial University of Newfoundland, Canada, 3Gama e Souza |
University, Rio de Janeiro, Brazil |
|
|
RESULTS |
Men were shown to have significantly greater relative force with both CP (P<0.001; 157.1%) |
and BS (P<0.001; 67.1%) when compared to women. |
|
https://www.asep.org/asep/asep/JEPonlineOCTOBER2016_Monteiro_Bigio.pdf |
----------------------------------------------------------------------------------------------------
Appendix
Run 1 – XY and XX Have Equal Muscle Power Scores
Run 2 – XY has 10% Higher
Muscle Power Scores than XX
Run 3 – XY
has 20% Higher Muscle Power Scores than XX
Run 4 – XY
has 30% Higher Muscle Power Scores than XX
Run 5 – XY
has 40% Higher Muscle Power Scores than XX
And, here is a sports related
book, where probabilities are a key part of the story (not that I am
implying that there is any similarity to the story, which is based on
an entirely different imagined scenario):
A Dark Horse
In “A
Dark Horse”, a gambler’s desire to hit a big win seems to lead
him to make a Faustian bargain with a supernatural evil. Or is
it all just a string of unnaturally good luck?
The
story is just $0.99 U.S. (equivalent in other currencies) and about
8000 words. It is also available on Kindle Unlimited and is
occasionally on free promotion.
U.S.:
https://www.amazon.com/dp/B01M9BS3Y5
U.K.:
https://www.amazon.co.uk/dp/B01M9BS3Y5
Germany:
https://www.amazon.de/dp/B01M9BS3Y5
France:
https://www.amazon.fr/dp/B01M9BS3Y5
Italy:
https://www.amazon.it/dp/B01M9BS3Y5
Netherlands:
https://www.amazon.nl/dp/B01M9BS3Y5
Spain:https://www.amazon.es/dp/B01M9BS3Y5
Japan:
https://www.amazon.co.jp/dp/B01M9BS3Y5
India:
https://www.amazon.in/dp/B01M9BS3Y5
Mexico:
https://www.amazon.com.mx/dp/B01M9BS3Y5
Brazil:
https://www.amazon.com.br/dp/B01M9BS3Y5
Canada:
https://www.amazon.ca/dp/B01MDMY2BR
Australia:
https://www.amazon.com.au/dp/B01M9BS3Y5