A Statistical Analysis of XX vs XY Chromosomes Advantage in Olympic Boxing
Introduction
Now that all the 2024 Olympic boxing matches are done and the gold medals have been handed out, the data can be usefully analyzed to get an idea of how much advantage the possession of XY chromosomes may have over XX chromosomes in boxing or similar strength-oriented sports.
First off, let’s acknowledge that the specifics of this situation are still being debated. The details of the circumstances are rather murky and the cases are deeply embroiled in controversies about gender and sex. But below is a quick sketch of what seems to be the issue, based on the claims that have been made by the various sides of the debate. There are probably other nuances, but this is the general idea.
The two boxers in question were born and raised as girls and think of themselves as such. However, the general consensus seems to be that they have XY chromosomes, but underwent female development in the womb, rather than male development, as would almost always be the case. These situations do arise in a small percentage of pregnancies (probably on the order of one in ten thousand, but these numbers are also disputed).
This condition results in the outward forms of female biology, but the possession of XY genetics results in post-puberty development that would normally be associated with males. These XY triggered hormonal changes are assumed to have given these women substantial physical advantages over XX chromosome women. Those advantages are mostly related to more and stronger muscles as well as denser bone structure, similar to what men develop. Both of those factors would be expected to be significant advantages in a sport like boxing.
Analysis 1
Each of the two XY women boxers faced four XX opponents in their respective weight divisions, with each winning all four of their matches. A bit of probability theory along with some simulation data can thereby be used to estimate the minimum advantage that the XY chromosomes confer.
This is a pretty interesting natural experiment, as the sorting of participants into weight categories basically controls for the factor of greater overall muscle mass that is usually the case when women and men have physical confrontations. It is also useful that there are two women boxers that are said to have this XY chromosome condition. That makes the result more robust, as it reduces the probability that some other cause was producing the wins (it is unlikely that both boxers would share some unknown cause).
An initial analysis, based on very basic probability theory follows:
-
Assume a given probability of XY Boxer 1winning a single match is P1.
-
Calculate the probability of winning 4 consecutive matches, which is P1All = P1*P1*P1*P1.
-
Do the same for Boxer 2, call that P2.
-
Calculate the probability of winning 4 straight. P2All = P2*P2*P2*P2
-
Calculate the probability of Boxer1 AND Boxer2 winning their respective categories, in 4 straight wins. P1_2_All = P1All * P2All
-
Repeat the exorcise, using a different assumed set of probabilities.
-
Create a table and graph with the results for the entire set of assumptions.
Below are the results in graph and table form. Here are some comments:
-
If we assume that the prior probability of an XY boxer winning any given match is 50% (i.e. the XY factor has no effect, positive or negative), then the likelihood of both boxers with the condition winning all of their matches, simply by chance, is low, at about 1/64 or 0.4%. This is similar to the case of two people flipping fair coins at the same time and each getting 4 heads in a row. It isn’t impossible, but you would have serious doubts about the coins.
-
We might think of the converse of the above. If XY chromosomes confer no advantage, we would expect at least 1 loss, with 99% confidence in our expectation.
-
To use statistical jargon, if we had a null hypothesis that XY chromosomes confer no advantage, we would have to reject the null hypothesis at the 99% confidence level. That would mean accepting the alternative hypothesis, that the XY chromosome does confer a substantial advantage.
-
If we assumed that the prior probability of an XY boxer winning any given match against an XX is 80%, then we would not be at all surprised that these boxers both took home gold, In fact, we would expect that about half the time, if the experiment were to be repeated in another Olympics.
-
As the prior probability goes up, we would be less and less surprised by the two gold medal result.
Probability of Winning All 4 Matches |
|||
Given a Single Match Probability |
|||
Boxer 1 |
|
Boxer 2 |
|
Win 1 Match |
Win All 4 Matches |
Win 1 Match |
Win All 4 Matches |
50.0% |
6.3% |
50.0% |
6.3% |
60.0% |
13.0% |
60.0% |
13.0% |
70.0% |
24.0% |
70.0% |
24.0% |
80.0% |
41.0% |
80.0% |
41.0% |
90.0% |
65.6% |
90.0% |
65.6% |
95.0% |
81.5% |
95.0% |
81.5% |
|
|
|
|
Prob of Boxer 1 AND Boxer 2 Winning All 4 Matches |
|
|
|
Win 1 Match |
Win All 4 Matches |
|
|
50.0% |
0.4% |
|
|
60.0% |
1.7% |
|
|
70.0% |
5.8% |
|
|
80.0% |
16.8% |
|
|
90.0% |
43.0% |
|
|
95.0% |
66.3% |
|
|
|
|
|
|
Single Boxer (4 matches) |
|
|
|
Both Boxers (8 matches) |
|
|
|
Analysis 2
Another way of analyzing the data is to use Monte Carlo simulations. This method takes certain probability assumptions, generates random data with those assumptions and gives results for that run of the simulation. That process can be repeated for a large number of runs (in a computer) to see the distribution of likely outcomes. The actual outcome can then be compared to the various sets of assumptions (scenarios) best fits the actual data.
In this case:
-
Matches were created between XY female boxers and XX boxers, 4 per XY boxer, 2 such boxers per trial (this would be equivalent to an “Olympic Tournament”, such as in 2024, but focussing only on the two categories with XY female participants).
-
The XY and XX boxers were given power scores on a variable that might be thought of as muscle power, in some arbitrary units. The scores were assigned from a simulated normal distribution.
-
To start, the mean for both XY and XX opponents was set to 100, with the standard deviation set to 10. The computer created a random number drawn from a normal distribution with these statistical characteristics.
-
The winner of each match was the participant with the higher score.
-
The XY vs XX status of the winner was recorded.
-
The number of XY winners for each trial was counted.
-
The process was repeated for a new set of scores, varying the mean of the XY group for each run.
-
The difference between XY and XX mean muscle score is referred to as “XY Boost”.
-
The count of XX winners for each year was put into a table and graph, showing the distribution of XY win counts over a 200 ‘year’ run of the model.
Below are the results in graph and table form (including the appendix). Here are some comments:
-
In Run 1, the XY and XX women are given equal scores for the latent muscle power variable. The result is a nice normally distributed histogram, where the XY and XX participants were about equally likely to win matches. An 8 match sweep only occurred in 1% of the model runs. This is exactly what one would expect when running a model with these equal parameters (means = 100, Sds = 10).
-
In Run 2, the scores for the latent muscle power variable for the XY group are assigned to be 10% higher than the XX group. The histogram of XY winners is now shifted to the higher range. Scores of 8 XY wins are rather rare (about 10% of the time), but they do occur. This is exactly what one would expect when running a model with scores somewhat tilted towards the XY (means = 110 XY and 100 XX, Sds = 10).
-
In Run 3, the scores for the latent muscle power variable for the XY group are assigned to be 20% higher than the XX group. The histogram is now shifted very prominently towards higher numbers of wins for the XY group (7s and 8s), and is very skewed. A full sweep of 8 wins for the XY group now occurs in about half of the simulations.
-
In Run 4, the scores for the latent muscle power variable for the XY group are assigned to be 30% higher than the XX group. The win count is now very skewed towards the XY group, with an 8 match sweep occurring about 85% of the time.
-
In Run 5, the scores for the latent muscle power variable for the XY group are assigned to be 40% higher than the XX group. The win count is now very skewed towards the XY group, with an 8 match sweep occurring 99% of the time, almost a sure thing.
Conclusion
This data does support the idea that the XY women (or whatever term is more appropriate in this situation) did have a significant advantage over the XX women in the Olympic Boxing competition. It would be useful to have more data and more transparent data. While it is technically possible to have a lucky streak of 8 wins in these matches, it is highly unlikely. It seems pretty clear that something systematic was contributing to their success. Or, to use a vernacular phrase - “that’s how the smart money bets.”
The size of the effect seemed likely to be somewhere between 20% and 30% though it could be more, With this data one can only establish a lower bound. Research on this subject that compares male muscle strength to female, generally gives males a 50% edge (i.e. females are about two-thirds as strong, pound for pound). This data could support a claim of 50% advantage to XY females, but that seems unlikely, given the likely complicated biological and social factors at work that could reduce the “XY” effect.
Note that as someone who analyzes data (professionally and away-from-the job), I am making no ethical or sociological claims. Just laying out the facts, as they seem to be, in this case.
-------------------------------------------------------------------------------------------
Here are some links to biologically based research on this subject:
Gender Differences in Strength |
According to the Journal of Exercise Physiology, women generally produce about two-thirds the amount of total strength and applied force that men produce. Women are also physically built so that they generally carry two-thirds as much muscle mass as men. |
This proves that there is, in fact, a difference in strength, that men are typically stronger, and that most of the difference is based on body size and muscle cross-sectional area alone. |
It can be seen, however, that women tend to match the strength of men more closely in lower body muscles than in upper body muscles. For examples, squats and lunges come easier to women than push-ups or pull-ups. |
https://www.livestrong.com/article/509536-muscular-strength-in-women-compared-to-men/ |
Journal of Exercise Physiology online |
Volume 19 Number 5 |
Male Relative Muscle Strength Exceeds Females for |
Bench Press and Back Squat |
Estêvão Rios Monteiro1, Amanda Fernandes Brown1, Leonardo |
Bigio1, Alexandre Palma1, Luiz Gustavo dos Santos1, Mark Tyler |
Cavanaugh2, David George Behm2, Victor Gonçalves Corrêa Neto1,3 |
1School of Physical Education and Sports, Federal University of Rio |
de Janeiro, Rio de Janeiro, Brazil, 2Scholl of Human Kinetics, |
Memorial University of Newfoundland, Canada, 3Gama e Souza |
University, Rio de Janeiro, Brazil |
RESULTS |
Men were shown to have significantly greater relative force with both CP (P<0.001; 157.1%) |
and BS (P<0.001; 67.1%) when compared to women. |
https://www.asep.org/asep/asep/JEPonlineOCTOBER2016_Monteiro_Bigio.pdf |
----------------------------------------------------------------------------------------------------
Appendix
Run 1 – XY and XX Have Equal Muscle Power Scores
Run 2 – XY has 10% Higher Muscle Power Scores than XX
Run 3 – XY has 20% Higher Muscle Power Scores than XX
Run 4 – XY has 30% Higher Muscle Power Scores than XX
Run 5 – XY has 40% Higher Muscle Power Scores than XX
And, here is a sports related book, where probabilities are a key part of the story (not that I am implying that there is any similarity to the story, which is based on an entirely different imagined scenario):
A Dark Horse
In “A Dark Horse”, a gambler’s desire to hit a big win seems to lead him to make a Faustian bargain with a supernatural evil. Or is it all just a string of unnaturally good luck?The story is just $0.99 U.S. (equivalent in other currencies) and about 8000 words. It is also available on Kindle Unlimited and is occasionally on free promotion.
U.S.: https://www.amazon.com/dp/B01M9BS3Y5
U.K.: https://www.amazon.co.uk/dp/B01M9BS3Y5
Germany: https://www.amazon.de/dp/B01M9BS3Y5
France: https://www.amazon.fr/dp/B01M9BS3Y5
Italy: https://www.amazon.it/dp/B01M9BS3Y5
Netherlands: https://www.amazon.nl/dp/B01M9BS3Y5
Spain:https://www.amazon.es/dp/B01M9BS3Y5
Japan: https://www.amazon.co.jp/dp/B01M9BS3Y5
India: https://www.amazon.in/dp/B01M9BS3Y5
Mexico: https://www.amazon.com.mx/dp/B01M9BS3Y5
Brazil: https://www.amazon.com.br/dp/B01M9BS3Y5
Canada: https://www.amazon.ca/dp/B01MDMY2BR
Australia: https://www.amazon.com.au/dp/B01M9BS3Y5
No comments:
Post a Comment