Are Athletes Great for Longer?

Tom Brady just won the superbowl again.

Previously, you could have attributed his streak to the Patriots as a team. But then he left for the Buccaneers, which had not won a superbowl since 2002, and immediately won again. He pulled Rob Gronkowski out of retirement who had previously been with him on the Patriots, so maybe it’s the two of them together, but still.

It’s even more insane when you remember that Tom Brady’s first Super Bowl win was in 2002. Was he some rookie back then, nominally on the team but not really contributing to the victory? Not at all, as in 2021, he was named MVP.

That makes for a 19 year rein, spanning 10 Super Bowl appearances, 7 wins, and 5 MVP awards.

It’s not just football. In other sports, we have:

  • Roger Federer: First won Wimbledon in 2003, last won in 2017, and still ranked #5 in the world in 2021.
  • Serena Williams: First won Wimbledon in 2002, last won in 2016, currently #11.
  • Magnus Carlsen: First ranked #1 in 2010, still ranked #1 in 2021.
  • Tiger Woods: First won Masters in 1997, last won in 2019. Ranked #1 for 683 weeks.

Is this weird? It should be. It’s not just that a 43 year old man is the best athlete in a full-contact sport, it’s that athletes are getting much better across the board, but still can’t supplant last decade’s champions.

Consider the progression in men’s marathon times:

Even more dramatic, here’s the progression in men’s 100 meter times:

It makes sense that humanity is getting better at sports. Presumably, we’re improving at sports nutrition, medicine, and coaching, but also just have more humans and better talent selection. Over time, each generation should be better than the last.

So why can’t anyone beat Tom Brady?

Maybe he really is just uniquely good, but you couldn’t say the same of Federer in tennis. Rafael Nadal has also been among the world’s best from 2005 (first French Open win) until today (ranked #2, won the French Open again in 2020). Djokovic, currently ranked #1, first achieved the ranking in 2011, and first won a major Open (the Australian) in 2008.

First, it’s worth asking if this is actually a recent phenomenon. Taking a look at the list of Wimbledon champions, Federer is #1 of all time with victories spanning 14 years. William Renshaw is in second with victories spanning just 8 years, followed by Pete Sampras with victories spanning 7 years. After that, it’s Djokovic again across 8 years.

The crazy thing is, it seems entirely possible that Federer could keep winning. He made it to the Wimbledon finals in 2019 where he had a close game (the longest final singles match ever) against Djokovic, and was playing well in 2020 until stepping back to recover from an injury.

What about women’s tennis? Here’s the list of Wimbledon champions. Serena Williams does not have the most wins ever, but her victories span 14 years. In contrast, the top players are Martina Navratilova (12 years), Helen Wills Moody (11 years), Dorothea Lambert Chambers (11 years), Steffi Graf (8 years). There’s one exception, Blanche Bingley won from 1886 to 1900. Not to take away from her accomplishments, but since the championship had just begun in 1884, it’s fair to assume there just wasn’t as much competition. At the very least, Williams is unmatched in modern history.

Is the continued dominance of Brady, Williams and Federer proof that we’re living in the greatest era of history, or a sign that we’re no longer improving?

In football, other players don’t even come close. Brady has MVPs spanning 19 years, compared to 8 for Joe Montana, and 4 for Eli Manning. But the award is a bit arbitrary, and not an objective measure of individual achievement.

In golf, Tiger Woods’ Masters Tournament victories span 22 years, compared to 23 for Jack Nicklaus. No other player comes close.

Finally, here’s the timeline of #1 ranked players in chess:

Carlsen’s 10 year streak is impressive, but doesn’t quite match Kasparov’s 21 years. Still, Carlsen has time. Kasparov was 43 when his reign ended and Carlsen is just 30 today

But still, the fact that any of this is happening at all is strange to me, given how much better humans have gotten at running. Maybe the advances in sports medicine lengthen athlete’s careers more than they help new players excel.

Or maybe it’s all just selection bias since I only looked at the athletes that came to mind.

In conclusion: In men tennis and football, current players have unmatched longevity, winning the top award over longer time spans than their predecessors.  In women’s tennis, it’s a slim margin. In Chess, Carlsen may continue to dominate and beat Kasparov’s record. In golf, Tiger Woods is one year behind the historical longest streak, but still stands a chance of winning again. More work is needed to see if this phenomenon holds true in other athletic competitions.


Simon M says:

This is one of my favourite topics, and I enjoyed your post, there’s some more ideas which I’ve pondered for a while which you didn’t touch on:

1/ Increasing economic rewards. (You earn far more as a top professional now, so staying in an extra year, or ten is much more worthwhile)

2/ Catching the wave at the “right time”. If the tools for longevity (training methods, drugs, whatever) are a recent phenomenon and there is some advantage to incumbency, then you would expect the current crop of top players to last longer but also hold off competitors for longer so even talented ones will have a shorter window where they are top of the sport.

3/ In soccer, Messi and Ronaldo have been dominating the game for a v. long time.

These are all good points. The second presents a particularly interesting dynamic. Using the toy model:

  1. Athletes get better over time due to experience, but more and more slowly
  2. Athletes get worse over time due to again, and this accelerates

A really interesting implication of improved “tools for longevity” is that there will never be another Tom Brady. The next greatest QB might be 30 by the time he outpaces Brady. Since 43 year old Brady is so good, the next 21 year old Brady won’t have a chance to win Super Bowl MPV.


Daniel Filan mentions over email that Go players are holding the #1 spot for less and less time. There’s a pretty wild chart:

Shin Jinseo comes out of nowhere and blows everyone else away, attaining the highest rating of all time by a healthy margin. XKCD provids a similar chart for chess:

Notably, the go players are also super young.

Why does this blog take so long to write?

I write full-time. That’s not to say I’m at my desk 40 hours a week, just that I don’t have anything else going on.

In January, I published a post on average every other day, then didn’t publish for a week until today.

The latest post is good, but does it represent a week’s output? Why does everything take so long? Am I even trying?

There are a couple answers:

  • I only publish ~50% of the posts I write
  • Every post I publish has to be rewritten at least once

It doesn’t take too long to hash out a quick draft, but then I try to get feedback, make the post more compelling and easier to read, and fact check my claims. That all takes longer than the original writing process.
Not everyone is this way. Scott Alexander and Byrne Hobart (#1 and #2 in Substack’s Technology section) have claimed they write more or less stream-of-thought. Maybe it’s a magical gift, but maybe it’s because they’ve both been writing for 10 years.

Feedback is tough as well. I try to send every post to at least one person before publication to have some level of editorial accountability. Often, it’s Alexey Guzey, known for his contrarian takes and brutal criticism. This is great for readers since it improves my writing, but also means that rather than nitpicking grammar, feedback is likely to expose a foundational flaw in my reasoning, resulting in lengthy rewrites.

In other cases, feedback takes time because I want to consult the people in question. Before publishing the case against StatNews, I tried to get in contact twice. Before publishing against Lambda School, I tried to get in touch with their Chief of Staff. I also sought out feedback from Adam Marblestone and Ashish Arora before commenting on their respective publications.

Getting feedback from an actual expert is really tough. They have spent, in some cases, entire careers thinking about the topic I tried to understand in a couple days. I am often embarrassed, try really hard to not say anything wrong, and take their feedback very seriously. This all takes time.

Finally, I am not an expert in anything I write about, so there is a learning curve spanning days or weeks. Especially for posts that criticize the original source, I am very scared of making a bold condemnation of someone else’s work, and then realizing that I’m totally wrong.

On that note, I also write about a broad variety of topics, which means I’m more or less starting from scratch each time. It wouldn’t take me very long to write another post about Lambda School since I already have the required context, but I’m not that interested in writing it, and I don’t think you would want to read it. I read their recent report and have attempted to stay updated in case it turns out that I was wrong. My impression is that they are still misleading students, and I don’t have much more to say.

This blog is an attempt to learn in public in real time, which means writing about things I don’t already understand. Additionally, the deeper I go into any one vein, the more context each reader needs to understand what I’m talking about, and the less likely it is they’ll have read all relevant previous posts. Because blog readership is growing quickly, I don’t assume the median reader is familiar with my previous work.

You may worry that this means never going deep enough to get anywhere interesting. So why is this blog even worth reading?

In short, I have different incentives and opportunities. Ashish Arora is an academic, and his only way to express a professional opinion is to publish a paper. He’s not going to make the point I did about Bell Labs, even though he’s more than intellectually capable of doing so. For most of the world, blogging is still a very weird niche.

And on the other end of the spectrum, you have popular writers trying to build their following. They grow their audience by having an opinion and making bold, easy to follow proclamations of belief. Of course, there are exceptions, but you might be surprised how few. Taking blogging seriously correlates with taking Twitter seriously, which means I’m part of a vanishingly small population that does the former without the latter.

Notes on Adam Marblestone’s Focused Research Organizations

After arguing that industrial research labs won’t return, I was hopeful for a new mechanism to reignite transformative research, but had no idea what it would actually be.

A couple hours later, Nintil told me about Focused Research Organizations.

In short: FROs promise to pair government funding with startup agility, and a mandate to pursue high-impact pre-commercial research. There isn’t a lot of literature out on them yet, but it’s a compelling proposal. [0]

Going into this, it’s worth noting that Adam explicitly says FROs are not a replacement for existing institutions, but they might be good at the margin. Specifically, the proposed budget is $1 billion over 5-7 years, around 0.3% of all federal R&D funding.

As with everything, I have a lot of criticism, but let me start by saying that I’m excited! We need new ideas, and actually experimenting with funding models will help us advance much faster than musing in a vacuum. The existing systems are filled with their own flaws, and I’m not attempting a cost-benefit analysis, so nothing critical I say should be taken to mean that I am net negative.

But first, what does Adam actually envision?

Summary of the FRO Proposal

The bulk of Adam’s thinking is laid out in the FRO Whitepaper. There’s also some discussion in the Idea Machines Podcast, and a bit more in a talk he gave with Nintil a couple months ago.

The core proposal: “Startup-like organizations, but pursuing pure science outcomes with no market.” In another venue, Adam describes them as “a special purpose organization to pursue a defined problem over a finite period of time.”

Or expressed negatively: FROs target projects that cannot be addressed by any existing organization or funding mechanism. The university system is massive, receiving around $40 billion in federal funding each year. Last time, I argued that this was one of the principle reasons behind the death and continued absence of industrial research labs. Why pay to innovate in-house when there’s a $40 billion in research happening for free?

Reasoning backwards from other people’s first principles is precisely the way to get around this. Adam understands the university system deeply, knows what it’s mechanisms cannot produce, and aims to directly target the research that can’t currently happen.

You can think about this through analogy to startups. A small team of 2 founders with $125k in funding could never reasonably compete with Google head-on. The first step for many pitch decks is to explain why, if the idea is actually good, it hasn’t already been done. [1]

So where do universities and corporate labs systematically fail? Adam argues it’s projects which “require levels of coordinated engineering or system-building inaccessible to academia” and “benefit society broadly in ways that industry cannot rapidly monetize”. Stated elsewhere, he says the fundamental tension is between academic settings which incentivize individual achievement over team coordination, and industry organizations which are better at teamwork, but don’t aim to produce public goods.

It’s worth being really clear about what this last claim requires. It’s not that Adam is condemning startups or capitalism, or that he believes industry never produces public goods. The argument is merely that there are some projects, at the margin, which:

  1. Require teamwork and cross-disciplinarity
  2. Only make sense if you intrinsically value the resulting public good

In this sense, rather than the abstract basic/translational/applied trichotomy, Adam talks about projects that are “pre-commercial” or don’t have a clear path to monetization in the short-term. [2]

That’s the narrative about why FROs ought to exist in the presence of existing organizations. But how do they actually work?

The federal government (or philanthropist) commits $1 billion over 5-7 years, paid out to 16 projects, each with a focused research charter. These ought to be time-bound, and there should be a clearly defined end goal that results if the project is successful. The whitepaper also notes that each should be led top-down by a CEO in the style of startups, rather than by committee or by a decentralized collective of loosely affiliated researchers.

The initial line sums it up pretty well. An FROs is a government-funded organization, run like a startup, focused on pre-comercial science, operating over a finite time horizon.

Commentary

Startups are great, but we should understand them ecologically: the result of a precarious balance held in place by a surrounding ecosystem. Swap out one part, and you don’t get “startups but for X”, you get a hot mess. [3]

Adam proposes taking startups, and making the following modifications:

  • Instead of a profit motive, FROs follow a scientific charter
  • Instead of pivoting as they run the idea maze, FROs have a specific predetermined goal
  • Instead of market signals and user validation, work is guided top-down by a CEO
  • Instead of VCs who compete for opportunities and bet against each other, the only funding agent is the federal government

I worry that any one of these substitutions would be fatal, and it’s not clear that they collectively bring us to a new stable equilibrium.

While there are some legends of startups that worked hermetically for a year or two, it’s hard to think of tech companies that were pre-commercial for 5-7 years, and maintained an attachment to reality. The famous anecdotes all turned out to be vaporware or massively over-hyped (Theranos, Magic Leap).

Another concern is with leadership. Startups are able to move quickly by vesting their CEO with dictatorial power [4], but they’re ultimately beholden to metrics. There is centralized power and a cult of personality, but their authority still stems from each employee’s faith in growth, validated against a steady exponential growth in valuation, revenue and/or user base.

It’s possible FROs could supplant the entire process with a rigorously predefined scientific charter. In one podcast, Adam mentions literally using “characters sequenced” as a measure of progress for the human genome project. The whitepaper states that in general, “FROs should [be] driven by quantitative metrics and/or concrete design goals”.

This also addresses a key concern over any transformative research: measuring impact. If a result is novel, there’s no point of comparison by definition, and no way of justifying the costs relative to an expected baseline. Even if the work does end up being impactful, it might be 20 years down the line. Running the entire program around a focused charter and quantifiable outputs alleviates some of these concerns. We still won’t know if the chosen projects were the right one, but we’ll at least find out of the program succeeded on its own merits.

If executed correctly, this would provide accountability for the CEO as well, in the same way that growth metrics provide accountability to founders. With a sufficiently well-defined goal, there’s little ambiguity over how well the company is being run.

Finally, I worry about risk. Though the Startup Ecosystem works perfectly well, we have to distinguish between the broader trends and individual firms. At the micro level, each startup is dysfunctional and overwhelmingly likely to fail. Part of this stems from market dynamics, but it’s also the nature of a scrappy ambitious project with a centralized authority and thus a single-point of failure. The exact failure rate depends on your reference population, but a commonly cited estimate is 90%.

If this holds true for FROs, a 16 project portfolio would still have a 19% chance of total failure. Trial and error is the nature of scientific discovery, but the greater harm might be a vast chilling effect on future experimental models.

Doubling the scale of FROs to 32 would get us down to a 3% chance of total failure, and going up 10x to 160 projects would get us down to a mere 0.0000047%.

The expected value doesn’t change, but that doesn’t matter if like startups, FROs are a hits-based enterprise. In venture capital, one big win pays for the entire portfolio. In FROs, one human genome project [5] or [connectome](https://en.wikipedia.org/wiki/Connectome#:~:text=A connectome (%2Fkəˈ,an organism’s nervous system.) could justify the entire budget.

Even then, the whole proposal would amount to just 3% of federal R&D, and only for the brief 5-7 year trial period. [6]

In summary: FROs are promising, and the program should be 10 times bigger. [7]


Thanks to Adam Marblestone to reviewing a draft of this post.

For the original proposal, see his whitepaper with Sam Rodriques.


Footnotes
[0] Adam mentions over email that these could also be philanthropy-funded. In the Idea Machines podcast, he and Ben express some concern over salary-restrictions in government-run projects. Taking Patrick Collison’s notes on the importance of compensation, it’s worth wondering if FROs are even possible with government funding.

Having said that, ARPA-E does give grants to private companies who (I assume) can set their own salaries. I’m not sure exactly where the line is, or if the distinction is between government-funded versus government-run, or if there’s any room for exceptions.

[1] Learning where tech giants systematically fail is an underappreciated reason to get a corporate job for a few years before setting out on your own.

[2] The short-termism of Venture Capital is overstated. You can find an arbitrary number of companies passed up for investment because they lacked a path to monetization, but there are compelling counter examples that suggest VCs are willing to make long term bets so long as the payoff is sufficiently large. Consider Magic Leap, which was founded in 2010 and didn’t have a commercial product until 2018. It’s now considered a failure, but that makes an even stronger case of VC long-termism. They’re willing to fund speculative, long-term, unproven pre-commercial technologies, even when (empirically) there is significant risk of failure.

How does this happen? One answer is that VCs are not actually long-termist, they just get tricked into making long-term decisions. Perhaps Magic Leap told investors in 2010 that they would launch in 2015, then just kept moving the goalposts, and took advantage of sunk cost. Or they just raised from different investors, claiming each year to be under 5 years away from commercialization.

A priori, is there any reason to think VCs even have a short-time horizon? Of course they want returns eventually, but the short-term goal is to have a strong enough track record to raise another fund. That might mean having your portfolio companies hit commercial milestones, but it could just as easily mean that they were able to raise money from other VCs at increasingly high valuations.

[3] Despite the frequent comparisons to startups, the whitepaper makes it seem like FROs are actually much more like National Labs. In a 13 point comparison table, FROs differ substantially from everything else, but have only two points of contrast with National Labs:

  • “Exists as an autonomous organization mobilized in a rapid agile fashion”
  • “Provides strong support for post-project transition to commercialization”
  • FROs aren’t permanent, and don’t provide a clear career path

It’s worth taking a minute to understand what the National Labs are. There are 17, each administered by a different entity, some embedded in universities, others by industry. It’s all under the Department of Energy, but the majority of their funding (55%) is for weapons research, mostly around the US nuclear arsenal. Overall, it’s $12 billion total in annual budget, which is two orders of magnitude larger than the total proposed FRO budget.

It’s not really accurate to say that an FRO is a National Lab, but more agile and with support for post-project commercialization. It’s more like a centi-lab with startup characteristics.

With that in mind, it feels like an easier way to pitch this whole proposal would be: National Labs, but for the life sciences, and as a small proof of concept.

[4] Abuses exist, but are largely moderated by at-will agreements and an active market for startup employees.

[5] Adam brings up the Human Genome Project a few times as an example of a past project that could have been successful as an FRO, but the whole line of augment is a bit confusing. FROs are supposedly for funding projects that couldn’t happen otherwise. Suggesting that they might have funded something that already did historically happen seems to weaken this argument.

[6] This is factually accurate, but it’s not a great line of reasoning. Lots of things would be “only x%” of some federal budget line item. It’s easy to say “The US government could invest $20B into climate justice, for just 0.1% of it’s total budget!”

[7] To be clear: each organization should still be the size Adam proposes ($25-$75 million over 5-7 years).