I built an explorer of 25+ years of New York Times coverage – 1.5B words and 2.2M articles

Below the Fold: “I used New York Times API/archive data to build an explorer of the paper’s coverage over the last 25+ years: 1.5 billion words across 2.2 million articles by about 26,000 reporters. You can use it to look at:

which reporters covered which beats
who shared bylines with whom
article frequency and length
headline-word frequency over time
section comparisons
U.S. and global coverage patterns

A few things that jump out at me:

to the surprise of no one, Maggie Haberman dominates recent byline counts
Trump dominates headlines compared to other recent presidents, even when OOO
Iowa surges every four years
China coverage peaked around 2014
India looks relatively under-covered on a per-capita basis

I began this in Python a couple of years ago during the Lede Program at Columbia J School but revived it recently with Claude Code for a lot of the grunt work. Any errors are mine. Let me know what you think! Explorer: https://tedalcorn.github.io/nyt/

Facebook LinkedIn

Posted in: Internet, Knowledge Management

I built an explorer of 25+ years of New York Times coverage – 1.5B words and 2.2M articles

Thank you!