USA Baby Names Analysis
Are baby names a reflection of our culture?
Every year, a few million parents pick names for their babies. Each baby name serves as a small vote about gender, individuality, sound, identity. What patterns can we find and what do they say about us?
2024 (latest year)
| # | Name | Births |
|---|---|---|
| 1 | Olivia | 14.7k |
| 2 | Emma | 13.5k |
| 3 | Amelia | 12.7k |
| 4 | Charlotte | 12.6k |
| 5 | Mia | 12.1k |
| 6 | Sophia | 12.1k |
| 7 | Isabella | 10.8k |
| 8 | Evelyn | 9.1k |
| 9 | Ava | 8.7k |
| 10 | Sofia | 8.1k |
| # | Name | Births |
|---|---|---|
| 1 | Liam | 22.2k |
| 2 | Noah | 20.3k |
| 3 | Oliver | 15.3k |
| 4 | Theodore | 12.0k |
| 5 | James | 11.8k |
| 6 | Henry | 11.5k |
| 7 | Mateo | 11.3k |
| 8 | Elijah | 11.2k |
| 9 | Lucas | 10.7k |
| 10 | William | 10.6k |
| # | Name | F+M | Lean (M–F) |
|---|---|---|---|
| 1 | Avery | 7.0k | |
| 2 | Riley | 6.2k | |
| 3 | Parker | 6.1k | |
| 4 | Rowan | 5.8k | |
| 5 | River | 4.6k | |
| 6 | Charlie | 4.2k | |
| 7 | Sawyer | 3.8k | |
| 8 | Eden | 3.8k | |
| 9 | Tatum | 3.3k | |
| 10 | Emerson | 3.2k |
Gen Alpha (2013–2024)
| # | Name | Births |
|---|---|---|
| 1 | Olivia | 215K |
| 2 | Emma | 211K |
| 3 | Sophia | 178K |
| 4 | Isabella | 165K |
| 5 | Ava | 165K |
| 6 | Mia | 152K |
| 7 | Charlotte | 148K |
| 8 | Amelia | 138K |
| 9 | Evelyn | 114K |
| 10 | Harper | 111K |
| # | Name | Births |
|---|---|---|
| 1 | Liam | 237K |
| 2 | Noah | 228K |
| 3 | William | 166K |
| 4 | James | 160K |
| 5 | Oliver | 156K |
| 6 | Elijah | 156K |
| 7 | Benjamin | 151K |
| 8 | Mason | 145K |
| 9 | Lucas | 143K |
| 10 | Jacob | 141K |
| # | Name | F+M | Lean (M–F) |
|---|---|---|---|
| 1 | Avery | 115K | |
| 2 | Riley | 86.2k | |
| 3 | Parker | 75.3k | |
| 4 | Sawyer | 56.2k | |
| 5 | Rowan | 49.5k | |
| 6 | Peyton | 48.6k | |
| 7 | Blake | 47.0k | |
| 8 | Quinn | 46.8k | |
| 9 | Charlie | 45.3k | |
| 10 | Hayden | 43.9k |
Gen Z (1997–2012)
| # | Name | Births |
|---|---|---|
| 1 | Emily | 344K |
| 2 | Madison | 281K |
| 3 | Emma | 268K |
| 4 | Hannah | 240K |
| 5 | Olivia | 239K |
| 6 | Abigail | 224K |
| 7 | Isabella | 223K |
| 8 | Samantha | 216K |
| 9 | Elizabeth | 210K |
| 10 | Ashley | 208K |
| # | Name | Births |
|---|---|---|
| 1 | Jacob | 441K |
| 2 | Michael | 409K |
| 3 | Joshua | 358K |
| 4 | Matthew | 357K |
| 5 | Christopher | 324K |
| 6 | Daniel | 319K |
| 7 | Andrew | 316K |
| 8 | William | 307K |
| 9 | Joseph | 304K |
| 10 | Anthony | 294K |
| # | Name | F+M | Lean (M–F) |
|---|---|---|---|
| 1 | Alexis | 248K | |
| 2 | Jordan | 236K | |
| 3 | Taylor | 203K | |
| 4 | Angel | 170K | |
| 5 | Jayden | 159K | |
| 6 | Riley | 125K | |
| 7 | Avery | 92.9k | |
| 8 | Addison | 86.7k | |
| 9 | Hayden | 85.4k | |
| 10 | Peyton | 81.6k |
Millennials (1981–1996)
| # | Name | Births |
|---|---|---|
| 1 | Jessica | 683K |
| 2 | Ashley | 588K |
| 3 | Jennifer | 497K |
| 4 | Amanda | 492K |
| 5 | Sarah | 412K |
| 6 | Stephanie | 324K |
| 7 | Elizabeth | 306K |
| 8 | Brittany | 301K |
| 9 | Nicole | 297K |
| 10 | Emily | 282K |
| # | Name | Births |
|---|---|---|
| 1 | Michael | 949K |
| 2 | Christopher | 784K |
| 3 | Matthew | 680K |
| 4 | Joshua | 606K |
| 5 | David | 532K |
| 6 | Daniel | 518K |
| 7 | James | 503K |
| 8 | Andrew | 466K |
| 9 | John | 464K |
| 10 | Joseph | 460K |
| # | Name | F+M | Lean (M–F) |
|---|---|---|---|
| 1 | Jordan | 213K | |
| 2 | Taylor | 194K | |
| 3 | Casey | 101K | |
| 4 | Morgan | 98.0k | |
| 5 | Shelby | 66.0k | |
| 6 | Angel | 63.9k | |
| 7 | Madison | 46.8k | |
| 8 | Dominique | 46.4k | |
| 9 | Jaime | 39.3k | |
| 10 | Dakota | 39.0k |
Gen X (1965–1980)
| # | Name | Births |
|---|---|---|
| 1 | Jennifer | 752K |
| 2 | Lisa | 509K |
| 3 | Michelle | 410K |
| 4 | Kimberly | 407K |
| 5 | Melissa | 373K |
| 6 | Amy | 368K |
| 7 | Angela | 342K |
| 8 | Mary | 268K |
| 9 | Heather | 249K |
| 10 | Elizabeth | 241K |
| # | Name | Births |
|---|---|---|
| 1 | Michael | 1.19M |
| 2 | David | 816K |
| 3 | James | 799K |
| 4 | John | 756K |
| 5 | Robert | 727K |
| 6 | Christopher | 674K |
| 7 | Jason | 547K |
| 8 | William | 500K |
| 9 | Brian | 493K |
| 10 | Joseph | 408K |
| # | Name | F+M | Lean (M–F) |
|---|---|---|---|
| 1 | Tracy | 192K | |
| 2 | Shannon | 183K | |
| 3 | Shawn | 166K | |
| 4 | Jamie | 138K | |
| 5 | Terry | 98.2k | |
| 6 | Dana | 95.9k | |
| 7 | Leslie | 89.4k | |
| 8 | Lee | 51.3k | |
| 9 | Jody | 43.4k | |
| 10 | Jaime | 43.1k |
Boomers (1946–1964)
| # | Name | Births |
|---|---|---|
| 1 | Mary | 1.13M |
| 2 | Linda | 1.06M |
| 3 | Patricia | 789K |
| 4 | Susan | 750K |
| 5 | Barbara | 632K |
| 6 | Karen | 587K |
| 7 | Deborah | 582K |
| 8 | Nancy | 498K |
| 9 | Donna | 495K |
| 10 | Sandra | 492K |
| # | Name | Births |
|---|---|---|
| 1 | James | 1.57M |
| 2 | Robert | 1.53M |
| 3 | John | 1.53M |
| 4 | Michael | 1.46M |
| 5 | David | 1.40M |
| 6 | William | 1.07M |
| 7 | Richard | 960K |
| 8 | Thomas | 810K |
| 9 | Mark | 684K |
| 10 | Charles | 658K |
| # | Name | F+M | Lean (M–F) |
|---|---|---|---|
| 1 | Terry | 323K | |
| 2 | Robin | 185K | |
| 3 | Kim | 141K | |
| 4 | Lynn | 140K | |
| 5 | Leslie | 119K | |
| 6 | Kelly | 99.2k | |
| 7 | Lee | 90.1k | |
| 8 | Tracy | 74.6k | |
| 9 | Jackie | 74.0k | |
| 10 | Dana | 68.8k |
Silent Generation (1928–1945)
| # | Name | Births |
|---|---|---|
| 1 | Mary | 1.07M |
| 2 | Barbara | 569K |
| 3 | Betty | 498K |
| 4 | Patricia | 468K |
| 5 | Shirley | 361K |
| 6 | Dorothy | 361K |
| 7 | Carol | 294K |
| 8 | Margaret | 293K |
| 9 | Nancy | 288K |
| 10 | Joan | 270K |
| # | Name | Births |
|---|---|---|
| 1 | Robert | 1.12M |
| 2 | James | 1.09M |
| 3 | John | 974K |
| 4 | William | 820K |
| 5 | Richard | 647K |
| 6 | Charles | 568K |
| 7 | Donald | 476K |
| 8 | Thomas | 395K |
| 9 | David | 392K |
| 10 | George | 360K |
| # | Name | F+M | Lean (M–F) |
|---|---|---|---|
| 1 | Willie | 159K | |
| 2 | Marion | 72.2k | |
| 3 | Billie | 54.4k | |
| 4 | Jimmie | 53.9k | |
| 5 | Jessie | 49.8k | |
| 6 | Terry | 48.0k | |
| 7 | Johnnie | 45.6k | |
| 8 | Gail | 45.4k | |
| 9 | Jackie | 44.0k | |
| 10 | Bobbie | 41.8k |
Pre-Silent (1880–1927)
| # | Name | Births |
|---|---|---|
| 1 | Mary | 1.43M |
| 2 | Helen | 613K |
| 3 | Dorothy | 563K |
| 4 | Margaret | 512K |
| 5 | Ruth | 447K |
| 6 | Anna | 371K |
| 7 | Elizabeth | 334K |
| 8 | Mildred | 302K |
| 9 | Frances | 279K |
| 10 | Betty | 268K |
| # | Name | Births |
|---|---|---|
| 1 | John | 1.09M |
| 2 | William | 945K |
| 3 | James | 852K |
| 4 | Robert | 781K |
| 5 | Charles | 530K |
| 6 | George | 526K |
| 7 | Joseph | 470K |
| 8 | Edward | 357K |
| 9 | Frank | 330K |
| 10 | Thomas | 287K |
| # | Name | F+M | Lean (M–F) |
|---|---|---|---|
| 1 | Willie | 218K | |
| 2 | Marion | 133K | |
| 3 | Francis | 117K | |
| 4 | Jessie | 116K | |
| 5 | Shirley | 88.5k | |
| 6 | Lee | 58.7k | |
| 7 | Cecil | 46.3k | |
| 8 | Johnnie | 45.0k | |
| 9 | Ollie | 33.8k | |
| 10 | Dale | 28.0k |
What these tables hint at
Findings
- The Mary dynasty. Mary was the #1 girl's name for three consecutive generations — Pre-Silent, Silent, and Boomer — roughly 1880 through 1964. Linda nearly caught her during the Boomer era (1.06M to Mary's 1.13M). By Gen X she'd slipped to #8; she doesn't appear in any later top 10.
- Male #1s churned more, but Michael held two. John → Robert → James → Michael (Gen X and Millennial — ~2.1M Michaels across 32 years) → Jacob → Liam. Five distinct male #1s across seven generational windows.
- Dominance shrinks every generation. The #1 girl's name fell from 1.43M (Mary, Pre-Silent) to 215K (Olivia, Gen Alpha). The boy side tracks the same curve: 1.09M (John, Pre-Silent) → 237K (Liam, Gen Alpha). The era of dynastic names is over.
- The gender-neutral roster turns over every generation. Willie/Marion/Jessie/Johnnie → Terry/Robin/Kim/Lynn → Tracy/Shannon/Jamie → Jordan/Taylor/Casey → Alexis/Jordan/Avery → Avery/Riley/Parker. Only Terry held a top-10 GN spot across three consecutive generations.
- GN volume peaked mid-century, not today. Terry topped the Boomer GN list at 322K. Today's Avery tops Gen Alpha GN at 115K — about a third of Terry's run. Cross-gender naming is more common today, but the individual hits are smaller because diversification has flattened the head.
Name concentration
In 1880, the top 10 boy's names covered 44% of all male births. By 2024, that's 8%. For girls, 25% down to 7%. The top 100 used to cover 80% of births. Now it's less than a third.

Findings
- Top-10 share fell sharply: Female 24.6% → 7.1%; male 44.2% → 8.0%.
- Top 100 tells the same story: 76.5% → 31.0% for females; 80.1% → 38.0% for males.
- Male naming was more concentrated but converged: The 1880s male top-10 share (44%) was nearly double the female (25%). By 2024 they've nearly met (8.0% vs. 7.1%).
- Steepest decline post-1960: Gradual 1880–1960, then a sharper break coinciding with the cultural shifts toward individualistic naming.
Name diversity: Shannon entropy
I used Shannon entropy to quantify how spread out name choices are. Female name entropy went from 7.6 bits in 1880 to 11.1 bits in 2023. That's an 8 to 16x increase in effective variety. Female names have always been more diverse than male. But the gap is closing.

How entropy is calculated, and what it means
For a given year, let $p_i$ be the proportion of births with name $i$. Shannon entropy is:
$$H = -\sum_{i=1}^{k} p_i \log_2 p_i$$
measured in bits. Intuitively, $H$ is the average number of yes/no questions you'd need to guess a random baby's name. The effective number of names — the number of equally-popular names that would produce the same $H$ — is $2^H$. An entropy of 10.5 bits ≈ 1,448 effective names.
- Diversity grew sharply: Female 7.6 → 11.1 bits; male 6.9 → 10.5. 3–4 extra bits of surprise ≈ 8–16× effective variety.
- Female names have always been more diverse, and the gap widened over time.
- Unique-name count diverges from entropy after 1980 — the 20k+ unique female names today include many used only a handful of times.
- A brief 1950s plateau: The baby-boom era briefly concentrated choice around Linda, Mary, Barbara / James, Robert, Michael.
Name half-life
Mary held #1 for girls for roughly 80 years. Michael held it for boys for about 40. Today, names cycle through the top 10 much faster. In the 1880s, the top 10 names took 45 years to drop to half their peak share. By the 1990s, 15 to 20.

Findings
- Male top names have always lasted longer: In the 1880s, male top-10 half-life was ~45 years vs. ~32 for females.
- Both declining: By the 1990s, male ~19 years, female ~15.
- Steepest female drop: 1920s–30s, from ~33 to ~20 years in two decades — a post-WWI cultural shift toward novelty.
- The gap is narrowing: 13-year male-female gap in the 1880s → 4 years by the 1990s.
- Chart cuts off at the 1990s because newer names haven't finished decaying yet.
Average name length
Female names got longer, peaked in 1990 at 6.4 characters (Stephanie, Christina, Jennifer, Samantha), then shortened again. Male names are remarkably stable. The Pearson correlation between male and female name length across 145 years is r = 0.954. The same cultural forces move both at once.

Findings
- Female names got longer, then shorter: 5.4 chars (1880) → 6.4 peak (~1990) → 5.9 by 2024.
- Male names are stable: 5.5–6.0 chars across the whole period, mild 1990–2000 peak from Christopher/Alexander/Nicholas.
- Strong correlation: r = 0.954 between male and female length year-by-year. Year-over-year changes correlate at r = 0.706.
- Crossover around 1910: Female names have been longer than male ever since.
- What drove the 1990 peak: Brittany, Ashley, Jessica, Samantha surged as Amy, Lisa, Tracy collapsed. What reversed it: Christopher fell from 2.6% to 0.3% of male births, replaced by Liam, Noah, Leo.
Trigram trends: the sound of a decade
Each decade has a distinct sonic fingerprint. "Mar" (Mary, Margaret, Martha) dominated 1880s–1920s. "Nif" (Jennifer) spiked in the 1970s–80s. "Liv" (Olivia) rises today. The top trigram's share fell from ~3% in the 1880s to under 1% now — no single sound pattern dominates modern naming the way "mar" once did.
Findings
- Character trigrams (3-letter subsequences) capture the sound of each era more precisely than starting letters alone.
- Trigram dominance collapsed alongside name-level concentration — no more "mar".
- Male trigrams are stickier — slower, broader waves vs. the sharper crashes of the female chart.
Gender-neutral names
Using a strict definition — minority gender ≥20% of usage in a 10-year rolling window, with ≥100 minority births — 1,653 names have ever qualified as gender-neutral. Their combined share of births rose ~8× since 1880, from ~0.7% to ~5.5% today. The post-1980 acceleration is real, not just a vibe.

Findings
- Strict definition matters. Loose definitions catch noise — Liam, given to ~22K boys and ~500 girls in 2024 (~2% female share), is not meaningfully gender-neutral. The 20% minority-share threshold demands substantive cross-gender use; the 100-count floor rules out one-off statistical noise.
- An early-1900s male spike (~3.3% in 1910) driven almost entirely by Willie — hugely popular and substantively cross-gendered.
- Flat 1920–1980 plateau at ~2% for both genders, then a strong post-1980 acceleration to ~5–5.5% by the early 2020s.
- Diversity expansion drove the post-1980 explosion. As parents picked from a wider pool, more names simultaneously cleared the substantive cross-gender bar.
The all-time gender-neutral champions
Willie in 1910 is the all-time winner: ~4.5% of all girls and ~14.9% of all boys carried it in a single year — substantively cross-gendered at the height of its popularity. The only modern names that come close are Taylor in 1992 and Jordan in 1997. Willie occupies 7 of the top 10 single-year slots overall.
Findings
Top by single-year minimum prevalence (highest and most evenly balanced):
- Willie (1910): 1,796 F + 2,897 M — 0.45% F / 1.49% M
- Willie (1900): 1,351 F + 2,113 M — 0.45% F / 1.40% M
- Willie (1909): 1,549 F + 2,175 M — 0.45% F / 1.33% M
- Taylor (1992): 14,952 F + 8,239 M — 0.81% F / 0.41% M
- Jordan (1997): 7,166 F + 14,761 M — 0.41% F / 0.78% M
Willie's dominance reflects how concentrated naming was in the early 20th century: a single hit name could simultaneously be top-tier and substantively cross-gendered. The modern era is more diffuse — Taylor and Jordan are the cleanest contemporary analogues, but neither reaches Willie's combined prevalence.
Stability of gender-neutral names
Once a name clears the gender-neutral bar, does it stay there? The distribution is bimodal. ~39% are "stable neutrals" (Jessie, Jamie, Casey — they settled in). ~25% are "brief crossers" (Aidan, Flynn, Juno — they brushed the boundary and retreated). The remaining ~36% oscillate. The M→F flip (Madison being the textbook example) is the famous arc — but F→M flips happen too; they just play out more slowly.

Findings
- Stable neutrals (top row). Jessie has been gender-neutral for 122 of the last 145 years, bobbing in the 30–80% female range. Jamie (since 1910) and Casey (since 1968) show the same pattern — modest drift, never an exit.
- M → F flippers (second row). Madison passed through the cross-gender band in just 3 years (the Splash effect). Lauren and Lindsay show the same shape on different timescales.
- F → M flippers (third row). Rarer and quieter. Lavon and Robbie drifted gradually from ~90% female to male-dominant over decades. Samar is the modern example.
- Brief crossers (bottom row). Aidan, Flynn, Juno briefly bumped above the 20% threshold for a year or two, then receded — they never flipped, just brushed against the band.
- Pre-GN history is universal: only 9 of 1,653 names were gender-neutral from their first year of meaningful use. Parker had 130 years of male-dominant use before crossing into GN territory.
Ending bigrams as gender signal
The last two letters of a name carry an outsized share of its gender identity. Mutual information between ending bigram and gender has dropped from 0.64 bits in 1880 to 0.47 bits today — a 27% decline. Endings like -la, -ia, -na stay near-100% female; -rt, -hn, -rd stay near-100% male. The blurring is happening in the middle: -ey, -ce, -ah.

Findings
- MI declined 27%: 0.64 bits (1880) → 0.47 bits (2024).
- But strong polarities persist: -la, -ia, -na, -da, -sa remain near-100% female; -rt, -hn, -rd, -es near-100% male.
- Big shifts in the middle: -yn went from 10% female to 80%. -ee and -ah similarly shifted toward female.
- 1950s–60s bump corresponds to the baby boom's traditional, strongly-gendered naming (Linda/Barbara vs. James/Robert).
Cross-gender sound transfer
Ending sounds migrate between genders. -yn rose from 5% female to 85% female (Evelyn, Jocelyn, Carolyn, Brooklyn shifted across). -ee, -ah, -gh followed similar S-curves. The transfer goes both ways: -ry, -az, -us drifted toward male use. Once an ending crosses ~30% usage by the other gender, it tends to accelerate rather than stabilize.

Findings
- Female migrations: -yn, -ee, -ah, -gh all S-curved toward female dominance.
- Transfer is bidirectional: Endings migrating toward male roughly match those migrating toward female.
- S-curve suggests tipping-point dynamics: Once an ending "sounds" like a gender to enough people, the remaining cross-gender names feel incongruent and get replaced.
Phonotactic complexity
Female names have always been more "liquid" — more vowels, smoother consonant-vowel alternation. Male names tolerate more consonant clusters (Chr-ist-opher, Str-ong). That's changing too. Male phonotactics are converging toward female: consonant clusters have dropped from 22% to 14%, vowel sequences (VV) have nearly doubled since 1980.

Findings
- CV dominates: ~42–46% for females, ~36–39% for males — the Ma-ry, Jo-hn rhythm.
- Female names are more liquid: Higher CV, lower CC share — a persistent pattern.
- Consonant clusters declining: CC fell 16% → 12% (female), 22% → 14% (male).
- Vowel sequences rising: VV share has nearly doubled since 1980 (Mia, Ava, Aria, Noah, Liam, Isaiah).
- Male convergence: Male phonotactics are drifting toward the female pattern — smoother, more vowel-rich.
Starting-letter frequency
J dominated boys' names for most of the 20th century (James, John, Joseph; later Jason, Justin, Joshua). A-names surged for both genders after 1960. M has been the quiet workhorse for girls for 145 years — Mary, Margaret, Madison, Mia, Mila. K for girls boomed and busted from the 1960s to the 1980s.

Findings
- J dominance in male names: Peaked mid-century, declined since the 1990s.
- The rise of A: Surging since 1960s–70s for both sexes; dominant for girls today.
- M's endurance for girls: Consistently large share across the whole period.
- K's boom and bust: Karen, Kimberly, Kelly, Kristen surged 1960s–80s, then faded.
- Male naming is more concentrated — fewer letters command large shares at any time.
Starting-letter correlation
Some letters rise and fall together; others are antipodes. For females, the old Mary/Margaret/J-name era runs in strong negative correlation with the A- and K-name waves that replaced it. Male names show a cleaner "old vs. new" structure: J, R, W move inversely to A, B, E.

Findings
- Strong negative correlations reveal generational shifts — as one letter-block faded, another rose to replace it.
- Male names show a clear "old vs. new" block structure.
- Persistently niche letters: Q, U, X, Z show weak correlations with everything — they've never been popular enough to participate in large-scale shifts.
Top-10 name race
Mary held the #1 female name for roughly 80 years, from the 1880s through the early 1960s. John and Robert dominated for boys; Michael held #1 from the 1950s to the 1990s. Mid-century top names peaked at large counts — Michael hit ~92,000 births in a year. Today, Liam tops out at ~22,000. Even proportionally, modern #1 names are much less dominant.
Findings
- Mary held #1 longest: ~80 years for girls.
- Male #1s rotate more slowly than females overall but at lower absolute counts today.
- Modern churn is rapid: Emma, Olivia, Liam cycle in and out faster than classic names.
- Raw counts dropped even as total births held: Michael's ~92K peak vs. Liam's ~22K.
Phonetic similarity networks
Names cluster by sound. Force-directed graphs of the top 75 all-time names per gender show dense cores connecting phonetically related names across generations — Mary, Marie, Martha, Margaret, Carol, Karen, Catherine pass popularity back and forth in the same sonic neighborhood. Isolated nodes (Kimberly, Rebecca, Patricia, Richard, Albert) don't sound much like any other top name.
Findings
- Edges connect names whose Metaphone encodings are within edit distance 2. Node size = total all-time births.
- Dense cores = sound families recycled across generations.
- Male graphs are tighter: Popular male names draw from a smaller phonetic repertoire.
- Isolated nodes are phonetically distinctive: Kimberly, Rebecca, Virginia, Patricia (F); Christian, Richard, Scott, Albert (M).
Trigram contagion
When a trigram appears in a hit name, it spreads to brand-new names. Ashley dominated the 1980s–90s. Then came Paisley, Kinsley, Brinley, Hadley. Jayden spawned Kayden, Brayden, Hayden, Zayden. In the 1900s, a new top-100 trigram spawned ~10 new names over the next decade. By the 2000s, that hit 71 for girls. A 7x increase.

Findings
- Female contagion rose 7×: ~10 new names per new top-100 trigram (1900s) → ~71 (2000s).
- Male contagion rose modestly: ~3 → ~20.
- Much of the diversity is remix: Unique-name growth tracks systematic phonetic recombination, not random invention.
Trigram entropy vs. name entropy
Name entropy has grown much faster than phonetic entropy. The gap between them is the "creative spelling" gap — Caitlin, Kaitlyn, Katelyn; Brian, Bryan, Bryon. Before 1960, the two tracked in parallel. After 1960, name entropy accelerated away. The modern boom in distinct names is largely orthographic, not phonetic.

Findings
- A clear hierarchy: name entropy > phoneme-sequence entropy > trigram entropy.
- The crossover matters: Name entropy overtook phoneme entropy around 1960 — the start of the "creative spelling" era.
- Pre-1960, trigram entropy was actually higher than name entropy for females — a few names (Mary, Dorothy, Helen) dominated but contained diverse trigrams.
Beginning vs. ending innovation
Both beginning and ending novel-trigram rates plummeted from 10–18% in the 1880s to under 1% by the 1920s and have stayed near zero since. The trigram space filled up fast. Modern creativity is recombination, not invention. Beginnings are consistently slightly more innovative than endings — endings are more conserved because they carry gender signal.

Findings
- Trigram space filled up fast: Novelty rates dropped to near-zero by the 1920s–30s.
- Beginnings slightly more innovative than endings, every era — endings carry gender/phonetic identity.
- Modern near-zero novelty + exploding name count is the signature of remix.
Soundex phonetic groups
104,819 names compress down to ~13,000 Soundex codes — an 8:1 collapse. The average name shares its phonetic code with 7 others. Most of naming variety is orthographic, not phonetic. The largest group (S050) contains Saaim, Saam, Saamia and ~800 siblings built on the S + vowel + nasal pattern.

Findings
- Heavy right skew: Most Soundex codes map to 1–3 distinct names; a few map to hundreds.
- Top groups share simple phonetic frames: S050, J050, K050, M020 — common consonant-vowel templates.
- 104,819 names → ~13,000 codes: Most "new" names sit on top of an existing phonetic slot.
Full phoneme analysis
Mapping names to the CMU Pronouncing Dictionary covers 15,269 names — but 90% of all births. The rest are rare creative spellings. Phoneme-identical groups are huge: John/Jon (5.4M births), Steven/Stephen (2.2M), Sarah/Sara/Cera (1.5M), Brian/Bryan/Bryon/Brion/Bryen (1.6M). These quantify exactly how much "diversity" is pure spelling variation.


Findings
- Four layers of diversity: name > phoneme-sequence > trigram > phoneme.
- Phoneme-sequence entropy crossed below name entropy around 1960 — the quantitative signature of the creative-spelling era.
- Phoneme trends: "AH" (schwa) and "N" dominate female names; "R" declined (Mary/Margaret/Barbara fading); "L" rose (Olivia, Emily, Ella). For males, "N" rose steadily, tracking Aiden/Jayden/Ethan.
- Phoneme-identical groups are huge: John/Jon = 5.4M births. Same sound, different spelling.
Bass diffusion modeling
The Bass model decomposes adoption into two forces: innovation ($p$, external influence like media) and imitation ($q$, social contagion). Across 175 fit names, $q \gg p$ — parents copy from their social network more than from TV. The imitation coefficient peaked mid-century and has declined — modern naming is less socially contagious, consistent with a more fragmented, individualistic culture.


How the Bass model works
Cumulative adoption:
$$F(t) = \frac{1 - e^{-(p+q)t}}{1 + (q/p) \cdot e^{-(p+q)t}}$$
$p$ controls how quickly first adopters appear (media-driven early awareness). $q$ controls how steeply the curve accelerates (word-of-mouth amplification). A high $q/p$ ratio means the name spread through social networks; a high $p$ means a single external spark drove it.
- Imitation dominates: Median $q/p \approx$ 10–20 across 175 fit names.
- Peak in 1940s–60s: The conformist era, when Michael/Jennifer/Linda spread through tight social networks.
- $p$ stayed flat: External media influence has been roughly constant.
- $q$ declined post-1960: Individualist culture = weaker social contagion.
- Male and female diffusion patterns are remarkably similar — same social dynamics, different specific names.
Cultural event fingerprints
Madison was near-zero before the 1984 film Splash, where a mermaid picks it as her human name — played for laughs. By 2000 it was the #2 girl's name in America. Culture removes names too: Katrina dropped after the hurricane, Isis after the terrorist group, Alexa after Amazon, Karen during the meme. Drops are often steeper than rises.

Findings
- Madison is the clearest cultural fingerprint — from near-zero to #2 female name after Splash (1984).
- Drops can be steeper than rises: Katrina (hurricane, 2005), Isis (terrorist group, 2014–15), Alexa (Amazon, 2015), Karen (meme, 2020).
- Algorithmic spike detection found historical events: Dewey +6.5× in 1898 (Admiral Dewey), Grover +5.8× in 1884 (Cleveland elected), Woodrow +8× in 1912 (Wilson), Marlene +8.9× in 1931 (Dietrich's Hollywood debut). Elections and military heroes were the movies of their era.
- Modern spikes are numerous but smaller — a more diverse naming pool absorbs them.
Regional variation
Each US region has its own naming signature. Northeast: Rachel, Esther, Maeve, Sara — heritage names. South: Landyn, Ryleigh, Kingston, Messiah — creative spellings and aspirational names. Midwest: Beckett, Emmett, Graham, Lincoln — surnames as first names. West: Emiliano, Ximena, Santiago, Camila — Hispanic/Latino influence. The West is consistently the most diverse region; the South the most concentrated.



Findings
- West is most diverse, South is most concentrated — ranking stable across 1910–2024.
- Interior states peak before coastal states — average lead of 1.9 years across 80 recent hits. 44 peaked interior-first, 3 coast-first, 33 tied.
- Regional signatures (2015–2024 overrepresentation):
- Northeast: Rachel, Esther, Maeve, Sara, Nicholas, Sienna — heritage and European-influenced.
- South: Landyn, Hunter, Ryleigh, Kingston, Khloe, Messiah, Bryson — creative spellings, aspirational, religious.
- Midwest: Beckett, Emmett, Graham, Ryker, Griffin, Bennett, Lincoln — surname-as-first-name.
- West: Emiliano, Ximena, Damian, Emilio, Jesus, Santiago, Natalia, Camila — Hispanic/Latino influence.
Geographic contagion
Names spread faster than ever. The median time for a name to go from 5 states to 30 has fallen from 10–12 years in the 1960s to 3–4 years today. There's a small neighbor effect (0.4 years earlier in adjacent states), but it's dwarfed by the speed compression. Modern names aren't really spreading geographically — they're appearing simultaneously nationwide via media.

Findings
- Spread speed compressed 3×: 10–12 years (1960s) → 3–4 years (2010s) for the 5→30-state transition.
- Small but real neighbor effect: Neighboring states adopt 0.4 years earlier than non-neighbors.
- Small states are trend incubators, not trend setters: A few babies with a novel name register as a high share of a small state, making early adoption visible. The actual trend mechanism is national.