What I discovered in 200 years of word frequencies
Google released a 500-billion word corpus of phrase appearances in English and other languages for research. It's a fantastic tool for tracking word and phrase frequencies over time.
For example, here are some things I found
- The resurgence of fundamentalist religion starting around 2000 is real. See God. Want a broader data set of religious words? Here you go.
- I've been somewhat skeptical of books like the Fourth Turning and generational theory, but "rebellious" shows peaks every 40 years or so. I had a conversation with some friends a while back about how the Fourth Turning people could take advantage of quantitative analysis. I admit it's hard to pick a good word to track, as many have other uses that obscure the data we care about, or go in and out of fashion in a way that dwarfs generational effects.
- We were bringing sexy back (or bringing it in for the first time)... until someone wrote a song about it. Yes you, Justin Timberlake, have ruined sexy. No... THIS is how to bring a word back. I'm actually impressed that Myst is mentioned more now than it was in the late 1990s... perhaps once it enters the cultural consciousness it becomes more widely referenced as it is incorporated into our collective knowledge. Or perhaps this corpus is mostly books, and Myst was mainly talked about in magazines at first.
- America's ascendance of the publishing industry and the English language, as told by colour vs color.
- How hipsters preceded hippies but were soon dwarfed by them
- "Cool" words like groovy have an initial peak and then sometimes rebound later. (It's hard to find cool words that don't have other meanings... like, well, "cool".)
- Racial slur for a black person. One peak around the Civil War, another one during 1930-1950 (why??), and another one during the Civil Rights era.
- Google crushes googol (though google was oddly popular around 1900 for some reason)
- Hope and fear are shockingly correlated. I supposed they must be used together a lot. Also in the antagonism wars, love crushes hate but lose beats win.
- Wars make people think about the future.
Yup, large datasets are my porn.
For example, here are some things I found
- The resurgence of fundamentalist religion starting around 2000 is real. See God. Want a broader data set of religious words? Here you go.
- I've been somewhat skeptical of books like the Fourth Turning and generational theory, but "rebellious" shows peaks every 40 years or so. I had a conversation with some friends a while back about how the Fourth Turning people could take advantage of quantitative analysis. I admit it's hard to pick a good word to track, as many have other uses that obscure the data we care about, or go in and out of fashion in a way that dwarfs generational effects.
- We were bringing sexy back (or bringing it in for the first time)... until someone wrote a song about it. Yes you, Justin Timberlake, have ruined sexy. No... THIS is how to bring a word back. I'm actually impressed that Myst is mentioned more now than it was in the late 1990s... perhaps once it enters the cultural consciousness it becomes more widely referenced as it is incorporated into our collective knowledge. Or perhaps this corpus is mostly books, and Myst was mainly talked about in magazines at first.
- America's ascendance of the publishing industry and the English language, as told by colour vs color.
- How hipsters preceded hippies but were soon dwarfed by them
- "Cool" words like groovy have an initial peak and then sometimes rebound later. (It's hard to find cool words that don't have other meanings... like, well, "cool".)
- Racial slur for a black person. One peak around the Civil War, another one during 1930-1950 (why??), and another one during the Civil Rights era.
- Google crushes googol (though google was oddly popular around 1900 for some reason)
- Hope and fear are shockingly correlated. I supposed they must be used together a lot. Also in the antagonism wars, love crushes hate but lose beats win.
- Wars make people think about the future.
Yup, large datasets are my porn.