• Strava


Random paper

Is language evolution grinding to a halt?: Exploring the life and death of words in English fiction

E. A. Pechenick, C. M. Danforth, and P. S. Dodds

Times cited: 3


The Google Books corpus contains millions of books in a variety of languages. Due to its incredible volume and its free availability, it is a treasure trove for linguistic research. In a previous work, we found the unfiltered English data sets from both the 2009 and 2012 versions of the corpus are both heavily saturated with scientific literature, as is the 2009 version of the English Fiction data set. Fortunately, the 2012 version of English Fiction is consistent with fiction and shows promise as an indicator of the evolution of the English language as used by the general public. In this paper, we first critique a method used by authors of an earlier work to determine the birth and death rates of words in a given linguistic data set. We show that this earlier method produces an artificial surge in the death rate at the end of the observed period of time. In order to avoid this boundary effect in our own analysis of asymmetries in language dynamics, we examine the volume of word flux across various relative frequency thresholds for the 2012 English Fiction data set. We then use the contributions of the words crossing these thresholds to the Jensen-Shannon divergence between consecutive decades to resolve the major driving factors behind the flux.
  • This is the default HTML.
  • You can replace it with your own.
  • Include your own code without the HTML, Head, or Body tags.


  author = 	 {Pechenick, Eitan A. and Danforth, Chrisopher M. and Dodds, Peter Sheridan},
  title = 	 {Is language evolution grinding to a halt?: Exploring the life and death of words in {E}nglish fiction},
  year = 	 {2015},
  key = 	 {language,culture,evolution},


Random paper