Quantifying the Information Flow of Long Narratives: A Case Study of Jane Austin’s Works

Authors

  • Tianyi Zhang School of International Studies, Zhejiang University, Hangzhou, Zhejiang, China
  • Junying Liang School of International Studies, Zhejiang University, Hangzhou, Zhejiang, China

Keywords:

information flow, entropy, narrative dynamics, Jane Austen, large language model

Abstract

This study employs digital humanities to analyze the information flow in Jane Austen's classic literature using the GPT-2 XL model. Entropy, a measure of unpredictability, quantifies the narrative's dynamic engagement with readers. By calculating the entropy of each sentence, the research reveals unique patterns of information gain across Austen's novels, reflecting the ebb and flow of reader surprise. Peaks in entropy correspond to narrative climaxes, while declines indicate more predictable plot developments. The findings suggest that digital tools can offer fresh insights into literary analysis, highlighting the interplay between predictability and surprise in narrative structure. This exploratory approach to literature enriches traditional literary studies and opens new avenues for understanding reader engagement with classic texts.

References

Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., … Liang, P. (2022). On the Opportunities and Risks of Foundation Models (arXiv:2108.07258). arXiv. https://doi.org/10.48550/arXiv.2108.07258

Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q. V., & Salakhutdinov, R. (2019). Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (arXiv:1901.02860). arXiv. https://doi.org/10.48550/arXiv.1901.02860

Estevez-Rams, E., Mesa-Rodriguez, A., & Estevez-Moya, D. (2019). Complexity-entropy analysis at different levels of organisation in written language. PLOS ONE, 14(5), e0214863. https://doi.org/10.1371/journal.pone.0214863

Frank, S. L. (2013). Uncertainty reduction as a measure of cognitive load in sentence comprehension. Topics in Cognitive Science, 5(3), 475–494. https://doi.org/10.1111/tops.12025

Freytag, G., & MacEwan, E. J. (1960). Technique of the Drama: An Exposition of Dramatic Composition and Art. https://www.semanticscholar.org/paper/Freytag's-Technique-of-the-Drama%3A-An-Exposition-of-Freytag/2882acb56b7917c2ddfe9b31d08cbf7f6fdb9031

Hale, J. (2006). Uncertainty about the rest of the sentence. Cognitive Science, 30(4), 643–672. https://doi.org/10.1207/s15516709cog0000_64

Hale, J. (2016). Information‐theoretical Complexity Metrics. Language and Linguistics Compass, 10(9), 397–412. https://doi.org/10.1111/lnc3.12196

Kukkonen, K. (2014). Bayesian Narrative: Probability, Plot and the Shape of the Fictional World. Anglia, 132(4). https://doi.org/10.1515/ang-2014-0075

Kullback, S., & Leibler, R. A. (1951). On Information and Sufficiency. The Annals of Mathematical Statistics, 22(1), 79–86. https://doi.org/10.1214/aoms/1177729694

Lambert, S. (2004). Shared Attention during Sight Translation, Sight Interpretation and Simultaneous Interpretation. Meta : Journal Des Traducteurs / Meta: Translators’ Journal, 49(2), 294–306. https://doi.org/10.7202/009352ar

Laurino Dos Santos, H., & Berger, J. (2022). The speed of stories: Semantic progression and narrative success. Journal of Experimental Psychology. General, 151(8), 1833–1842. https://doi.org/10.1037/xge0001171

Linzen, T., & Jaeger, T. F. (2016). Uncertainty and Expectation in Sentence Processing: Evidence From Subcategorization Distributions. Cognitive Science, 40(6), 1382–1411. https://doi.org/10.1111/cogs.12274

Lowder, M. W., Choi, W., Ferreira, F., & Henderson, J. M. (2018). Lexical Predictability During Natural Reading: Effects of Surprisal and Entropy Reduction. Cognitive Science, 42(S4), 1166–1183. https://doi.org/10.1111/cogs.12597

Moretti, F. (2013). Distant Reading. Verso Books.

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners. https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfe

Sap, M., Jafarpour, A., Choi, Y., Smith, N. A., Pennebaker, J. W., & Horvitz, E. (2022). Quantifying the narrative flow of imagined versus autobiographical stories. Proceedings of the National Academy of Sciences, 119(45), e2211715119. https://doi.org/10.1073/pnas.2211715119

Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

Shi, Y., & Lei, L. (2020). Lexical Richness and Text Length: An Entropy-based Perspective. Journal of Quantitative Linguistics, 29, 1–18. https://doi.org/10.1080/09296174.2020.1766346

Wilkens, M. (2015). Digital Humanities and Its Application in the Study of Literature and Culture. Comparative Literature, 67, 11–20. https://doi.org/10.1215/00104124-2861911

Wu, S. T., Bachrach, A., Cardenas, C., & Schuler, W. (2010, July 11). Complexity Metrics in an Incremental Right-Corner Parser. Annual Meeting of the Association for Computational Linguistics. https://www.semanticscholar.org/paper/Complexity-Metrics-in-an-Incremental-Right-Corner-Wu-Bachrach/5cfe9bb78fa50955be5a99e6966dd5ba5b4d4f80

Downloads

Published

2024-07-27

How to Cite

Zhang, T., & Liang, J. (2024). Quantifying the Information Flow of Long Narratives: A Case Study of Jane Austin’s Works. Journal of Theory and Practice in Humanities and Social Sciences, 1(3), 7–12. Retrieved from https://woodyinternational.com/index.php/jtphss/article/view/28