References

Abeysooriya, Megan AND Kasu, Mandhri AND Soria. 2021. “Gene Name Errors: Lessons Not Learned.” PLOS Computational Biology 17 (7): 1–13. https://doi.org/10.1371/journal.pcbi.1008984.
Acemoglu, Daron, Simon Johnson, and James A Robinson. 2001. “The Colonial Origins of Comparative Development: An Empirical Investigation.” American Economic Review 91 (5): 1369–1401.
Akerlof, George A. 1978. “The Market for ‘Lemons’: Quality Uncertainty and the Market Mechanism.” In Uncertainty in Economics, 235–51. Elsevier.
Alexander, Monica. 2019a. “Reproducibility in Demographic Research.” https://www.monicaalexander.com/posts/2019-10-20-reproducibility/.
———. 2019b. “Analyzing Name Changes After Marriage Using a Non-Representative Survey,” August. https://www.monicaalexander.com/posts/2019-08-07-mrp/.
Alexander, Rohan, and Monica Alexander. 2020. “The Increased Effect of Elections and Changing Prime Ministers on Topics Discussed in the Australian Federal Parliament Between 1901 and 2018.” https://rohanalexander.com/pdfs/AlexanderAlexander-EffectofElectionsandPrimeMinisters.pdf.
Alexander, Rohan, Samantha-Jo Caetano, Haoluan Chen, Michael Chong, Annie Collins, Shirley Deng, Isaac Ehrlich, et al. 2021. “An Introduction to DoSStoolkit.” http://arxiv.org/abs/2105.09347.
Allaire, JJ, Rich Iannone, Alison Presmanes Hill, and Yihui Xie. 2021. Distill: ’R Markdown’ Format for Scientific and Technical Writing.
Allen, Eric J, Patricia M Dechow, Devin G Pope, and George Wu. 2017. “Reference-Dependent Preferences: Evidence from Marathon Runners.” Management Science 63 (6): 1657–72.
Allen, Jeff. 2021. plumberDeploy: Plumber Deployment. https://CRAN.R-project.org/package=plumberDeploy.
Alsan, Marcella, and Marianne Wanamaker. 2018. “Tuskegee and the Health of Black Men.” The Quarterly Journal of Economics 133 (1): 407–55.
Arel-Bundock, Vincent. 2021. Modelsummary: Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready. https://CRAN.R-project.org/package=modelsummary.
Aschwanden, Christie. 2020. Artificial Intelligence Makes Bad Medicine Even Worse. https://www.wired.com/story/artificial-intelligence-makes-bad-medicine-even-worse/.
Athey, Susan, and Guido W Imbens. 2017. “The State of Applied Econometrics: Causality and Policy Evaluation.” Journal of Economic Perspectives 31 (2): 3–32.
Au, Randy. 2020. “Data Cleaning IS Analysis, Not Grunt Work.” Counting Stuff. https://counting.substack.com/p/data-cleaning-is-analysis-not-grunt.
Bandy, Jack, and Nicholas Vincent. 2021. “Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus.” http://arxiv.org/abs/2105.05241.
Barrett, Malcolm. 2021a. Data Science as an Atomic Habit. https://malco.io/2021/01/04/data-science-as-an-atomic-habit/.
———. 2021b. Ggdag: Analyze and Create Elegant Directed Acyclic Graphs. https://CRAN.R-project.org/package=ggdag.
Bastian, Hilda. 2020. “A Timeline of the Oxford-AstraZeneca Covid-19 Vaccine Trials.” http://hildabastian.net/index.php/100.
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.
Baumer, Benjamin, Daniel Kaplan, and Nicholas Horton. 2021. Modern Data Science with r. 2nd ed. CRC Press.
Bensinger, Greg. 2020. Google Redraws the Borders on Maps Depending on Who’s Looking. Washington Post.
Berkson, Joseph. 1946. “Limitations of the Application of Fourfold Table Analysis to Hospital Data.” Biometrics Bulletin 2 (3): 47–53. http://www.jstor.org/stable/3002000.
Blair, Graeme, Jasper Cooper, Alexander Coppock, and Macartan Humphreys. 2019. “Declaring and Diagnosing Research Designs.” American Political Science Review 113: 838–59. https://declaredesign.org/paper.pdf.
Blei, David M. 2012. “Probabilistic Topic Models.” Communications of the ACM 55 (4): 77–84.
Blei, David M, and John D Lafferty. 2009. “Topic Models.” In Text Mining, 101–24. Chapman; Hall/CRC.
Blei, David M, Andrew Y Ng, and Michael I Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research 3 (Jan): 993–1022.
Bloom, Howard, Andrew Bell, and Kayla Reiman. 2020. “Using Data from Randomized Trials to Assess the Likely Generalizability of Educational Treatment-Effect Estimates from Regression Discontinuity Designs.” Journal of Research on Educational Effectiveness, 1–30. https://doi.org/10.1080/19345747.2019.1634169.
Boland, Philip J. 1984. “A Biographical Glimpse of William Sealy Gosset.” The American Statistician 38 (3): 179–83.
Bowley, Arthur Lyon. 1901. Elements of Statistics. P. S. King.
Brandt, Allan M. 1978. “Racism and Research: The Case of the Tuskegee Syphilis Study.” Hastings Center Report, 21–29.
Bronte, Charlotte. 1847. Jane Eyre. https://www.gutenberg.org/files/1260/1260-h/1260-h.htm.
Brook, Robert H, John E Ware, William H Rogers, Emmett B Keeler, Allyson Ross Davies, Cathy D Sherbourne, George A Goldberg, Kathleen N Lohr, Patricia Camp, and Joseph P Newhouse. 1984. “The Effect of Coinsurance on the Health of Adults: Results from the RAND Health Insurance Experiment.”
Bryan, Jennifer, and Jim Hester. 2020. What They Forgot to Teach You about r. https://rstats.wtf/index.html.
Bryan, Jennifer, Jim Hester, David Robinson, and Hadley Wickham. 2019. Reprex: Prepare Reproducible Example Code via the Clipboard. https://CRAN.R-project.org/package=reprex.
Bryan, Jenny. 2020. Happy Git and GitHub for the useR. https://happygitwithr.com.
Buhr, Ray. 2017. Using r as a Production Machine Learning Language (Part i). https://raybuhr.github.io/blog/posts/making-predictions-over-http/.
Cambon, Jesse, and Christopher Belanger. 2021. “Tidygeocoder: Geocoding Made Easy.” Zenodo. https://doi.org/10.5281/zenodo.3981510.
Carle, Eric. 1969. The Very Hungry Caterpillar. World Publishing Company.
Carroll, Lewis. 1865. Alice’s Adventures in Wonderland. Macmillan.
Chamberlain, Scott, Hadley Wickham, and Winston Chang. 2021. Analogsea: Interface to ’Digital Ocean’.
Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke, Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara Borges. 2021. Shiny: Web Application Framework for r. https://CRAN.R-project.org/package=shiny.
Chen, Wei, Xilu Chen, Chang-Tai Hsieh, and Zheng Song. 2019. “A Forensic Examination of China’s National Accounts.” National Bureau of Economic Research.
Cheng, Joe, Bhaskar Karambelkar, and Yihui Xie. 2021. Leaflet: Create Interactive Web Maps with the JavaScript ’Leaflet’ Library. https://CRAN.R-project.org/package=leaflet.
Cohn, Alain. 2019. Data and code for: Civic Honesty Around the Globe.” Harvard Dataverse. https://doi.org/10.7910/DVN/YKBODN.
Cohn, Alain, Michel André Maréchal, David Tannenbaum, and Christian Lukas Zünd. 2019a. “Civic Honesty Around the Globe.” Science 365 (6448): 70–73.
———. 2019b. “Supplementary Materials for: Civic Honesty Around the Globe.” Science 365 (6448): 70–73.
Cohn, Nate. 2016. We Gave Four Good Pollsters the Same Raw Data. They Had Four Different Results.
Cook, Dianne, Nancy Reid, and Emi Tanaka. 2021. “The Foundation Is Available for Thinking about Data Visualization Inferentially.” Harvard Data Science Review, July. https://doi.org/10.1162/99608f92.8453435d.
Cooksey, Brian. 2014. “An Introduction to APIs.” Zapier. https://zapier.com/learn/apis/.
Cooley, David. 2020. Mapdeck: Interactive Maps Using ’Mapbox GL JS’ and ’Deck.gl’. https://CRAN.R-project.org/package=mapdeck.
Cox, Murray. 2021. Inside Airbnb - Toronto Data.” http://insideairbnb.com/get-the-data.html.
Craiu, Radu V. 2019. “The Hiring Gambit: In Search of the Twofer Data Scientist.” Harvard Data Science Review 1 (1).
Crawford, Kate. 2021. Atlas of AI. Yale University Press.
Cunningham, Scott. 2021. Causal Inference: The Mixtape. Yale Press.
D’Ignazio, Catherine, and Lauren F Klein. 2020. Data Feminism. Mit Press.
Dagan, Noa, Noam Barda, Eldad Kepten, Oren Miron, Shay Perchik, Mark A Katz, Miguel A Hernán, Marc Lipsitch, Ben Reis, and Ran D Balicer. 2021. “BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination Setting.” New England Journal of Medicine.
Dahly, Darren. 2020. A Brief History of Medical Statistics and Its Impact on Reproducibility. https://statsepi.substack.com/p/a-brief-history-of-medical-statistics.
Darling, William M. 2011. “A Theoretical and Practical Implementation Tutorial on Topic Modeling and Gibbs Sampling.” In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 642–47.
DeWitt, Helen. 2000. The Last Samurai. Talk Mirimax Books.
Edgeworth, Francis Ysidro. 1885. “Methods of Statistics.” Journal of the Statistical Society of London, 181–217.
Euller, Roald, Stephen H Long, and M Susan Marquis. 1997. Data Cleaning Procedures for the 1993 Robert Wood Johnson Foundation Employer Health Insurance Survey.” RAND CORP SANTA MONICA CA.
Farrugia, Patricia, Bradley A Petrisor, Forough Farrokhyar, and Mohit Bhandari. 2010. “Research Questions, Hypotheses and Objectives.” Canadian Journal of Surgery 53 (4): 278.
Finkelstein, Amy, Sarah Taubman, Bill Wright, Mira Bernstein, Jonathan Gruber, Joseph P Newhouse, Heidi Allen, Katherine Baicker, and Oregon Health Study Group. 2012. “The Oregon Health Insurance Experiment: Evidence from the First Year.” The Quarterly Journal of Economics 127 (3): 1057–1106.
Firke, Sam. 2020. Janitor: Simple Tools for Examining and Cleaning Dirty Data. https://CRAN.R-project.org/package=janitor.
Fisher, Ronald. 1935. The Design of Experiments. Oliver; Boyd.
Fitts, Alexis Sobel. 2014. “The King of Content: How Upworthy Aims to Alter the Web, and Could End up Altering the World.” Columbia Journalism Review. https://archives.cjr.org/feature/the_king_of_content.php.
Foer, Joshua. 2011. Moonwalking with Einstein. Penguin Books.
Forster, E M. 1927. Aspects of the Novel. Edward Arnold.
Franklin, Laura R. 2005. “Exploratory Experiments.” Philosophy of Science 72 (5): 888–99.
Friedman, Jerome H., Robert Tibshirani, and Trevor Hastie. 2009. The Elements of Statistical Learning. Springer.
Gagolewski, Marek. 2020. R Package Stringi: Character String Processing Facilities. http://www.gagolewski.com/software/stringi/.
Galef, Julia. 2020. “Episode 248: Are Democrats Being Irrational? (David Shor).” Rationally Speaking, December. http://rationallyspeakingpodcast.org/248-are-democrats-being-irrational-david-shor/.
Gao, Yuxiang, Lauren Kennedy, Daniel Simpson, Andrew Gelman, and others. 2021. “Improving Multilevel Regression and Poststratification with Structured Priors.” Bayesian Analysis.
Gebru, Timnit, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III au2, and Kate Crawford. 2020. “Datasheets for Datasets.” http://arxiv.org/abs/1803.09010.
Gelfand, Sharla. 2020. Opendatatoronto: Access the City of Toronto Open Data Portal. https://CRAN.R-project.org/package=opendatatoronto.
Gelman, Andrew. 2016. “What Has Happened down Here Is the Winds Have Changed.” https://statmodeling.stat.columbia.edu/2016/09/21/what-has-happened-down-here-is-the-winds-have-changed/.
———. 2020. “Statistical Models of Election Outcomes.” YouTube, August. https://youtu.be/7gjDnrbLQ4k.
Gelman, Andrew, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin. 2014. Bayesian Data Analysis. 3rd ed. CRC Press.
Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models.
Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2020. Regression and Other Stories. Cambridge University Press.
Gelman, Andrew, and Eric Loken. 2013. “The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No ‘Fishing Expedition’ or ‘p-Hacking’ and the Research Hypothesis Was Posited Ahead of Time.” Department of Statistics, Columbia University 348.
Gelman, Andrew, and Aki Vehtari. 2020. “What Are the Most Important Statistical Ideas of the Past 50 Years?” arXiv Preprint arXiv:2012.00174.
Gerber, Alan, and Donald Green. 2012. Field Experiments: Design, Analysis, and Interpretation. W W Norton.
Gertler, Paul J, Sebastian Martinez, Patrick Premand, Laura B Rawlings, and Christel MJ Vermeersch. 2016. Impact Evaluation in Practice. The World Bank.
Ghitza, Yair, and Andrew Gelman. 2020. “Voter Registration Databases and MRP: Toward the Use of Large-Scale Databases in Public Opinion Research.” Political Analysis 28 (4): 507–31.
Goodrich, Ben, Jonah Gabry, Imad Ali, and Sam Brilleman. 2020. “Rstanarm: Bayesian Applied Regression Modeling via Stan.” https://mc-stan.org/rstanarm.
Greenland, Sander, Stephen J Senn, Kenneth J Rothman, John B Carlin, Charles Poole, Steven N Goodman, and Douglas G Altman. 2016. “Statistical Tests, p Values, Confidence Intervals, and Power: A Guide to Misinterpretations.” European Journal of Epidemiology 31 (4): 337–50.
Griffiths, Thomas, and Mark Steyvers. 2004. “Finding Scientific Topics.” PNAS 101: 5228–35.
Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.
Grün, Bettina, and Kurt Hornik. 2011. topicmodels: An R Package for Fitting Topic Models.” Journal of Statistical Software 40 (13): 1–30. https://doi.org/10.18637/jss.v040.i13.
Halberstam, David. 1972. The Best and the Brightest. Random House.
Hamming, Richard W. 1996. The Art of Doing Science and Engineering. Stripe Press.
Hanretty, Chris. 2020. “An Introduction to Multilevel Regression and Post-Stratification for Estimating Constituency Opinion.” Political Studies Review 18 (4): 630–45.
Hao, Karen. 2019. This is how AI bias really happens—and why it’s so hard to fix.” MIT Technology Review.
Hart, Edmund M, Pauline Barmby, David LeBauer, François Michonneau, Sarah Mount, Patrick Mulrooney, Timothée Poisot, Kara H Woo, Naupaka B Zimmerman, and Jeffrey W Hollister. 2016. “Ten Simple Rules for Digital Data Storage.” Public Library of Science San Francisco, CA USA.
Hastie, Trevor J, and Robert J Tibshirani. 1990. Generalized Additive Models. Vol. 43. CRC press.
Healy, Kieran. 2018. Data Visualization. Princeton University Press.
Henry, Lionel, and Hadley Wickham. 2020. Purrr: Functional Programming Tools. https://CRAN.R-project.org/package=purrr.
Hernan, Miguel A, and James M Robins. 2020. What If. CRC Press.
Hugh-Jones, David. 2020. Huxtable: Easily Create and Style Tables for LaTeX, HTML and Other Formats. https://CRAN.R-project.org/package=huxtable.
Hulley, Stephen B. 2007. Designing Clinical Research. Lippincott Williams & Wilkins.
Hullman, Jessica, and Andrew Gelman. 2021. “Designing for Interactive Exploratory Data Analysis Requires Theories of Graphical Inference.” Harvard Data Science Review, July. https://doi.org/10.1162/99608f92.3ab8a587.
Huntington-Klein, Nick. 2021. The Effect: An Introduction to Research Design and Causality. Chapman & Hall.
Huntington-Klein, Nick, Andreu Arenas, Emily Beam, Marco Bertoni, Jeffrey R Bloem, Pralhad Burli, Naibin Chen, et al. 2021. “The Influence of Hidden Researcher Decisions in Applied Microeconomics.” Economic Inquiry.
Huntington-Klein, Nick, Andreu Arenas, Emily Beam, Marco Bertoni, Jeffrey Bloem, Pralhad H Burli, Naibin Chen, et al. 2020. “The Influence of Hidden Researcher Decisions in Applied Microeconomics.”
Iannone, Richard. 2020. DiagrammeR: Graph/Network Visualization. https://CRAN.R-project.org/package=DiagrammeR.
Iannone, Richard, Joe Cheng, and Barret Schloerke. 2020. Gt: Easily Create Presentation-Ready Display Tables. https://CRAN.R-project.org/package=gt.
Igelström, Erik. 2020. “Causal Graphs in r with DiagrammeR.” https://www.erikigelstrom.com/articles/causal-graphs-in-r-with-diagrammer/.
Ilyas, Ihab F, and Xu Chu. 2019. Data Cleaning. Morgan & Claypool.
Imai, Kosuke. 2017. Quantitative Social Science. Princeton University Press.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2017. An Introduction to Statistical Learning with Applications in r.
Johnson, Alicia A., Miles Ott, and Mine Dogucu. 2022. Bayes Rules! An Introduction to Bayesian Modeling with r. CRC Press.
Jones, Arnold HM. 1953. “Census Records of the Later Roman Empire.” The Journal of Roman Studies 43 (1-2): 49–64.
Jordan, Michael I. 2019. “Artificial Intelligence—the Revolution Hasn’t Happened Yet.” Harvard Data Science Review 1 (1). https://doi.org/10.1162/99608f92.f06c6e61.
Kahle, David, and Hadley Wickham. 2013. “Ggmap: Spatial Visualization with Ggplot2.” The R Journal 5 (1): 144–61. http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf.
Kahneman, Daniel, Olivier Sibony, and Cass Sunstein. 2021. Noise: A Flaw in Human Judgment. William Collins.
Kastellec, Jonathan, Jeffrey Lax, and Justin Phillips. 2016. “Estimating State Public Opinion with Multi-Level Regression and Poststratification Using r.”
Kay, Matthew. 2020. tidybayes: Tidy Data and Geoms for Bayesian Models. https://doi.org/10.5281/zenodo.1308151.
Kearney, Michael W. 2019. “Rtweet: Collecting and Analyzing Twitter Data.” Journal of Open Source Software 4 (42): 1829. https://doi.org/10.21105/joss.01829.
Kennedy, Lauren, and Jonah Gabry. 2020. “MRP with Rstanarm.” https://mc-stan.org/rstanarm/articles/mrp.html.
Kennedy, Lauren, and Andrew Gelman. 2020. “Know Your Population and Know Your Model: Using Model-Based Regression and Poststratification to Generalize Findings Beyond the Observed Sample.” http://arxiv.org/abs/1906.11323.
Kennedy, Lauren, Katharine Khanna, Daniel Simpson, and Andrew Gelman. 2020. “Using Sex and Gender in Survey Adjustment.” http://arxiv.org/abs/2009.14401.
Kenny, Christopher T., Shiro Kuriwaki, Cory McCartan, Evan Rosenman, Tyler Simko, and Kosuke Imai. 2021. “The Impact of the u.s. Census Disclosure Avoidance System on Redistricting and Voting Rights Analysis.” http://arxiv.org/abs/2105.14197.
Keyes, Os. 2019. “Counting the Countless.” Real Life. https://reallifemag.com/counting-the-countless/.
Kimmerer, Robin Wall. 2012. Braiding Sweetgrass. Milkweed Editions.
King, Stephen. 2000. On Writing: A Memoir of the Craft. Scribner.
Kohavi, Ron, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to a/b Testing. Cambridge University Press.
Kross, Sean. 2021. Postcards: Create Beautiful, Simple Personal Websites. https://CRAN.R-project.org/package=postcards.
Kuhn, Max, and Hadley Wickham. 2020. Tidymodels: A Collection of Packages for Modeling and Machine Learning Using Tidyverse Principles. https://www.tidymodels.org.
Kuriwaki, Shiro. 2020. “Defining Custom Functions in r.” Vimeo, February. https://vimeo.com/388825332.
Lauderdale, Benjamin E, Delia Bailey, Jack Blumenau, and Douglas Rivers. 2020. “Model-Based Pre-Election Polling for National and Sub-National Outcomes in the US and UK.” International Journal of Forecasting 36 (2): 399–413.
Lazear, Edward P. 2000. “Economic Imperialism.” The Quarterly Journal of Economics 115 (1): 99–146.
Lee, Benjamin D. 2018. “Ten Simple Rules for Documenting Scientific Software.” Public Library of Science San Francisco, CA USA.
Leek, Jeff, and Roger D. Peng. 2020. Advanced Data Science 2020.” http://jtleek.com/ads2020/index.html.
Leeper, Thomas J. 2018. Tabulizer: Bindings for Tabula PDF Table Extractor Library.
Levitt, Steven D. 1997. “Using Electoral Cycles in Police Hiring to Estimate the Effect of Police on Crime.” The American Economic Review 87 (3).
———. 2002. “Using Electoral Cycles in Police Hiring to Estimate the Effects of Police on Crime: Reply.” American Economic Review 92 (4): 1244–50.
Lin, Sarah, Ibraheem Ali, and Greg Wilson. 2020. “Ten Quick Tips for Making Things Findable.” PLoS Computational Biology 16 (12): e1008469.
Locke, Steph, and Lucy D’Agostino McGowan. 2018. datasauRus: Datasets from the Datasaurus Dozen. https://CRAN.R-project.org/package=datasauRus.
Lohr, Sharon L. 2019. Sampling: Design and Analysis. CRC Press.
Lopp, Sean. 2017. “R for Enterprise: Understanding r’s Startup.” R Views. https://rviews.rstudio.com/2017/04/19/r-for-enterprise-understanding-r-s-startup/.
Luebke, David Martin, and Sybil Milton. 1994. “Locating the Victim: An Overview of Census-Taking, Tabulation Technology, and Persecution in Nazi Germany.” IEEE Annals of the History of Computing 16 (3): 25.
Lumley, Thomas. 2020. “Survey: Analysis of Complex Survey Samples.”
Lüdecke, Daniel, Dominique Makowski, Philip Waggoner, and Indrajeet Patil. 2020. “Performance: Assessment of Regression Models Performance.” CRAN. https://doi.org/10.5281/zenodo.3952174.
MacDorman, Marian F, and Eugene Declercq. 2018. “The Failure of United States Maternal Mortality Reporting and Its Impact on Women’s Lives.” Birth (Berkeley, Calif.) 45 (2): 105.
Martinez, Luis R. 2019. “How Much Should We Trust the Dictator’s GDP Growth Estimates?” Available at SSRN 3093296.
Matias, J. Nathan, Kevin Munger, Marianne Aubin Le Quere, and Charles Ebersole. 2019. “The Upworthy Research Archive.” https://upworthy.natematias.com.
Mattson, Greggor. 2017. “Artificial Intelligence Discovers Gayface. Sigh.” https://greggormattson.com/2017/09/09/artificial-intelligence-discovers-gayface/amp/.
McCrary, Justin. 2002. “Using Electoral Cycles in Police Hiring to Estimate the Effect of Police on Crime: Comment.” American Economic Review 92 (4): 1236–43.
McElreath, Richard. 2020. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. CRC Press.
McQuire, Scott. 2019. “One Map to Rule Them All? Google Maps as Digital Technical Object.” Communication and the Public 4 (2): 150–65.
Meng, Xiao-Li. 2018. “Statistical Paradises and Paradoxes in Big Data (i): Law of Large Populations, Big Data Paradox, and the 2016 US Presidential Election.” The Annals of Applied Statistics 12 (2): 685–726.
Michener, William K. 2015. “Ten Simple Rules for Creating a Good Data Management Plan.” PLoS Computational Biology 11 (10): e1004525.
Mitchell, Margaret, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. “Model Cards for Model Reporting.” Proceedings of the Conference on Fairness, Accountability, and Transparency, January. https://doi.org/10.1145/3287560.3287596.
Mock, Thomas. 2020. Building a Blog with Distill.
Müller, Kirill. 2017b. Here: A Simpler Way to Find Your Files. https://CRAN.R-project.org/package=here.
———. 2017a. Here: A Simpler Way to Find Your Files. https://CRAN.R-project.org/package=here.
Nelder, John Ashworth, and Robert WM Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society: Series A (General) 135 (3): 370–84.
Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. “Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations.” Science 366 (6464): 447–53.
Ooms, Jeroen. 2019a. Pdftools: Text Extraction, Rendering and Converting of PDF Documents. https://CRAN.R-project.org/package=pdftools.
———. 2019b. Pdftools: Text Extraction, Rendering and Converting of PDF Documents. https://CRAN.R-project.org/package=pdftools.
———. 2019c. Tesseract: Open Source OCR Engine. https://CRAN.R-project.org/package=tesseract.
Oostrom, Tamar. 2021. “Funding of Clinical Trials and Reported Drug Efficacy.” https://drive.google.com/file/d/1EQLCH0ns99IxYBkxPNbagcZtGgE9a8MQ/view.
Oreopoulos, Philip, and Uros Petronijevic. 2018. “Student Coaching: How Far Can Technology Go?” Journal of Human Resources 53 (2): 299–329. https://doi.org/10.3368/jhr.53.2.1216-8439R.
Oxford-AstraZeneca. 2020. “Azd1222 Vaccine Met Primary Efficacy Endpoint in Preventing COVID-19.” https://www.astrazeneca.com/media-centre/press-releases/2020/azd1222hlr.html.
Pavlik, Kaylin. 2019. “Understanding + Classifying Genres Using Spotify Audio Features.” https://www.kaylinpavlik.com/classifying-songs-genres/.
Pedersen, Thomas Lin. 2020. Patchwork: The Composer of Plots. https://CRAN.R-project.org/package=patchwork.
Pitman, Jim. 1993. Probability.
Presmanes Hill, Alison. 2021a. M-F-E-O: postcards + distill. https://alison.rbind.io/post/2020-12-22-postcards-distill/.
———. 2021b. Up & Running with Blogdown in 2021.
R Core Team. 2020. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Register, Yim. 2020. “Data Science Ethics in 6 Minutes.” https://youtu.be/mA4gypAiRYU.
Rilke, Rainer Maria. 1929. Letters to a Young Poet.
Robinson, David, Alex Hayes, and Simon Couch. 2020. Broom: Convert Statistical Objects into Tidy Tibbles. https://CRAN.R-project.org/package=broom.
Robinson, Emily, and Jacqueline Nolis. 2020. Build a Career in Data Science. https://livebook.manning.com/book/build-a-career-in-data-science?origin=product-look-inside.
Rockoff, Hugh. 2019. “On the Controversies Behind the Origins of the Federal Economic Statistics.” Journal of Economic Perspectives 33 (1): 147–64.
Ruggles, Steven, Catherine Fitch, Diana Magnuson, and Jonathan Schroeder. 2019. “Differential Privacy and Census Data: Implications for Social and Economic Research.” In AEA Papers and Proceedings, 109:403–8.
Salganik, Matthew. 2018. Bit by Bit: Social Research in the Digital Age. Princeton University Press.
Schloerke, Barret, JJ Allaire, Barbara Borges, and Garrick Aden-Buie. 2021. Learnr: Interactive Tutorials for r.
Schloerke, Barret, and Jeff Allen. 2021. Plumber: An API Generator for r. https://CRAN.R-project.org/package=plumber.
Shetty, Shravya, and Daniel Tse. 2020. Using AI to Improve Breast Cancer Screening. https://blog.google/technology/health/improving-breast-cancer-screening/.
Si, Yajuan. 2020. “On the Use of Auxiliary Variables in Multilevel Regression and Poststratification.” http://arxiv.org/abs/2011.00360.
Silberzahn, Raphael, Eric L Uhlmann, Daniel P Martin, Pasquale Anselmi, Frederik Aust, Eli Awtrey, Štěpán Bahnı́k, et al. 2018. “Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results.” Advances in Methods and Practices in Psychological Science 1 (3): 337–56.
Silge, Julia. 2018. Text Classification with Tidy Data Principles. https://juliasilge.com/blog/tidy-text-classification/.
Simpson, Dan. 2017. “It Seemed to Me That Most Destruction Was Being Done by Those Who Could Not Choose Between the Two.” https://statmodeling.stat.columbia.edu/2017/09/12/seemed-destruction-done-not-choose-two/.
Simpson, Edward H. 1951. “The Interpretation of Interaction in Contingency Tables.” Journal of the Royal Statistical Society: Series B (Methodological) 13 (2): 238–41.
Sjoberg, Daniel D., Michael Curry, Margie Hannum, Joseph Larmarange, Karissa Whiting, and Emily C. Zabor. 2021. Gtsummary: Presentation-Ready Data Summary and Analytic Result Tables. https://CRAN.R-project.org/package=gtsummary.
Slowikowski, Kamil. 2021. Ggrepel: Automatically Position Non-Overlapping Text Labels with ’Ggplot2’. https://CRAN.R-project.org/package=ggrepel.
Staniak, Mateusz, and Przemyslaw Biecek. 2019. “The Landscape of r Packages for Automated Exploratory Data Analysis.” arXiv Preprint arXiv:1904.02101.
Statistics Canada. 2017. “Guide to the Census of Population, 2016.” Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2016/ref/98-304/98-304-x2016001-eng.pdf.
———. 2020. “Sex at Birth and Gender: Technical Report on Changes for the 2021 Census.” Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2021/ref/98-20-0002/982000022020002-eng.pdf.
Steyvers, Mark, and Tom Griffiths. 2006. “Probabilistic Topic Models.” In Latent Semantic Analysis: A Road to Meaning, edited by T. Landauer, D McNamara, S. Dennis, and W. Kintsch.
Stock, James H, and Francesco Trebbi. 2003. “Retrospectives: Who Invented Instrumental Variable Regression?” Journal of Economic Perspectives 17 (3): 177–94.
Suriyakumar, Vinith M., Nicolas Papernot, Anna Goldenberg, and Marzyeh Ghassemi. 2020. “Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings.” http://arxiv.org/abs/2010.06667.
Taback, Nathan. 2020. Design of Experiments and Observational Studies. https://scidesign.github.io/designbook/.
Taddy, Matt. 2019. Business Data Science. McGraw Hill.
Thompson, Charlie, Josiah Parry, Donal Phipps, and Tom Wolff. 2020. Spotifyr: R Wrapper for the ’Spotify’ Web API. http://github.com/charlie86/spotifyr.
Thornhill, John. 2021. “Lunch with the FT: Mathematician Hannah Fry.” Financial Times.
Tierney, Nicholas. 2017. “Visdat: Visualising Whole Data Frames.” JOSS 2 (16): 355. https://doi.org/10.21105/joss.00355.
Tukey, John W. 1962. “The Future of Data Analysis.” The Annals of Mathematical Statistics 33 (1): 1–67.
Van den Broeck, Jan, Solveig Argeseanu Cunningham, Roger Eeckels, and Kobus Herbst. 2005. “Data Cleaning: Detecting, Diagnosing, and Editing Data Abnormalities.” PLoS Medicine 2 (10): e267.
von Bergmann, Jens, Dmitry Shkolnik, and Aaron Jacobs. 2021. Cancensus: R Package to Access, Retrieve, and Work with Canadian Census Data and Geography. https://mountainmath.github.io/cancensus/.
Wang, Wei, David Rothschild, Sharad Goel, and Andrew Gelman. 2015. “Forecasting Elections with Non-Representative Polls.” International Journal of Forecasting 31 (3): 980–91.
Ware, James. 1989. “Investigating Therapies of Potentially Great Benefit: ECMO.” Statistical Science, no. 4: 298–306.
Wasserman, Larry. 2005. All of Statistics. Springer.
Weissgerber, Tracey L, Natasa M Milic, Stacey J Winham, and Vesna D Garovic. 2015. “Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm.” PLoS Biology 13 (4): e1002128.
Whitby, Andrew. 2020. The Sum of the People. Basic Books.
WHO. 2019. “Trends in Maternal Mortality 2000 to 2017: Estimates by WHO, UNICEF, UNFPA, World Bank Group and the United Nations Population Division.” https://www.who.int/reproductivehealth/publications/maternal-mortality-2000-2017/en/.
Wickham, Hadley. 2010. “A Layered Grammar of Graphics.” Journal of Computational and Graphical Statistics 19 (1): 3–28.
———. 2014. “Tidy Data.” Journal of Statistical Software 59 (1): 1–23.
———. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.
———. 2017. Tidyverse: Easily Install and Load the ’Tidyverse’. https://CRAN.R-project.org/package=tidyverse.
———. 2019a. Advanced r. CRC Press.
———. 2019b. Babynames: US Baby Names 1880-2017. https://CRAN.R-project.org/package=babynames.
———. 2019c. Httr: Tools for Working with URLs and HTTP. https://CRAN.R-project.org/package=httr.
———. 2019d. Rvest: Easily Harvest (Scrape) Web Pages. https://CRAN.R-project.org/package=rvest.
———. 2019e. Stringr: Simple, Consistent Wrappers for Common String Operations. https://CRAN.R-project.org/package=stringr.
———. 2020a. Forcats: Tools for Working with Categorical Variables (Factors). https://CRAN.R-project.org/package=forcats.
———. 2020b. Tidyverse. https://www.tidyverse.org/.
———. 2021. Tidyr: Tidy Messy Data. https://CRAN.R-project.org/package=tidyr.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019b. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
———, et al. 2019a. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, and Jennifer Bryan. 2020. Usethis: Automate Package and Project Setup. https://CRAN.R-project.org/package=usethis.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2020. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.
Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science. https://r4ds.had.co.nz/.
Wickham, Hadley, Jim Hester, and Winston Chang. 2020. Devtools: Tools to Make Developing r Packages Easier. https://CRAN.R-project.org/package=devtools.
Wickham, Hadley, and Evan Miller. 2020. Haven: Import and Export ’SPSS’, ’Stata’ and ’SAS’ Files. https://CRAN.R-project.org/package=haven.
Wiessner, Polly W. 2014. “Embers of Society: Firelight Talk Among the Ju/’Hoansi Bushmen.” Proceedings of the National Academy of Sciences 111 (39): 14027–35.
Wilkinson, Mark D, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (1): 1–9.
Wilson, Greg. 2021. Building Software Together. CRC Books.
Wilson, Jennifer AND Cranston, Greg AND Bryan. 2017. “Good Enough Practices in Scientific Computing.” PLOS Computational Biology 13 (6): 1–20. https://doi.org/10.1371/journal.pcbi.1005510.
Wright, Philip G. 1928. The Tariff on Animal and Vegetable Oils. Macmillan Company.
Wu, Changbao, and Mary E Thompson. 2020. Sampling Theory and Practice. Springer.
Xie, Yihui, Christophe Dervieux, and Alison Presmanes Hill. 2021. Blogdown: Create Blogs and Websites with r Markdown. https://github.com/rstudio/blogdown.
Xie, Yihui, Amber Thomas, and Alison Presmanes Hill. 2021. Blogdown: Creating Websites with r Markdown.
Zhu, Hao. 2020. kableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.
Zook, Matthew, Solon Barocas, Danah Boyd, Kate Crawford, Emily Keller, Seeta Peña Gangadharan, Alyssa Goodman, et al. 2017. “Ten Simple Rules for Responsible Big Data Research.” Public Library of Science San Francisco, CA USA.