The limits of prediction: Some thoughts about AI
I never planned to study machine learning, AI, linear algebra (I’m not very good), or experimental methods (my 22-year-old self would detest that I know the utility of a Latin square). When I graduated high school, my plan was to be a professional water polo player (hahaha!). Even though I’m a millennial, in college, I held firmly to the GenX mentality: anti-corporate and anti-sellout. My worst fear about a job was a suit and taking orders from a manager. When I applied to graduate school, I planned to be the best Willa Cather school in the world. No joke. The first line of my application was: “I want to be the best Willa Cather scholar in the world (too much?)”.
I got here a little by accident, which I’ve written about formally in a publication about web scraping. Web scraping in 2011 and 2012 wasn’t on most people’s mind, the way it is now with the emergence of AI writers that use scraped data. I got into scraping because I needed to solve a problem. In the summer of 2012, I was studying a Facebook group about politics. The group was interesting to me for a few reasons, including it was supposed to be a space of critical debate rather than political bickering (Hello Habermas!). More to the point of my dissertation, I scraped the group to better understand the trends about the group. I ended up with 5600 posts, which was more data than I knew what to do with. No one on my dissertation committee ever asked about the data, the technique, or anything. I had some beautiful tables, if I might say so. From there, I continued scraping and I continued learning about techniques that would allow me to scale up my analysis…necessarily learning about quantitative methods along the way. (It helps to be married to an engineer, who is skeptical of non-data related research and has helped guide me. I also live with Reviewer #2.)
The quantitative literacy I’ve acquired has taught me two things. First, it’s made me skeptical of making broad claims from a small amount of data. A few case studies can’t be used to justify policies. The other thing is that I now see a relationship between quantitative and qualitative paradigms, something that I don’t see valued in the corporate world fixated on metrics. I see qualitative processes in quantitative approaches. Corporations and businesses want metrics in order to predict profit. That’s a very narrow (and I’d argue, misunderstanding) of metrics and statistics. Metrics, statistics, and predictive analytics are where the applications of machine learning (“AI”) are really put into practice.
To broaden what I mean here: as I get deeper into statistics and experimental methods, I have begun to understand that statistics are a way of understanding possibilities at different scales than a human being intuits them. If I argue that a pair of words (AI generated sentences) have a likelihood of being related to each other based on the context in which they appear, what I’m doing is not making a definite statement about those words (examples: dog and cat, dog and ball, cat and string). I’m not proclaiming, “these words are related,” the way I might say “a dad has a child.” I’m saying they could be. The relationship between the words might be true in some circumstances but not in others. It’s not a black-and-white answer, and that’s the beauty of predictive analytics when not applied in a corporate, business setting. Predictive analytics are a way of understanding the uncertainty of the world—in tendencies, not certainties.
This approach has given me a sense of humility. When I first started scraping, I had this naïve belief that more data would inevitably, incontrovertibly lead to better studies. I attribute this partly to my excitement and partly to the zeitgeist of the “big data” hype. (We are in the midst of the “AI” hype that mimics the “big data” hype from 2014-2019.) But as I fiddled with more data, I realized that more data means noise and ambiguity.
This brings to me a weird point: there are very few predictions we can make about AI writing programs (GenAI but I hate that term). There’s no certainty. We really have no idea the effects of AI writing currently not because we don’t have enough studies but because there are so many empirical studies, often with contradicting results. To me, that makes sense: the who, what, when, where, and how of these studies matters. I’ve found academic studies that show humans can accurately predict AI written essays. I’ve found studies that show humans cannot predict AI vs. human written essays. The difference is in the details: the first study is about Old English poetry whereas the latter study is about Physics essays (the human essays were collected prior to GPT). The what and the who matter.
The details matter, the design of the study matters, the chosen metrics matter—all of that is qualitative, revealing the relationship between quantitative and qualitative paradigms.