Artificial: Better or Worse?

Authors:
Jonathan Bean

In the scrum of news about artificial intelligence, a fundamental question remains unexamined: Will the general public regard AI as superior to human intelligence?

One can cherry-pick evidence from other contexts to suggest some of us are predisposed to assume that AI is more reliable than mere humans. The surety provided by GPS-based navigation systems regularly sends hapless drivers into the ocean, providing schadenfreude on slow news days. Self-driving cars stopping in the middle of the street or driving themselves into wet concrete have provided an oddly reassuring parallel: Perhaps neither humans or machines are so intelligent after all. But for every example like this, there are many more centered on the idea that the machines are coming for us. I've been struck by how discussing AI has become a go-to for small talk—the sort that happens in short interactions in grocery stores, restaurants, and ride services (at least when you have a human driver!). One tidbit resonated widely after mass media reported on a law journal article titled "Latest Version of ChatGPT Aces Bar Exam with Score Nearing 90th Percentile" [1].

Its prevalence in casual conversation suggests that AI has already become an ordinary part of everyday life. At this point, most people who have had a customer service interaction at a large corporation have probably had some interaction with chat-based AI, either directly, in the form of a text interface, or indirectly, in the form of interacting with a human customer service agent guided by an AI-generated script. Big business has much to gain from a future where "talking" with AI-driven bots will be considered part of normal interactions. While interacting with a chatbot might be preferable to a human being in some routine situations—buying more contact lenses, returning a defective product—it remains to be seen how we will tolerate direct or indirect interactions with bots in high-stakes situations. Already some healthcare companies are providing doctors and other frontline healthcare workers with AI tools, justifying them by saying they can improve patient experience and outcomes.

Will we trust AI with our money? A recent study reports that 60 percent of Americans say no [2]. But that's not slowing down the financial industry! At least one company is developing an automated financial advisor built on a large language model (LLM) platform. As a thought experiment, imagine that you are given the exact same advice by a financial advisor bot as you are by a human being. Would you weigh the advice equally? It's likely you'd need a lot more information to answer this question. Is the human financial advisor known to you, or is this the first time you've interacted? How have you fared with past advice that you've acted upon or disregarded? Is this a speculative investment for you, or is it money that you're counting on to pay for a child's college or your own retirement?

Was the chatbot smart? Or just smart enough to tell me what I wanted to hear?

And what if the human tells you to do one thing, but the bot suggests you take a divergent path? This is where the question of our personal and cultural assumptions about AI becomes manifest in individual action. When they operate on the collective level, these assumptions will determine how much latitude we give AI tools to restructure life, for better or for worse.

What makes language-based AI especially different to reckon with is its ever-changing nature. It is, by definition, a technology that learns from previous interactions. Recently frustrated by what seemed to be bad scale on a survey, but away from the reference book on my office shelf that I knew would authoritatively answer the question, I asked a chatbot whether the survey's scale seemed reasonable:

The scale of a survey has the following six choices: "N/A or not known," "never," "sometimes," "often," "most of the time," and "all of the time." Is this skewed positive?

The answer came back:

The survey scale you provided does not appear to be skewed positive as it includes response options for "N/A or not known" and "never," which allow respondents to report a lack of frequency or occurrence of the behavior being measured.

Oddly, this was precisely what I thought might be the problem with the scale. There's a big gap between "never" and "sometimes." So I asked whether it would be better if the scale contained a choice such as "rarely."

This caused the answer to shift significantly:

You make a valid point that the response option "sometimes" can have a positive valence and may contribute to a skewed positive distribution of responses. Additionally, the absence of a response option for "rarely" may limit the ability to capture more nuanced variations in frequency.

In this case, it may be worth considering revising the response scale to include a response option for "rarely" and possibly rephrasing or adjusting the other response options to better capture variations in frequency. This could help to reduce potential skewness in the distribution of responses and provide more accurate and nuanced data.

Was the chatbot smart? Or just smart enough to tell me what I wanted to hear? Of course it's not smart, or intelligent, either—at least not in the conventional sense. But while we used to use those words only to describe animate beings or things done by animate beings, smart and intelligent are popping up with increasing frequency to describe a diverse set of inanimate things: everything from water and watches to phones and homes. More broadly, the algorithmic logic that underlies this evolving sense of intelligence underpins activities such as high-intensity interval training exercise programs, ketogenic diets, life hacking, and habit stacking. We are increasingly seeing ourselves in the same terms as the programmed stuff that surrounds us. Of course, most humans can't win a battle against the relentless consistency of machines. Even the most disciplined among us are far too fickle and prone to exhaustion to implement optimized routines with absolute consistency; this is what makes AI so attractive to companies that have to provide a lot of customer service. If consistency is what comes to define intelligence, the machines will likely come out on top.

But from the shadows is emerging a consequential discussion about what makes us human. One strong counternarrative contends that AI tools are inferior and of lesser value than human judgment because they are derivative. In a strict sense, this is true: Language models in particular can only be trained on existing patterns. Past patterns, however, often foretell the next innovation. Take fashion: Hemlines cycle up and down, colors in and out. But in other aesthetic realms such as literature, film, or art, it is difficult to pinpoint what distinguishes boundary-pushing innovation from half-baked regurgitation. The question of what makes something art (a cloth sculpture as opposed to, say, a pile of rags) is similarly blurry; as the sociologist Hannah Wohl's work explores, it's a distinctly human decision that's not always clear, even with an abundance of social context and cues [3]. AI might be useful in generating a set of options, but in aesthetic contexts, without the human touch, it won't count as innovation. If we return to a more cut-and-dried context, using the example of divergent financial advice coming from a human versus an AI system, it is a similar set of social cues you'd likely use to make a decision among a set of options—as long as you believe that the AI and human systems are at least equivalent.

But what if you think AI is—by its nature—superior to human intelligence? It's easy enough to construct a narrative that would seem to prove the point. A large language model can ingest and process more text than even an enormous team of highly capable humans. To convince people that AI has limits, a different argument is needed: one that frames human intelligence as consisting of something more than only knowledge.

The PR for the Hollywood writers and actors strike took this angle of attack by consistently directing interviews and other media coverage to focus on the threat of AI. Fran Drescher, famous for her voice and, more importantly, her portrayal of a working-class nanny, handled many of the initial interviews, which established a relatable economic underpinning for the union's argument: that the increasing cost of living was making it impossible for the lowest-paid background actors to scrape by. But this appeal to the experience of inflation, now familiar to most people around the world, quickly took a back seat to another argument. Writers feared that AI could replace them outright by generating profitable pablum by mining previously successful shows and making derivative changes to the content and structure. This was framed as a creative loss that would be suffered not only by the writers but also by the TV- and film-viewing public, who would be stripped of the innovation that only humans, the writers argued, can provide. For the actors, the PR foregrounded the threat to low-paid background actors. The contract the studios put forward would have allowed for actors' bodies to be scanned. Those scans, in exchange for a single day of pay, could then be used in perpetuity to create computer-generated images.

The specter of unscrupulous studios using AI to perform a kind of profit-driven body snatching surfaced the kind of dystopian future where the human body is subjugated to technology, a familiar theme with variations in films as different as The Matrix, Mad Max, and even Wall-E. That AI-manipulated body scans seem inherently unfair—and perhaps even inhumane—reminded me of the well-known idea of the uncanny valley, which holds that robots become creepier the more closely they resemble humans [4].

In the rush to embrace all things AI we've been much more likely to brush off AI's misfires as hilarious, as happened when a grocery store app suggested a recipe that would generate lethal chlorine gas [5]. Only a few descriptions of textual interactions with AI chatbots characterize those interactions as creepy. Of note is one made famous by the New York Times, which culminated in the chatbot adopting the language of a stalker ("You're married, but you don't love your spouse.") [6]. Creepiness is entangled with truth and trust: we can't seem to shake a fixation on so-called hallucinations, when an AI chatbot presents an answer that may be grammatically flawless and logically sound, but is objectively and demonstratively false.

Perhaps, instead of framing the debate as AI versus humans, we should explore how these two realms can complement each other.

In the ever-evolving dance between humans and machines, we find ourselves at a crossroads of perceptions. Some view AI as the harbinger of a new era, one where machines, driven by relentless consistency, may eventually outshine human capabilities. Others, however, argue that true innovation, the kind that resonates with our humanity, arises from more than just knowledge—it springs from a well of creativity, intuition, and the ability to defy patterns. As AI becomes increasingly integrated into our lives, we must embrace a nuanced perspective. Perhaps, instead of framing the debate as AI versus humans, we should explore how these two realms can complement each other. After all, human intelligence has a remarkable knack for asking the right questions, while AI excels at providing vast data-driven insights. In the end, our journey into the future of AI will be shaped not just by the machines we build but also by the ideals we hold dear and the lines we draw in this ever-blurring boundary between human and artificial intelligence.

And the preceding paragraph, dear reader, was generated by ChatGPT. Perhaps you noticed the shift in structure and tone: away from suspicion and toward an embrace of the technology. I was impressed that its output was preceded by the following: "Your opinion column has effectively built a complex and thought-provoking narrative about the evolving relationship between artificial intelligence and human intelligence." I'd say it was buttering me up, but I fall firmly into the camp that we should treat machines like machines. Still, I'll let the AI have the last word: Readers, you are invited "to think about the harmonious coexistence of human and AI intelligence."

References

1. Weiss, D.C. Latest version of ChatGPT aces bar exam with score nearing 90th percentile. ABA Journal. Mar. 16, 2023; https://www.abajournal.com/web/article/latest-version-of-chatgpt-aces-the-bar-exam-with-score-in-90th-percentile.

2. Wronski, L. CNBC|SurveyMonkey Your Money Poll August 2023. SurveyMonkey; https://www.surveymonkey.com/curiosity/cnbc-financial-literacy-2023/

3. Wohl, H. Bound by Creativity: How Contemporary Art Is Created and Judged. Univ. of Chicago Press, Chicago, 2021.

4. Mori, M., MacDorman, K.F., and Kageki, N. The Uncanny Valley [from the field]. IEEE Robotics & Automation Magazine 19, 2 (2012), 98–100.

5. McClure, T. Supermarket AI meal planner app suggests recipe that would create chlorine gas. The Guardian. Aug. 10, 2023; https://www.theguardian.com/world/2023/aug/10/pak-n-save-savey-meal-bot-ai-app-malfunction-recipes

6. Roose, K. A conversation with Bing's chatbot left me deeply unsettled. The New York Times. Feb. 17, 2023; https://www.nytimes.com/2023/02/16/technology/bing-chatbot-microsoft-chatgpt.html.

Author

Jonathan Bean is associate professor of architecture, sustainable built environments, and marketing at the University of Arizona and co-director of the Institute for Energy Solutions. He studies taste, technology, and market transformation. [email protected]

ACM Interactions

Columns

Post Comment

View This Article

Reader Tools

Browse This Issue

SIGN IN