AI Humor

Aug 22 2024

Published by Steven Novella under Skepticism,Technology
Comments: 0

It’s been less than two years (November 2022) since ChatGPT launched. In some ways the new large language model (LLM) type of artificial intelligence (AI) applications have been on the steep part of the improvement curve. And yet, they are still LLMs with the same limitations. In the last two years I have frequently used ChatGPT and other AI applications, and often give them tasks just to see how they are doing.

For a quick review, LLMs are AIs that are trained on vast amounts of information from the internet. They essentially predict the next work chunk in order to build natural-sounding responses to queries. Their responses therefore represent a sort-of zeitgeist of the internet, building on what is out there. Responses are therefore necessarily derivative, but can contain unique combinations of information. This has lead to a so-far endless debate about how truly creative LLMs can be, or if they are just stealing and regurgitating content from human creators.

What I am finding is that LLMs are getting better at doing what they do, but have not broken out of the limitations of this regurgitation model. Here is a good example from the New York Times – an author (Curtis Sittenfeld) wrote a short story based on the same prompt given to ChatGPT, and published both to see if readers could tell the difference. For me, I knew right away which story was AI. The author’s story was interesting and engaging. ChatGPTs story bored me before the end of the first paragraph. It was soulless and mechanical. It reminded me of a bad story written by a freshman in high school. It got the job done, and used some tired and predictable literary devices, but failed to engage the reader and lacked any sense of taking the reader on an emotional journey.

This reinforced for me what I suspected from my own interactions – LLMs are getting better at being LLMs, but have not broken out of their fundamental limitations.

Several people I know have observed that a good test for them will be if AI can tell a funny joke. This is the subject of this BBC article, which put that very idea to the test, with similar results. Here are some tests of my own, using ChatGPT 4. With the prompt: “Tell me a funny joke about scientists,” I got, “Why do biologists look forward to casual Fridays? Because they’re allowed to wear genes to work!” We are at cringey dad-joke level. I pushed it for something more funny and creative and got: “Why did the physicist stop dating the biologist? Every time they got close, the biologist kept saying, “I need my space!”” That was even worse, plus it mixed up the punchline.

You get the same results with poetry or storytelling – tired, predictable, formulaic.

For now, I don’t think creative producers have much to worry about, at least not at the high end. Where these creative AIs are useful is at the low to medium end of quality, where predictable and “just get’s the job done” is adequate. AI can help generate ideas, or it can produce low-quality stuff for personal use. You will notice I sometimes use AI art for my blogs. The advantage is they are copyright free, and they can be exactly what I need. I don’t need high quality. For my purposes, getting the job done is perfect.

The “downside” is that the world will be flooded with mediocre art. I think we can survive this, however. It doesn’t really worry me. It may even push some people to get better than the mediocre level of AI, thereby raising the bar. I wonder if, going forward, AIs will push humans to be better in general, or if it will just make us lazy (probably both).

What ChatGPT and other LLMs are really good at is not being creative but doing predictable tedious tasks. For me personally, I use ChatGPT to augment my Google searches (beyond the results that are already built in). There are times when I am just not getting the results I want from Google. The results are overwhelmed with either recent news, commercial content, or pseudoscience. But a carefully worded prompt in ChatGPT can get me quickly to exactly the content I want and direct me to the references I need. There is still room for improvement here, but the results so far are great.

I also know people who use ChatGPT to draw up legal documents. These are supposed to be technical and boring, and copying boilerplate is actually what you want. I wouldn’t rely on this for anything serious, but if you need basic contracts for everyday work, it’s fine. I also know people who use LLMs to write computer code, and report that it can save them hours of work.

What these two examples also reflect is that LLMs can be used as expert helpers. A computer programmer can use LLM generated code that they can understand, test, and tweak. A lawyer can use an LLM to generate a legal document that they can also read and correct.

The bottom line of all this is that I do not think that LLMs are at the point where they replace either experts or creative content producers. However, they do have three very useful applications: They can be used to generate mediocre content for fun and personal use. They can be used to improve and automate tedious tasks. And they can be used as tools in the hands of experts to improve their efficiency and quality. They are also getting better that these three types of tasks.

But they have not gotten any closer, in my opinion, to replacing true human creativity.

The real interesting question is this – will LLMs get to the point where they do produce creative output at a human level, or will something fundamentally different be necessary? I am leaning toward the latter. The deeper question is – what is true creativity? Can an algorithm produce content that does not seem algorithmic? Will having more complex algorithms suffice, meaning that getting beyond a certain critical point of complexity produce the illusion of true creativity? Or are the creative processes happening in the human brain beyond the LLM approach?

So far I have not seen any evidence to convince me that LLMs are capable of true creativity. But they remain technically very useful. Perhaps it’s not fair to judge them by what they are worst at, and rather to simply use them for what they are best at.

No responses yet