Learning from AI’s Bullshit

Date

March 13, 2025

source

Anyone who has used modern AI knows how unreliable they are. They might recommend adding glue to pizza sauce to keep the cheese from sliding off, generate Shrek when asked to recreate the Mona Lisa, or give completely wrong answers to mathematical questions. While new models of AI are getting better at many of these tasks, research has also found AI models are increasingly more likely to willingly answer questions they get wrong.

Tech companies like to call inaccurate outputs “hallucinations”. Calling them hallucinations frames AI’s mistakes as instances where something that usually works has instead gone wrong; the term “hallucination” assumes the mistakes are defects instead of the AI acting as designed. But, as we will see shortly, contemporary AI are acting as intended when they produce inaccurate results too. Therefore, many philosophers have argued that AI’s outputs should instead be understood as what Frankfurt called “bullshit”.

Frankfurt argued bullshit is a statement made without concern for the truth. To understand what Frankfurt means, contrast this to lying. Lying requires caring about the truth enough to deceive someone about it. When politicians lie, they want us to believe something they know as false. Bullshit is done with indifference for what is true or false, so when politicians instead bullshit, they might not even know what is true or false and just want to come across as knowledgeable or competent.

The philosophers who call bullshit on AI point out that lack of concern for the truth sounds an awful lot like the outputs of modern LLMs (large language models like ChatGPT, Gemini, Deepseek, Llama, and Claude). The main technology of LLMs are complex mathematical structures called “transformers”. Transformers contain, over billions or even trillions of parameters, a statistical representation of language. Specifically, they represent the statistical relationships between the order of words contained in their training data—which for modern LLMs consists in a sizable percentage of the entire internet. What transformers do is take an input text, where each token—roughly, each word, affix, and punctuation—has been converted to a mathematical object. They then run a very complex and computationally-intensive series of mathematical equations on that text. This results in—and here is the crucial part—a distribution of probabilities of what word comes next.

For example, if you input the incomplete sentence “I knocked my pencil off of my” to a transformer, based on patterns in its training data, it might predict with 50% probability the next token is “desk”, 10% probability the next token is “table”, 5% probability the next token is “notebook”, and so on until we end up with tokens with basically 0% probability like “ran” and “legislate”. In other words, the key function of the key engine of any LLM is to predict from the inputted context the likelihood of what the next word or punctuation will be. They do not think, they do not reason, they simply produce probabilities of what the next word will be. When ChatGPT produces the next word in its response to your question, the only thing ChatGPT has actually done is predict, based on what you said, what words likely come next in the conversation, and then picked one of the most common words.

That’s it. LLMs like ChatGPT are therefore essentially software-based sleight of hand. LLMs are not intelligent agents or advanced search engines. Modern LLMs just make predictions of what the next token will be, and choose one of the likelier tokens. Then they do this over and over and over. This is why LLMs move from word to word like they do; each time a word appears in a chat with an LLM, that is another time it has run, from scratch, an algorithm for deciding what word comes next based on everything that has been said in the conversation so far (including its own previous outputs).

Because these LLMs are nothing more than predicting the likely next words, when they tell you what the capital of Canada is, whether or not they get the question right, they do not care about telling you the right answer. There’s nothing there to do the caring. They are bullshitting.

The interesting epistemological question, given the ubiquity of AI’s bullshit (to say nothing about the ubiquity of bullshit more generally), is can we learn from bullshit? Can we end a session using AI knowing more than when we started?

Here again, contrasting bullshit from lying is helpful. We can certainly learn from lying. Suppose my friend is obsessed with falsely convincing people that he is an insomniac. I know about this obsession and know that he always tells people he slept less than he did. If my “insomniac” friend then lies by telling me he slept 3 hours last night, since I know he is lying and I know the pattern of his lies, I learn something—that he slept more than 3 hours last night.

This doesn’t work for bullshit. If instead I think my friend is bullshitting, has no idea how much he slept, and is just saying he slept three hours because he thinks it’s a low number, I can’t infer anything about how much he slept last night. This is because lying, unlike bullshit, has by definition some connection to the truth. Therefore, if we know what that connection between a lie and the truth is, we can infer the truth behind the lie.

We can learn from bullshit, but through a different way than we can learn from lying. Suppose you and I are in the impressionist wing of the Chicago Art Institute. I admittedly hate impressionist art. Despite my best efforts, I’ve never “gotten” impressionist art, and I find it really tedious. As we are looking at the collection of Monet’s Haystacks, I am bored but embarrassed of my boredom and so, to try to sound smart and engaged, I bullshit “the village in the background simultaneously depicts stasis and change.” Not spotting my bullshit, you take a few minutes to examine the village and realize I am right—Monet playfully keeps the nearby village out of focus as both an ever-present constant and as something subject to the same passage of time as the haystacks. As I watch you examine the village, I grow increasingly guilty and finally admit I was bullshitting. If you had just taken me at my word and moved on without examining the village, at the moment I came clean you would have lost any reason to think what I said was true. But that’s not what you did. You examined the paintings for yourself, and you had seen the truth of what I said by your own lights. Therefore, you are well within your right to respond to my admission of bullshit by saying “so what? I now appreciate the duality of stasis and change of the village in Monet’s Haystacks.”

This same sort of independent discovery could not have happened with the lying insomniac. When he tells me he slept 3 hours, we cannot go off and discover the truth by our own lights unless we are willing to invade his privacy and have access to a time machine. When it comes to the historical sleep habits of my friend, there is no equivalent of the painting to ponder. We are stuck inferring the truth from what they said and why we think they said it.

Notice, however, that key relevant difference between the insomniac and the Monet is not that one was a lie and one was bullshit, but with our own relationship to the truth of the matter. In the case of the Monet, what was being communicated was within our “epistemic reach” in a way my friend’s sleeping habits are not. By epistemic reach, I mean the sort of things that, if we have motivation and spend the time and effort necessary, we could learn it by our own lights.

There are lots of things we could know at any given point but do not: these are things within our epistemic reach. If I looked out of my window, I could know if the horses in my neighbor’s pasture are currently grazing. Last week when watching the horror movie Longlegs, I completely missed the intentional use of long, wide-angle shots to create senses of threat and dread in scenes where the characters did not actually face any danger. I could, with enough dedication and many years of training, learn the physics necessary to run experiments to learn new facts about the fundamental nature of reality. In the case of the horses and especially in the case of physics, I just lack the motivation to do the necessary things to gain the knowledge first-hand. In the case of the movie Longlegs, I did not manage to connect the dots on my own, and only realized it after watching a review of the film.

We can learn from bullshit in cases where the bullshitter bullshits about something within our epistemic reach. These are instances where we can be inspired or motivated to consider something we hadn’t before or investigate something we wouldn’t otherwise investigate. This is a pretty wide-ranging set of things including things of interest to philosophers like moral or aesthetic facts, things of interest to scientists like non-historical empirical facts, and much more mundane things like proper social etiquette and English vocabulary. If the bullshit is enough to inspire us to put in the work and thought on our own, even if we know it is bullshit, bullshit can spur us to learn something new and valuable by our own lights that we wouldn’t otherwise learn.

Let’s bring this back to AI. As discussed, modern LLMs are world-class bullshitters. They do not mean what they say at all, and false things are said as confidently as true things. While this means we should not take what they say at their word, it does mean that we can use them responsibly to help us learn things within our epistemic arm reach.

We can therefore use LLMs in an epistemically responsible way if we keep our epistemic use of LLMs to things we can check for ourselves. Computer code often works this way—when an LLM generates a snippet of code we can verify if the code works as expected. It also works with lots of things philosophers generally care about. If it produces a moral argument against eating meat or against supporting refugees, we can check via our own reasoning and moral faculties if the argument is any good.

Even though we are doing the epistemic heavy lifting of verifying what is said, LLMs’ bullshit can be valuable to us as knowers. LLM output, despite being derivative on text already on the internet, can still surprise us with things we hadn’t considered. A few months ago I had GPT-4o (the model behind ChatGPT) generate moral advice about some moral dilemmas for an ongoing research project. When I read its advice about one of the dilemmas I actually exclaimed “oh” at how trenchant the advice was, and any uncertainty I had about what was the right thing to do in the dilemma disappeared. Thus, LLMs, just like human conversational partners, can lead us to consider things we would not have done on our own.

Above I have argued that even if we take an extremely cynical picture of what LLMs are, we can still use them for our own epistemic ends. This does not, however, mean that LLMs are unambiguously worth using. Let’s set aside worries about the unethical ways training data has been collected and the horrible environmental impact of AI data centres, instead focusing on epistemic considerations. Overreliance on LLMs risks degradation of our skills, especially if they enable laziness, or if they lead to certain skills no longer being taught. We need to remain vigilant about how we use them, and how we are encouraged to use them by tech companies. AI should be nothing more to us than a useful but ultimately untrustworthy helper whose work we need to verify.

The post Learning from AI’s Bullshit first appeared on Blog of the APA.

Post Views: 61

Read the full article which is published on APA Online (external link)