Michael Townsen Hicks, James Humphries, and Joe Slater published an article on June 8, 2024, with a very meaningful title. ChatGPT is nonsense. The article is publicly available at this address. It is only 10 pages long, so I encourage everyone to read it in its entirety. The authors propose changing the terminology used to describe situations in which ChatGPT or other large language models generate untrue statements. Such results are often referred to as hallucinations, when they should be called nonsense, just like all other results generated by such tools.
Nonsense and bullshit are not colloquial terms. Their definition was proposed by American philosopher Harry Gordon Frankfurt and is now in common use. It defines nonsense as any statement whose author is indifferent to whether it is true. Based on this definition, Hicks, Humphries, and Slater analyze whether the statements generated by ChatGPT constitute „soft bullshit” (i.e., statements made without the intention of misleading the recipient) or „hard bullshit” (i.e., statements made with the intention of misleading the recipient). In the course of their argument, the authors show that the definition of „soft bullshit” is fulfilled in every case, and there is much to suggest that ChatGPT is also a „hard bullshitter.”.
Unfortunately, everything indicates that the way large language models are constructed means that they cannot stop talking nonsense. This, in turn, means that their applicability to report generation is extremely limited, if not non-existent. Every piece of text, every number in every table must be verified before we can submit such a report to an auditor and then publish it. You have probably had to verify material prepared by someone else – an intern, a junior colleague, or someone from the company's management board. It is tedious work, often taking more time than if we were to write the entire text ourselves. It can only be shortened if we have a high level of trust in the author („Please take a look at my text on employee policy reporting; it should be fine, but I'm not sure if I've provided the correct references to the paragraphs in ESRS S1.”). We will not gain such trust in a tool whose primary task is to spout nonsense.
Everything seems to indicate that we will not be able to rely on AI in reporting for a long time to come, at least until large language models are replaced by tools built on completely different foundations. For now, we will have to continue using HI, Human Intelligence. And that is actually a comforting thought. 😊



