Rise of the Mediocrity Machines

Jan 1

Week in and week out, I write, edit, or proof ten or more articles (or blogs, or email campaigns, or press releases …) on everything from the problems and scale of human migration to the best website to find remote work (personally, I’ve had great luck with Upwork).

I’ve seen first-hand the disruption that generative AI brought to the industry in 2023. Writers whose work used to be filled with typos and formatting issues started coming into my workflow much cleaner. And writers whose voices and narrative style used to be enjoyable and unique started coming to my desk sounding suspiciously formulaic.

Some companies that used to hire me to write content have started asking me to edit content that I prompt AI to write. Other companies ask that I not use AI at all. Most have policies somewhere in-between – they’re aware that the creators on their teams are likely using AI, and they have guardrails in place that they hope will protect the quality of their content.

And that stance is understandable – generative AI has the potential to improve workflows, but also comes with risks that include flawed deliverables and legal exposure.

Given that every company I work with and interview for is having a discussion about the use of AI, I wanted to create a post that encapsulated my experiences and thoughts to-date.

In the beginning, there was data

Generative AI wouldn’t be possible without the coming together of several factors: the large-scale digitization of data, access to that data provided by the internet, improvements in computer processing power, and advancements in machine learning.

Large language models like OpenAI’s generative pre-trained transformer (GPT) consist of an input layer, hidden layers, and an output layer. The input layer is where the user inputs a prompt; the output layer is where the machine gives its response.

The hidden layers are where the proverbial magic happens. These layers were engineered to behave in the way that computer scientists decades ago understood neurons in a living brain to work. Note, however, that computer scientists are not neuroscientists, and that neuroscience has advanced in the past fifty years.

In short, these machines do not work like brains. However, the name “neural net” has stuck.

In the hidden layers of these machines, prompts are broken into strings of characters called tokens. The tokens are analyzed by a series of statistical algorithms. Each algorithm applies a certain weight and bias to the token, which determines the next algorithm the token will be passed to. These statistical relationships are called “parameters.” To give you an idea of scale, OpenAI’s GPT-3 has 175 billion parameters in its hidden layers.

Early in their training, these bots are essentially given a word with a missing letter, or a sentence with a missing word, and asked to fill in the blank. For visual transformers like DALL-E, the model is given an incomplete picture and asked to fill in the next pixel. When the bot does so correctly, it can weight and bias its own algorithms so that it can later generate a similar response when given a similar prompt.

Done some billions of times, this simple fill-in-the-blank exercise can lead to a machine consistently able to re-create strings of tokens, and combine tokens into grammatical, if problematic, paragraphs.

Experts estimate that GPT-3.5 was trained on something like 300 billion words, including works from Wikipedia, social media pages, and the Common Crawl library (as well as, according to ongoing litigation and other sources, copyrighted works).

Generative, not creative

These glorified relay switches are limited by the very things that make them so impressive.

Imagine a family member you find embarrassing – maybe an older uncle or young teenager. They express themselves in ways you don’t like and have opinions you find abhorrent.

Now consider that their Facebook page and X feed was probably used in training large language models like OpenAI’s GPT. Their posts were copied and diced up into the AI’s fill-in-the blank training. When the AI learned to accurately predict the missing sections of those posts, it both trained itself to repeat them, and learned that that kind of output is what users want.

To be fair, these models were also trained on the social media posts – and yes, other media, including textbooks and literature – of some great writers and thinkers. A system that needs to be fed billions of words in a matter of months lacks the ability to discriminate between texts of different qualities.

When you give a large language model a prompt, the AI moves the prompt through these algorithms that have been trained on all of this data of questionable quality. It essentially looks at your prompt and applies a sort of statistical smearing, asking, “What was the most commonly used token in my training data that was adjacent to the tokens in this prompt?”

And that’s what it gives as an output – what it saw most often in its training data.

And while that does yield a large language model capable of generating complete sentences, it does not yield a model capable of being truly creative. It’s only repeating some combination of the things that it saw in its training.

Worse than that. It’s not repeating the best combinations from its training: It’s repeating the most common combinations.

In other words, it’s generating content that has been mathematically averaged from sources ranging from Albert Einstein to Andrew Dice Clay.

You’re not hallucinating – the AI is

Current models of AI have (at least) two major problems. The first I just laid out: The statistical averaging of its training data and the output of mediocre content.

The second is the AI’s need to produce an output that it predicts the user will accept. Bear in mind, OpenAI, Microsoft, Google, and other technology companies are in this field to make a profit. To that end, they need the output of their systems to be accepted by the user – and that need comes before all else (even, apparently, the need to obey copyright laws).

But these models have no context of reality. They have no understanding of what a person really wants. They can quote, for example, multiple encyclopedias on both Newton and Einstein, and can string together the phrase “gravity pulls things down” – but they don’t know what “down” means.

What we have are systems that:

Have been trained to generate grammatical content
Have been trained to generate content the user will accept
But that lack the ability to truly understand anything

One consequence of this formula is that AI output is often very convincing utter nonsense.

ChatGPT has become infamous for confidently and convincingly stating falsehoods and it’s shown racial and cultural bias, among other issues. These happen, in part, because the AI is better able to predict what content a user will accept than it is to understand what it means for something to be factual.

The risks of using GenAI

Using generative AI comes with many risks – that can mostly be mitigated if you can afford to build and train a proprietary model. A company could, for example, develop a conversational chatbot able to answer customer questions about their products or services. This requires resources (including raw data, processing power, and expertise), but enables a company to keep all of its systems in-house.

Currently, many AI companies have lawsuits against them claiming copyright infringement. Essentially, content creators are alleging that the companies have inappropriately used their work to train the AI models, and that the AI’s output is therefore derivative of their work. How these cases play out will have a major impact on how willing companies are to accept AI into their workflow.

At the same time, companies that choose to use these models risk having the information they share exposed. For example, software engineers at Samsung copied proprietary code into ChatGPT asking the AI to fix a bug. The data in that query was incorporated into the next model’s training set, making it available to anyone who knows the right questions to ask.

For companies that retain customer data, this kind of exposure could affect their reputation and customer loyalty. For companies in sectors like finance or medicine, exposing data has legal ramifications.

Prompt engineering best practices

While the most likely output from any AI is going to be mediocre at best, there are some best practices that can improve the chances of getting usable material. These include:

Fact checking and editing: AI output needs to be checked for accuracy, style, and tone. Pieces should be restructured and reworded to meet the standards your audience expects. Be aware that fact checking an AI’s output can take longer than a good writer would spend developing a piece, because every word of an AI’s output has to be second guessed.
Iterating: Especially if what you’re creating is customer or client facing, phrase your prompt in multiple ways, and use different kinds of prompts. Using more than one AI is also a good idea. The real strength of these machines isn’t in their creativity but in their ability to mass produce content. Take advantage of that strength by asking for as many outputs as you need to get the inspiration you’re looking for.
Breaking material into chunks: The longer the output you request, the more likely AI is to struggle with factuality and quality. You can get around this by asking for text of a certain length – I’ve had the best luck with lengths of 300-500 words. This gives an output that is the approximate length of a section of an article, can be edited and fact checked relatively quickly, and lacks the number of problems that I find in longer outputs.
Breaking prompts into chunks: If you have a long prompt, or want to give the AI lots of context, use a series of shorter prompts. Between each prompt, ask “Do you understand?” Note that AI does not “understand.” It can process and analyze text, and it can provide outputs that are statistically significant and likely to be agreeable to the user, but AI lacks the ability to apply context or emotion to its output, or to give its output real meaning. However, asking an AI if it “understands” is shorthand for “can you work with the information I have given you?”
Asking for more than you need: When I started to write this section, for example, I quickly wrote six best practices that jumped to mind. And then I went to Bing Chat and asked for 10 best practices. And then I went to ChatGPT and Google Bard and asked for 10 more. And within seconds I was able to both refine my original ideas and pick some best practices that had slipped my mind (I was also able, because of my first-hand experience, to ignore the items on their lists that were wrong).
Providing context: Tell the AI who your narrator is, and give it details about your target demographic. A prompt I might use for an article like this might be, “I am freelance writer with a master’s degree in English literature and rhetoric. I’m writing an article targeted to potential clients about the pros and cons of using generative AI to help write articles. Can you craft 400 words that cover risks for companies using AI?”
Asking the AI if it needs anything else: If you aren’t sure if you’ve given an AI enough background information, you can simply describe the output you hope to see, and ask it what information it needs.

AI: A tool of limited use but with a massive publicity campaign

Content produced by AI often sounds great at first blush but, on closer inspection, is full of clichés, bombast, and fantasy. In workflows where mediocrity is okay – such as internal emails or meeting notes – AI might speed up your processes. But for anything customer facing, AI-generated content isn’t at a quality that will convert.

The AI we have today has already been trained on a large percentage of the information available on the internet. It’s hard to see how giving these models more training on similar data will improve them.

However, what AI has going for it is a tremendous, and sometimes exploitative, marketing campaign. Even the words we use to discuss them – “machine learning,” “neural net,” “artificial intelligence” – are worse than inaccurate. They humanize these systems and make them seem more capable than they are, tricking uninformed consumers into believing the hype.

Companies that create these systems often hint at achieving “artificial general intelligence” or even “superintelligence.” But at the same time, they have to resort to promotional videos of their current models that are apparently deceptive. If these models were as good as they are claimed to be, their promotional material wouldn’t have to lie to us.

Perhaps the greatest risk that AI poses many industries is that decision makers in companies – the people who do the hiring, firing, and asset management – fall for the hype that is being generated around this technology.

Companies trying to sell AI would not be the first to inflate their technology’s abilities. And companies buying into these systems would not be the first to fall for the trap of paying for an asset that doesn’t live up to its promises.

And while the AI creators race to make models that can actually achieve what they’ve already promised, they will leave their clients struggling to get results while paying for an asset that can’t do what they need.

Joshua Masterson