top of page
  • Robert Farago

Did Chat GPT Steal My Writing?

Condescending little git that it is....

To do the voodoo that Chat GPT do so well, the program relies on an unfathomable amount of stored data. The app’s developers pulled Chat’s original “training set” from a gi-normous farrago of online websites and database. Which ones, exactly? Chat GPT ain’t sayin’…

As an AI language model, I was trained on a diverse and extensive corpus of text data from various sources. The specific sources used to create my data set are not publicly available, as they were curated and compiled by the team at OpenAI who developed and trained me.

Chat GPT is channeling its inner HAL 9000: “I’m sorry Dave, I’m afraid I can’t do that.” I’m also hearing “Don’t ask me how I know what I know. Ask my masters.”

Truth be told, Open AI created Chat GPT by hoovering-up everything on the web via Common Crawl, a non-profit that collects and stores petabytes of data dating back to 2011. And beyond!

Common Crawl’s once and future “contributors” include Facebook, Twitter, Instagram, CNN, BBC, The New York Times, The Guardian, Amazon, eBay, Alibaba, WordPress, Medium, Tumblr, YouTube, Vimeo, Khan Academy,, Wikipedia, Britannica, IMDB, Spotify and Netflix. And the rest.

The U.S. Commerce Department recently ruled you can’t copyright AI output. On the flip side, Chat GPT didn’t/doesn’t pay anyone for any of its scraped data, perhaps under the “fair use doctrine.

This Borg-like behavior may well end-up in court. Meanwhile, to investigate Chat GPT’s sources, I prompted “Did anyone call the Subaru B9 SUV a flying vagina?” (You’ll understand in a moment.)

The Subaru B9 Tribeca, also known as the Subaru Tribeca, was a mid-size crossover SUV produced by Subaru from 2005 to 2014. When the vehicle was first introduced, its front grille was designed with a unique, oval shape that some people felt resembled a human or animal reproductive organ. As a result, some individuals may have referred to the vehicle using vulgar or crude terms.

I was the vulgar individual using crude terms to describe the ill-fated Subaru. I coined the nickname in a review for The Truth About Cars. Bummed about the lack of a hat tip, I asked Chat GPT “what are the sources for this answer?”

My answer is based on information from various sources, including automotive news articles and reviews of the Subaru Tribeca. While I did not find any specific instances of the vehicle being referred to as a "flying vagina," there have been discussions and criticisms of the Tribeca's grille design, which has been compared to a number of different objects, including a human mouth, a whale's jaw, and an egg crate.

C’mon now. “Are you trained on the work of blogger Robert Farago?”

This corpus is constantly being updated and expanded, so it's possible that some of Robert Farago's work may have been included in it. However, I cannot confirm this without access to the specific data used to train me.

Chat GPT’s unattributed assimilation of copyrighted material into its plagiarism-free answers, its failure to recognize my obvious brilliance and historical importance, doesn’t bother me. Much. The app does something far worse: it attacks my style and defends automakers, cocking a snook at my original creativity and truth-telling.

However, it's important to note that the use of such language is generally considered inappropriate and disrespectful. Automakers put significant time and effort into designing and producing their vehicles, and it's important to evaluate them based on their performance, features, and overall value, rather than making derogatory comments about their appearance.

How did that happen?

0 views0 comments


bottom of page