Since ChatGPT launched, the debate around the “fair use” of public website content for AI training – and whether this is plagiarism – has raged.
That debate has only gotten louder and more intense since OpenAI announced ChatGPT plugins on March 23.
One of OpenAI’s plugins is an official ChatGPT-hosted web browser. It will allow their models to read information directly from the internet.
Since we still see daily posts and tweets and prompt examples claiming otherwise, it’s worth repeating here:
The current instance of ChatGPT cannot access anything on the internet.
It doesn’t use a database or store content from websites like a search engine does in an index.
What this means is that, without a plugin, ChatGPT is still stuck in 2021, predicting the next word based on its old training data.
Even the current Bing implementation is (in a simplified explanation) pulling keywords from your prompt, doing a Bing search, feeding in the results that would appear for that keyword, and then asking the AI to “summarize” those results.
And that’s how plugins are going to change everything.
Pretty soon, ChatGPT will be able to feed in content from third-party websites for the AI to summarize or manipulate – the same way Bing is doing.
Many third-party plugins and tools can already scrape content from a website, feed it into a prompt to the OpenAI API, and summarize or manipulate that text.
However, with an official web browser plugin, this usage is about to increase drastically.
You can block OpenAI’s ChatGPT-User bot
OpenAI has given us details about their bot – including how to block it.
It’s worth noting that OpenAI will behave just like any other bot, following the robots protocol. It will assume it can access content unless it’s specifically told otherwise in a robots.txt file.
OpenAI and ChatGPT won’t crawl the web like a search engine. And, as far as we can tell, they aren’t using this data for training (yet?). All requests will be the result of a direct request from a user.
Another fun fact: it does this using the Bing search API. This likely means that if Bing can’t see the content on your website, neither will ChatGPT.
This brings us to a question I have seen a lot lately:
Should we block the OpenAI bot from accessing our websites?
The citation/plagiarism/sources/copyright debate has been raging for a while and could easily take 20,000 words to dig into.
My short answer: No.
Most websites shouldn’t block AI from accessing your website. Let’s dig deeper into why.
Take a wait-and-see approach
We shouldn’t block any new technology until we have enough data to make an informed decision.
Sure, some copyright issues could be at play, but AI plugins could also become a new source of discovery and traffic.
OpenAI says it will cite sources when plugins pull data from third-party websites. This means there will definitely be potential to get clicks from ChatGPT if a user pulls in your content.
Blocking access only means that ChatGPT (or your user) will cite somebody else’s website.
Many people having this debate start with the flawed assumption that if people can’t get the content from ChatGPT, then they’ll have to visit the website.
I don’t think that’s true. The reality is that they’ll get the content from your competitor.
Given how many people are using ChatGPT for content creation these days, there’s a decent chance that if somebody uses the tool to pull content from your website, they might link to you wherever they post the output. You’ll be passing up this chance if you block it.
We need to think long term
I remember having similar conversations about iPhone apps and the app store when it first came out in 2008.
The app store changed the interface of mobile phones. Sure, you could (and still can) do most of what an app can do with a website, but the app store is where people went to find and discover websites.
AI will have a similar effect in changing the internet’s user interface.
This isn’t going to kill search engines.
However, AI will be a new starting point for many web users. That means plugins might be your only opportunity to reach these users.
We need to start thinking of AI as a new acquisition channel – just like we do with search, social and retail platforms or app stores.
The time to start thinking about your AI and AI plugin strategy was last week. Most marketers are already behind – but it isn’t too late!
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.