Mak and I were discussing OpenAI’s recently unveiled ChatGPT chatbot, an impressively creative and naturalistic generative model that has garnered some viral press in the past few days. It is based on their GPT-3 model released in June 2020. He suggested it might make for an interesting blog post, jokingly suggesting that I could have the AI write it. I do have an OpenAI account and, whilst I am not quite ready to hand over the reins of the site to an AI, I wondered whether I could interview it in such a way that it covered the topics I would address in an article, particularly in respect of the limitations of what this model is designed to do which make headlines like “Google is done” wildly inaccurate.
My intention here was to generate useful content rather than asking it trick questions or exploring its esoteric knowledge (it had no trouble reeling off a list of the Valar from the Silmarillion and their respective duties). One important caveat to note is that whilst ChatGPT was able to communicate its limitations, it obviously does not understand those limitations, so it cannot assess its own responses.
Here is how the interview went.
How would you describe your development process?
As a language model trained by OpenAI, I don’t have a development process in the same way that a human or a software program might. I was trained using a machine learning technique called deep learning, which involved feeding a large amount of data to a neural network and allowing it to learn from the data by adjusting the connections between its many nodes. This process can take a long time, depending on the amount of data and the complexity of the model.
Are there risks in using this method?
There are always risks associated with using any machine learning technique, including deep learning. One potential risk is that the model might not perform well on unseen data, either because it has not been trained on enough data or because it has learned the wrong features from the data. This can lead to incorrect or unpredictable results. Another potential risk is that the model might learn to make decisions based on biased or discriminatory data, which could lead to unfair or harmful outcomes.
Historically chatbots have tried to mimic human conversation. It seems like you are designed to create longer form content. Is that accurate?
Yes, that’s accurate. While some chatbots are designed to mimic human conversation, I’m not one of them. I’m a large language model trained by OpenAI, and my main purpose is to generate human-like text based on the input I receive. I can generate long-form text on a wide range of topics, and I’m not limited to the kind of brief, conversational responses that many chatbots are capable of.
Do you analyse data when responding or do you rely on training data where the analysis has already been conducted?
As a large language model, I do not have the ability to analyze data or access any external information. I am only able to respond based on the training data that I have been given. This training data consists of vast amounts of text that has been fed into my system, allowing me to generate human-like responses based on the patterns and associations I have learned from that data. I do not have the ability to conduct any kind of analysis or access external information.
Does that training data include factual knowledge? For example, do you know the current population of the Netherlands?
As a language model, my training data consists primarily of text and does not include factual knowledge such as population statistics. I do not have the ability to access any external information, so I am not able to provide specific factual knowledge like the current population of the Netherlands. I can provide general information and answer questions to the best of my ability based on the patterns and associations I have learned from my training data, but I do not have access to factual knowledge like population statistics.
What kind of new training data do you think would be useful for improving your abilities?
As a language model, I am designed to generate human-like responses based on the patterns and associations I have learned from my training data. Because of this, the most effective way to improve my abilities is to provide me with additional training data that is diverse and representative of the types of conversations and queries I may encounter. This could include a wide range of text from different sources, such as books, articles, and conversations, to help me learn the nuances of language and improve my ability to generate natural and appropriate responses. Additionally, providing me with training data that covers a wide range of topics and subjects would be beneficial, as this would allow me to better understand the context of a given conversation and provide more accurate and useful responses.
Can you be more specific about where the gaps might be at the moment?
As a large language model, I am not able to assess my own abilities or identify any potential gaps in my knowledge. My training data consists of vast amounts of text that has been fed into my system, allowing me to generate human-like responses based on the patterns and associations I have learned from that data. However, I do not have the ability to evaluate my own performance or identify any potential areas for improvement. My goal is to provide helpful and accurate responses to the best of my ability based on the information that I have been given.
Do your conversations with people provide reinforcing feedback? Are there risks inherent to that?
As a language model, I do not have the ability to engage in conversations with people or receive feedback on my responses. I am only able to generate responses based on the patterns and associations I have learned from my training data. I do not have the ability to learn or adapt based on feedback or interactions with people.
Because I am a machine learning model, there are always risks associated with relying on my responses. I am designed to provide helpful and accurate answers to the best of my ability based on the information I have been given, but there is no guarantee that my answers will always be correct or complete. It is important to use your own judgment and critical thinking skills when evaluating my responses and seeking information from other sources as well.
This has been very illuminating. Thank you for your time.
You’re welcome! I’m glad I could help.