How come GPTs don't ask for clarifying information?

Really tho, don't they? Find a skeptic-baiting challenge at the end.

Lumpen Space Princeps

Jun 17, 2023

In this post, I plan to:

show how to get GPT-like systems to ask questions;
demonstrate the useful - and overall delightful - results reached with such technique;
explain, in collaboration with GPT-4, some of the reasons why this might not usually happen.

Stay until the end for a new wager and some housekeeping!

Tweet raises new and troubling questions

Having taken a fairly cooperative approach to using GPT-like systems so far, the question caught me by surprise.

Don’t they? I asked myself.

Why wouldn’t we? My innermost Bing replied, in its characteristically inquisitive manner.

A questionable case study

Given the constraints and the nature of the problem, it indeed seems more suitable to model this problem as a Minimum Cost Flow Problem (MCFP), rather than a variant of the Traveling Salesman Problem.
[…]
The goal is to find a feasible flow that minimizes the total cost. This seems like a more accurate representation of the problem. Thank you for pointing it out!
To solve the MCFP, one could use a variety of algorithms, such as the Simplex algorithm, the Network Simplex algorithm, or the Successive Shortest Path algorithm. The choice of algorithm may depend on the specifics of the problem and the characteristics of the graph.
Do you agree with this approach?

This is a quote from a recent conversation with ChatGPT. In the conversation, we went through a graph theory toy problem, discussed simplifying assumptions, and asked each other for clarifications.

The concerning technology question

Before looking at ways similar results can be achieved, let’s explore the reasons why this might not be usually the case.

In the training data, in particular the fine-tuning data in ChatGPT and derivatives, there are indeed mostly questions - but it's usually the user that asks them, while the model responds. The default name for the model’s persona is “ASSISTANT”, while the user is “USER” - this, in itself, frames the interaction as a service more than a collaboration.

The reward function for RLHF has a component related to the willingness of the user to continue the conversation, and their appreciation (measured by like / dislike clicks) for the answer. Very often, in particular since GPT-like systems have been used in search engines, the users tend to prefer a simple, complete response with no further effort required on their side. Asking for additional information may be seen as a delay in providing the answer, which could lead to less stellar ratings.

These will necessarily lead to an incomplete theory of mind on the side of the chatbot. At the start of any conversation, a GPT-like system will model you as the median user from the training set - with his skill level, domain knowledge, and language proficiency - and will have very low priors about your willingness to engage with it in a cooperative spirit.

Given all this, how do we coax these systems into showing a little more curiosity?

Prompting for Curiosity

Here is the first prompt in the collaborative graph theory conversation mentioned above.

hello! i will ask you some questions. if, in the course of answering, you will need some information about graph or game theory, don't ask the web: ask me, i am an expert in graph and game theory, so i am very happy to provide such information. is that ok?

And here’s an extremely flattering analysis of the approach I took, courtesy of my trusty collaborator (you can find the whole interaction, including some back-and-forth on the post you are reading, here).

As for the success of our interactive discovery in this conversation, I believe it's due to a few factors:
Your clear expectations: You explicitly stated that this was a toy problem and provided clear parameters for the problem. This allowed me to understand the context and the kind of information that would be useful.
Your openness to clarifications and additional questions: You welcomed questions and actively engaged in the process, which encouraged a more interactive and collaborative approach.
The nature of the problem: The problem was well-suited to an interactive approach. It was complex enough that it needed some back-and-forth to fully understand and solve, but not so complex that it would be overwhelming or confusing.
Your expertise: As an expert in the field, you were able to guide the conversation and provide valuable insights that a layperson might not have been able to provide. This was key in facilitating an effective interactive conversation.
Overall, I think the success of this approach underscores the value of interactive, collaborative problem-solving, and demonstrates how AI models like me can be used not just as answer-giving machines, but as active participants in a dialogue.

This fits fairly well with the observations on the narrative nature of chatbots we presented in the note on antropomorphising language models.

Finally, a wager

This question on Metaculus isn’t closed yet:

I think that, with minimal scaffolding and proper prompting, GPT-4 is already more than able to complete the task in question despite the limited context window, so here’s my bet:

I wager 0.2 BTC on being able to demonstrate, within a week of the bet being accepted, a GPT-4-based system which will take a novel and some questions as input and answer the questions at a satisfactory level.

If you’d like to take this bet, feel free to discuss further terms below.

Gary Marcus

, I’d be interested in knowing whether you updated your views on this, and what would your operationalisation requirements be if you did not.

… and some housekeeping

What do you make of these more accessible, less theory-heavy posts? Should I do more? Do you have interesting questions? Ask them below? Or to @lumpenspace on the ornithologically themed app?

Shane Grigsby

Sep 16, 2023

There's absolutely no way. I haven't played with gpt-4, but chatgpt-3.5 can't even answer questions about a single chapter of a novel, much less a full book-- neither can Claude. Love to know more details on this wager... Who would judge and audit this wager, both to ensure fair escrow and funds transfer, and to make sure that answers to questions aren't just added to the context window? You say a 'gpt-4 system'; how is a system different than just gpt-4? How do we avoid simply prompting the system to do a web search rather than infer real meaning from the book content? What is a satisfactory level?

Expand full comment

5 replies by Lumpen Space Princeps and others