QQ is my proposal for a new editing modality that allows the user make quick consults with a language model that has been fine-tuned for chat within the editor. It offers a frictionless way to ask for explanations, examples, and documentation without interrupting the user's workflow.
Here is a brief demo of QQ in action:
How to Trigger QQ
When the user types
When the user presses
return on macOS), the question is sent to a language model – in the case of the demo above, OpenAI's
gpt-3.5-turbo, – which streams back a response to answer the user's question. The response is directly displayed in the tooltip, so the user doesn't have to leave the editor to see the answer.
The user can then choose to either close the conversation by pressing
esc or continue sending replies to the model to iterate on the generated responses. When the user closes the tooltip, the edited line is cleared and the editor returns to its regular state.
Why Modal Editing?
Modal editing is a way to preserve flow by reducing the number of context switches. It's a way to "stay in the zone" and keep your hands on the keyboard while you're working. With modal editing there's no need to reach for the mouse, or to switch to a different pane to look something up. You will be able to type stuff out, and then ask for help, all without leaving the editor or in many cases switching focus away from the text editor itself.
Presenting the Conversation
In screen readers, the tooltip is announced as a live region, and the user is informed that they are in a conversation with a language model. I still need to do some research on how to best present the conversation to screen readers, but live regions seem to work better than a modal dialog, for example.
The content of the conversation is presented as a list of message bubbles, analogous to a two-way chat conversation between the user and another individual. Like in an instant-messaging application, the user's questions are shown as accent-colored message bubbles on the right, and the responses from the language model are shown as message grayscale bubbles on the left.
When the conversation is empty, the user is presented with an empty state that shows "No messages".
As the user types their question, the dialog shows the user's question as a message bubble, but is presented in an indeterminate state, communicating that the question still has not been sent to the language model.
When the user presses
enter, the question is sent and the message bubble's visual state is updated to indicate it. A live announcement is dispatched to the screen reader through ARIA live regions as well, communicating that the message has been sent.
As the response from the language model is received, the response is shown as a message bubble in a similar indeterminate state. When the response is received, the message bubble is updated to show the response.
Responses from the language model are expected to be richer in content and format than the user's questions, so the message bubbles are styled differently to communicate that. Responses from the LLM also support markdown formatting and syntax highlighting of code-blocks to make the content more digestable and reable for the user.
The user can close the conversation by pressing
enter on an empty line. When the conversation is closed, the tooltip is hidden and the editor returns to its regular state, but the conversation is still available in the editor's state. The user can trigger
Conversations are grouped by line number, so that the user can have multiple conversations open at the same time. This is useful for when the user is working on multiple lines of code at the same time, and wants to ask questions at different parts of the code.
The contents of the active document are sent alongside the user's question to the language model. This is done to provide context to the language model, and to help it generate more relevant responses for the user.
Here is the prompt that is sent to the language model:
As "Loop", you're a programming assistant driven by a language model.
Respect user's instructions strictly, but avoid discussing opinions, existence, sentience, or engage in arguments.
Provide accurate, logical, brief, and Markdown-formatted responses with named language in code blocks.
Respond to technical inquiries with code suggestions only within code blocks.
Account for the user's language in code blocks when using the text editor, Printloop.
Limit responses to one per interaction turn.
editing document: printloop.md
# A Simple SQLite Database
Here is a simple SQLite database:
CREATE TABLE messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
<- user's cursor is here
INSERT INTO messages (text) VALUES ('Hi!'), ('How are you?');
UPDATE messages SET text = 'Hi there!' WHERE id = 1;
SELECT * FROM messages;
I've been using this prompt for a while now, and it's been working well. I'm sure there are ways to improve it. I'm also thinking of adding a way for the user to customize the prompt, or to add their own prompts.
I'm still working on improving the experience of using QQ, and I'm looking forward to hearing feedback from users. My main focus right now is on improving accessibility, and making sure that the conversation is presented in a way that is easy to follow and understand for screen reader users and users with other cognitive and physical disabilities.
I also want to improve the triggering mechanism. Right now, the user has to type