The Prompting Myth: Does Saying ‘Please’ and ‘Thank You’ Affect AI Output?

October 25, 2025 5 Min Read

The conventional wisdom suggests skipping small talk while prompting to save time, but technical writers should use conversation as a tool for cognitive clarity instead.

When drafting documentation, technical writers prioritize efficiency and clarity. We cut the fluff, eliminate ambiguity, and use the active voice. This same mindset has carried over to prompt engineering for Generative AI.

The conventional wisdom is crystal clear. When talking to a Large Language Model (LLM), skip the small talk. Avoid using phrases like “please” and “thank you,” or other unnecessary conversational language. The theory suggests this saves tokens (the AI’s unit of language measurement) and delivers a faster, cleaner output.

But does this pursuit of bare-bones efficiency actually compromise the quality of the prompt and therefore the output you receive?

We’re going to test this common prompting myth and explore why experts advise against skipping small talk.

You may also like:

https://sneha-pandey.medium.com/prompt-engineering-a-technical-writers-guide-to-mastering-ai-5f50148f0dad

What do we mean by AI Tokens?

Before we proceed, let’s understand what we mean by tokens. A token is the foundational unit of language measurement for an LLM. Unlike humans, who read continuous strings of text, the AI breaks down all input and output — words, punctuation, and even spaces — into these small segments.

For simplicity, think of a token as a small chunk of data; in English, one token is roughly equivalent to three-quarters of a word. Every request you send to an LLM, and every part of its response, is processed and billed based on these tokens. This is why maximizing token efficiency is such a legitimate financial concern for high-volume enterprise users.

The argument against conversational language in prompts is rooted in the fundamental nature of these LLMs:

1. Token Consumption

Each word, including ‘please,’ ‘thank you,’ and ‘I would appreciate it if you could,’ consumes a token. While a few extra words would not break the bank, using five unnecessary words across thousands of prompts annually wastes computational resources. For high-volume enterprise use, token efficiency is a legitimate financial concern.

2. The Machine Perspective

An LLM doesn’t have emotions, feelings, or a need for affirmation. It operates purely on statistical patterns. When you input a prompt, the AI predicts the most statistically probable and practical next set of words based on its training data. Whether your input is a polite request or a blunt command, the core instructions remain the same.

In short, polite language is considered noise that does not improve the signal of your core instruction.

The Counter-Argument: Defining “Quality” in Prompting

While the efficiency argument is factually correct regarding tokens, it overlooks the most important component of the process: the users.

Writing a prompt is not just about commanding a machine; it’s a cognitive exercise in clarifying our own requirements. Studies have found that a slightly more conversational approach often leads to better results because it forces the user to provide two critical components: Persona and Context.

The Cognitive Benefit of Clarity

When a user slows down to write a slightly more detailed, even “polite” prompt, they are often mentally reframing the interaction from a command into a request for expertise. This naturally encourages the inclusion of crucial context:

Bare Command

Write a quickstart guide for the /auth endpoint.

This prompt lacks context and comes across as vague.

Conversational Request

Hello, I need an API quickstart guide. As an expert technical writer, please draft content for the /auth endpoint, with a focus on OAuth 2.0. Thank you.

This prompt includes persona, scope, and detail.

The polite framing did not benefit the AI, but it did help the user articulate the full scope of the task.

Practical Test Results: Does Conversational Prompting Improve LLM Output?

To test this, I used a complex task that requires nuance and a specific voice. Something technical writers deal with daily. Drafting an initial troubleshooting section for a new API endpoint.

1*3EKhZ09sVFbda2KWeylfAg — This table illustrates the different prompt used to test persona and context with Gemini

The Finding? Politeness is a User Tool, Not an AI Feature

The Actionable Takeaway is: Optimize Your Prompting Workflow

The test confirms the conventional wisdom that polite language does not statistically improve the quality of output for a complex technical prompt. The AI is driven by the strength of the command and the definition of the persona, not sentiment.

However, the key is knowing why people use “please” in the first place. You do not need to treat the AI like a person, but you should treat the prompting process with rigor.

1*oWc5rl5zZsxg 1rDHGU29w — Image created with Gemini

Here is the balanced approach for technical writers seeking maximum efficiency and quality:

Always Set the Persona First: This is a non-negotiable requirement. Begin your prompt by defining the AI’s role in the System Role, such as “Act as an experienced technical editor” or “Assume the role of a senior API documentarian.”
Focus on Specificity, Not Sentiment: Instead of adding “please,” add details about the output format. For example, “Use only Markdown headings,” “Ensure all steps are numbered,” or “Include a section on prerequisite knowledge”.
Use Conversation for Cognitive Clarity: If framing your request as a slightly more conversational setup (like, “Hello, I need your help…”) helps you slow down and ensure you have included the Persona and all necessary technical constraints, then it is a beneficial part of your workflow. The few extra tokens are a worthwhile investment against a poor, generic output.

The core lesson that I understood is: Do not use politeness for the AI; use it for your own thinking process.

Efficiency is key, but clarity and quality are paramount.

While the principle of eliminating fluff ( please, thank you, etc.) is universal, the actual number of tokens saved and the cost difference will vary between models. This is because each LLM (like Gemini, GPT, or Claude) uses a unique tokenizer: The algorithm that breaks down text. What one model counts as a single token, another may break into two. Therefore, the ultimate token savings will depend on the specific model your enterprise uses in production.

To read more about prompt and prompt engineering, read some of my other blogs here:

https://sneha-pandey.medium.com/prompt-engineering-a-technical-writers-guide-to-mastering-ai-5f50148f0dad

Disclaimer: This post may contain affiliate links. If you click and buy, we may receive a small commission at no extra cost to you. Read our full disclosure here.

Tags:

Artificial Intelligence