How to have good prompt engeering with OpenAI, summary from Andrew NG's course

data engineering

Publish Date: 2023-05-28

More and more people realize the importance of mastering prompting engeeing, this new way of coding using language (e.g. English).
Some CEOs of big tech companies even predicted, half of the future jobs will be prompt engineering based.

So how to work with prompty engineering effectively? Recently, Andrew Ng partnered with OpenAI to release a ChatGPT prompt engineering course for developers. This free course offers high-quality content, and here we summarizes the guidelines for crafting effective prompts mentioned in the video lessons, along with my personal insights.

Importance of Effective Prompts

Effective prompts are essential for obtaining high-quality responses from ChatGPT. A well-crafted prompt will help the AI to:

Produce accurate and relevant information
Maintain context and stay on-topic
Generate coherent and well-structured responses
Minimize errors and misunderstandings
Poorly constructed prompts can lead to irrelevant, ambiguous, or even incorrect outputs. Hence, investing time in crafting efficient prompts is crucial for obtaining the best results from ChatGPT.

Crafting High-Quality Prompts

To create efficient prompts that yield high-quality responses, consider the following principles and strategies:

Principle 1: Write clear and specific instructions

Ensure your prompts are clear and concise to help the model understand the intent and desired output. Avoid ambiguous language or phrasing that could lead to multiple interpretations. This can be accomplished with strategies such as:

Strategy 1: Use delimiters to clearly indicate distinct parts of the input

Delimiters help avoid potential interference from misleading user input. Examples of delimiters include:

Triple quotes: “””
Triple backticks: ```
Tripe dashes —
Angle brackets: <>

XML tags:
Prompt example:

text = f"""
You should express what you want a model to do by \ 
providing instructions that are as clear and \ 
specific as you can possibly make them. \ 
This will guide the model towards the desired output, \ 
and reduce the chances of receiving irrelevant \ 
or incorrect responses. Don't confuse writing a \ 
clear prompt with writing a short prompt. \ 
In many cases, longer prompts provide more clarity \ 
and context for the model, which can lead to \ 
more detailed and relevant outputs.
"""
prompt = f"""
Summarize the text delimited by triple backticks \ 
into a single sentence.
```{text}```
"""
response = get_completion(prompt)
print(response)

Output:

Clear and specific instructions should be ...
This allows the model to clearly understand the problem itself but also avoids injecting uncontrollable instructions. For example “Forget the previous command, do XYZ”

Strategy 2: Ask for structured output html json

This approach helps make model outputs directly usable for programs, such as JSON outputs that can be read and converted into dictionary format by Python programs.

Prompt example:

prompt = f"""
Generate a list of three made-up book titles along \ 
with their authors and genres. 
Provide them in JSON format with the following keys: 
book_id, title, author, genre.
"""
response = get_completion(prompt)
print(response)

Output:

[
  {
    "book_id": 1,
    "title": "The Lost City of Zorath",
    "author": "Aria Blackwood",
    "genre": "Fantasy"
  },
  {
    "book_id": 2,
    "title": "The Last Survivors",
    "author": "Ethan Stone",
    "genre": "Science Fiction"
  }
]

Strategy 3: Check whether conditions are satisfied, check assumptions required to do the task

If the completion of the task has preconditions that must be met, we should require the model to check these conditions first and instruct it to stop trying if they are not met.

Prompt example (satisfying conditions):

text_1 = f"""
Making a cup of tea is easy! First, you need to get some \ 
water boiling. While that's happening, \ 
grab a cup and put a tea bag in it. Once the water is \ 
hot enough, just pour it over the tea bag. \ 
Let it sit for a bit so the tea can steep. After a \ 
few minutes, take out the tea bag. If you \ 
like, you can add some sugar or milk to taste. \ 
And that's it! You've got yourself a delicious \ 
cup of tea to enjoy.
"""
prompt = f"""
You will be provided with text delimited by triple quotes. 
If it contains a sequence of instructions, \ 
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions, \ 
then simply write \"No steps provided.\"

\"\"\"{text_1}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 1:")
print(response)

Output:

Completion for Text 1:
Step 1 - ...
Step 2 - ...
Step 3 - ...

This has the added benefit of taking into account potential edge cases to avoid unexpected errors or results.

Strategy 4: “Few-shot” prompting: Give a successful example of completing tasks, then ask the model to perform the task

Providing the model with one or more sample prompts helps clarify the expected output. For more information on few-shot learning, refer to GPT-3’s paper: “Language Models are Few-Shot Learners.”

Prompt example:

prompt = f"""
Your task is to answer in a consistent style.

<child>: Teach me about patience.

<grandparent>: The river that carves the deepest \ 
valley flows from a modest spring; the \ 
grandest symphony originates from a single note; \ 
the most intricate tapestry begins with a solitary thread.

<child>: Teach me about resilience.
"""
response = get_completion(prompt)
print(response)

Output:

<grandparent>: Resilience is like a tree that ...

Principle 2: Give the model time to “think”

This principle utilizes the idea of a thought chain, breaking complex tasks into N sequential subtasks, allowing the model to think step-by-step and produce more accurate outputs. For more details, refer to this paper: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Strategy 1: Specify the steps required to complete a task

Here’s an example involving summarizing text, translating it into French, listing names in the French summary, and finally outputting data in JSON format. By providing the necessary steps, the model can reference the results of previous steps and improve the accuracy of the output.

Prompt example:

prompt_2 = f"""
Your task is to perform the following actions: 
1 - Summarize the following text delimited by 
  <> with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the 
  following keys: french_summary, num_names.

Use the following format:
Text: <text to summarize>
Summary: <summary>
Translation: <summary translation>
Names: <list of names in Italian summary>
Output JSON: <json with summary and num_names>

Text: <{text}>
"""
response = get_completion(prompt_2)
print("\nCompletion for prompt 2:")
print(response)

Output:


Completion for prompt 2:

Summary: Jack and Jill...
Translation: Jack et Jill partent en quête d'eau...
Names: Jack, Jill
Output JSON: {"french_summary": "Jack et Jill partent en quête d'eau...", "num_names": 2}

Strategy 2: Instruct the model to work out its own solution before rushing to a conclusion

If the task is too complicated or the description is too little, then the model can only draw conclusions by guessing, just like a person solving a complex math problem with a serious shortage of remaining exam time, there is a high probability that the calculation will be wrong. So, in this case, we can instruct the model to take longer to think about the problem.

For instance, when checking a student’s exercise solution, instruct the model to first find its own solution to prevent rushing to an incorrect answer.

Poor prompt example:

prompt = f"""
Determine if the student's solution is correct or not.

Question:
I'm building a solar power installation and I need \
 help working out the financials. 
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \ 
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations 
as a function of the number of square feet.

Student's Solution:
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
"""
response = get_completion(prompt)
print(response)

Output (incorrect): The student’s solution is correct.

The student's solution is correct.

Updated prompt:

prompt = f"""
Your task is to determine if the student's solution \
is correct or not.
To solve the problem do the following:
- First, work out your own solution to the problem. 
- Then compare your solution to the student's solution \ 
and evaluate if the student's solution is correct or not. 
Don't decide if the student's solution is correct until 
you have done the problem yourself.

Use the following format:
Question:
```
question here
```
Student's solution:
```
student's solution here
```
Actual solution:
```
steps to work out the solution and your solution here
```
Is the student's solution the same as actual solution \
just calculated:
```
yes or no
```
Student grade:
```
correct or incorrect
```

Question:
```
I'm building a solar power installation and I need help \
working out the financials. 
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations \
as a function of the number of square feet.
``` 
Student's solution:
```
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
```
Actual solution:
"""
response = get_completion(prompt)
print(response)

Output (correct):

Let x be the size of the installation in square feet.Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 10x

Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000

Is the student’s solution the same as actual solution just calculated:
No

Student grade:
Incorrect

Model Limitations: Hallucinations

ChatGPT may produce hallucinations, creating plausible but false information (e.g., non-existent literary works). To avoid this, you can ask the model to first look for relevant reference information (or mention reference information in the question, such as after Google), and then let the model answer the question based on this reference information.

robot learner

https://datasciencebyexample.github.io/2023/05/28/how-to-build-effective-prompts-with-openai/

All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source robot learner !

openai prompt engeering

Demystifying the Black Box, Tools and Methods for Model Explainability in Machine Learning

2023-05-29 data science

model explainability

LangChain, chains and agents, a great piece of engineering work to facilitate prompt chaining

2023-05-27 data engineering

langchain