Fine-tuning GPT-3.5 with the Titanic Dataset

data engineering

Publish Date: 2023-10-10

In the quest for superior machine learning models, fine-tuning is a crucial step. GPT-3.5, a robust model from OpenAI, offers a fine-tuning feature that lets you tailor the model for specific tasks. One intriguing use case is employing GPT-3.5 to unravel insights from the historic Titanic dataset.

Here’s a simplified example of processing the Titanic dataset into fine-tuning prompts:

import pandas as pd

# Load the Titanic dataset
data = pd.read_csv('titanic.csv')

# Define a function to format data into a fine-tuning prompt
def format_example(row):
    prompt = f"Passenger: {row['Name']}, Age: {row['Age']}, Sex: {row['Sex']}\n"
    prompt += f"Survived: {'Yes' if row['Survived'] else 'No'}\n"
    return prompt

# Create fine-tuning prompts
prompts = data.apply(format_example, axis=1)

# Save prompts to a text file for fine-tuning
with open('fine_tuning_prompts.txt', 'w') as f:
    for prompt in prompts:
        f.write(prompt + '\n')

Post processing, follow OpenAI’s fine-tuning guide to upload your data and initiate a fine-tuning job. Once the fine-tuning is complete, you’re set to sail on a voyage of discovery with your newly fine-tuned GPT-3.5 model, exploring the depths of the Titanic dataset!

The realm of fine-tuning is vast and the potential to uncover new insights is boundless. So, hoist your sails, the sea of data awaits!

robot learner

https://datasciencebyexample.github.io/2023/10/10/finetune-gpt-with-structured-data-set-such-as-titanic-dataset/

All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source robot learner !

openai finetuning

Unlocking the Power of Large Language Models, A Glimpse into Prompt Engineering

2023-10-12 data science

large language model prompt engineering

How to Indicate if Any of Multiple Columns Have Values Greater Than 0 in Pandas

2023-10-07 data engineering

pandas