Understanding the Difference Between Melt and Pivot in Pandas, Examples and Use Cases

data engineering

Publish Date: 2023-06-03

Pandas is a popular data manipulation library in Python that provides powerful tools for data analysis and transformation. Two commonly used functions in Pandas are melt and pivot, which allow users to reshape their data. In this blog post, we will explore the differences between these two functions and provide simple examples to illustrate their usage.

Melt:

The melt function in Pandas is used to transform a dataset from a wide format to a long format, also known as unpivoting. It gathers columns and “melts” them into a single column, creating a new DataFrame with a row for each unique combination of identifiers. The melted column contains the values that were previously spread across multiple columns.

Example:

Let’s consider a dataset with information about students and their scores in different subjects:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Maths': [90, 75, 80],
    'Physics': [85, 70, 95],
    'Chemistry': [92, 87, 78]
}

df = pd.DataFrame(data)

print(df)

Output:

      Name  Maths  Physics  Chemistry
0    Alice     90       85         92
1      Bob     75       70         87
2  Charlie     80       95         78

Now, let’s use the melt function to reshape the data:

melted_df = df.melt(id_vars='Name', var_name='Subject', value_name='Score')

print(melted_df)

Output:

      Name    Subject  Score
0    Alice      Maths      90
1      Bob      Maths      75
2  Charlie      Maths      80
3    Alice    Physics      85
4      Bob    Physics      70
5  Charlie    Physics      95
6    Alice  Chemistry      92
7      Bob  Chemistry      87
8  Charlie  Chemistry      78

As shown in the example, the melt function transformed the wide-format DataFrame into a long-format DataFrame, where each row represents a unique combination of the identifier (Name) and the melted column (Subject), with the corresponding values in the Score column.

Pivot:

The pivot function in Pandas is the inverse operation of melt. It is used to transform a long-format DataFrame into a wide format by spreading a column’s values into multiple columns.

Example:

Let’s use the melted DataFrame from the previous example and apply the pivot function:

pivoted_df = melted_df.pivot(index='Name', columns='Subject', values='Score')

print(pivoted_df)

Output:

Subject  Chemistry  Maths  Physics
Name
Alice           92     90       85
Bob             87     75       70
Charlie         78     80       95

In this example, the pivot function reshaped the long-format DataFrame back into a wide-format DataFrame. The unique values in the Subject column became the columns in the pivoted DataFrame, and the corresponding values in the Score column were spread across those columns, with each row representing a unique identifier (Name).

robot learner

https://datasciencebyexample.github.io/2023/06/03/dataframe-operation-melt-and-pivot/

All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source robot learner !

pandas melt pivot

how to install lightgbm on macOS

2023-06-03 data engineering

lightgbm

Unlocking the Potential of ChatGPT and 9 Unique and Exciting Use Cases

2023-06-02 data science

chatgpt