Understanding the Difference Between Melt and Pivot in Pandas, Examples and Use Cases


Pandas is a popular data manipulation library in Python that provides powerful tools for data analysis and transformation. Two commonly used functions in Pandas are melt and pivot, which allow users to reshape their data. In this blog post, we will explore the differences between these two functions and provide simple examples to illustrate their usage.

Melt:

The melt function in Pandas is used to transform a dataset from a wide format to a long format, also known as unpivoting. It gathers columns and “melts” them into a single column, creating a new DataFrame with a row for each unique combination of identifiers. The melted column contains the values that were previously spread across multiple columns.

Example:

Let’s consider a dataset with information about students and their scores in different subjects:

import pandas as pd

data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Maths': [90, 75, 80],
'Physics': [85, 70, 95],
'Chemistry': [92, 87, 78]
}

df = pd.DataFrame(data)

print(df)

Output:

      Name  Maths  Physics  Chemistry
0 Alice 90 85 92
1 Bob 75 70 87
2 Charlie 80 95 78

Now, let’s use the melt function to reshape the data:

melted_df = df.melt(id_vars='Name', var_name='Subject', value_name='Score')

print(melted_df)

Output:

      Name    Subject  Score
0 Alice Maths 90
1 Bob Maths 75
2 Charlie Maths 80
3 Alice Physics 85
4 Bob Physics 70
5 Charlie Physics 95
6 Alice Chemistry 92
7 Bob Chemistry 87
8 Charlie Chemistry 78

As shown in the example, the melt function transformed the wide-format DataFrame into a long-format DataFrame, where each row represents a unique combination of the identifier (Name) and the melted column (Subject), with the corresponding values in the Score column.

Pivot:

The pivot function in Pandas is the inverse operation of melt. It is used to transform a long-format DataFrame into a wide format by spreading a column’s values into multiple columns.

Example:

Let’s use the melted DataFrame from the previous example and apply the pivot function:

pivoted_df = melted_df.pivot(index='Name', columns='Subject', values='Score')

print(pivoted_df)

Output:

Subject  Chemistry  Maths  Physics
Name
Alice 92 90 85
Bob 87 75 70
Charlie 78 80 95

In this example, the pivot function reshaped the long-format DataFrame back into a wide-format DataFrame. The unique values in the Subject column became the columns in the pivoted DataFrame, and the corresponding values in the Score column were spread across those columns, with each row representing a unique identifier (Name).


Author: robot learner
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source robot learner !
  TOC