- Aug 21, 2021

[python]How to visualize data

Introduction

When doing machine learning such as kaggle's competition, the first thing to do is to visualize the data. And I think that seaborn is often used for data visualization. But do you ever wonder which one to use because there are various types of graphs? (I have)

There are many explanations that "which method can be used to draw such a graph", but I feel that there are few explanations that "in what circumstances this graph is good". Therefore, here I have summarized which method of seaborn should be used for each type of explanatory variable and objective variable.

Environment is python: 3.6.6　seaborn: 0.10.0

Explanatory variable: Discrete quantity (category) Objective variable: Discrete quantity

First is when both the explanatory variable and the objective variable are discrete quantities (categories). Use seaborn count plot. Draw how many each category of objective variables exists. Pass the explanatory variable to the argument x of countplot and the objective variable to hue. The data is titanic.

import pandas as pd
import seaborn as sns

data=pd.read_csv("train.csv")
sns.countplot(x='Embarked', data=data, hue='Survived')

You can also reverse x and hue (which is a matter of taste?).

sns.countplot(x='Survived', data=data, hue='Embarked')

Explanatory variable: Continuous quantity Objective variable: Discrete quantity

Next is when the explanatory variable is a continuous quantity and the objective variable is a discrete quantity. Draw the distribution of explanatory variables for each category of objective variables with seaborn's distroplot.

g=sns.FacetGrid(data=data, hue='Survived', size=5)
g.map(sns.distplot, 'Fare')
g.add_legend()

Please refer to the other article for how to color-code with a method that does not have a hue as an argument .

Explanatory variable: Discrete quantity Objective variable: Continuous quantity

Next, when the explanatory variable is a discrete quantity and the objective variable is a continuous quantity. Draw the distribution of the objective variable for each category of explanatory variables with the seaborn violin plot. We use Kaggle's House Prices for the data.

train_data=pd.read_csv("train.csv")
sns.violinplot(x="MSZoning", y="SalePrice", data=train_data)

Explanatory variable: continuous quantity Objective variable: continuous quantity

Finally, when both the explanatory variable and the objective variable are continuous quantities. Draw the correlation between the explanatory variable and the objective variable with seaborn's joint plot.

sns.jointplot(x="LotArea", y="SalePrice", data=train_data)

This joint plot is excellent because you can see the correlation between two variables and their distribution at the same time.

Summary

The above is summarized in the table below.

[python]How to visualize data

Introduction

Explanatory variable: Discrete quantity (category) Objective variable: Discrete quantity

Explanatory variable: Continuous quantity Objective variable: Discrete quantity

Explanatory variable: Discrete quantity Objective variable: Continuous quantity

Explanatory variable: continuous quantity Objective variable: continuous quantity

Summary

Recent Posts

category

article

Make a "don't forget to add to list" shopping list app with Flutter + Raspberry pi

I made a towel exchange monitoring app with Flutter and Raspberry Pi

[Flutter] Manage status by linking Firestore and Redux

[python] Visualize data and grasp correlation at the same time

Let's do our best with our partner: ChatReminder

It is an application that achieves goals in a chat format with partners.

Let's do our best with our partner: ChatReminder

It is an application that achieves goals in a chat format with partners.

Theme diary: Decide the theme and record for each genre

It is a diary application that allows you to post and record with themes and sub-themes for each genre.

[python]How to visualize data

Introduction

Explanatory variable: Discrete quantity (category) Objective variable: Discrete quantity

Explanatory variable: Continuous quantity Objective variable: Discrete quantity

Explanatory variable: Discrete quantity Objective variable: Continuous quantity

Explanatory variable: continuous quantity Objective variable: continuous quantity

Summary

Recent Posts

category

article

Make a "don't forget to add to list" shopping list app with Flutter + Raspberry pi

I made a towel exchange monitoring app with Flutter and Raspberry Pi

[Flutter] Manage status by linking Firestore and Redux

[python] Visualize data and grasp correlation at the same time

Let's do our best with our partner:​ ChatReminder

It is an application that achieves goals in a chat format with partners.

Let's do our best with our partner:​ ChatReminder

It is an application that achieves goals in a chat format with partners.

Theme diary: Decide the theme and record for each genre

It is a diary application that allows you to post and record with themes and sub-themes for each genre.

Let's do our best with our partner: ChatReminder

Let's do our best with our partner: ChatReminder