A Simple Trick to Do Your Data Analysis in Seconds | by Christopher Tao | Jan, 2024

Editor
2 Min Read


Uncover Hidden Insights using ydata-profiling

Exploratory Data Analysis (EDA) plays a crucial role in data science, which allows us to gain insights and understand the patterns within a dataset. In one of my previous articles, I introduced the convenience of a Python library called “Pandas GUI” which is an out-of-the-box Python EDA tool.

Now, let’s turn our attention to “ydata-profiling,” a successor to the popular “pandas-profiling” library. “ydata-profiling” offers advanced EDA capabilities and addresses the limitations of its predecessor, making it an invaluable resource for data scientists and analysts.

Image by Stevenom from Pixabay

As always, before we can start to use the library, we need to install it using pip.

pip install ydata-profiling

To conduct EDA, we need to have a dataset. Let’s use one of the most famous public datasets — the Iris dataset for this demo. You can get it from the Sci-kit Learn library. However, to make it easier, since we are not going to use the Sci-kit Learn library in this demo, I found the dataset on the datahub.io website which you can make use of directly.

https://datahub.io/machine-learning/iris/r/iris.csv

We can easily load the data from the URL into Pandas dataframe as follows.

import pandas as pd

df = pd.read_csv("https://datahub.io/machine-learning/iris/r/iris.csv")
df.head()

Then, we can import the ProfileReport module from the ydata-profiling library to generate the EDA report from the pandas dataframe.

from ydata_profiling…
Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.