Exploratory Data Analysis: What Do We Know About YouTube Channels (Part 2) | by Dmitrii Eliuseev | Nov, 2023

Editor
2 Min Read


Getting statistical insights using Pandas and the YouTube Data API

Image by Souvik Banerjee, Unsplash

In the first part of the story, I collected statistical data from about 3000 YouTube channels and got some interesting insights. In this part, I will go a bit deeper, from the generic “channel” to the individual “video” level. I will show how to collect data about YouTube videos and what kind of insights we can get.

Methodology

To collect data about YouTube videos, we need to perform several steps:

  • Get credentials for the YouTube Data API. It’s free, and the API limit of 10,000 requests per day is enough for our task.
  • Find several YouTube channels that we want to analyze.
  • Write some Python code to get the latest videos and their stats for a selected channel. YouTube analytics is available only for channel owners, and we can only get data at the current moment. But we can run the code for some time. In my case, I collected data for three weeks using Apache Airflow and a Raspberry Pi.
  • Perform the data analysis. I will be using Pandas, Matplotlib, and Seaborn for that.

Getting the YouTube API credentials and configuring the Apache AirFlow were described in my previous articles, and I recommend readers pause this one and read that part first:

And now, let’s get started.

1. Getting the data

To get information about YouTube videos, I will use a python-youtube library. Surprisingly, there is no ready-to-use method to get the list of videos from a specific channel, and we need to implement it on our own.

First, we need to call the get_channel_info method, which, as its name suggests, will return us the basic information about the channel.

from pyyoutube import Api

def get_channel_info(api: Api, channel_id: str)…

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.