Process Pandas DataFrames with a Large Language Model | by Dmitrii Eliuseev | Mar, 2024

Editor
1 Min Read


Seamless Integration of Python, Pandas, and LLM

Pandas, Image by Stone Wang, Unsplash

Nowadays, it is easy to use different large language models (LLMs) via the web interface or the public API. But can we seamlessly integrate LLM into the data analysis process and use the model directly from Python or Jupyter Notebook? Indeed, we can, and in this article, I will show three different ways to do it. As usual, all components used in the article are available for free.

Let’s get into it!

1. Pandas AI

The first Python library I am going to test is Pandas AI. It allows us to ask questions about our Pandas dataframe in natural language. As a toy example, I created a small dataframe with all EU countries and their populations:

import pandas as pd

df = pd.DataFrame({
"Country": ['Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech Republic', 'Denmark', 'Estonia', 'Finland',
'France', 'Germany', 'Greece', 'Hungary', 'Iceland', 'Ireland', 'Italy', 'Latvia', 'Liechtenstein', 'Lithuania',
'Luxembourg', 'Malta', 'Monaco', 'Montenegro', 'Netherlands', 'Norway', 'Poland', 'Portugal', 'Romania', 'Serbia',
'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Switzerland'],
"Population": [8_205000, 10_403000…

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.