Skip to main content

How To Chat With Multiple CSV Files Together - Azure OpenAI + LangChain

When dealing with multiple CSV files having different columns, it’s essential to have an efficient method for querying and extracting relevant information. Azure OpenAI and LangChain provide a robust combination for handling such scenarios. 



In this article, we’ll explore how to load multiple CSV files, process them, and ask questions across all of them.

Let’s get started by installing the required packages.

Install Required Packages

Here are the ones which you need to install:

! pip install openai
! pip install langchain
! pip install pandas
! pip install langchain_experimental
! pip install langchain_openai
! pip install tabulate

Import Required Packages

Here are the packages which we need to import to get started:

from langchain_openai import AzureOpenAI
from langchain_experimental.agents import create_csv_agent

Read Configuration

First of all, we need to set few variables with information from Azure portal and Azure OpenAI Studio:

api_type = "azure"
api_base = "PLACE YOUR ENDPOINT HERE"
api_version = "2024-02-15-preview"
api_key = "PLACE YOUR KEY HERE"
deployment_name = "gpt-35-turbo-instruct
"

You can also set above configuration either using respective environment variables or by placing them in a configuration file.

Define LLM

Here we will be using Azure OpenAI as our LLM. So, we need to construct its object using AzureOpenAI, as shown below:

llm = AzureOpenAI(
openai_api_key=api_key,
deployment_name=deployment_name,
model_name=MODEL_NAME,
api_version=api_version,
azure_endpoint=api_base)

Create CSV Agent

For creating the CSV agent, we will be using create_csv_agent(…) function from langchain_experimental, which takes few key parameters like LLM and the list of CSV files:

agent = create_csv_agent(llm = llm, path=['news_2012.csv', 'nasdaq_2012.csv'], verbose=True)

Ask Questions From CSV Files

And the final thing, we need to do is invoking the agent and that can be done by just one line of code:

agent.invoke("How many rows of data do you have?")

Executing above line results in below output:


Conclusion

I hope you find this walkthrough useful.

If you find anything, which is not clear, I recommend you watch my video recording, which demonstrates this flow from end-to-end.

This video also includes few more examples of different types of queries which were executed on 2 CSV files.


Happy learning!

Comments

  1. can you try using the latest version of langchain? as there're so many new major updates in the latest versions

    ReplyDelete

Post a Comment