Member-only story
Answer your questions based on local embedding
In this article, I’ll walk you through high level steps which are required to query your comma separated value a.k.a. CSV file and get response out of it.
Let’s get started by importing the required packages.
Import Required Packages
import pandas as pd
import tiktoken
import openai
import os
import numpy as np
Get OpenAI API Key
To get the OpenAI key, you need to go to https://openai.com/, login and then grab the keys using highlighted way:
Once you got the key, set that inside an environment variable(I’m using Windows).
os.environ["OPENAI_API_KEY"] = "YOUR_KEY"
Parse CSV File
Next, we need to read our CSV file and extract the required columns out of it. Here, I’ve taken HBOMax data set from Kaggle.
During this extraction phase, we need to take a call on the columns as which all columns we will be using for answering our questions. In my case, I introduced a new column named summarized, which contain all the information which could be helpful to answer user’s query.
Once all the information is gathered, we need to calculate number of tokens associated with every row/cell…