How To Generate SQL Queries Using Azure OpenAI

Shweta Lodha
3 min readOct 9, 2024

--

In the ever-evolving landscape of data management and artificial intelligence, the ability to generate SQL queries using natural language inputs is a game-changer. Azure OpenAI, with its powerful language models, offers a seamless way to translate natural language into SQL queries, making data interaction more intuitive and accessible. This article will guide you through the process of leveraging Azure OpenAI to generate SQL queries, enhancing your data querying capabilities.

Introduction to Azure OpenAI

Azure OpenAI Service provides access to OpenAI’s powerful language models through the Azure platform. These models are capable of understanding and generating human-like text, making them ideal for tasks such as natural language processing, text generation, and, importantly, translating natural language into SQL queries.

Setting Up Azure OpenAI and Azure SQL DB

Before you can start generating SQL queries, you need to set up your Azure OpenAI environment. Here are the steps:

  1. Create an Azure Account: If you don’t already have one, sign up for an Azure account.
  2. Set Up Azure OpenAI Service: Navigate to the Azure portal and create an Azure OpenAI resource.
  3. Set Up Azure SQL Database: You should have your tables ready in Azure SQL DB. If not, you can use below scripts to create sample tables for yourself.

Here’s a script for create 3 tables — Student, Course and Department with proper relationships:

-- Creating the Department table
CREATE TABLE Department (
DepartmentID INT PRIMARY KEY,
DepartmentName NVARCHAR(100) NOT NULL
);

-- Creating the Course table
CREATE TABLE Course (
CourseID INT PRIMARY KEY,
CourseName NVARCHAR(100) NOT NULL,
DepartmentID INT,
FOREIGN KEY (DepartmentID) REFERENCES Department(DepartmentID)
);

-- Creating the Student table
CREATE TABLE Student (
StudentID INT PRIMARY KEY,
StudentName NVARCHAR(100) NOT NULL,
CourseID INT,
FOREIGN KEY (CourseID) REFERENCES Course(CourseID)
);

In this script:

  • Department table contains departments.
  • Course table contains courses offered by each department and references the Department table through DepartmentID.
  • Student table contains students and references the Course table through CourseID.

This setup ensures that each course is associated with a department, and each student is associated with a course. Now you can go ahead and insert data for these tables using your insert script.

At this point, we are good to dive into the code. Let’s start by installing the required packages.

Install Required Packages

Here are the ones which you need to install:

! pip install requests
! pip install azure-core
! pip install azure-identity

Import Required Packages

Here are the packages which we need to import to get started:

import requests
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

Setup Creds and Header

I’m going with the DefaultAzureCredentials as I’ve already logged in the editor, you can choose your own way of authentication. You can opt for key-based authentication as well.

azure_credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(azure_credential,<SCOPE_GOES_HERE>)
token= token_provider()

# Headers for the API request
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {token}"
}

Construct the Prompt and Payload

Here is my prompt for one of the SQL queries:

prompt = """
Generate a SQL query to get courses and their respective departments.

The table schemas are as below:
- Department(DepartmentID, DepartmentName)
- Course(CourseID,CourseName,DepartmentID)
- Student(StudentID,StudentName,CourseID)
"""

and here goes payload:

data = {
"messages": [
{
"role": "system",
"content": [
{
"type": "text",
"text": "You are an AI assistant that can generate SQL queries very efficiently using given table schemas. <ADD_MORE_INSTRUCTIONS_IF_REQUIRE>"
}
]
},
{
"role": "user",
"content": [
{
"type": "text",
"text": prompt
}
]
},
],
"max_tokens": 200
}

Make an API Request and Get Response

endpoint = "<ENDPOINT_GOES_HERE>/openai/deployments/<MODEL_GOES_HERE>/chat/completions?api-version=<VERSION_GOES_HERE>"
response = requests.post(endpoint, headers=headers, json=data)
response.json()["choices"][0]["message"]["content"].strip()

On successful execution, you will see the response something like this:

Happy querying!

--

--