In the rapidly evolving financial service industry, the ability to efficiently handle customer interactions is paramount. The call center, often the first point of contact for customers, plays a crucial role in shaping customer experiences and satisfaction levels. However, traditional call center operations face challenges in terms of speed, efficiency, and accuracy, which can impact customer satisfaction and business outcomes. Large volumes of audio call recordings need to be stored, processed, and governed. Insights and next best action recommendations need to be intelligently extracted from this data to drive customer happiness and retention. Many personas have a stake in deriving value from this data, including Banking Service Agents, Investment Advisors, Insurance claims Officers, Adjusters, and Customer Servicer Operators.
With Databricks Data Intelligence Platform, we can unlock the value of this data at scale. The new AI Query makes the development of an end-to-end workflow as simple as a few lines of SQL script, reducing complexity and saving time.
A traditional call center analytics workflow involves many steps and technologies:
With Large Language Models (LLMs), the NLP analytics steps of the workflow can be simplified with prompt engineering. But there is still a lot of work required to host LLMs, make API calls to LLMs, develop pipelines, and orchestrate workflows. It can be particularly challenging to achieve good performance at scale. This is where the Databricks AI Query (ai_query SQL API) comes to the rescue. It not only simplifies the implementation of the workflow with SQL language but also optimizes the batch LLM inference performance at scale with built-in fault tolerance.
ai_query function provides a simple way to apply AI directly on data with Databricks. ai_query supports Databricks foundation model endpoint, external model endpoint, and custom model endpoint with Databricks Model Serving.
Here is simple example shows the syntax of ai_query:
SELECT
ai_query(
"llama-70b-endpoint",
"Summarize this call transcript: " || transcript
) AS summary_analysis
FROM call_center_transcripts;
Using ai_query, you can now run batch LLM inference at high scale with unmatched speed thanks to Databricks model serving which minimizes batch LLM inference processing time and costs by auto-scaling resources, adjusting batching configurations, and improving workload management. In addition, built-in fault tolerance with automatic retries ensures large workflows run smoothly, handling transient errors without disruption.
Here is an example of how one can perform call center batch LLM inference workflow on Databricks (Figure 1) with just 4 steps using ai_query
Figure 1. Call Center Batch Inference Workflow
Typically call center datasets have caller ids associated with the audio file, in this example, the caller id is embedded in the name of folders that contain the audio files.
CREATE OR REFRESH STREAMING TABLE raw_audio_files
AS
SELECT *, regexp_extract(path, r'.*\/caller_id_(\d+)\/.*', 1) AS caller_id
FROM STREAM read_files(
'/Volumes/genai/call_center/volume_speech/audio_clips/',
format => 'binaryFile',
inferColumnTypes => 'true',
recursiveFileLookup => 'true',
pathGlobFilter => '*.wav'
);
Foundation models such as OpenAI whisper large v3 speech2text model and meta Llama 3.1 series LLMs are located in unity catalog path system.ai. One can navigate to a model asset and deploy with a few clicks of a button. Please follow the screen recording below to deploy a foundation model.
Navigate to system.ai for foundation models
Find and deploy OpenAI whisper large V3 speech to text model
The recommended endpoint compute for the OpenAI whisper large v3 model is:
Model Name |
Suggested workload type (AWS) |
Suggested workload type (Azure) |
whisper_large_v3 |
GPU Medium (AWS) |
GPU Large (Azure) |
Find and deploy Llama 3.1 70B LLM
ai_query function is applied to the content column. The content column was created in step 1 when the files were loaded and it contains the audio byte data.
CREATE TABLE IF NOT EXISTS raw_audio_transcription (
caller_id STRING,
modificationTime TIMESTAMP,
length INT,
transcript STRING
)
USING DELTA;
INSERT INTO raw_audio_transcription (caller_id, modificationTime, length, transcript)
SELECT
caller_id,
modificationTime,
length,
ai_query(
"whisper_v3",
content,
failOnError => True
) as transcript
FROM raw_audio_files;
The example here shows a version of prompts that works well with Llama 3.1 LLMs and an example dataset for the tasks of summarization, sentiment analysis, and topic analysis. We encourage readers to experiment and optimize prompts based your own data and desired LLMs
CREATE TABLE IF NOT EXISTS transcription_nlp_analysis (
caller_id STRING,
modificationTime TIMESTAMP,
length INT,
transcript STRING,
summary STRING,
sentiment STRING,
topic STRING
)
USING DELTA;
INSERT INTO transcription_nlp_analysis (caller_id, modificationTime, length, transcript, summary, sentiment, topic)
SELECT
caller_id,
modificationTime,
length,
transcript,
ai_query(
'meta-llama3-1-70b',
CONCAT(
"Summarize the conversation in max of 150 words",
transcript
),
failOnError => True,
modelParameters => named_struct('max_tokens', 300, 'temperature', float(0))
) as summary,
ai_query(
'meta-llama3-1-70b',
CONCAT(
"analyze the sentiment of the conversion",
transcript,
", return 'positive', 'negative', or 'neutral'. Return only the overall sentiment, do not explain"
),
failOnError => True,
modelParameters => named_struct('max_tokens', 50, 'temperature', float(0))
) as sentiment,
ai_query(
'meta-llama3-1-70b',
CONCAT(
"Return the predominant topic in the below conversation. please include only one main topic from the provided list ",
transcript,
"\n\nlist of topics wtih description delimited with ':' \n",
"* car accident: the customer involved in a car accident \n",
"* policy change: the customer would like change, update, or add on their policy or information\n",
"* home accident: the customer has a damage in his or her home\n",
"* motorcrycle: the customer has a motorcycle related question\n",
"* theft: the customer had things stolen from their cars and homes\n",
"Return only the topic, Do not explain"
),
failOnError => True,
modelParameters => named_struct('max_tokens', 50, 'temperature', float(0))
) as topic
FROM raw_audio_transcription;
Here are a few examples of the analysis results:
Now the call center batch LLM inference workflow is completed and the results are ready to be consumed by downstream applications for business insights.
In this article, we showed how using Mosaic AI Batch Inference and ai_query can simplify Customer Call Center NLP Analytics for the financial services industry.
We plan to add more exciting features to this capability in the near future:
Stay tuned!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.