DataRobot PartnersUnify your AI stack with our open platform, extend your cloud investments, and connect with service providers to help you build, deploy, or migrate to DataRobot.
GenAI: Automating Product Feedback Reports Using Generative AI and DataRobot
GenAI: Automating Product Feedback Reports Using Generative AI and DataRobot
Generative AIHorizontalMicrosoft Azure OpenAI
This accelerator shows how to use Predictive AI models in tandem with Generative AI models and overcome the limitation of guardrails around automating summarization/segmentation of sentiment text. In a nutshell, it consumes product reviews and ratings and outputs a Design Improvement Report.
Going through customer review comments to generate insights for product development teams is a time intensive and costly affair. This notebook illustrates how to use DataRobot and generative AI to derive critical insights from customer reviews and automatically create improvement reports to help product teams in their development cycles.
DataRobot provides robust Natural Language Processingcapabilities. Using DataRobot models instead of plain summarization on customer reviews allows you to extract keywords that are strongly correlated with feedback. Using this impactful keyword list, Generative AI can generate user-level context around it in the user’s own lingua franca for the benefit of end users. DataRobot AI Platform acts as a guardrail mechanism which traditional text summarization lacks.
In [ ]:
import json
import os
import warnings
import datarobot as dr
from fpdf import FPDF
from langchain.chains import LLMChain
from langchain.chat_models import AzureChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import BaseOutputParser
import numpy as np
import pandas as pd
warnings.filterwarnings("ignore")
Configuration
Set up the configurations reuired for secure connection to the generative AI model. This notebook assumes you have an OpenAI API key, but you can modify it to work with any other hosted LLM as the process remains the same.
The cell below outlines the functions to accomplish the following:
Extract high impact review keywords from product reviews using DataRobot.
During keyword extraction, implement guardrails for selecting models with higher AUC to make sure keywords are robust and correlated to the review sentiment.
Generate product development recommendations for the final report.
LLM Parameters: Read the reference documentation for all Azure OpenAI parameters and how they affect output.
In [ ]:
class JsonOutputParser(BaseOutputParser):
"""Parse the output of an LLM call to a Json list."""
def parse(self, text: str):
"""Parse the output of an LLM call."""
return json.loads(text)
def get_review_keywords(product_id):
"""Parse the Word Cloud from DataRobot AutoML model and generate the text input for the LLM."""
keywords = ""
product = product_subset[product_subset.product_id == product_id]
product["review_text_full"] = (
product["review_headline"] + " " + product["review_body"]
)
product["review_class"] = np.where(product.star_rating < 3, "bad", "good")
project = dr.Project.create(
product[["review_class", "review_text_full"]],
project_name=product["product_title"].iloc[0],
)
"""Creates a DataRobot AutoML NLP project with review text"""
project.analyze_and_model(
target="review_class",
mode=dr.enums.AUTOPILOT_MODE.QUICK,
worker_count=20,
positive_class="good",
)
project.wait_for_autopilot()
model = project.recommended_model()
"""logic to accept word ngram models and not char ngram models."""
if max([1 if proc.find("word") != -1 else 0 for proc in model.processes]) == 0:
models = project.get_models(order_by="-metric")
for m in models:
if max([1 if proc.find("word") != -1 else 0 for proc in m.processes]) == 1:
model = m
break
word_cloud = model.get_word_cloud()
word_cloud = pd.DataFrame(word_cloud.ngrams_per_class()[None])
word_cloud.sort_values(
["coefficient", "frequency"], ascending=[True, False], inplace=True
)
# keywords = '; '.join(word_cloud.head(50)['ngram'].tolist())
"""Guardrail to accept higher accuracy models, as it means the wordclouds contain \
impactful and significant terms only """
if model.metrics["AUC"]["crossValidation"] > 0.75:
keywords = "; ".join(word_cloud[word_cloud.coefficient < 0]["ngram"].tolist())
return keywords
template = f"""
You are a product designer. A user will pass top keywords from negative customer reviews. \
Using the keywords list, \
provide multiple design recommendations based on the keywords to improve the sales of the product.
Use only top 10 keywords per design recommendation.\
Output Format should be json with fields recommendation_title, recommendation_description, keyword_tags"""
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
human_template = "{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages(
[system_message_prompt, human_message_prompt]
)
chain = LLMChain(
llm=AzureChatOpenAI(
deployment_name=OPENAI_DEPLOYMENT_NAME,
openai_api_type=OPENAI_API_TYPE,
openai_api_base=OPENAI_API_BASE,
openai_api_version=OPENAI_API_VERSION,
openai_api_key=OPENAI_API_KEY,
openai_organization=OPENAI_ORGANIZATION,
model_name=OPENAI_DEPLOYMENT_NAME,
temperature=0,
verbose=True,
),
prompt=chat_prompt,
output_parser=JsonOutputParser(),
)
Import data
This accelerator uses the publicly available Amazon Reviews dataset in this workflow. This example uses a subset of products from the Home Electronics line. The full public dataset can be found here.
This programmatic loop runs through the product list and generates the final report.
In [ ]:
from datetime import datetime
In [ ]:
product_list = ["B000204SWE", "B00EUY59Z8", "B006U1YUZE", "B00752R4PK", "B004OF9XGO"]
pdf = FPDF()
for product_id in product_list:
print(
"product id:",
product_id,
"started:",
datetime.now().strftime("%m-%d-%y %H:%M:%S"),
)
keywords = get_review_keywords(product_id)
""" Guardrail to generate report only if there are enough \
Keywords to provide results"""
if len(keywords) > 10:
# report = chain.run(keywords)['recommendations']
report = chain.run(keywords)
if type(report) != list:
report = chain.run(keywords)["recommendations"]
product_name = product_subset[product_subset.product_id == product_id][
"product_title"
].iloc[0]
print("Adding to report")
pdf.add_page()
pdf.set_font("Arial", "B", 20)
pdf.multi_cell(w=0, h=10, txt=product_name)
for reco in report:
pdf.cell(w=0, h=7, txt="\n", ln=1)
pdf.set_font("Arial", "B", 14)
pdf.multi_cell(w=0, h=7, txt=reco["recommendation_title"])
pdf.set_font("Arial", "", 14)
pdf.multi_cell(w=0, h=7, txt=reco["recommendation_description"])
pdf.set_font("Arial", "I", 11)
pdf.multi_cell(
w=0, h=5, txt="Review Keywords: " + ", ".join(reco["keyword_tags"])
)
print(
"product id:",
product_id,
"completed:",
datetime.now().strftime("%m-%d-%y %H:%M:%S"),
)
pdf.output(f"/home/notebooks/storage/product_development_insights.pdf", "F")
Download the report
Download the pdf named “product_development_insights.pdf” at “/home/notebooks/storage/” or from the notebook files tab in the UI Panel.
In [ ]:
from IPython import display
display.Image(
"https://s3.amazonaws.com/datarobot_public_datasets/ai_accelerators/images/image_report.jpg",
width=800,
height=400,
)
Out [ ]:
Conclusion
This accelerator demonstrates how you can use DataRobot and generative AI to identify key patterns in customer reviews and create reports or work items that can be used by product development teams to improve their products and offerings. Using various prompts you can steer the LLM into much more complex outputs like Agile stories, development plans, and more.
Get Started with Free Trial
Experience new features and capabilities previously only available in our full AI Platform product.