Yext Showcases Knowledge Injection Research With Large Language Models

Yext presented at the Extended Semantic Web Conference 2023 conference on using the Yext platform for Knowledge Injection to generate review responses with Large Language Models (LLMs).

By Ariana Martino

Jul 27, 2023

6 min

In May, Yext presented on Knowledge Injection with Large Language Models (LLMs) at the Extended Semantic Web Conference (ESWC) 2023 conference. The paper, Knowledge Injection to Counter Large Language Model (LLM) Hallucination, co-authored by Yext data scientists Ariana Martino, Michael Iannelli, and Yext senior data analyst Coleen Truong, represents the second published piece of peer-reviewed research from Yext's research and development function. This paper investigates the results of using related entity data from Yext Content with LLMs for automated review response generation. This research helped develop the Content Generation for Review Response, included in the 2023 Summer Release.

Read on to see key takeaways from the paper.

Section 1: Yext Platform and Industry

Our research evolved from an initial study, conducted by Yext, that revealed how responding to online reviews can affect business reputation. Businesses that respond to at least 50% of reviews see approximately .35 star rating increase on average.* This elevates a business's reputation online - especially in search results. Prior research has also shown that responding to 60-80% of reviews is optimal.** Depending on the review volume that a business sees, this manual review and content generation process can translate to multiple hours or even days of a full-time employee's workload. We set out to discover if we could leverage the emerging technology of LLMs to automate and streamline the review response process for businesses through our Reviews monitoring platform.

A large language model is an AI algorithm that is trained on large data sets and can perform a variety of natural language processing tasks such as text generation or answering questions in a conversational manner (you probably already know this if you've ever played around with the likes of Jasper or ChatGPT). LLMs are already used in other sections of the Yext platform, such as Chat and the Content Generation feature. We set out to explore if AI and LLMs could be used for Reviews.

The foundation of our research lies in Yext Content, based on knowledge graph technology, where brand-approved facts are stored. There are four main key features to Yext Content.

1. Maintains a flexible schema

Content allows for platform customizations to align with each individual business's needs and operational structure. For example, a healthcare system would need entity types that reflect healthcare professionals, hospital locations, medical specialties, and health article documents., while a restaurant would have menu items, store locations, and special events. The KG schema can also change over time to adapt to evolving business needs and structures.

2. Defines relationships between entities

Content also provides definitions of how entities are related to one another. For example, Doctor A works at the Union Square office. This additional information provides key context for the LLM to understand the relationship structure between two objects.

3. Contains bi-directional relationship connections

Based on graph technology, Content can make connections bi-directional. For example, a Doctor works at the Union Square office. In the same relationship connection, we can infer that the Union Square office has Doctor A working there. This additional context layer provides a wealth of information to draw complex connections between entities.

These connections go beyond entities added into Content, and extend to data returned into the Yext platform across our other product lines such as Reviews, Pages, Search, and Chat. For reviews specifically, review data is aggregated from all available publishers for each unique entity and returned in the Yext Review Monitoring platform. This content can then be reviewed by a brand employee to manually generate an appropriate review response, which then gets pushed back out to the third-party publisher site.

4. Multi-hop Relationships

Through the linkages between entities, we can confidently make relational ties between entities that are not directly connected. For example, if there is a Doctor who specializes in pediatric gastroenterology, and pediatric gastroenterology appointments are only held at the Union Square office, we know that the doctor works at the Union Square office.

Hypothesis

Our hypothesis was that including related entity information in the prompt text would result in a generated response that reflects the relevant business information. We have defined Knowledge Injection (KI) as injecting related entity information into the prompt text for an LLM.

Section 2: The Research

In order to improve the generated responses from LLMs, we developed a prompt-engineering technique called Knowledge Injection (KI) to map contextual data about entities relevant to a task from a knowledge graph to text space for inclusion in an LLM prompt. Brand experts evaluated the assertions (i.e. specification of a location name, contactable at phone number or web address, owned by brand name, or located at location address) within a generated response for 'correct' vs 'incorrect' assertions. Additionally, brand experts evaluated the generated responses for overall quality to assess alignment with review response brand standards.

Experiment 1:

To test out if using Knowledge Injection (KI) reduces hallucinations in generated responses, we trained bloom-560m on review-response pairs as the dataset for our control. We then re-ran the model with prompts containing related entity information with the review-response pairs into bloom-560m for a KI-prompted LLM.***

Brand experts then reviewed the model output on related entity data in Yext Content and labeled assertions in each generated response based on the below:

Incorrect Assertion (Hallucinated): Untrue information contradicted by the knowledge graph, like directing customers to call a fictitious phone number

Correct Assertion: Assertions not otherwise marked as incorrect

In this experiment, we found that using Knowledge Injection (KI) resulted in reduced hallucinations and increased the count of correct assertions in generated responses by +205%.

Experiment 2:

To test out if using Knowledge Injection (KI) improves overall quality of generated responses on a smaller base model, we ran an experiment comparing generated responses from OpenAI text-davinci-003 and bloom-560m fine-tuned with KI. It's important to note that OpenAI text-davinici-003 has nearly 300 times as many parameters, or nodes in the neural network, as bloom-560m.****

Brand experts then graded the generated responses on a scale of 1 (bad) to 3 (great) based on a variety of qualitative factors, as determined relevant by brand experts.

In this experiment, we found that using Knowledge Injection (KI) does result in higher quality generated responses on smaller base models, outperforming larger base models that don't use KI.

Both of these experiments showcase that KI is useful for enterprise tasks like review response, which are manual and costly when done by humans, but require factual context about the business to produce trustworthy generated text. KI is useful for helping models to align generated responses with business brand standards. Fine-tuning with KI could help businesses save on cost by training and hosting a smaller model while producing higher-quality generated responses and improving inference speed. Since KI requires a well-populated and fact-based knowledge graph in order to build high-quality LLM prompts, consistently updating Content is key to effectively leveraging KI for automated review response and LLM-based enterprise tasks.

Section 3: Presenting at ESWC

Co-authors Ariana Martino (Data Scientist, Data Science), and Coleen Truong (Data Strategist, Data Insights) represented Yext at ESWC and presented the research findings. The presentation was well received by fellow researchers and data scientists from across the world and a multitude of companies, and one of the highest-attended industry track presentations during the conference.

For more details on our LLM research, the full paper is available through ESWC. Plus, check out this video explaining the findings.

*Yext Study (2020)

**Chamberlain, L.: How Responding To Online Reviews Affects Your Business Reputation (2019).

*** Scao, T.L., et al.: Bloom: A 176b-parameter open-access multilingual language model (2022). https://doi.org/10.48550/ARXIV.2211.05100, https://arxiv.org/abs/2211.05100

****Sharir, O., Peleg, B., Shoham, Y.: The cost of training nlp models: A concise overview. arXiv preprint arXiv:2004.08900 (2020)

Share this Article

Read Next

loading icon