Yext’s Policy On Ethical AI Design

By Yext

Feb 3, 2023

8 min

One of Yext's core products — and a crucial component of our Answers Platform — is Search, an "AI-powered search experience based on natural language understanding and using a multi-algorithm approach".

Machine learning algorithms that are at the core of AI modeling emulate decision-making based on available data that come from humans (esp. previous users of Search) — and they are often labeled by humans (annotators). Likewise, the models are created, trained, and deployed by humans. All humans have biases, stemming from their different cultural and social backgrounds, socio-economic situations, gender, age, education, world-view, personal history, and many other factors. As a result, there is always the potential for bias.

At Yext, we are aware of this potential bias, and we strive to minimize or eliminate it by implementing the policies, guidelines, and control mechanisms outlined in this document.

General Principles for Ethical AI Design

At Yext, we seek to deploy innovative AI technologies that are not just economically profitable but also beneficial, fair, and autonomy-preserving for people and societies, drawing from the ethical principles of beneficence, non-maleficence, justice, and autonomy. These high-level principles are rooted in major schools of ethical philosophy, and they have been recently adopted into the domain of digital ethics from the domain of bioethics, where they have been applied for decades.* Concretely, this means that we aim to design AI that (1) avoids causing both foreseeable and unintentional harm; (2) helps promote the well-being of people; (3) is fair, unbiased, and treats people equally during the process of searching as well as when it comes to the search results it provides; and (4) is transparent and trustworthy.

Implementation of Ethical AI Design in Labeling & Model Training
Customer Selection

Yext is selective in the kinds of customers we work with, which dramatically reduces the risk surface for unethical inputs to AI. We do not publish content generated by end users, which may be found on social media websites (e.g., Facebook or Twitter), and which is often ethically complicated. While our AI models must be able to respond to end-user inputs, our data inputs for these models are derived from upstanding businesses that do not engage in ethically risky content production.

Training Data Characteristics

It is important to understand that not all areas of language are equally prone to bias, let alone ethical bias (i.e., bias related to factors like gender, ethnicity, or age). The vast majority of data that we label and use for machine learning (ML) at Yext represent concrete, verifiable, and specific information on businesses and institutions provided by those institutions themselves (online on their own webpage or in the form of digitalized internal documentation).

Unlike the generalized user search across all resources available on the web, represented by tools like Google search or Bing, Yext's domain is the enterprise search, which means a search exclusively within a particular company/institution and its knowledge base. Given this unique character of Yext's business focus, ethically charged topics or concepts only rarely appear in the materials that are used for AI training. Consequently, it is hard to imagine a scenario where a bias could swing a particular annotation one way or another. There is always an external "source of objective truth" that the annotators need to refer to as instructed in the labeling guidelines. Should any uncertainties arise, the annotators have the option to escalate a particular labeling task to their manager who provides advice from both the linguistic and content perspectives and who involves other subject matter experts as needed.

That being said, our data scientists always train ML algorithms on sufficiently large volumes of data that are representative of the scenarios in which the algorithm is to be deployed. By doing so, we prevent any idiosyncratic occurrences of inaccurate or biased labels from skewing the statistical pattern-matching that produces the AI algorithms.

Data Selection

The majority of labeling tasks begin with collecting datasets from search logs. When constructing a corpus of data for labeling, we make sure to avoid over-indexing on large clients by ensuring that no more than 40% of the data comes from one client, and at least four clients are represented in the dataset — unless there's a good reason to do otherwise (e.g., training a customer-specific model).

Labeling Process and Review Mechanisms

To guarantee the highest possible quality of labeled data, each labeling task must have clear written labeling guidelines that reflect the objectives of the project and explain in detail what labels should be used and how they should be applied. The guidelines are a result of collaboration between a linguistic expert/data-labeling manager, a data scientist, and a product manager.

Each labeling project is first tested on a small amount of data in order to gather feedback for further clarification of the guidelines. After that, the labeling project is passed on to the annotators who are in constant communication with the labeling manager. The manager's task is to resolve any issues, ambiguities, or unclarities that the annotators bring up during the entire process of labeling and track the applied solutions in the labeling guidelines so that they can be consistently utilized in the future as well.

Marking Problematic Data as Corrupt

In order to maintain the four main ethical objectives stated above, the annotators are instructed to mark any queries and/or responses that contain vulgar, profane, or ethically questionable content as Corrupt. The data with this tag are discarded from any AI training. The same rule holds for queries with personally identifiable information (PII) or otherwise corrupted data (meaningless or irrelevant for the given business domain).

Consensus Between Multiple Annotators

In order to prevent any unwanted bias, the majority of data used for model training or Search performance analysis are labeled by at least two annotators. If there is disagreement in the selected label between the annotators, the task is escalated for a "disagreement resolution", whereby an annotator assesses both labels and chooses the more appropriate one. If there are doubts about which label should be chosen, the task is further escalated to the labeling manager who discusses the optimal resolution with all annotators involved in the process. If the agreement cannot be reached (which only rarely happens), the data point is discarded.

Final Review

To add an extra layer of protection against bias and unwanted errors that could slip through and compromise the labeled data quality, the more experienced annotators perform a manual review of most labels assigned during the primary annotation process. The systematic implementation of the review process has been possible since March 2022 when Yext invested in the enterprise edition of Label Studio, a cutting-edge labeling software for large-scale labeling operations, where all our labeling tasks are currently carried out.

Preventing Bias and Ensuring Ethical Approach at the Product Level

Yext Search itself contains multiple features and measures for the safe and ethical usage of models trained through the above-described process. These safeguards stem from the philosophy that the use of AI models should be as transparent and configurable as possible, and direct supervision over model outputs should be given to the administrator whenever needed.

Search Algorithm Configurability and Transparency

AI models such as Embedding, Extractive Question Answering, and Named Entity Recognition are used in various places throughout the Yext Search algorithm to affect the recall and ranking of search results. For example:

  • Embedding models may be used to rank results based on semantic similarity to a search query

  • Extractive Question Answering models may be used to retrieve passages from long text documents that address a search intent

  • Named Entity Recognition (NER) models may be used to detect locations in queries, in order to trigger logic to filter by proximity to that location

While the results these AI models produce are themselves outside of the direct control of the user or administrator of a Yext Search experience, we strive to make their application in Search maximally transparent and configurable. Our customers can always choose which search algorithms to apply to their search experience. It is entirely possible, for instance, to create a search experience that does not make any usage of AI model outputs and retrieves results entirely on the basis of keyword- and phrase-matching.

Additionally, the output of models and other factors influencing any given search can be viewed in a robust and detailed log record, which includes model outputs such as the featured snippet shown below, or the semantic similarity of a result that influenced its ranking.

Algorithm Overrides in Experience Training

In some cases, administrators have the ability to directly override the outputs of AI models. This is done through a feature called Experience Training, which gives administrators direct control over certain parts of the Search algorithm, including:

  • Featured Snippets: Admins can review predictions being made by the Extractive QA and Embedding models when generating Featured Snippets on long-form text documents

  • NLP Filters: Admins can review filters that are automatically selected using natural language processing of search queries (including NER)

  • Spell Checking: Admins can review spell checking suggestions that are made by the Spell Check error model.

Changes that are made in Experience Training are reflected immediately in live search experiences. For maximal oversight, administrators can also choose whether Featured Snippets must be approved in Experience Training before they are shown in any live search experience.

Further, overrides created by administrators in Experience Training are automatically entered into the labeling queue, where they are reviewed and approved by annotators through the aforementioned review processes. Experience Training thus allows administrators of Yext Search experiences to participate directly in the training of AI models through the review of model outputs and the creation of overrides.

By enhancing our AI design with the control mechanisms described in this section, commonly referred to as "Human-in-the-Loop", we not only improve the overall performance and robustness of the Search product but we make it more ethically accountable and trustworthy.

Want to learn more about how to apply the proliferation of generative AI to your marketing and personalization strategies? Click here to watch our recent virtual event.

*Cf. Floridi, L., Cowls, J., Beltrametti, M. et al. AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations. Minds & Machines 28, 689–707 (2018).

Share this Article

Read Next

loading icon