Integrating Yext Search To Mediawiki To Improve Our Engineering Documentation Experience

Yi Zhang

By Yi Zhang

Aug 23, 2023

6 min

At Yext, we're committed to consistently maintaining and improving our internal documentation tools.

For years, our technical documentation was stored in our own markdown-based system. While this made our documentation relatively hard to edit, it allowed us to leverage the full power of our Yext Search technology to search it.

Recently, however, we decided to upgrade to a Mediawiki-powered wiki for our documentation. This shift made documentation much easier to create and edit, but we found that the built-in search based on pure text matching was inferior to what we're used to with Yext Search.

Now, by integrating Yext Search into Mediawiki — as shown below — we've found the best of both worlds. Here's how.

Background

Initially, our documentation existed primarily as markdown files stored alongside the relevant code. We developed an internal service called DocMD that allowed us to find and read that documentation.

Using DocMD to read documentation worked well enough, but the problem was that the documentation existed as files distributed across our codebase. That meant that every change required an explicit code review and approval. This friction slowed down the process of making changes and discouraged people from making edits.

To simplify this process, we migrated the majority of our technical documentation to a Mediawiki-based wiki platform. The platform's built-in visual editor and direct edit-and-save workflow allowed engineers to create and improve the documentation with minimal effort.

However, because we chose to keep a lot of the code-specific documentation as README files alongside the code, we found a new challenge: now, searching for specific information required us to look in two places and compare the results. Given how often engineers have to find information, this was incredibly inconvenient. That's when we introduced Yext Search.

Yext Search, a core product at our company, happened to provide exactly what we needed: AI-powered natural language search that allows searches across multiple data sources and types.

We set up an internal Yext Search site for the technology department to integrate several types of data, which were synchronized through API connectors and stored in our Knowledge Platform as their corresponding entity types.

This included the markdown files and wiki pages, but it was easily extended to provide better search over our entire codebase. The result was stunning: Yext Search seamlessly wove together the multiple knowledge types in a single interface that could be easily asked questions using natural language — and then return contextual answers from the most relevant source.

However, while we were pleased with this solution, it still required engineers to search for documentation outside of where they were reading it — which was somewhat awkward and unintuitive. So, we came up with the next step: integrating Yext Search directly with Mediawiki's native search functionality. This would allow us to search for documents without having to navigate to a different site, taking the search experience to the next level.

Integration Planning

The idea to integrate Yext Search into Mediawiki's native search was inspired by existing Mediawiki extensions such as Google Site Search, which add additional search results from an external source to the standard wiki search results. These kinds of extensions work by leveraging one of Mediawiki's built-in hook functions: triggers that allow custom code to be executed when some specific event (such as displaying search results) occurs. For our purpose, the hook involved is called SpecialSearchResultsAppend, which adds an HTML page at the end of the wiki results.

With these existing extensions to refer to, all we needed to do was build our own custom Mediawiki extension that used the search hook to append our Yext Search results. There were two options we could have gone with. First, we could have simply appended an iframe of our existing Yext Search site. Alternatively, we could have rewritten a miniature Yext search page using the Yext Search SDK.

Integration Implementation

Initially, since we were aiming for a seamless search experience, we opted for the iframe option, but we found it less than ideal because it forced users to log in for every search. Plus, iframe embedding isn't ideal from a security perspective.

This led us to the second option: using the Yext Search SDK to generate a miniature Yext search page.

Initially, we expected this to be a more difficult task than it turned out to be since we have relatively little organizational experience with PHP. Fortunately, a judicious application of ChatGPT provided some direction, and we were able to get a completed extension working with relatively minor effort.

The first component of the extension is the extension.json, shown below, which contains the extension's setup instructions and declares the various PHP classes to use and where to find them. The "Hooks" section tells it which functions to run when the search hook is triggered:

The next part is the PHP implementation of the function called by the search hook. It basically just does two things (and in only two lines): it reads the content from an HTML page and appends it to the output.

With all the parts wrapped up, we deployed the Mediawiki server locally with this new extension. We tried to search some texts and the testing messages showed up at the end of the results. Happily, it worked.

Yext Search SDK

At this point, all that was left was to add the Yext Search logic. For developing Yext products, we have the Hitchhiker Platform with detailed guidance. We were following this guide to use the javascript tools in our page to add the search results.

Importantly, rather than hardcoding the API key inside the web page, we chose to go with a more secure route and have an API endpoint to handle this. As part of our research into the best way to go about this, we turned to ChatGPT again, and it told me about the API module. This provided exactly what we were looking for: an endpoint to handle requests. An API module can be added from an extension, and since we were already building an extension, we just put it inside the same one.

The extension.json is then updated to add the new API module class and declare the API name.

And inside the PHP file, we added the API implementation with authentication and custom logic to request and return an API key:

With the API module added, the endpoint could be reached at "/api.php?action=answers-auth&format=json" from the javascript code, and the api key in response will be used by the Yext search SDK.

Wrap Up

Now, our Mediawiki server will show both search results from wiki and our Yext Search. We can configure the Yext Search on whether to show the wiki results along with the markdown results. We decided to still show the Wiki results, even though it sometimes duplicates the result from Wiki, because we find the Yext search is more accurate than the built-in wiki search engine. (E.g. When searching for "services", the wiki page titled "service" didn't show up in wiki, but it shows in the Yext Search results.) The difference is that wiki search is based on keyword search while Yext Search is based on semantic search.

From the initial planning to production deployment, I managed to complete this project within two weeks largely independently. It was well-received by the other engineering teams.

For the next steps of this project, we plan to further improve the search experience based on the Yext search result analysis. e.g. monitor the click-through rate of different search clusters to find out documentation gaps, adding more synonyms to show more accurate results.

Click here to learn more.

Share this Article

Read Next

loading icon