Trust, Grounding, and the Rising Stakes for Brands in an AI-Driven World
In this episode of The Visibility Brief, Yext SVP of Marketing Rebecca Colwell sits down with Chief Data Officer Christian Ward for the final episode of the year — and a timely discussion on one of the most critical issues facing AI today: accuracy.
As AI platforms increasingly power discovery, recommendations, and answers, consumers expect accurate responses every time. But what happens when an LLM gets something wrong – whether it be about a fact, a location, or a brand? And with models improving daily, how should marketers approach their data, and the expanding responsibilities they now share with AI systems?
Rebecca and Christian break down Google's recent FACTS Grounding Benchmark results and Suite, why even top models still make mistakes, and what brands must do to keep their data clean, consistent, and trustworthy across all digital touchpoints, including AI.
The episode breaks down:
Why accuracy matters more in AI search than in traditional
What brands can (and can't) control
How models decide when to check facts against the live web, and a simple test marketers can use
How inconsistent data invites errors and hallucinations
What high-stakes accuracy really means for brands
Why memory and corroboration will define 2026
If you're a marketing leader preparing for a world where AI powers a large share of searches, answers, and recommendations, this episode will help you understand why accuracy is becoming the new competitive battleground and how to make sure your brand shows up correctly… when it matters most.
Episode Links
Transcript
Rebecca Colwell (00:05) Happy holidays, Christian.
Christian J Ward, Yext (00:17) Happy holidays, good to see you Rebecca.
Rebecca Colwell (00:19) Yes, this is our ⁓ final episode of the calendar year. Do you have big plans before we roll into break?
Christian J Ward, Yext (00:26) I don't. Thankfully, we are taking it easy. We've got all of our kids home and we're excited to spend some time. How about you?
Rebecca Colwell (00:33) Amazing. ⁓ A little bit of personal travel, ⁓ but I'm very excited about that. I just wrapped up some work travel, my final work travel of the year, and it happened to be to San Francisco for a GEO conference. And it was really great. I got to share our citation research. It was really, really well received. ⁓ But I also got to hear from a lot of emerging thought leaders in this space, including Anfal Siddique, who is a machine learning engineer at Google. have so many questions for you because he shared some really interesting insights from Google's ⁓ facts grounding benchmark study, which is in essence measuring the accuracy of AI responses. ⁓ But before we dive in and analyze all of that, I think it would be worth revisiting why accuracy is so important ⁓ in an LLM. ⁓ So let's start there, just kind of to ground ourselves, ⁓ to play on that word. ⁓ Why do you think accuracy? is still so important for an LLM and like what's the impact of getting it wrong?
Christian J Ward, Yext (01:37) So when it comes to really AI, conversational AI, which are built on the LLMs, ⁓ realistically, I think the whole battleground is going to be trust. And so it's a little different in search. So with search, you get a lot of information back. In fact, most there's been studies on when people don't get what they're looking for in search, they don't pick up the phone and call Google and go, it didn't work. They just retype something in because they think they did something wrong. So they actually project the error back on themselves because it's a list. It's not really a conversation. In AI, if it brings something wrong to you, you're immediately going to be like, you know what, I'm not using this anymore. And the real issue for them is AI has so adopted us, there's no cognitive burden or cost to me as the human of changing from Grok to Gemini to Claude to perplexity because it's so easy to start. No one has ever sat there going, I just finished the PDF on how to use AI. You don't have to do that. so that because that means there's no barrier of switching. And so for AI companies, if they have inaccuracies, they're going to lose users immediately and at a horrifying rate. So they have to get it right in order for people to trust the AI and to continue to use it.
Rebecca Colwell (02:39) Thank you, absolutely, makes perfect sense. And I was also thinking from a brand perspective, what is the impact if the LLM gets information about me or my company wrong?
Christian J Ward, Yext (03:08) Well, and so this is where it's going to be very interesting. And I think you'll see a lot of research in the coming year on this, which is so let's say I'm a brand and the AI has misinterpreted my information or it's gotten the information from a bad source. What is different in Google is while people blame themselves maybe for not getting the data, they tend to also blame the brand a lot. So if the data was wrong in Google, we found that people didn't blame Google, they blamed the brand. Now with AI, they're going to blame the AI and the brand. And so you never get out of this alive. Like as a brand, it's always your fault either way. But what happens with the AI is you have to make sure that if you can monitor this, if you can pay attention to where it's getting it right or wrong or what the sources are, you are a little bit, from the consumer perspective, responsible for keeping a clean house. And so if AI is bringing back bad information and it's citing things you can control and you're not controlling them, then somewhat caveat emptor like, I'm sorry, but like, that's your fault. And so people really have to start looking at this. And I know obviously the conference you're at, the fact that there is a conference about AEO, GEO and sort of visibility just goes to show you that this is very much on people's minds now. They understand they're gonna have to take care of things way beyond the classic search perspective of hoping to show up here. They've got to show up and they've got to keep a clean house in terms of all of their data and information. Otherwise consumers are ultimately going to sort of dislike or disapprove of both the AI and the experience from the brand.
Rebecca Colwell (04:46) Absolutely. It seems like there are two factors here. There are a set of factors within a brand's control that can help improve accuracy or not, and then a set outside. So maybe let's start with what they can control. So what should a brand be doing to ensure that they're just making silly errors because I've got inaccurate information out there?
Christian J Ward, Yext (05:12) Yeah, I mean, look, so a lot of our research and actually I'm very excited. We're going to update the research study we did back in September where we studied about 7 million citations. We're going to be studying about 70 million citations. So in January, we'll be able to see, has it changed, did it change by model? But in those original findings, what we found is the vast majority of the information cited by AI from the consumer perspective. So if I'm a consumer and I say, you know, does this restaurant have this on the menu or is this available near me? Those types of things that are controlled by the website of the brand. That's a huge portion by third party platforms. So that could be Google business profile or MapQuest or anything. Yahoo. It doesn't matter what it is. It's these third party platforms that are phenomenal providers of structured data to the AI. And then there's other areas where they can control a little less, but they can still engage. So reviews and social. But what you find is this is going to be almost a race whereby brands have to understand the more knowledge they're willing to share directly with these AI systems through the crawlers and through putting it on their website, the better off they're going to be. And I don't want to say it sound like they haven't realized that in the past, but a lot of times websites are almost the minimal viable information. that is out there. And I know that sounds crazy, but think of all the offers and deals and events. Half that stuff never makes it to the website because it's hard to get the data to the IT team to push it out. And so it doesn't happen. The people that recognize that as much real-time information as they can provide about the brand and what they're doing to these systems, they're gonna win. I actually think ⁓ it's sort of a classic thing that I learned on sort of financial services. In all cases, Real-time synchronized data beats delayed data. So if you're trading a stock and you don't have real-time quotes, you probably should stop trading. You need real-time information and AI is gonna make that, this is an absolute necessity going forward. Think of most websites, they are static for days, weeks, months, years. Stop doing that. Start understanding that you need to update this stuff everywhere as fast as you can and the AI is very likely to reward that behavior.
Rebecca Colwell (07:33) Excellent. You know, you were talking about cleaning house and offers and things, and it just made me cringe to think how many things are out there that are outdated, like promos that have expired, events that are no longer happening that could also just be sending bad signals. And so it's not just keeping new, getting new information out there. It's taking down things that don't make sense anymore either.
Christian J Ward, Yext (07:53) That's such a great point because I think a lot of times what we end up with, whether we want to or not, is there's sort of this expansion of who controls what data. So you might have like the finance team is controlling different variations on offers and pricing. And then you have the real estate team that has all the data regarding new offices or new locations and closing locations. And then you have the employment or the recruiting team controlling job postings in all these areas. This is a fallacy. We have to start looking at this saying, if I don't get everyone on the same page in a compliant manner, controlling as many of these facts as possible, I'm going to be great in one area and then totally behind in another. And so this obviously is a large reason why we built the Knowledge Graph, but you need a federated understanding across the organization that if we want to dominate in AI, where dialogue is going to weave way longer and more entrenched than a search, You've got to make sure all those things are working together. And look, I think it's an opportunity. think for most brands, if they can focus on making sure the answers are readily available to any question, that's going to help them a lot. But this is a new exercise for many marketing departments who are, I remember, know, the term evergreen, evergreen is, I love the idea of evergreen. And I'm like, listen, other than your mission statement as a business, nothing's evergreen. You need to start realizing every consumer is deciding in real time, in their time, whether or not you're the right choice. And AI is going to make that a big, big hit in 26.
Rebecca Colwell (09:31) The second thing that came up that was really interesting to me was how much of the accuracy is entirely outside of our control and really the model's ability to correctly interpret that information. So ⁓ the term grounding came up quite a bit. And so before we dive into the results of the survey or the study that Google had done, how would you explain grounding in terms that a lay person could understand?
Christian J Ward, Yext (09:57) Sure. So when you think about a question like, the sky blue? The AI doesn't necessarily think or nor is it likely that the reason the sky is blue has changed in the last 24 hours. Right. And so what happens is it basically says, look, I know that for my training data, I don't need to go out to the web. Grounding is going either the AI model recognizes in your question that it should go check the answer. So what time does Starbucks open on the corner? It should probably check that just in case it's changed in the last few weeks. Grounding, as you know, is based on the idea that if I can go and check, I have a higher verification rate and have a higher likelihood of accuracy. But it's not for all questions. So one of the best ways everyone should check this is if you use Anthropix models, Claude, they allow you actually to turn off web searching, which is a really neat way to check, hey, I'm asking questions about my brand. If you turn off web searching, do you see more hallucinations or more errors? That means the training was wrong. So then you really gotta rethink your strategy. Now if you turn on web search and answers it differently, it's correctly, then you go, okay, I've gotta somehow get more of that data back into the training or I need that data in more places for the next time it runs so I know that I'm getting that data out there. But it's a great hack if you wanna see what's grounding and what's not.
Rebecca Colwell (11:18) that's so interesting. when Anfal was talking, I should take a step back. ⁓ So when ⁓ Anfal Siddique, who is a machine learning engineer from Google, did his presentation, he was explaining some research that Google had conducted, the Google Facts benchmark study, right? And in this, he was saying the model was given a set of information and then questions that should have been answerable within that information, right?
Christian J Ward, Yext (11:37) Yes.
Rebecca Colwell (11:48) Then they went and they said, how accurate were the answers? And the top performing model was Gemini 3 Pro, which had a score of 68.8. Does that mean it only got two thirds of the information right?
Christian J Ward, Yext (12:00) Yes. No, so a couple things. So usually when they do this, they look for at least one error. Did that happen? Then they look for what was the depth of that error and how severe was that error? So all of these people that are doing accuracy, they're looking at it in a different span. And that's probably a good thing because it's a little bit like, you know, these aren't multiple choice standardized tests to get into college. Like if you get one wrong, you're kind of, there's no going back. In life,There are degrees of verification and accuracy. And I think they're actively trying to manage that. The problem is there isn't a perfectly agreed benchmark yet on what accuracy is. Anyone living in today's world is like, well, that's how, what that side thinks is accurate. That's what that side. So think about subjective questions versus objective questions. Is this a good restaurant to go with my wife for our anniversary? Has a massive variance. And so if that's in the data, It misinterpreting that is a lot more likely than what time does the restaurant open on Saturdays. That is a much higher likelihood of being correct. So you can think of it as it's basically an asymmetry of verification. So there are things that are easily verifiable and there are things that are very hard to verify. Now I didn't look at all the examples in his study. I think we should link to it so people can see. But this is not a simple question because in dialogue a lot of times We're asking many different things, but those things are on a varying curve of verification. And so you get into this world where an answer might be 68 % right. Now to the human, it answered really 9 tenths of what they were looking for, but it made subtle mistakes along the way. ⁓ because I, alternatively, I can tell you there was a recent study on Gemini 3 and GPT 5.2. GPT 5.2 just came out where they got their hallucination rate, which is a little different than an error that way down another 50%. So there's hallucinations where it's a misinterpretation or a miss re statement of things that should be known. And then there's an error. And an error is it literally has the wrong data. There's a lot you can do as a brand or a business about getting rid of the errors. That's about having perfectly consistent information everywhere something looks your website, the directories, the review site, I don't care where you can post it, post it. but it must stay in perfect synchronization every time you make a change. Otherwise you're introducing the likelihood of error because when it checked over here, you said it's six o'clock at night. When you checked over here, it's 630, which is what your hours are in the summer, but you didn't update that other location. So that's the way to fight the errors. Hallucinations, to be honest, there's not a ton you can do. And in fact, most of the models are getting so good. they're down to one or 2 % hallucination rates where it's making something up. And I got to tell you, I've had interns that hallucinated a much higher rate than that. So I think the models are trying to handle the hallucination rates, but if you're giving it data that is varied through time, you're inviting not only errors, but a hallucination off
Rebecca Colwell (15:20) Interesting, interesting. You you're talking about the difference between is this restaurant good and is this restaurant open? I also wondering, some things like is the restaurant good? If the model doesn't get the answer 100 % accurate, the consequence of that is maybe you have like not a great dinner. But there are probably some pretty consequential questions people could be asking about their health or their finances where the stakes are much, much higher.
Christian J Ward, Yext (15:38) Yes, yes. Yes. Yeah. Well, I will give you the one that always hits home for us, which is, allergens. So we have a child with an allergy where tree nuts are really dangerous. And so the question of whether or not a tree nut is present in a recipe, you cannot get wrong. So I want everyone to think of this phrase. What is the penalty for being wrong is a great way to think about what you cannot get wrong. And so in that case, if I say, Hey, what's on the menu? Like, I don't know if you've ever gone to Disney world, but like, The chef comes out to every table if there's an allergy and they walk through everything of what they can and cannot have. That's because they recognize the penalty for being wrong is life and death. Over on the other side of, was it the ambiance you really wanted? And did it maybe misinterpret the ambiance from six reviews from three years ago? That's not, quite frankly, that may not even matter because the way the human interprets what the ambiance was on that evening is very different. That could be just the company they're with. So I think the reality is, is we have to leave some room here between what is the accuracy of the data? Where is it stored and where is it available to the models? And we have to separate that from error rates and verification. And that's, I think 26 is going to be a really interesting time where whether the models companies or not, the foundational models want this, there's going to have to be some concept of how much can you verify. to this answer right now. And thankfully we can see the citations. We know where the source is and that's helpful. But how confident is the model that that data is accurate is going to be something that's going to come.
Rebecca Colwell (17:25) So do you see it potentially becoming a requirement the same way that ⁓ privacy disclosure became required or accessibility has become required?
Christian J Ward, Yext (17:34) Yeah, I mean, look, I hope to God we're not heading into another cookie banner issue. what I tell you is you can actually do this already. And if you don't do this, this is another hack I'd love to give everybody a try, like the turn off web search. You can actually ask any question and then go back and say, put all of the knowledge you answered in that question into a table and then tell me your source and then rate for me the confidence that the information is correct or verifiable. And it will give you a rate. Now that is a calculated field, but what you're doing is you're triggering. Remember, most of these models are not one model. They're a model of experts. So if you asked a mathematical question, there's a different model handling that than a rational logic question versus a web search dinner question. And so when it does that, you're going to get some different rates. you're actually going to, I can tell you, I mean, obviously if it's 50 % or less, it's a guess. Like the AI is saying, I'm not sure, it's 50-50. But everything above that, I think you're going to find is pretty good. Now, whether that becomes a mandate or not is very much more about regulatory authorities and regulatory regimes. I don't think we should head there. I think it's one of those things of you're going to sort of slow down. But I do think if some model company came out and just put right next to the, was a good answer, this was a bad answer, some sort of confidence number that you could click on and it could say, break it down and say, by the way, I'm not as confident on that restaurant's menu ingredients because they don't expose it. It's not online, which is mistake of the restaurant, then you come back and say, okay, I understand. Does that make sense? I think that's a good way to break it down.
Rebecca Colwell (19:08) Yes, yes, I love that possibility because it's interesting. If I ask the LLM about something that I'm an expert in, it's a new marketing strategy or something like that, and it comes back with an answer, I have enough judgment most of the time to kind of squint at it and say, okay, that didn't look right. But in fact, like if I ask about something that I don't really know about, I'm not going to stop errors, you know, in.
Christian J Ward, Yext (19:25) Right. Yeah. Yes. Yes. Well, thanks to Dunning Krueger. If you know a little bit, you actually think you know, and you're going to be like, that's totally it. It's the sycophancy problem of AI where it's like, you're so right, Rebecca. Let's totally. Yeah. Look, it's definitely a problem. And look, I think most of the AI models and the companies know this. This is a little bit of the battle between safety guidelines and the model not speaking out of turn to a human being and also the model not necessarily disrupting a human or being ⁓ even antagonistic. So I understand there's a balance. But when we think about how this sort of will pan out, I think if there was something to tell people, hey, I'm confident in this answer, I'm not as confident in this answer, I think citations were a great start. They could put next to the citation the recency or last time they got it. There's lots of ways to add that value. But I also want to offer only people like us, the marketers, the big brands, the people that are trying to understand this. That's probably we're the ones looking at this. I don't think that many humans are clicking the citation panel. I do it all the time, but it's my job. I think for everyone else, it'd be great if there was some sort of other almost accepted way of sharing some of that or just making sure that it's available to the human so they can look and say, hey, you told me there was a sale this weekend just before the holiday for Legos as I have to get to the store and get. And if you don't do that and you don't show where you got that, think that is again the issue for AI is they can say, hey, we got it from right here. This is a source. I guess it was wrong. But it's really tough. think when people get the wrong answer, they're just going to end up swapping AI. And I think the AI companies are very aware of that issue, that they've got to get better sources with more information. But remember, AI companies are not going to use one source. They're going to do what Google does, which is they seek corroborating evidence. That's the word for 2026. It's corroborating evidence, which is, hey, I found this 26 places. I'm pretty confident. Over here, I found it four places, not so confident. That's gonna change how it looks at that asymmetry of verification.
Rebecca Colwell (21:25) I love that look forward. ⁓ Let's wrap up our segment today with a speed round of the year in review. Because it's been quite the year. ⁓ So I'll just fire some quick questions at you and ⁓ we'll see where we go. So what do you think was the most important development in AI search this year?
Christian J Ward, Yext (21:46) Okay. So in terms of the different models with AI search, I think what one of the things that has happened, which I wasn't necessarily expecting, I know people were worried about AI overviews when we started the year. What I think has really been amazing is Gemini in terms of the way it's merging the AI with classic search. In other words, a lot of people, at the beginning of the year, we were all debating, are people gonna use AI to search? And Google sort of like, let me answer that for you.
Rebecca Colwell (22:31) Thanks
Christian J Ward, Yext (22:31) We're going to use AI on every search. So it wasn't really something you can say they adopted or didn't. lot of the SEO community is upset about it because a lot of what they're used to is changing. But I think that AI working into the adoption rates, partially through Gemini and partially through ChatGPT, was really the big winner, which is there were a lot of people early on in this year, early 2025, like, you know, it's doing okay. It's only this percent. And I think people are realizing, no, it's not this percent. Look, my kids do not use Google for the majority of what they, and so we're getting to the point where you can't ignore this. That was the big one this year.
Rebecca Colwell (23:14) Okay. There were a lot of models out there and it feels like there was this arms race all year. was like one week one would do this and then the next one would counter with this and this and this. So at the beginning of the year, what do you think the strongest model was? And then as we're at the end of the year, who has the best model now?
Christian J Ward, Yext (23:21) Yes. So at the beginning of the year, think it was actually a battle between GPT or OpenAI and Claude and Thropic. Particularly for those of us, mean, if we break this up, I just want to caveat. Whenever I'm talking about models, there's sort of three audiences I'm thinking about. The people that code and engineer, they have a very different use case than ⁓ the classic sort of consumer journey use case. And then there's sort of the brand marketer use case. So there's three here, but I'll say on the coding and brand marketing, Claude and ChatGPT were really crushing. I think the big winner though, like if I'm looking at the models, it was Gemini 3. Google, look, Google for a long time, do you remember all the headlines at beginning of the year of that Google, Google's on the run, like Google's being, Google, I think Gemini 3 was not only are here, we're ahead. And I think for many people that was like, me included, I was so glad to see that. I think the beauty of it is, You have a company that has one of the most or the most successful business model in the history of business, like one of them. And you have them, you have them not falling in the innovators dilemma, in my opinion, but running boldly into this future. And that was great because it's going to keep the competitive landscape going. So I think, think, I think Anthropic and to their credit, OpenAI, their newest model is doing phenomenal as well. remember, like you said, I haven't seen an arms race like this since I was a kid watching Reagan and Gorbachev. Like this is incredible. It's every few weeks. I think, look, I don't think it's going to slow down. think 2026, we're going to see the same level, if not faster, but you're going to see it into multimodals and all these other areas.
Rebecca Colwell (25:02) Yeah. Incredible. ⁓ To wrap us up, what is your biggest prediction for the coming year?
Christian J Ward, Yext (25:22) I well, I'll go back to where we started where I think verification of information is going to continue to be a really big space. I think most people are going to start really questioning and wanting to know, hey, the model is doing the following. I think that will be a big trend. But going forward, I think what everybody was sort of guessing earlier was adoption. But really what you're going to start seeing is the addition of all the other modalities, all the other experiences. And my big prediction for this coming year is memory. To me, memory is this huge unlock. ⁓ Sam Altman's already said, GPT-6 will be almost exclusively about memory, but memory gemini is already demonstrating how powerful it is. But people and humans, we are going to have a mildly existential concern with having beings that remember everything. And that's gonna be both a blessing and a curse. Forgive and forget and humanity, we forgive most of the time because we forget. We're designed to forget. we're now going to have a compatriot that forgets nothing. So I think 2026 is going to be the year of memory and quite frankly, the year of personalization in a way that the consumer or the human is more empowered because they're using an AI agent to control what is shared with brands and businesses. This will be uncomfortable for marketers who want to know everything about you and I and where we went for dinner last night and everything in between. It's going to be uncomfortable, but this is a better path. It's just going to be different than everyone.
Rebecca Colwell (26:53) Okay, so memory and personalization. We have it recorded for posterity. So many years from now gonna see how it played out. ⁓ Yes, and I bet we'll be talking about it ⁓ every couple of weeks on the visibility brief. So thank you so much for joining us, Christian. I hope you have a really restful holiday.
Christian J Ward, Yext (27:02) Excellent. You as well. Great to see you.
Rebecca Colwell (27:15) Thanks. And thank you all for joining us for our final episode of the Visibility Brief in 2026. Thanks for listening and we'll see you next year.








