AI Re-Ranking For Semantic Search

By Chris Barnhart On Aug 11, 2022

Search isn’t just about matching keywords – and that’s even more true when we talk about semantic search.

Semantic search is about finding the right information for the searcher at the right time.

That goes beyond finding the right keywords and concepts and speculating how searchers will interact with the results.

Artificial intelligence (AI) re-ranking will take information about the people who come to search and tailor search results to the individual.

That might be done on a cohort level, changing results based on trends, seasonality, and popularity.

It might also be done individually, changing results based on the current searcher’s desires.

While AI re-ranking is not easy to implement in a search engine, it brings outsized value for conversions and searcher satisfaction.

Re-Ranking With Artificial Intelligence

AI-driven re-ranking can improve search results, no matter the underlying ranking algorithm a search engine uses.

That’s because good search results are more than textual relevance and business metrics like raw popularity.

Good results take into account other signals and do so on a per-query level.

To see why this is important, let’s focus on the business metric of popularity.

It’s a good general ranking signal but can fall short for specific queries. A search query of “red dress” might bring up in the first results two different dresses: “backless dress with red accents” and “summer dress in bright red.”

The backless dress might be more popular as an overall dress and product.

But in this case, specifically, it’s not what customers want.

They want a red dress, not one with red accents, and they click and buy accordingly.

Shouldn’t the search engine take that as a signal to rank the summer dress higher?

Search Analytics

As the above example shows: Understanding what searchers are doing is necessary for re-ranking.

The two most common events to track are clicks and conversions.

Generally, those are the only two events necessary and must be events coming from search.

The example above also highlights another important consideration: the events should be tied to specific queries.

That allows the search engine to learn from the interplay between the different result sets and user interactions. It propels the summer dress higher in the search results for the “red dress” query.

The same product might be less popular for other queries than its neighbors.

When looking at your different events, you’ll want to weigh them differently, too.

Clicking on a result is a sign of interest while making a purchase (or any other conversion metric) is a sign of commitment.

The ranking should reflect that.

The weighting doesn’t need to be complex.

You can go as simple as saying that conversions are worth double clicks.

You should test the right ratio for your own search.

You may also want to discount events based on the result ranking at the time the searcher saw it.

We know that a result’s position influences its clickthrough rate (CTR).

Without discounting events, you may have a situation where the top results become even more entrenched because they get more interactions, which keep them ranked higher – and repeating infinitely.

Freshness And Seasonality

A simple way to combat this self-reinforcing loop is by discounting events based on the time passed since the event.

That happens because each event that occurred in the past has an increasingly small impact on re-ranking. That is, until, at some point, it has no impact at all.

For example, you might divide the impact of each event by two, each day, for 30 days. And after 30 days, stop using the event for ranking.

A nice benefit of using freshness in the re-ranking algorithm is that it also introduces seasonality into the results.

Not only do you stop recommending videos that were extremely popular years ago but are boring to people today; you also will recommend “learn how to swim” videos in the summer, and “learn to ski” videos in the winter.

YouTube has seasonality and freshness built into its algorithm precisely for this purpose.

Using Signals To Re-rank

Now that you’ve got the signals and decaying them over time, you can apply them to the search results.

When we see “artificial intelligence,” we often think of something incredibly complex and inscrutable.

AI, though, can also be as simple as taking data over time and using it to make decisions, like we’re doing here.

One easy approach is to take a certain number of results and simply re-rank them based on a score.

For performance reasons, this number of results will generally be fairly small (10, maybe 20). Then, rank them by score.

As we discussed above, the score could be as simple as adding up the number of conversions times two, plus the number of clicks.

Adding a decay function makes for more complexity, as does discounting based on result position – but the same general principle applies.

Learning To Rank

A drawback of this re-ranking system is that you are limited to re-ranking a smaller number of results.

If you have a result that would otherwise be popular but isn’t ranking high, that result won’t get the attention it warrants.

This system also requires events on the records and the queries you want to re-rank.

It won’t work for brand new product launches or user-generated content (UGC) that often comes in and out of the search index.

Learning to rank (LTR) can address these issues.

Much like the re-ranking we’ve discussed above, LTR also works based on the idea that the records searchers interact with are better than the ones they don’t.

The previous re-ranking method works by boosting or burying results directly when tied to a specific query.

Meanwhile, LTR is much more flexible. It works by boosting or burying results based on other popular results.

LTR uses machine learning to understand which queries are similar (e.g., “video games” and “gaming console”).

It can then re-rank results on the less popular queries based on interactions on the more common ones.

LTR doesn’t only generalize on queries; it generalizes on records, too.

The LTR model learns that a certain type of result is popular; for example, the Nintendo Switch game “Legend of Zelda: Breath of the Wild.”

Then, it can start to connect to other similar results (for example, “Legend of Zelda: Skyward Sword”) and boost those.

Why, then, not just use LTR if it appears to be much more powerful than your typical re-ranking and provides more query and record coverage?

(In other words: It generalizes better.)

In short, LTR is much more complex and needs more specialized in-house machine learning (ML) expertise.

Additionally, understanding why certain results are ranked in certain places is more difficult.

With the first type of re-ranking, you could look at the number of clicks and conversions over time for one record compared to another.

Meanwhile, with LTR, you have an ML model that makes connections that may not always be obvious.

(Are “Breath of the Wild” and “Sonic Colors” really all that similar?)

Personalization

While re-ranking works across all searchers, personalization is what it sounds like: personal.

The goal of personalization is to take results that are already relevant and re-rank them based on personal tastes.

While there is a debate on how much web search engines like Google use personalization in their results, personalization often impacts the performance of results in on-site search engines.

It is a useful mechanism for increasing search interactions and conversions from search.

Search Analytics

Just as with re-ranking, personalization depends on understanding how users interact with search results.

By tracking clicks and conversions, you’ll have a clearer idea of the kinds of results that the user wants to see.

One significant difference between re-ranking and personalization on this front is that, depending on your search, you might want to adjust how you apply personalization.

For example, if you sell groceries, you definitely want to recommend previously purchased products.

But if your website sells books, you won’t want to recommend a book that a customer has already bought. Indeed, you may even want to move those books down in the search results.

It’s also true, however, that you shouldn’t push personalization so hard that users only see what they’ve interacted with before.

Search empowers both finding and discovery. So, if they return to the search bar, you should be open to the possibility that they want to see something new.

Don’t rank results exclusively via personalization; make it a mix with other ranking signals.

Just as with re-ranking, personalization also benefits from event decay.

Decreasing the impact of older events makes a search more accurately represent a user’s current tastes.

In a way, you can think of it as personal seasonality.

Personalization Across Users

The kind of personalization we’ve seen so far is based on an individual’s own interactions, but you can also combine it with what others are doing inside search.

This approach shows an outsized impact on situations where the user hasn’t interacted with the items in the search results before.

Because the user doesn’t interact with the search result items, you can’t boost or bury based on past interactions, by definition.

Instead, you can look at users that are similar to the current user and then personalize based on what they have interacted with.

For example, say you have a user who has never come to you for dresses but has purchased many handbags.

Then, you can look for other users who have similar tastes and have also interacted with dresses.

Intuitively, other customers who like the same type of handbags as our searcher should also like the same dresses.

Re-Ranking And Personalization For Discovery

Search is only one example of where re-ranking and personalization can make an impact. You can use these same tools for discovery as well.

The secret is to think of your home page and category pages as search results.

Then, it’s clear that you can use the same tools you use for search and gain the same benefits.

For example, a home page is similar to a search page without a query, isn’t it? And a category landing page sure does look like a search page with a category filter applied to it.

If you add personalization and re-ranking to these pages, they can be less static. They will serve users what they prefer to see, and they can push items higher that are more popular with customers overall.

And don’t worry, personalization and re-ranking can mix with editorial decisions on these pages or inside search.

The best way to handle this is by fixing the desired results in certain places and re-rank around them.

We’ve seen that personalization and re-ranking are two approaches that take user interactions with relevant signals to make search better.

You can let your user base influence the result by using the interactions.

Little by little, these interactions tell the search engine what items should be ranking higher.

Ultimately, searchers benefit from a better search experience, and you benefit from more clicks and conversions.

More resources:

Featured Image: amasterphotographer/Shutterstock

if( sopp != 'yes' && addtl_consent != '1~' ){

!function(f,b,e,v,n,t,s) {if(f.fbq)return;n=f.fbq=function(){n.callMethod? n.callMethod.apply(n,arguments):n.queue.push(arguments)}; if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version='2.0'; n.queue=[];t=b.createElement(e);t.async=!0; t.src=v;s=b.getElementsByTagName(e)[0]; s.parentNode.insertBefore(t,s)}(window,document,'script', 'https://connect.facebook.net/en_US/fbevents.js');

if( typeof sopp !== "undefined" && sopp === 'yes' ){ fbq('dataProcessingOptions', ['LDU'], 1, 1000); }else{ fbq('dataProcessingOptions', []); }

fbq('init', '1321385257908563');

fbq('track', 'PageView');

fbq('trackSingle', '1321385257908563', 'ViewContent', { content_name: 'ai-re-ranking-semantic-search', content_category: 'seo' }); } });

Read original article here

Denial of responsibility! Search Engine Codex is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.