Sunday, March 13, 2011

Overriding a Keyword Search?

My most recent project has taken an interesting turn into keyword search. We're using SOLR for our search engine and the philosophical question on the table is how to fine-tune it or do we fine-tune it?

That is, we are in a tug-of-war over whether the most prominent results should be the products management wants to feature and also, if the results are inherently flawed because some other things are showing up. Specifically, some stakeholders are insisting that we find a way to re-interpret what the user is searching for or override the results the search engine brings back.

Well that sounds rather Orwellian, and maybe not a good thing. At least not something I care to endorse. But let me explain the dilemma a little more.

First, this is an eCommerce site, not social networking or information sharing. It's all about connecting shoppers to products, and consequently, reinforcing marketing goals for the business.

Second, we are grappelling with two problems: iconic brands/products and context-sensitive search. The question at hand is how to get balance between iconic brands and products and the apparent randomness of context-sensitive search.

To illustrate the problem, let's think about grocery shopping. And let's imagine your search word is "Kellogg's". What are you searching for? Breakfast cereal? Probably. And what images are in your mind that signal "Kellogg's" and breakfast cereal? Maybe a box of cornflakes? Not that you want to buy cornflakes or even like them but it's the product that built the company and it's an image that strongly signals your search has landed you in the right place.

That's an iconic brand and an iconic product. And the problem is that people have such strong expectations about iconic brands and products, they may believe the search is broken if these brands or products don't show up according to their expectations.

Think about Kellogg's Cornflakes again. Where did you imagine this product would appear in your imaginary Kellogg's search? Near the top? In the middle? On Peapod, the dominant Chicago-area online grocery, a search on "Kellogg's" sorted by "Best Match", puts Kellogg's Cornflakes in 100th place, trailing a long mix of other cereals, breakfast bars, frozen waffles, and fruit roll-ups. Surprised?

Maybe that is an accurate reflection of the market or maybe the search algorithm is flawed; after all, we don't know Peapod's criteria for "Best Match", or their preferences for sorting results, but it's not hard to imagine a brand manager somewhere being rather upset and insisting something must be wrong if Cornflakes is not among the top 10 or 12 things you see.

We're not in the grocery business but our client makes and distributes several iconic products and brands and carries others in its inventory. These iconic products and brands are not always obvious in our search results, either. A lot of parts, supplies, and accessories show up first when you search solely for the brand name. And yes, this seems a little odd, or at least, hard to explain. To switch analogies, it's like showing you the windshield wipers and floor mats before we show you the car.

Our search engineer and I are convinced it's just a matter of examining the relevancy logic and re-weighting some variables or adding some variables so that the ancillary products lose relevancy. But how finely do you tune the search when you don't have any real data to work with?

To complicate matters, this is a B2B site that is not in production yet and we have no budget for user research; we don't really know what kind of search habits and expectations our customers will have. We know that searching on product numbers (full or part) works fine and that is something our users have been doing for a long time on the legacy site. But soon, they'll have the power of text search over the full product specifications and descriptions and we have no clue how they'll respond to that.

Meanwhile, our stakeholders and product owners are experimenting with keyword search based on their own preferences and guesses, which means a lot of searching for the iconics, and they are seeing too many parts and accessories in the results.

Some of these managers are certain the best bang for their money is to stop spending it on search engine support and start spending it on workarounds; like trapping for brand and product names and running hard-coded queries or highly limited searches instead of letting the search engine do its thing.

The more I struggle with these discussions, the more it hits me that we are getting sucked down a dead-end. If these brands and products are so central to the business and the site's identity, why should we be spending time and energy on the idea that our customers would be typing these brand names into a search box?

If these brands are so central to the business, why should any customer have to search for them? They should be front and center on the home page and every page. There should be "Famous Brands" and "Featured Products" links and lists and icons everywhere you look. All of these hot-button items should be one click away ... not something that customers will have to manually spell out and then hit "Go".

If we haven't put those features and shortcuts in place, (but we mostly have, as well as effective facet filtering *), then all of our time and energy should be on fixing that deficiency. Not fiddling with the search engine.

Let the search engine do its thing and return lots of branded parts and compatible parts for brands. Parts are hard to find; parts require searching. Parts are where you make your margin. ("Give away the razor; make a bundle on the blades.") Let's not take a chance on breaking that before the real users have had a chance to make their habits known.

* You can generally filter a result set in the 1,000's down to less than 50 items, usually under 25, in about two or three screens.