top of page
  • Manu KT

E-commerce Search – Make & Bake

Updated: Aug 27, 2021

E-commerce Search has become the default driving force to acquiring new customers. Below are some of the reasons that have proven why search is a critical part of e-commerce:

  • People who search are more likely to buy on the site

  • Better the search experience higher the conversion rate

  • Autocomplete feature of search improves the conversion rates

  • Recommendations play an important role when no search results are found for the user search query

As per a survey of 1,000 US consumers conducted by PowerReviews found that Amazon is the preferred starting point for product search. Google comes in a close second, followed by brand/retailer sites and e-commerce marketplaces (eBay, Etsy, etc.):

  • Amazon — 38 percent

  • Google — 35 percent

  • Brand or Retailer Site — 21 percent

  • Other eCommerce Marketplaces — 6 percent

Do not try to overbuild the system

The current market trends show that the 7 technologies listed below are where every Software Developer would like to spend most of his/her time, invent new things and build the profile and experience:

  1. AR & VR

  2. Machine Learning

  3. Humanized Big Data

  4. Physical-Digital Integrations

  5. On-Demand services

Though all the above technologies do play a pivotal role in E-commerce and search engine enhancement that is customer-centric and handles changing data, the builder of the system should never try to overbuild the system and bring in the latest technologies into the system in the early stages…

Analyze, Learn & Grow

The careful study of user behavior and the type of queries fired to the system can give a lot of insights in gradually tuning the system to respond to customer queries with results as relevant as possible.

An e-commerce search engine is termed ‘Good’ if it is more relevant to customer query rather than how ‘good’ the spell correction algorithm is working behind the scenes.

Here is an example of the comparison between search engines for a search term: “pamper shop”

E-commerce Search – Make & Bake

E-commerce Search – Make & Bake

Beware of attacks from bots/crawlers

With more and more consumers switching to e-commerce from the traditional shopping model, either due to the amount of time spent in shopping or due to the kind of throw away discounts offered by big e-commerce businesses, bot-based attacks are pretty common for multiple reasons:

  • Malicious intent

Bots can intercept your site and place orders. Orders placed by bots are fake but will appear legitimate. This will reduce the inventory for genuine customers and the brand gets badly affected. Ultimately, however, the owner of the site will incur huge losses.

  • Data Scrapers:

The eCommerce business has a lot of competitors and to gain an unfair advantage third-party tools can be used to scrape the metadata of the products available in your catalog. Information like your product price and stock can be used by competitors to compare and reduce their prices in order to gain an unfair advantage and achieve targeted GMV during sales period/any normal day sales.

  • Bring down systems by crawling through the entire list of the product catalog

All e-commerce websites when searched with certain terms list a long list of results that can be paginated. Certain sites limit the number of pages a user can paginate and others do not limit. The ones who do not limit are affected as the pagination across millions of products is quite hard work for any system even with a lot of resources at its disposal and will eventually bring down the system to its knees

  • Wild card queries / Long query issues

A search engine that supports queries like ‘*’, “*: *”, “a*” or a malicious long query, etc., will be under a lot of pressure to serve out results as the number of search index documents that match will be everything present in the catalog or near to entire document set. This will be a huge setback for the business, especially during the big sale days. Instead of serving customers fast, the system will be working unnecessarily to dig out entire document sets for certain users with malicious intent who have no intent to purchase.

  • Paid affiliate partners to boost your revenue – does it really work?

Paid partners can boost the traffic but can they really boost the sales? How many affiliates and types of affiliates can grow your revenue? Is the paid traffic really doing good for your business or is it a setback for a system already overloaded?

These are some of the questions any e-commerce business should answer to manage paid partner traffic. Typically, partners are allowed to crawl the catalog data and allowed to divert traffic from different sites. With no control mechanisms built into the system, sales will suffer.


Facets are an integral part of any e-commerce site and are served out through search engines. With specific and quality information available in the product catalog, search with facets has shown to improve conversion rates.

A user cannot provide all the information in the search query about the product he/she is looking for. A search query, hence, is always found to be much broader and with the help of Facets, results can be drilled down.

The faceting time increases with high cardinality. In Solr, there are different ways in which a faceting operation can be achieved:

  1. Normal faceting

  2. Json faceting

Json facets when enabled with docValues have proven to be faster. However, with large unique values, they are still slow. One has to decide on the type of faceting that will be useful in different situations, based on system load and capacity.


The search queries of users are nothing but the intent of the users to buy a specific product. However, for machines to understand intent, appropriate algorithms should be in place to show results that are as relevant as possible and closer to user intent.

A user searching for Samsung s8 64 GB should be shown with results that have the exact specification and the ROM facet with the value 64 GB selected. If the search is made for the phone cover of Samsung s8, then hand casings of s8 should be listed on top. The intent mining system needs to build different types of dictionaries that help uncover such intent from the query.

If a query is very broad (Ex: Samsung) which can match multiple categories of data, a decider algorithm should be in place to decide on the ambiguity and boost appropriate results on top. The history of click-through data helps to boost the results and categories while discovering intent

Near real-time Update of Data

E-commerce businesses when compared to traditional businesses are more real-time in nature. Considering, the kind of discounts and big sale days, information like price, stock, discount, shipping, etc., should be updated on a real-time basis and the data should also be visible to users in real-time.

The number of item SKUs present in the catalog for which updates are to be pushed requires a lot of processing. User searches will get affected if the updates are pushed frequently and made searchable. In Solr terms, if the auto-soft commit is performed too frequently, then the warming required with new searches opened requires a lot of processing, and CPU cycles will be spent in warming up the cache thereby reducing the throughput. An optimal value has to be chosen for soft commit time. If the search system is index-heavy and read-heavy and if you need near real-time views of the data, then keeping it around 30-60s would be optimal.


As an initiator, we have tried to point out the problems we have faced and addressed & will come up with more such insights and learnings in the upcoming write-ups. Read our blog on Understanding the basics of the Jenkins Pipeline

46 views0 comments

Recent Posts

See All


bottom of page