Please update your browser.
How to Build a FAQ Bot With Pre-Trained BERT and Elasticsearch
In this tutorial, we will demonstrate a simple way to create a FAQ bot by matching user questions to pre-defined FAQs using Sentence-BERT and Dense Vector Search in ElasticSearch with concrete code example. The solution is fast, accurate and scalable in production level environment.
Chatbot has emerged to be one of the most popular interfaces given the improvement of NLP techniques. Within this big family, FAQ bot is usually designed to handle domain-specific question-answering given a list of pre-defined question-answer pairs. From the machine learning standing point, the problem could be further translated as “find the most similar question in database matching user’s question”, or something we called Semantic Question Matching. In this post, we will have a step-by-step tutorial for building a FAQ bot interface using Sentence-BERT and ElasticSearch.
To solve the semantic question matching, there are generally two directions to go. One is to use the Information Retrieval approach which treats the pre-defined FAQs as documents and the question from users as the query. The advantage of this search-based approach is that it’s in general more efficient and scalable. Running this system on 100 FAQs vs 1 million FAQs usually won’t have a significant difference in inference time (this may still depend on the algorithm you use. For example, BM25 will be much faster than a ranking model). The other approach is to use a classification model which takes user question and a candidate FAQ as a question pair, then classify whether they are the same question or not. This method will require the classification model to run through all the potential FAQs with the user question to find the most similar FAQ. Comparing with the first approach, it could be more accurate but takes more computation resources during inference. In the following parts, we will focus on adopting the search-based method that computes the semantic embedding on both the user question and FAQs to select the best match based on their cosine similarity.
Sentence-BERT for Question Embedding
BERT types of models have been able to achieve SOTA performance on various NLP tasks1. However, BERT token level embeddings could not be transformed directly into a sentence embedding. A simple average of token embedding or just use [CLS] vector turns out to have poor performance on Textual Similarity tasks.
The idea to improve BERT sentence embedding is called Sentence-BERT (SBERT)2 which fine-tunes the BERT model with the Siamese Network structure in figure 1. The model takes a pair of sentences as one training data point. Each sentence will go through the same BERT encoder to generate token level embedding. Then a pooling layer is added on top to create sentence level embedding. Final loss function is based on the cosine similarity between embeddings from those two sentences.
Figure 1: Sentence-BERT (SBERT) with Siamese architecture
According to SBERT paper, fine-tuned SBERT could significantly outperform various types of baseline such as averaging GloVe3 or BERT token embeddings in terms of Spearman rank correlation of sentence embeddings on textual similarity data set.
The author team has also released a python package called “sentence-transformer” which allows use to easily embed sentences with SBERT and fine-tune the model based on a Pytorch interface. Following the GitHub link (https://github.com/UKPLab/sentence-transformers), we could download the package by:
from sentence_transformers import SentenceTransformer
sentence_transformer = SentenceTransformer("bert-base-nli-mean-tokens")
questions = [
"How do I improve my English speaking? ",
"How does the ban on 500 and 1000 rupee notes helps to identify black money? ",
"What should I do to earn money online? ",
"How can changing 500 and 1000 rupee notes end the black money in India? ",
"How do I improve my English language? "
question_embeddings = sentence_transformer.encode(questions)
You're now leaving J.P. Morgan
J.P. Morgan’s website and/or mobile terms, privacy and security policies don’t apply to the site or app you're about to visit. Please review its terms, privacy and security policies to see how they apply to you. J.P. Morgan isn’t responsible for (and doesn’t provide) any products, services or content at this third-party site or app, except for products and services that explicitly carry the J.P. Morgan name.