Modern chatbots can serve as digital agents, providing a new avenue for delivering 24/7 customer service and support across many industries. Their popularity stems from the ability to respond to customer inquiries in real time and handle multiple queries simultaneously in different languages. Chatbots also offer valuable data-driven insights into customer behavior while scaling effortlessly as the user base grows; therefore, they present a cost-effective solution for engaging customers. Chatbots use the advanced natural language capabilities of large language models (LLMs) to respond to customer questions. They can understand conversational language and respond naturally. However, chatbots that merely answer basic questions have limited utility. To become trusted advisors, chatbots need to provide thoughtful, tailored responses. One way to enable more contextual conversations is by linking the chatbot to internal knowledge bases and information systems. Integrating proprietary enterprise data from internal knowledge bases enables chatbots to contextualize their responses to each user’s individual needs and interests. For example, a chatbot could suggest products that match a shopper’s preferences and past purchases, explain details in language adapted to the user’s level of expertise, or provide account support by accessing the customer’s specific records. The ability to intelligently incorporate information, understand natural language, and provide customized replies in a conversational flow allows chatbots to deliver real business value across diverse use cases. The popular architecture pattern of Retrieval Augmented Generation (RAG) is often used to augment user query context and responses. RAG combines the capabilities of LLMs with the grounding in facts and real-world knowledge that comes from retrieving relevant texts and passages from corpus of data. These retrieved texts are then used to inform and ground the output, reducing hallucination and improving relevance. In this post, we illustrate contextually enhancing a chatbot by using Knowledge Bases for Amazon Bedrock, a fully managed serverless service. The Knowledge Bases for Amazon Bedrock integration allows our chatbot to provide more relevant, personalized responses by linking user queries to related information data points. Internally, Amazon Bedrock uses embeddings stored in a vector database to augment user query context at runtime and enable a managed RAG architecture solution. We use the Amazon letters to shareholders dataset to develop this solution. Retrieval Augmented Generation RAG is an approach to natural language generation that incorporates information retrieval into the generation process. RAG architecture involves two key workflows: data preprocessing through ingestion, and text generation using enhanced context. The data ingestion workflow uses LLMs to create embedding vectors that represent semantic meaning of texts. Embeddings are created for documents and user questions. The document embeddings are split into chunks and stored as indexes in a vector database. The text generation workflow then takes a question’s embedding vector and uses it to retrieve the most similar document chunks based on vector similarity. It augments prompts with these relevant chunks to generate an answer using the LLM. For more details, refer to the Primer on Retrieval Augmented Generation, Embeddings, and Vector Databases section in Preview – Connect Foundation Models to Your Company Data Sources with Agents for Amazon Bedrock. The following diagram illustrates the high-level RAG architecture. Although the RAG architecture has many advantages, it involves multiple components, including a database, retrieval mechanism, prompt, and generative model. Managing these interdependent parts can introduce complexities in system development and deployment. The integration of retrieval and generation also requires additional engineering effort and computational resources. Some open source libraries provide wrappers to reduce this overhead; however, changes to libraries can introduce errors and add additional overhead of versioning. Even with open source libraries, significant effort is required to write code, determine optimal chunk size, generate embeddings, and more. This setup work alone can take weeks depending on data volume. Therefore, a managed solution that handles these undifferentiated tasks could streamline and accelerate the process of implementing and managing RAG applications. Knowledge Bases for Amazon Bedrock Knowledge Bases for Amazon Bedrock is a serverless option to build powerful conversational AI systems using RAG. It offers fully managed data ingestion and text generation workflows. For data ingestion, it handles creating, storing, managing, and updating text embeddings of document data in the vector database automatically. It splits the documents into manageable chunks for efficient retrieval. The chunks are then converted to embeddings and written to a vector index, while allowing you to see the source documents when answering a question. For text generation, Amazon Bedrock provides the RetrieveAndGenerate API to create embeddings of user queries, and retrieves relevant chunks from the vector database to generate accurate responses. It also supports source attribution and short-term memory needed for RAG applications. This enables you to focus on your core business applications and removes the undifferentiated heavy lifting. Solution overview The solution presented in this post uses a chatbot created using a Streamlit application and includes the following AWS services: The following diagram is a common solution architecture pattern you can use to integrate any chatbot application to Knowledge Bases for Amazon Bedrock. This architecture includes the following steps: A user interacts with the Streamlit chatbot interface and submits a query in natural language This triggers a Lambda function, which invokes the Knowledge Bases RetrieveAndGenerate API. Internally, Knowledge Bases uses an Amazon Titan embedding model and converts the user query to a vector and finds chunks that are semantically similar to the user query. The user prompt is than augmented with the chunks that are retrieved from the knowledge base. The prompt alongside the additional context is then sent to a LLM for response generation. In this solution, we use Anthropic Claude Instant as our LLM to generate user responses using additional context. Note that this solution is supported in Regions where Anthropic Claude on Amazon Bedrock is available. A contextually relevant response is sent back to the chatbot application and user. Prerequisites Amazon Bedrock users need to request access to foundation models before they are available for use. This is a one-time action and takes less than a minute. For this solution, you’ll need to enable access to the Titan Embeddings G1 – Text and Claude Instant – v1.2 model in Amazon Bedrock. For more information, refer to Model access. Clone the GitHub repo The solution presented in this post is available in the following GitHub repo. You need to clone the GitHub repository to your local machine. Open a terminal window and run the following command. Note this is one single git clone command. git clone –depth 2 –filter=blob:none –no-checkout https://github.com/aws-samples/amazon-bedrock-samples && cd amazon-bedrock-samples && git checkout main rag-solutions/contextual-chatbot-using-knowledgebase Upload your knowledge dataset to Amazon S3 We download the dataset for our knowledge base and upload it into a S3 bucket. This dataset will feed and power knowledge base. Complete the following steps: Navigate to the Annual reports, proxies and shareholder letters data repository and download the last few years of Amazon shareholder letters. On the Amazon S3 console, choose Buckets in the navigation pane. Choose Create bucket. Name the bucket knowledgebase-<your-awsaccount-number>. Leave all other bucket settings as default and choose Create. Navigate to the knowledgebase-<your-awsaccount-number> bucket. Choose Create folder and name it dataset. Leave all other folder settings as default and choose Create. Navigate back to the bucket home and choose Create folder to create a new folder and name it lambdalayer. Leave all other settings as default and choose Create. Navigate to the dataset folder. Upload the annual reports, proxies and shareholder letters dataset files you downloaded earlier to this bucket and choose Upload. Navigate to the lambdalayer folder. Upload the knowledgebase-lambdalayer.zip file available under the /lambda/layer folder in the GitHub repo you cloned earlier and choose Upload. You will use this Lambda layer code later to create the Lambda function. Create a knowledge base In this step, we create a knowledge base using the Amazon shareholder letters dataset we uploaded to our S3 bucket in the previous step. On the Amazon Bedrock console, under Orchestration in the navigation pane, choose Knowledge base. Choose Create knowledge base. In the Knowledge base details section, enter a name and optional description. In the IAM permissions section, select Create and use a new service role and enter a name for the role. Add tags as needed. Choose Next. Leave Data source name as the default name. For S3 URI, choose Browse S3 to choose the S3 bucket knowledgebase-<your-account-number>/dataset/.You need to point to the bucket and dataset folder you created in the previous steps. In the Advanced settings section, leave…
Build a contextual chatbot application using Knowledge Bases for Amazon Bedrock
Sign Up for Our Newsletters
Get notified of the best deals on our WordPress themes.