Coding up the RAG capability of the site – Looking Ahead into the Future of IoT-2
By Patrice Duren / June 27, 2024 / No Comments / AWS Certification Exam, IoT sustainability
We can then initialize the Flask application with the following line of code:
app = Flask(__name__)
We can then initialize the Amazon Bedrock client through the following code. Replace the values for aws_access_key_id and aws_secret_access_key with your account’s values:
print(“Initializing Boto3 client…”)
bedrock_runtime = boto3.client(
service_name=’bedrock-runtime’,
aws_access_key_id= ‘<YOUR_AWS_ACCESS_KEY_ID_HERE>’,
aws_secret_access_key= ‘<YOUR_AWS_SECRET_ACCESS_KEY_HERE>’,
region_name=’us-west-2′)
print(“Boto3 client has been initialized.”)
We can then move ahead and initialize the embeddings and language model:
llm = Bedrock(model_id=”anthropic.claude-v2″, client=bedrock_runtime, model_kwargs={‘max_tokens_to_sample’:200})
bedrock_embeddings = BedrockEmbeddings(model_id=”amazon.titan-embed-text-v1″, client=bedrock_runtime)
With this, we create an instance of a large language model (LLM) using Anthropic’s claude-v2 model. This model is designed for tasks that require an understanding of human language, such as conversation, content generation, reasoning, and coding. The max_tokens_to_sample parameter is configuring the model to generate responses with a maximum length; in this case, 200 tokens.
We then create an instance for generating text embeddings using Amazon’s titan-embed-text-v1 model. Text embeddings are vectorized representations of text that can be used for a variety of natural language processing (NLP) tasks, such as similarity search or classification.
We then load and preprocess the documents:
print(“Preprocessing your files…”)
ssl._create_default_https_context = ssl._create_unverified_context
os.makedirs(“data”, exist_ok=True)
files = [
“https://www.cisa.gov/sites/default/files/publications/Federal_Government_Cybersecurity_Incident_and_Vulnerability_Response_Playbooks_508C.pdf”,
“https://www.cisa.gov/sites/default/files/2023-10/StopRansomware-Guide-508C-v3_1.pdf”,]
for file in files:
file_path = os.path.join(“data”, url.rpartition(“/”)[2])
urlretrieve(url, file_path)
print(“Files preprocessed.”)
We download files from the internet using HTTPS. We have the code print a message indicating the start of the download process. Then, it modifies the SSL context to create an unverified context, which means it will not verify SSL certificates. This is typically not recommended due to security concerns but may be necessary in certain environments where the SSL certificate cannot be verified. The script then ensures that a directory named data exists, creating it if it does not. After that, it defines a list of file URLs. Finally, we have the code print a message stating that the files have been downloaded.
We then need to perform embedding on the files:
loader = PyPDFDirectoryLoader(“./data/”)
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
text_to_process = text_splitter.split_documents(documents)
vectorstore_faiss = FAISS.from_documents(text_to_process, bedrock_embeddings)
We then process the PDF documents for text analysis. It begins by establishing a loader that targets a directory where PDFs are stored and then loads these documents into a format suitable for analysis. The documents are then segmented into smaller sections of text, a common practice in NLP to make handling large documents more efficient. Finally, the segments are indexed using an advanced library designed for fast retrieval, setting the stage for sophisticated operations such as searching for information within the documents based on query inputs. This setup is foundational for systems that aim to extract and utilize knowledge from extensive textual data efficiently.
We then start initializing the landing page of the Flask app:
@app.route(‘/’, methods=[‘GET’])
def home():
return render_template(‘query_form.html’)
The @app.route decorator registers the URL path (/, the root of the website) with the home function, which is defined immediately below it. When a user navigates to the root URL of the web application using a web browser (which sends a GET request), the home function will be invoked. This function responds by rendering and returning the query_form.html HTML template we created.
We now define a route to handle the queries that we will make:
@app.route(‘/’, methods=[‘GET’])
def home():
return render_template(‘query_form.html’)
We define an endpoint in a Flask web application that responds to POST requests at the /query URL path. When this endpoint is hit, typically by a user submitting a form on a web page, it attempts to process the input named query. This input is then transformed into an embedding—a numerical representation used for ML tasks—by the FAISS library, which is optimized for efficient similarity search.
The application then performs a search to find documents that are like the query embedding. The matched documents are collated into a response object, which is returned as JSON to the requester. If an error occurs during this process, it catches the exception and returns an error message, also in JSON format, with an appropriate HTTP status code. This setup allows for the creation of a web-based interface where users can search through indexed documents by submitting queries and receiving relevant document content as a response.
Finally, we set up the host and port for the Flask application to run on when it is started:
if __name__ == ‘__main__’:
app.run(host=’0.0.0.0′, port=5000)
With this, you’ve coded up the appropriate RAG capability that is needed for the site. Now, we can move ahead and test that the application is working without any issues.