How we used GenAI to make sense of the government
Feb 26, 2024
•
Rushil Daya
We built a RAG chatbot with AWS Bedrock and GPT4 to answer questions about the Flemish government.
In 2024 we are flying into a perfect storm: 49% of the world’s population is headed towards the polls (the largest voting year in history) at the same time that GenAI-generated misinformation is becoming harder and harder to spot. Doubtlessly 2024 is going to be the year of big change — the only thing I think that won’t change will be the fact that our smooth monkey brains are incapable of making wise decisions when overwhelmed by (mis)information.
But — and there is always a but…
GenAI + Politics doesn’t only need to equal disaster. As the saying goes: “The only way to stop a bad guy with an AI bot is a good guy with an AI bot”. With this in mind we wanted to see if we could use GenAI to make sense of the government and empower the everyday citizen to understand the decisions made by their government.
Let’s get specific 🇧🇪
Living in Flanders, our attention was of course focused on the Flemish government. Who, thankfully are very transparent when it comes to sharing information. All government decisions they make are publicly available here. While this is a useful resource it can quickly become overwhelming and hard to make sense of. We believed that by leveraging GenAI we could add an interpretation and summarisation layer on top of this resource making it more accessible and useful for the general public.
Introducing the RegeringsRobot* 🤖
After much brainstorming and deliberation — it became clear that what we needed was a chatbot capable of answering a user’s questions about the government and in particular focusing on the decisions made by the Flemish government over the last election cycle. The chatbot needs to be capable of uncovering relevant government decisions and synthesizing these into a single easy-to-digest response.
I will get into the technical aspects of how we implemented our bot in the next section but before doing that let me first introduce you to what we built — the Regeringsrobot. I encourage you to get hands-on and ask your burning questions.
All of the source code for the bot is available on our github. This project is an experiment and thus all feedback is welcome — you can share your thoughts with us via the website or on github.
Our implementation 🛠
Now it is time to get a bit technical… if you are allergic to nerd speak I suggest skipping ahead to the next section
Our goal of summarisation and interpretation of text naturally leads us to Large Language Models (LLMs) and as we have the goal of providing factual and reference-able information the go-to technique for this is retrieval augmented generation (RAG). There are many deep dives available online regarding RAG so to keep it brief: RAG is when you provide the specific pieces of context deemed semantically relevant to the user’s query directly in the prompt you pass to the LLM. A simple RAG prompt will look like this:
The Retrieval part of RAG i.e. getting the pieces of semantically relevant context is typically done by storing vector embeddings of chunks of the source data in a vector database, then when a question is provided we compute the vector embedding of the question and retrieve the pieces of context closest to the question in the vector space.
Knowing that we wanted to take a RAG approach we needed to implement a system capable of serving users in a scalable, cost-controlled, and ideally serverless fashion. Below I will give a high-level overview of the system we implemented.
Data collection, cleaning, and storage: We built a Python web scraper to gather information from the government decisions website. This content was then cleaned and each decision was stored in a .txt file within an S3 bucket.
Vector database: Once our data is stored it is then shifted to a vector database. We opted for a knowledge base on AWS Bedrock. The knowledge base is in essence an abstraction over a vector database and an embedding function. Using this approach provides several benefits:
Easy to use — a knowledge base can be set up in a matter of minutes. All you are required to do is provide an s3 prefix to specify where your data is and select your embedding function. You do not need to manage the vector database yourself.
Plug & play embedding functions — Bedrock attempts to provide a unified platform where you can easily switch between foundational models, this also applies to embedding functions. As our dataset is almost entirely in Dutch we opted for a multilingual embedding function from Cohere.
client-side initiated RAG flow: We built a single page application hosted on Cloudflare pages which provides a chat interface for users to initiate an RAG interaction by making a call to an API gateway endpoint. To control cost while at the same time not including a restrictive login procedure we take advantage of API gateway usage plans which allow us to set usage quotas and rate limits on our endpoint. The core logic of our system is run within a lambda function which performs the following 3 steps:
Retrieval — unsurprisingly the first step of the RAG pipeline is retrieval, we retrieve the closest N decisions from the Bedrock knowledge base.
Generation — we make use of GPT-4 to perform the generation, having our Lambda function make calls to the OpenAI API.
Updating References — lastly, we are required to perform some post-processing. As there is no guarantee that the LLM will make use of all of the context provided, we explicitly prompt the model to indicate which references it used, this allows us to then reduce the list of retrieved contexts we pass back to the user to only those specifically cited in the response. Additionally, Bedrock returns references to s3 objects, however, to be useful to the end user we also need to convert these s3 object links to links to the official government website.
Why not Bedrock all the way? AWS Bedrock is marketed as a one-stop shop for all of your GenAI needs — as such it may seem strange that we are only using Bedrock for our knowledge base and not for the generation step. The reason for this is sadly not very exciting, at the time at which we implemented the Regeringrobot we found Bedrock generation functionality to still be buggy. I look forward to revisiting this architecture once the Bedrock platform matures.
My key takeaways ( did we save the world? )🤔
The Regeringsrobot is capable of answering questions for which it has context in an insightful and reference-able manner. I believe we were successful in proving that GenAI is capable of adding value in this domain.
But — and there is always a but…
Using GenAI to answer questions in the political space is difficult for many reasons — not least for the fact that the person asking the question already has their political viewpoint which shapes how they would judge the “correctness” of any response given to them. However, we believe that by taking the RAG approach and making sure we can always reference the claims made by the AI it is possible to build a system capable of helping people navigate a political landscape filled with an overabundance of information.
The system we built is a tangible showcase of the potential of GenAI within the political environment and I believe it is a step in the right direction. However, as promising as this step is we should not underestimate the challenges that lay ahead. What do you do when the government data itself cannot be trusted? what if reporting is biased or not transparent?
Lastly, I think a big challenge would be the tendency of a RAG-based system to put a greater emphasis on the actions the government did take over those which it didn’t. Actions that were not taken tend to fly under the radar and as such a RAG-based system would have a difficult time evaluating the opportunity costs of the decisions not taken by the government. A solution to this may be to incorporate into our knowledge base content from outside of the government itself which provides a critique of the government’s performance — doing this in a manner that does not introduce biased information is where the challenge would lie.
Final Thoughts 👋
The combination of GenAI with politics has (understandably) a bad reputation and it can be easy to regard this entire domain as a swamp of misinformation and stay away from it completely. However, I believe this is the wrong approach and risks ceding too much of the space to bad actors — as the deluge of misinformation becomes more sophisticated we need to make sure that the tools and techniques we use to combat it follow suit.
Challenges lay ahead, but haven’t they always? the important thing for us is that we continue to find solutions.
* Regering is the Flemish word for government, and robot is the Flemish word for robot. Coming up with overly creative names is not our forte.
Latest
Data Stability with Python: How to Catch Even the Smallest Changes
As a data engineer, it is nearly always the safest option to run data pipelines every X minutes. This allows you to sleep well at night…
Clear signals: Enhancing communication within a data team
Demystifying Device Flow
Implementing OAuth 2.0 Device Authorization Grant with AWS Cognito and FastAPI