Build a Slack AI app powered by Amazon Bedrock
Introduction
I've always found the idea of ChatOps alluring. With the right people building the ChatOps functionality, it can be a very powerful tool for helping teams get things done. However, many ChatOps tools were unable to understand intent from natural language, required hardcoded logic for user actions, and faced other limitations.
With LLM's and Gen AI, I think ChatOps is worth revisiting. LLM's are fantastic at intent recognition. And depending on your risk tolerance, the hardcoded logic problem can be solved by letting a foundational model loose on a problem (ideally in a secure sandbox), giving it access to tools via MCP, and just sitting back and watching magic happen.
It is also a particularly good time to work on this as Slack recently released support for developing apps with AI features. These AI powered apps let users open them in a dedicated pane, which means they can work on them side-by-side their other open channels. They also follow the user around as their context changes as well as have the ability to take into account conversation history.
I'd like to explore what LLM's can offer ChatOps by building an AI agent to help with security automation. But this post is not about that. Before going all in, I wanted to see how easily I could hook up a Slack App to AWS Bedrock. Turns out, its pretty straightforward. If you're a CloudFormation nerd, check out Deploy a Slack gateway for Amazon Bedrock. But since Terraform is my jam, I borrowed some ideas from that post to create slackbot-lambdalith.
What we're building
Because its Labor Day weekend, I'm too lazy to draw up a proper architecture diagram. Instead I'll use an image of the trace detailed view of our Slack AI App.
We'll build a Lambda function that receives Slack requests, calls out to AWS Bedrock, and returns useful responses to the Slack user. On the way to achieving this, it uses DynamoDB to store message deduplication information. Sidebar: if anyone knows an AI app that I can use to create architecture diagrams in the style of draw.io with just prompting, please ping me!
Setting up the infrastructure
Slackbot-lambdalith lets you setup a Slack AI app where all the functionality lives in a single Lambda function. This design choice is deliberate as it makes prototyping quicker.
You will need to apply the terraform twice, the first time to generate the manifest.json
which you can use to setup the Slack app. And the second time to wire everything up properly with the required credentials. See the setup guide for detailed instructions.
module "slack_bot" {
source = "mbuotidem/slackbot-lambdalith/aws"
slack_bot_token = var.slack_bot_token
slack_signing_secret = var.slack_signing_secret
# Optional: Customize your Slack app manifest
slack_app_name = "Qofiy"
slack_app_description = "A custom bot built with Terraform and AWS Lambda"
slack_slash_command = "/slash-command"
slack_slash_command_description = "Executes my custom command"
lambda_function_name = "qofiy"
lambda_source_path = "./lambda"
lambda_source_type = "directory"
bedrock_model_id = "anthropic.claude-3-5-haiku-20241022-v1:0"
bedrock_model_inference_profile = "us.anthropic.claude-3-5-haiku-20241022-v1:0"
lambda_env_vars = {
"BEDROCK_MODEL_INFERENCE_PROFILE" = "us.anthropic.claude-3-5-haiku-20241022-v1:0"
}
use_function_url = true
enable_application_signals = true
tags = {
Environment = "production"
Project = "slack-bot"
}
}
I set up the module to deploy a Lambda function url with use_function_url
. While slackbot-lambdalith supports API Gateway, I went with the function url option for simplicity. If you're in an enterprise setting with strict security requirements around externally exposed resources, you should consider the API Gateway option.
And because we're building a chat app, keeping an eye on latency and overall performance is key to a good user experience, so we setup application signals with enable_application_signals
.
As you can see in the image above, enabling application signals gives us automatic instrumentation. This means we can interrogate our apps performance with traces that include critical response-time metrics. More on this later.
Our lambda function
Here's how our lambdalith is setup. It's loosely modeled after the bolt-python-assistant-template provided by Slack.
16:39:45 ~/slackbot/lambda
$ tree
.
├── index.py
├── listeners
│ ├── __init__.py
│ ├── assistant.py
│ └── llm_caller.py
├── requirements.txt
└── utils
├── __init__.py
└── deduplication.py
3 directories, 9 files
Entrypoint
Our entry point is index.py
where we have our lambda function handler. Slack needs a response within 3 seconds, and normally you'd send a quick HTTP 200 OK and then do the work. But with Lambda, that's complicated because returning a response effectively terminates execution. To get around this, we use the Slack Bolt SDK's lazy listeners. These work by acknowledging the request right away, then invoking another instance of the same lambda asynchronously to perform the required task. The key setting that enables this behavior is process_before_response=True
.
# process_before_response must be True when running on FaaS
# See https://tools.slack.dev/bolt-python/concepts/lazy-listeners/
=
=
return
=
return
=
return
Listeners
The Slack Bolt SDK provides the convenient Assistant
class. It handles the assistant_thread_started
, assistant_thread_context_changed
, and message.im
events. For our purposes we don't need to worry about the context changed event but we'll be using the other two.
@assistant.thread_started
This is invoked when your user opens an assistant thread. You can use it as we do here to say something nice and helpful, or you could use it to seed the first interaction with prompts.
=
=
=
# Check for duplicate thread start using channel_id and thread_ts
return
# Mark thread start as processed
To ensure we don't respond to the same message twice when the app is invoked from a cold start, we implement DynamoDB-based message deduplication in our calls to is_duplicate_message
and mark_message_processed
. This allows us to track processed messages across all our Lambda invocations. We also use DynamoDB's native TTL feature to remove stale deduplication markers.
@assistant.user_message
This is invoked when the user replies in the assistant thread. We use the aforementioned lazy listener functionality here to handle the call to Bedrock with @assistant.user_message(lazy=[process_message_lazily])
.
Its also recommended to send the user an indicator that their message was received and is being acted on. We do that above with set_status
. Now lets take a deeper look at process_message_lazily
.
"""Process the message, call Bedrock, and send a reply."""
=
=
=
=
# Check for duplicate message
return
# Mark message as being processed
return
=
: =
=
=
Once again, we perform deduplication to ensure we handle multiple invocations due to cold starts gracefully.
Notice also how we use the conversations.replies
method to gain access to a bit of conversation history to pass on to the model. This allows us to mimic the sensation of conversational continuity even though every model call is a new invocation.
To ensure the model knows who is who, we cycle through the messages and clearly delineate which message was from the user and which was a reply from the assistant. Then we invoke the model passing in this list of attributed messages.
Calling Bedrock
If this is your first time using AWS Bedrock, make sure to request model access. In our case, we're using a Claude model which requires filling in some information regarding your desired use-case. The whole process gives one the impression that you have to wait for a human to review your request but in my experience, approval comes pretty quickly so its likely automated.
Our prompt
Your prompt can make a difference in how the model responds. Our prompt for this exploration is relatively simple however since we just want to get things working. If you'd like to learn more tricks of the trade, Anthropic has a whole course on prompt engineering.
=
Calling with the Converse API
Armed with our prompt, we can now call Bedrock. The recommended way to do so is with the Converse API. Here's the code:
# Format messages for Bedrock API - content must be a list
=
# Convert thread messages to Bedrock format
=
=
=
# Process the response from the Bedrock AI model
=
return
The latency problem
Eagle eyes would have clocked the performanceConfig={"latency": "optimized"}
part of the Bedrock model invocation. Getting responses down to a reasonable latency is arguably the hardest part of building a Slack AI App.
Fortunately, with application insights enabled, you can see detailed information of which parts of your app are taking the longest. Here is an image of an invocation of our app.
As you can see, the init phase of the lambda took almost 3 seconds while the call to Bedrock took almost 5 seconds. And this was with latency optimized inference enabled. Without the latency config, I was seeing response times of magnitudes higher.
What this means for you is that you need to be ruthless about minimizing the things your lambda has to do on initialization as well as during regular operation. Additionally, test different foundational models to find one that consistently delivers fast responses without sacrificing quality.
Wrapping up
If you've made it this far, you should now have a fully functional Slack AI app. Here's what using our app looks like.
If you are familiar with building gen ai chat applications, you might wonder if we could take advantage of either Lambda response streaming or the Converse API's ConverseStream method to get responses quicker.
Unfortunately, true streaming is not currently possible as the Slack API does not natively support streaming HTTP requests. There is a workaround however that involves calling chat.update
with each received chunk to update the previously sent message. This does have the effect of marking every message sent by the Slack App as 'edited', but the improved UX may be worth it in your case.
In part 2, I'll walk through how to implement this approach. And depending on how long that post is, we might briefly discuss other considerations for running a Slack AI App in production such as guardrails, evals, and security.