An engineering manager’s guide to AI security and governance | Retool Blog

For many eng leaders, recent months have seen AI implementations looming over already bustling roadmaps, as companies grapple to keep pace with competitors, carve out new niches, and leverage emerging AI technologies to advance their position in the market. Suddenly, you might be trying to upskill team members, hiring new folks with expertise you’ve never hired for before, pivoting priorities, and wrangling expectations around how AI will (and won’t) be helpful to the business.

But with advancements in AI technologies, specifically around LLMs, it’s not all business as usual. As AI gives businesses new “powers” in the way they build, scale, and operate, considerations around privacy, security, governance (not to mention myriad ethical elements) are increasingly—and incredibly—significant.

Cutting corners when it comes to AI security and governance poses big risks—including losing customers’ trust, violating government regulations, and exposing your company to security vulnerabilities. Engineering managers need to do more than just lead their teams in AI development—we need to proactively commit to responsible and well-considered governance and fully equip our teams to understand the unique considerations around security and privacy so that they can build and innovate responsibly. “With great power comes great responsibility,” as it were, is not just an adage for Spider-Man.

While the AI landscape is rapidly evolving on both technological and regulatory levels, our eng team has thought deeply about best practices and guardrails as we’ve implemented AI-powered features for customers and leveraged LLMs for internal tooling. Here are a few of our learnings and recommendations to help you get started and set you and your team up for success.

Create your AI playbook in advance

Deploying AI tools irresponsibly is, literally and figuratively, bad business. If there’s one takeaway to share with your team, it’s that security and privacy in AI can’t be an afterthought. (Not that any security/privacy initiatives should be!) Responsible use of AI requires particularly intentional, upfront investment—and consistent follow-ups.

Ideally before you start shipping AI-assisted features or leveraging AI and LLMs internally, create a playbook of processes, expectations, and best practices (including the other ones we recommend here) such that your organization can judiciously reason about how it applies AI. Be sure to synthesize diverse perspectives when outlining your principles: Consider gathering insights from a cross-functional group that spans engineering, design, legal, privacy, trust & safety, product, and other roles. (Comms and marketing might be helpful, too, in considering the public ramifications of missteps, or in framing content to ensure it resonates across the organization.)

If your team is largely quite new to AI technologies, you may also want to consult an independent expert with experience implementing responsible AI practices at other companies.

Some questions to consider when creating your playbook include:

What use cases should we apply AI to? What, if anything, is off limits?
How will we conduct internal privacy/security reviews for AI-powered features?
What kind of monitoring/auditing will we implement?
How will we govern access to AI features?
What’s the protocol in the event of an escalation?
How will we communicate our AI protocols to customers and external parties

Some of the answers here might draw upon your team’s existing guidelines for other technologies, but don’t assume they’ll fit—be sure to turn over all the stones.

You may also benefit from creating a couple versions of the playbook to distinguish between AI use cases for internal stakeholders versus external customers. While there are many similarities, you might find that, say, the nuances of data access controls or interfacing with users is different for a tool that serves people within your organization versus the general public.

Document your AI use case(s)

In addition to your general-purpose playbook, ensure the team documents how each of your specific applications of AI puts those standards into practice. This documentation should include justification for the AI use case, as well as implementation details as they relate to privacy and security.

If you don’t document each of your AI applications, you leave room for uncertainty, which in turn heightens the risk of security and privacy breaches. Good documentation is an asset for your team to come back to and evaluate that everything is working as intended, as well as for partner teams to leverage as they build out their own AI toolkits.

At the most basic level, be very clear about the AI’s inputs and outputs. Whether your models are entirely homegrown or from a third-party (as is the case for many LLMs), try to answer the following questions:

Where does the model’s training data come from and how are the model’s features derived from that data?
What’s the source of the test data that the model is making inferences on?
What exactly does the model predict and how do those outputs get used?

Going a step further, you can investigate model fairness, drift, and other factors that could present security or privacy risks.

How does your model perform on different classes of data?
Is the model sensitive to changes in the production data distribution?
Has the industry identified vulnerabilities with a system you’re using (such as prompt injection attacks for LLMs)?
What are you doing to mitigate these concerns?

Like your playbook of AI standards, application-specific documentation isn’t meant to be static. Regularly revisit and update it to reflect the current state of your AI toolset.

Do deep diligence around third-party AI tools

Often, you’ll depend on third-party AI models and tools when building out your own internal stack. That’s not inherently an issue; after all, it can be prudent to get external support so your team can stay focused on the core business. But it’s a known-known that it’s risky to share sensitive business data with third-party software that you haven’t thoroughly vetted—and AI tools are no exception.

You should review third-party AI tools both on a contractual level and a technical one. The contract should specify who owns the data—ideally, data that’s yours stays yours and isn’t used for the third party model’s learning. (This is not a given—read the fine print thoroughly.) On the technical side, evaluate whether the tool you’re considering adheres to its own contract and any other security/privacy standards you’ve outlined in your playbook.

You also have some choices about how you deploy third-party software. This isn’t unique to AI tools—for example, you’ll need to decide between an on-cloud or on-premises deployment model if the third party offers those options. Some solutions may not be available on-prem, which could be a dealbreaker if you prefer to store all your data on your own servers. On the other hand, some organizations may prefer a cloud deployment because of lower maintenance costs and dedicated security support from the cloud provider.

Just because you don’t directly own an AI model doesn’t mean you can’t thoroughly debug it, in the same way that you would any other third-party APIs and libraries you use. You still have leverage over the model’s inputs, so validate that the model is secure and maintains data privacy even when given malicious inputs (for example, try prompt injections). Any infrastructure you control that pre-processes model inputs is also worth debugging, in addition to your downstream code that acts on a model’s outputs.

Ensure AI usage is auditable

When you're using AI to handle large volumes of enterprise and customer data, robust auditing is essential. In the same way that businesses maintain logs of employees accessing internal datasets, AI’s access to data (and employees’ access to AI) should be monitored.

You don’t have to reinvent the wheel when it comes to logging, nor do you need to separate logs for AI systems from the rest of your logs. (Ensuring your AI-related logs live alongside your existing logs can be helpful for auditability.) Continue to follow your existing logging patterns and just be sure to include records of who interacted with which of your AI models as well as what the inputs/outputs were. This way, you have the full context of how your AI models are operating in relation to the rest of your stack.

In addition to auditing end-users’ interactions with AI, it’s just as critical you monitor how developers are invoking AI tools. Changes in which APIs are called or how they’re used can sometimes merit further investigation.

Of course, auditing isn’t just for detecting security breaches. It can, for instance, help verify that your AI tool is generating reproducible and fair results. (You could also query your logs for product insights beyond a privacy and security context.)

Detailed records are especially valuable when you need to demonstrate compliance with regulatory requirements, bringing us to the next best practice.

Give extra attention to compliance

Regulations like GDPR have been shaping the software industry for years, but because of the rapid rise in LLMs and novel applications of AI, the regulatory landscape is in flux. If you operate in a variety of markets, you may need to be especially mindful of following a combination of local, regional, and international requirements. In all cases, be prepared to react quickly to sudden changes.

This means exercising care when using AI to accelerate software development. It’s still your company’s responsibility, for example, to ensure that machine-generated code follows relevant regulations. That responsibility extends to other job functions that use AI tools to supercharge their workflows. Ultimately, it’s wise to require a human gives the final stamp of approval on AI-generated results, code, copy, and the like—and to actively enforce this requirement.

While you can’t know for sure what new regulations might crop up, the more you can do to get ahead of potential challenges, the better positioned you’ll be to keep services running if you have to make quick changes. For example, when it comes to how AI features are presented to end-users, it can be helpful to make them opt-in experiences. It’s safer to be very explicit about seeking consent and to communicate how and why an application needs to process user data.

How to lead AI development responsibly

Ensuring your company is well positioned to use AI responsibly is an ongoing journey that starts before your team builds its first AI-powered tool and continues well after you’ve deployed. Reading this guide is a first step in forming your organization’s playbook. As the AI ecosystem develops, you’ll surely revise your strategy accordingly.

If you’re just getting started, you may also need to consider whether, in fact, AI is the right tool and gather some foundational best practices. If leveraged thoughtfully and securely, AI can be a key differentiator, and a powerful force for innovation, enablement, and accelerating output. As a technical leader, learning from others in the space can help your team and company get there.

If you’re interested in helping your team securely build AI-powered tools without starting from scratch, Retool comes complete with auditing, SSO, and other features to keep your data secure and private. You can also check Retool AI—or book a demo to learn more.

Big thanks to Bob Nisco, Ford Filer, and Yujian Yao for the input on this piece.