Understanding RBAC While Building Our AI Foundry Infrastructure

 

A few days ago my dad and I were working on setting up credentials for our startup’s Azure AI Foundry infrastructure, a discussion through which I discovered one of the core ideas behind modern cloud security - RBAC.

RBAC — Role-Based Access Control.

RBAC is basically about answering three questions:

Who are you?
What are you allowed to do?
And where are you allowed to do it?

In Azure, this is implemented using three components:

  • Security principal – the identity making the request (a user, application, or managed identity)

  • Role – the permissions (for example read, write, delete)

  • Scope – the resource those permissions apply to (a resource, resource group, subscription, etc.)

So instead of giving everything unlimited access, you assign specific roles to specific identities at specific scopes. It’s a simple idea, but it becomes incredibly powerful when systems start scaling.


The Problem That Needs to be Solved

Our system has multiple environments:

  • Local development

  • Development server

  • Staging

  • Production

Each of these environments interacts with our Azure AI Foundry services, which expose endpoints that our servers call through HTTP requests.

Traditionally, a lot of services handle this with something simple:

Endpoint URL + API key

If you have the key, you can call the service.

But that approach has a major weakness:
the key becomes the entire security model.

If someone gets that key, they effectively become the system.

So instead of relying purely on keys, we explored a much better approach:

Identity-based authentication with RBAC.


Identity Instead of Secrets

The idea is to authenticate using an identity provider, in Azure’s case Microsoft Entra ID.

Instead of saying:

“This request has the correct key”

you say:

“This request is coming from this verified identity.”

Once the identity is verified, RBAC determines what it is allowed to do.

A good analogy my dad used is passports.

The government issues the passport (identity provider), proving who you are.

Then different services — banks, airports, border control — decide what you are allowed to do based on that identity.

Cloud infrastructure works the same way:

  1. Identity provider verifies who you are

  2. It issues an authentication token

  3. Azure checks RBAC rules to determine permissions


When a Server Becomes an Identity

One of the most interesting ideas I learned is that servers and applications can have identities too.

In Azure this is called a Managed Identity.

Instead of hardcoding credentials into your code, Azure can assign an identity directly to a service — for example an App Service running your API.

Then the flow looks something like this:

  1. The app requests a token from the identity provider

  2. Azure verifies the identity of the app

  3. The app receives an authentication token

  4. That token is used to access other services

The big advantage here is that you don’t need to store secrets in your code anymore.

Azure manages the identity and the token lifecycle.

There are two main types of managed identity:

System-Assigned Identity

  • Attached directly to a single resource

  • Deleted automatically if the resource is deleted

  • Good for tightly coupled resources

User-Assigned Identity

  • Separate Azure resource

  • Can be shared across multiple services

  • Persists independently of the services using it

That distinction becomes really important when designing larger systems.


Connecting This to Our Environment Setup

Once I understood identities, the design of our infrastructure made much more sense.

Each environment can have its own identity:

  • Development environment identity

  • Staging environment identity

  • Production environment identity

Each of those identities can then be given different RBAC roles.

For example:

  • Development might have broader access for experimentation

  • Production might have extremely restricted permissions

  • Staging might have read/write access but no destructive permissions

Instead of everything sharing the same secret key, Azure knows exactly which environment is making the request.

That makes systems far easier to audit and secure.


Credential Chaining

Another clever design I discovered is something called credential chaining, implemented in Azure through DefaultAzureCredential.

The idea is simple: the SDK tries multiple authentication methods in sequence until one works.

For example it might try:

  1. Environment credentials

  2. Managed identity

  3. Visual Studio Code login

  4. Azure CLI login

  5. Azure PowerShell login

This means the same code can run in different environments without modification.

For example:

When the code runs in Azure:

Managed Identity is used

When the same code runs on my laptop:

VS Code login or Azure CLI identity is used

The application code doesn’t need to change. The authentication system adapts automatically based on where the code is running.

That design genuinely impressed me.

It’s one of those engineering decisions that makes infrastructure much cleaner.


Applying This to Azure AI Foundry

Since our system interacts with Azure AI Foundry models, I also looked into how RBAC works there.

Azure supports two ways to authenticate requests to model endpoints:

  1. API key authentication

  2. Microsoft Entra ID authentication

When using identity-based authentication, RBAC controls which identities can access the models.

One important role is:

Cognitive Services User

This role allows an identity to perform inference calls to AI models.

Interestingly, even high-level roles like Owner or Contributor don’t automatically grant inference permissions through Entra ID. You still need the correct role assignment.

That separation between management permissions and runtime permissions is a subtle but very important design decision.


This was a great discussion I had with my dad. I love how when working we started with a practical problem — how do we connect our AI services securely? — and ended up understanding a deeper principle behind how modern cloud infrastructure is designed.


Popular Posts