Incident.io Product RBAC

Product RBACs allow teams to control and manage user permissions and access to their product. Incident.io, an incident response software for site reliability engineers, uses an RBAC to ensure a systematic approach to assigning roles, defining permissions, and granting access based on users' roles or responsibilities within their user base.

The document outlines a preferred solution of using predefined roles that are consistent across customers. Each role builds upon the previous one, with three default roles: Regular User, Administrator, and Owner. The RBAC also discusses permissions that organizations may want to move between roles, such as creating workflows and approving private workflows.

With a focus on clear implementation steps, including modeling individual permissions as 'scopes' and tagging users with predefined roles, Incident.io's RBAC framework enables organizations to control user access and permissions within their product efficiently and securely.

Related examples in Request for Comments (RFCs)

HashiCorp

RFC: Standard Communication Tools

Meetup

An RFC for RFCs

View all templates in Request for Comments (RFCs)

Incident.io Product RBAC

We want to restrict what users can do inside of the product, so that:

People can’t accidentally modify global settings
Only specific people can see sensitive data (private incidents)

RBAC can be complex, and is trickier than most features due to the serious consequences of error, that there are users with motivations to break it, and the difficulty in reversing decisions (it is easy to take one-way doors).

Any solution should be:

Simple and easily understandable
Explainable, so we can diagnose unexpected permission errors
Be minimally flexible while achieving product goals

The last aim is a reflection on Hyrum’s Law, which implies any possible RBAC configuration will be used by at least one customer, who is unlikely to cheer if we remove it.

This author has a preferred solution, which is where we can start this discussion:

Proposal: Predefined roles

We’ll model RBAC using predefined roles that are consistent across our customers.

Roles will have an order, where each successive role will include and build on (be a proper superset of) the previous role.

We would launch with three default roles:

Regular User (default)Create and view incidents
Interact with incidents, ie. create actions and post updates
AdministratorModify organisation settings, such as integrations (most of /settings)
Create workflows (can be delegated to user)
Create announcement rules (can be delegated to user)
OwnerCan see all private incidents, regardless of membership
Approve workflows which apply to private incidents (can be delegated to admin)

Modifying roles

Each organisation has a different opinion on how roles should work, and we’ve identified a few permissions that people will want to move between roles.

These permissions are:

Creating workflows: we (incident.io) want all users to access workflows, to increase adoption of the feature. Some organisations will want to restrict workflows to admins, to prevent accidental breakage or too many workflows being created.
Approving private workflows: workflows can be used to escalate permissions, as a user can execute a workflow that performs actions they are not permitted to take (ie, invite the user to a private incident). We should restrict approval to owners by default, but some organisations may want to delegate authority to admins, so more people can approve workflow changes.

This is not an exhaustive list of contentious permissions, but illustrate a class of problem.

For these permissions, we allow organisations to relax the restriction (perhaps better expressed as delegating the ability to?) lower roles.

As an example, an organisation owner could delegate authority to admins for approving private incident workflows. In a similar fashion, an admin could permit regular users to create workflows.

This is a bad UX, but sketches how this might appear:

When organisations first sign up, the default roles would match our opinion on how incident.io is best configured, with all users being able to create announcement rules and workflows, while owners are the only role that can approve private workflows (TBC).

Critique against requirements

Most of our requirements around RBAC focus on maintainability for us, ensuring we can implement a viable MVP while preserving our ability to extend it.

Before we discuss implementation, it’s worth evaluating this proposal against those trade-offs:

Users have just one role

In mature RBAC systems, you normally have:

Entities, in our case users
Roles, which comprise of individual scopes
Role bindings, which apply a role to an entity, granting the role scopes

Our system rejects the one-to-many relationship of users to roles (and thereby scopes) in favour of users having just one role.

Things to note:

This is much simpler than users having many roles, which should benefit our users
In terms of implementation, a one-to-one mapping of user to role is a specialised case of one-to-many, and can easily be extended into the latter
We can only adopt this simplification due to roles being proper subsets

The last point is important.

There are roles, such as a billing manager, which would break this system as the billing manager role should be exclusively for managing billing.

With the constraint of one role per user, we could not grant the ability to manage billing to an existing user without putting them in a role that removes their other abilities.

It seems unlikely we’ll need this within 12 months, which makes it an acceptable trade-off.

Everything is scopes, under the hood

While we expose pre-packaged roles to our customers, we’ll model roles internally as a collection of scopes.

We really want to do this, as it provides a layer of abstraction between our users and how we model the scopes, but is only acceptable with us permitting delegation of the tricky scopes.

Implementation

In summary bullet form:

Model individual permissions as ‘scopes’
Tag each user with a predefined role
Define a function from role and organisation settings to a list of scopes
When performing a restricted action, build the users scopes from their role and confirm the user has the desired scope

As a sketch, here is a basic rbac package:

package rbac

// Scope is a permission to execute an action, often at a very granular level, that can be
// included in a role.
//
// When implementing features that require restriction, add a scope and provide it to the
// appropriate scopes.
type Scope struct {
	Name        string
	Description string
}

var (
	// Announcement rules
	ScopeAnnouncementRulesCreate = Scope{
		Name:        "announcementRules.create",
		Description: "Create announcement rules.",
	}
	// Incidents
	ScopeIncidentsCreate = Scope{
		Name:        "incidents.create",
		Description: "Create incidents.",
	}
	ScopeIncidentsRespond = Scope{
		Name:        "incidents.respond",
		Description: "Perform incident operations as part of responding, ie. creating incident actions.",
	}
	ScopeIncidentsGlobalAccess = Scope{
		Name:        "incidents.globalAccess",
		Description: "Administrate all incidents irrespective of membership.",
	}
	// Workflows
	ScopeWorkflowsCreate = Scope{
		Name:        "workflows.create",
		Description: "Create workflows.",
	}
	ScopeWorkflowsApprovePrivate = Scope{
		Name:        "workflows.approvePrivate",
		Description: "Approve workflows that act on private incidents.",
	}
)

// Role is a named collection of scopes, representing a predefined role within the
// product.
type Role struct {
	Name        string
	Description string
	Rank        uint
}

var (
	RoleUser = Role{
		Name:        "user",
		Description: "Regular users, with no special permissions.",
		Rank:        1,
	}
	RoleAdmin = Role{
		Name:        "admin",
		Description: "Admins can manage organisation settings.",
		Rank:        2,
	}
	RoleOwner = Role{
		Name:        "owner",
		Description: "Owners can see all incidents and take any administrative action.",
		Rank:        3,
	}
)

// Organisation contains the fields we'd add to domain.Organisation
type Organisation struct {
	ScopeDelegateAnnouncementRulesCreateToUser  bool
	ScopeDelegateWorkflowsCreateToUser          bool
	ScopeDelegateWorkflowsApprovePrivateToAdmin bool
}

// ScopesFor returns a list of scopes for a given role, taking into account the
// organisations delegation settings.
func ScopesFor(org *Organisation, role *Role) []Scope {
	scopes := []Scope{}

	// User
	if role.Rank <= RoleUser.Rank {
		scopes = append(scopes,
			ScopeIncidentsCreate,
		)

		if org.ScopeDelegateAnnouncementRulesCreateToUser {
			scopes = append(scopes, ScopeAnnouncementRulesCreate)
		}
		if org.ScopeDelegateWorkflowsCreateToUser {
			scopes = append(scopes, ScopeWorkflowsCreate)
		}
	}

	// Admin
	if role.Rank <= RoleAdmin.Rank {
		scopes = append(scopes,
			ScopeAnnouncementRulesCreate,
			ScopeWorkflowsCreate,
		)

		if org.ScopeDelegateWorkflowsApprovePrivateToAdmin {
			scopes = append(scopes, ScopeWorkflowsCreate)
		}
	}

	// Owner
	if role.Rank <= RoleOwner.Rank {
		scopes = append(scopes,
			ScopeWorkflowsApprovePrivate,
		)
	}

	return scopes
}

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

At the API layer, we would tag endpoints with required scopes in the API design DSL, which allows us to solve this generically for each endpoint.

Slack would be harder, but a similar check for scopes would be applied wherever we implement restricted actions.

API access

When we offer an API, we are likely to provide organisation access tokens.

This provides another opportunity for privilege escalation, in that a user could create a token that gains permissions they lack.

If we implement RBAC as above with scopes, we should be able to guard against this by preventing users from creating tokens with scopes they do not themselves possess. This is a common prevention mechanism in RBAC systems, one example being Kubernetes.

That we can do this simply and securely is a positive indication of this system.

Roll-out

When we first release RBAC, we’ll be forced to categorise existing users into roles.

We can categorise by:

First user (who is still active in the workspace) that signed up becomes the account owner
Anyone who has created a workflow or announcement rule (or any other detectable setting change) is granted admin
All other users will be regular

This isn’t perfect, but should be approximately workable. We can notify owners and admins to explain their new role, and allows anyone to see the list of owners and admins in the dashboard so it’s easy to request a role bump.

Changing a user’s role

The API to change a role should target a user and specify the new role name.

This is a permitted action only if:

Target user has a role equal to or lower than your own
Actor is at least the target role

It would look like this:

POST /api/users/:id/actions/set_role
{
  "role": "admin"
}

Related examples in Request for Comments (RFCs)

HashiCorp

RFC: Standard Communication Tools

Meetup

An RFC for RFCs