Before diving into the SLSA Levels, we need to establish a core set of terminology and models to describe what we’re protecting.
Software supply chain
SLSA’s framework addresses every step of the software supply chain - the sequence of steps resulting in the creation of an artifact. We represent a supply chain as a directed acyclic graph of sources, builds, dependencies, and packages. One artifact’s supply chain is a combination of its dependencies’ supply chains plus its own sources and builds.
|Artifact||An immutable blob of data; primarily refers to software, but SLSA can be used for any artifact.||A file, a git commit, a directory of files (serialized in some way), a container image, a firmware image.|
|Attestation||An authenticated statement (metadata) about a software artifact or collection of software artifacts.||A signed SLSA Provenance file.|
|Source||Artifact that was directly authored or reviewed by persons, without modification. It is the beginning of the supply chain; we do not trace the provenance back any further.||Git commit (source) hosted on GitHub (platform).|
|Build||Process that transforms a set of input artifacts into a set of output artifacts. The inputs may be sources, dependencies, or ephemeral build outputs.||.travis.yml (process) run by Travis CI (platform).|
|Package||Artifact that is “published” for use by others. In the model, it is always the output of a build process, though that build process can be a no-op.||Docker image (package) distributed on DockerHub (platform). A ZIP file containing source code is a package, not a source, because it is built from some other source, such as a git commit.|
|Dependency||Artifact that is an input to a build process but that is not a source. In the model, it is always a package.||Alpine package (package) distributed on Alpine Linux (platform).|
Throughout the specification, you will see reference to the following roles that take part in the software supply chain. Note that in practice a role may be filled by more than one person or an organization. Similarly, a person or organization may act as more than one role in a particular software supply chain.
|Producer||A party who creates software and provides it to others. Producers are often also consumers.||An open source project’s maintainers. A software vendor.|
|Verifier||A party who inspect an artifact’s provenance to determine the artifact’s authenticity.||A business’s software ingestion system. A programming language ecosystem’s package registry.|
|Consumer||A party who uses software provided by a producer. The consumer may verify provenance for software they consume or delegate that responsibility to a separate verifier.||A developer who uses open source software distributions. A business that uses a point of sale system.|
|Infrastructure provider||A party who provides software or services to other roles.||A package registry’s maintainers. A build platform’s maintainers.|
We model a build as running on a multi-tenant build platform, where each execution is independent.
- A tenant invokes the build by specifying external parameters through an interface, either directly or via some trigger. Usually, at least one of these external parameters is a reference to a dependency. (External parameters are literal values while dependencies are artifacts.)
- The build platform’s control plane interprets these external parameters, fetches an initial set of dependencies, initializes a build environment, and then starts the execution within that environment.
- The build then performs arbitrary steps, which might include fetching additional dependencies, and then produces one or more output artifacts. The steps within the build environment are under the tenant’s control. The build platform isolates build environments from one another to some degree (which is measured by the SLSA Build Level).
- Finally, for SLSA Build L2+, the control plane outputs provenance describing this whole process.
Notably, there is no formal notion of “source” in the build model, just external parameters and dependencies. Most build platforms have an explicit “source” artifact to build from, which is often a git repository; in the build model, the reference to this artifact is an external parameter while the artifact itself is a dependency.
For examples of how this model applies to real-world build platforms, see index of build types.
|Platform||System that allows tenants to run builds. Technically, it is the transitive closure of software and services that must be trusted to faithfully execute the build. It includes software, hardware, people, and organizations.|
|Admin||A privileged user with administrative access to the platform, potentially allowing them to tamper with builds or the control plane.|
|Tenant||An untrusted user that builds an artifact on the platform. The tenant defines the build steps and external parameters.|
|Control plane||Build platform component that orchestrates each independent build execution and produces provenance. The control plane is managed by an admin and trusted to be outside the tenant’s control.|
|Build||Process that converts input sources and dependencies into output artifacts, defined by the tenant and executed within a single build environment on a platform.|
|Steps||The set of actions that comprise a build, defined by the tenant.|
|Build environment||The independent execution context in which the build runs, initialized by the control plane. In the case of a distributed build, this is the collection of all such machines/containers/VMs that run steps.|
|Build caches||An intermediate artifact storage managed by the platform that maps intermediate artifacts to their explicit inputs. A build may share build caches with any subsequent build running on the platform.|
|External parameters||The set of top-level, independent inputs to the build, specified by a tenant and used by the control plane to initialize the build.|
|Dependencies||Artifacts fetched during initialization or execution of the build process, such as configuration files, source artifacts, or build tools.|
|Outputs||Collection of artifacts produced by the build.|
|Provenance||Attestation (metadata) describing how the outputs were produced, including identification of the platform and external parameters.|
Ambiguous terms to avoid
- Build recipe: Could mean external parameters, but may include concrete steps of how to perform a build. To avoid implementation details, we don’t define this term, but always use “external parameters” which is the interface to a build platform. Similar terms are build configuration source and build definition.
- Builder: Usually means build platform, but might be used for build
environment, the user who invoked the build, or a build tool from
dependencies. To avoid confusion, we always use “build platform”. The only
exception is in the provenance, where
builderis used as a more concise field name.
Software is distributed in identifiable units called packages according the rules and conventions of a package ecosystem. Examples of formal ecosystems include Python/PyPA, Debian/Apt, and OCI, while examples of informal ecosystems include links to files on a website or distribution of first-party software within a company.
Abstractly, a consumer locates software within an ecosystem by asking a package registry to resolve a mutable package name into an immutable package artifact.1 To publish a package artifact, the software producer asks the registry to update this mapping to resolve to the new artifact. The registry represents the entity or entities with the power to alter what artifacts are accepted by consumers for a given package name. For example, if consumers only accept packages signed by a particular public key, then it is access to that public key that serves as the registry.
The package name is the primary security boundary within a package ecosystem. Different package names represent materially different pieces of software—different owners, behaviors, security properties, and so on. Therefore, the package name is the primary unit being protected in SLSA. It is the primary identifier to which consumers attach expectations.
|Package||An identifiable unit of software intended for distribution, ambiguously meaning either an “artifact” or a “package name”. Only use this term when the ambiguity is acceptable or desirable.|
|Package artifact||A file or other immutable object that is intended for distribution.|
|Package ecosystem||A set of rules and conventions governing how packages are distributed, including how clients resolve a package name into one or more specific artifacts.|
|Package manager client||Client-side tooling to interact with a package ecosystem.|
The primary identifier for a mutable collection of artifacts that all represent different versions of the same software. This is the primary identifier that consumers use to obtain the software.
A package name is specific to an ecosystem + registry, has a maintainer, is more general than a specific hash or version, and has a “correct” source location. A package ecosystem may group package names into some sort of hierarchy, such as the Group ID in Maven, though SLSA does not have a special term for this.
|Package registry||An entity responsible for mapping package names to artifacts within a packaging ecosystem. Most ecosystems support multiple registries, usually a single global registry and multiple private registries.|
|Publish [a package]||Make an artifact available for use by registering it with the package registry. In technical terms, this means associating an artifact to a package name. This does not necessarily mean making the artifact fully public; an artifact may be published for only a subset of users, such as internal testing or a closed beta.|
Ambiguous terms to avoid
- Package repository: Could mean either package registry or package name, depending on the ecosystem. To avoid confusion, we always use “repository” exclusively to mean “source repository”, where there is no ambiguity.
- Package manager (without “client”): Could mean either package ecosystem, package registry, or client-side tooling.
Mapping to real-world ecosystems
Most real-world ecosystems fit the package model above but use different terms. The table below attempts to document how various ecosystems map to the SLSA Package model. There are likely mistakes and omissions; corrections and additions are welcome!
|Package ecosystem||Package registry||Package name||Package artifact|
|Cargo (Rust)||Registry||Crate name||Artifact|
|CPAN (Perl)||Upload server||Distribution||Release (or Distribution)|
|Go||Module proxy||Module path||Module|
|Maven (Java)||Repository||Group ID + Artifact ID||Artifact|
|PyPA (Python)||Index||Project Name||Distribution|
|Dpkg (e.g. Debian)||?||Package name||Package|
|Homebrew (e.g. Mac)||Repository (Tap)||Package name (Formula)||Binary package (Bottle)|
|Pacman (e.g. Arch)||Repository||Package name||Package|
|RPM (e.g. Red Hat)||Repository||Package name||Package|
|Nix (e.g. NixOS)||Repository (e.g. Nixpkgs) or binary cache||Derivation name||Derivation or store object|
|deps.dev: System||Packaging authority||Package||n/a|
- Go uses a significantly different distribution model than other ecosystems. In go, the package name is a source repository URL. While clients can fetch directly from that URL—in which case there is no “package” or “registry”—they usually fetch a zip file from a module proxy. The module proxy acts as both a builder (by constructing the package artifact from source) and a registry (by mapping package name to package artifact). People trust the module proxy because builds are independently reproducible and a checksum database guarantees that all clients receive the same artifact for a given URL.
Verification in SLSA is performed in two ways. Firstly, the build platform is certified to ensure conformance with the requirements at the level claimed by the build platform. This certification should happen on a recurring cadence with the outcomes published by the platform operator for their users to review and make informed decisions about which builders to trust.
Secondly, artifacts are verified to ensure they meet the producer defined expectations of where the package source code was retrieved from and on what build platform the package was built.
|Expectations||A set of constraints on the package’s provenance metadata. The package producer sets expectations for a package, whether explicitly or implicitly.|
|Provenance verification||Artifacts are verified by the package ecosystem to ensure that the package’s expectations are met before the package is used.|
|Build platform certification||Build platforms are certified for their conformance to the SLSA requirements at the stated level.|
The examples below suggest some ways that expectations and verification may be implemented for different, broadly defined, package ecosystems.
Example: Small software team
|Expectations||Defined by the producer’s security personnel and stored in a database.|
|Provenance verification||Performed automatically on cluster nodes before execution by querying the expectations database.|
|Build platform certification||The build platform implementer follows secure design and development best practices, does annual penetration testing exercises, and self-certifies their conformance to SLSA requirements.|
Example: Open source language distribution
|Expectations||Defined separately for each package and stored in the package registry.|
|Provenance verification||The language distribution registry verifies newly uploaded packages meet expectations before publishing them. Further, the package manager client also verifies expectations prior to installing packages.|
|Build platform certification||Performed by the language ecosystem packaging authority.|
This resolution might include a version number, label, or some other selector in addition to the package name, but that is not important to SLSA. ↩