git://codeberg.org:stagex/stagex | matrix://#stagex:matrix.org | ircs://irc.oftc.net:6697#stagex
Minimalism and security first repository of reproducible and multi-signed OCI images of common open source software toolchains full-source bootstrapped from Stage 0 all the way up.
If you want to build or deploy software on a foundation of minimalism and determinism with reasonable security, stagex might be the solution you are looking for.
You can do anything with these images you would with most any other musl based containerized linux distro, only with high supply chain integrity and determinism.
For a full list of images see the "packages" directory.
docker run -it stagex/stage3
docker run -it stagex/pallet-python -c "print('hello, world!')"
FROM stagex/pallet-rust AS build
RUN ["cargo", "new", "--bin", "pattern_matcher"]
WORKDIR /pattern_matcher
RUN ["cargo", "add", "regex"]
COPY <<-EOF /pattern_matcher/src/main.rs
use regex::Regex;
fn main() {
let mut args = std::env::args();
args.next();
let pattern = args.next().expect("pattern not given");
let text = args.next().expect("text to match not given");
let re = Regex::new(&pattern).expect("given pattern is invalid regex");
if let Some(r#match) = re.find(&text) {
println!("Found match: {match:?}");
}
}
EOF
ENV RUSTFLAGS="-C target-feature=+crt-static"
RUN ["cargo", "build", "--release"]
FROM stagex/core-filesystem AS package
COPY --from=build /pattern_matcher/target/release/pattern_matcher /usr/bin/pattern_matcher
ENTRYPOINT ["/usr/bin/pattern_matcher"]
Note the difference between the "build" and the final image: build
has to
pull the rust
pallet, which includes just the binaries required to build a
Rust program, but the final OCI image only contains the statically compiled
Rust binary, and is tiny as a result.
Oftentimes, you'll need dependencies that aren't included by default, such as
clang
when building crates using Rust's bindgen
crate. StageX makes adding
packages super simple. In your build
phase, add the following line:
COPY --from=stagex/core-clang . /
No RUN
commands needed.
Unlike most linux distros, stagex was built for determinism, minimalism, and containers first, and thus has no concept of a traditional package manager. In fact, stagex ships no first-party code at all. We just package things in the most "stock" way possible with exceptions only to maintain determinism.
Every image is "from scratch" and contains an empty filesystem with the installed package.
By default you always get the latest updates to dependencies on the fly, but you retain the option for bit-for-bit reproducible builds by locking any given dependency at a particular tag or image hash.
If you want an old version of rust with a recent version of GCC to work around
some problem build, you can do that without resorting to low security
"curl | bash" style solutions like rustup.
We built to support very high risk threat models where trusting any single system or maintainer in our software supply chain cannot be tolerated. That said, we should also function as a drop-in replacement for musl-based linux distributions for virtually any threat model.
Our aim is to provide a reasonably secure set of toolchains for every major programming language to be the basis of your containers, build systems, firmware, secure enclaves, or hosting infrastructure.
Not all of these goals are 100% realized yet, but should at least help you decide if this project is something you want to contribute to or keep an eye on for the future.
- Anyone can reproduce the entire tree with tools from their current distro
- Hosted CI servers auto-sign confirmed deterministic builds
- Like NixOS
- Multiple maintainers reproduce the entire build and ensure that everything matches down to the last bit
- Maintainers sign all package additions/changes
- Like Gentoo, Debian, Fedora, Guix
- Reviewers/Reproducers locally build and counter-sign all new binary packages
- No one does this, as far as we can tell
- Based on musl libc
- Basis of successful minimal distros like Alpine, Adelie, Talos, Void
- Implemented with about 1/4 the code of glibc
- Required to produce portable static binaries in some languages
- Less prone to buffer overflows
- Puts being light, fast, and correct before compatibility
- Package using tools you already have
- OCI build tool of choice (Docker, Buildah, Podman)
- Make (for dependency management)
- Prove hashes of bootstrap layer builds match before proceeding
- Keep package definitions lean and readable with simple CLI and no magic
We have learned a lot of lessons about supply chain integrity over the years, and the greatest of them may be that any system that is complex to review and assigns trust of significant components to single individuals, which creates significant points of failure, will lead to eventual compromise.
Distros (Linux distributions) rely on complex package management systems for which only a single implementation exists. They typically generate a lot of custom tooling, which in turn rapidly grows in complexity to meet demands ranging from hobby desktops to production servers. This complexity demands a lot of effort to maintain, and in practice results in a tendency to reduce security overhead in order to lower the barrier to entry to attract more maintainers. As a result, projects rarely mandate cryptographic signing or reproducible builds, let alone multiple signed reproduction proofs. In fact, some popular distros use a server to blindly sign all contributions from the public, which can give a false sense of security to the unassuming user.
We will cover an exhaustive comparison of the supply chain strategies of other
package management solutions elsewhere, but while many are pursuing reproducible
builds, minimalism, or signing, there isn't currently another solution which delivers
on all of these basic tenets of supply chain security. stagex
is an attempt to fix
this, in order to satisfy the criteria of reasonably secure supply chain strategy,
which requires more than one individual to deterministically build and sign software.
Ask yourself the following: do I have a way of verifying that this binary was produced based on this source code?
While software is often reviewed for security flaws, and sometimes provides signed releases, what is missing is the ability to prove that the resulting binary is the direct result of that code and nothing has been modified along the way. To achieve this, we have to make the software always build the exact same thing, down to the last bit - this is what determinism or reproducibility is. You may be reading this and thinking "of course it should always build to the same exact binary", but this is usually not the case - it's highly unlikely that any of the software you have ever built is deterministic. By forcing software to always produce the same binary, we can use hashes to easily verify nothing has been modified and no new code has been introduced to the software during compilation. This is a significant security improvement, but it's not enough for only one individual to build something deterministically as they could be compromised - the real guarantee comes from multiple individuals compiling the software using different setups and still getting the same hashes. This gives us multiple points of reference, which we can use to figure out if the integrity of the software is truly in tact.
To develop a further intuition about the distinction between trusting source code and trusting what the compiler translates that source code to, you may refer to the seminal paper by Ken Thomson, Reflections on Trusting Trust
A comparison of stagex
to other distros in some of the areas we care about:
Distro | Trust Model | OCI | Packaging | Bootstrapped | Reproducible |
---|---|---|---|---|---|
Stagex | Decentralized | Native | Declarative | Yes | Yes |
Guix | Distributed | Exported | Declarative | Yes | Mostly |
Debian | Distributed | Published | Imperative | No | Mostly |
Arch | Distributed | Published | Imperative | No | Mostly |
Nix | Centralized | Exported | Declarative | Partial | Mostly |
Yocto | Centralized | Exported | None | No | No |
Buildroot | Centralized | Exported | None | No | No |
Alpine | Centralized | Published | Imperative | No | No |
Fedora | Centralized | Published | Imperative | No | No |
- “Trust Model”:
- "Decentralized": No single system or individual is trusted
- "Centralized": One single system or individual is trusted
- "Distributed": All members of a system or organization are trusted
- "OCI"
- "Native": OCI layers are the native package management system
- "Exported": Has the capability to export OCI from non-OCI build system
- "Published": Has published official OCI images
- "Packaging"
- "Declarative": Can declare exact dependency chain at time of usage
- "Imperative": Packaging system chooses dependencies for you at build time
- "None": No packages at all, only source code
- “Bootstrapped”
- Can the entire distro be full-source-bootstrapped from Stage0
- “Reproducible”
- Is the entire distro reproducible bit-for-bit identically
- Signatures are made by the PGP public keys in the "keys" directory
- Signatures are made by any tool that implements "Container Signature Format"
- We provide a minimal shell script implementation as a convenience
- Podman also implements support for this signature scheme
- Signatures are "PR"ed and committed to this repo as a source of truth
- Signatures can be mirrored to any HTTPS url
- Container daemons can verify signatures on pull with a containers-policy.json
- As a policy, we expect all published signers to:
- Maintain their PGP private keys offline and/or on personal HSMs
- E.g. Nitrokey, Yubikey, Leger, Trezor, etc.
- Maintain a keyoxide profile self-certifying keys
- Maintain a Hagrid profile with verified UIDs
- Make best efforts to meet in person and sign each others keys
- Create signatures from highly trusted operating systems
- E.g Dedicated QubesOS VM, or a an airgapped signing system
- Maintain their PGP private keys offline and/or on personal HSMs
The only way to produce trustworthy packages is to make sure no single system or human is ever trusted in the process of compiling them. Everything we release must be built deterministically. Further to avoid trusting any specific distro or platform, we must be able to reproduce even from wildly different toolchains, architectures, kernels, etc.
Using OCI container images as our base packaging system helps a lot here by making it easy to throw away non-deterministic build stages and control many aspects of the build environment. Also, as a well documented spec, it allows our packages to (ideally) be built with totally different OCI toolchains such as Docker, Podman, Kaniko, or Buildah.
This is only part of the story though, because being able to build deterministically means the compilers that compile our code themselves must be bootstrapped all the way from source code in a deterministic way.
- Final distributable packages are always OCI container images
- OCI allows reproduction by totally different toolchains E.g: Docker, Podman, Kaniko, or Buildah.
- OCI allows unlimited signatures on builds as part of the spec
- E.g: each party that chooses to reproduce adds their own signature
- We always "Full Source Bootstrap" everything from 0
- Stage0: 387 bytes of x86 assembly built by 3 distros with the same hash
- Also the same hash many others get from wildly different toolchains
- Relevant: Guix: Building From Source All The Way Down
- Stage1: A full x86 toolchain built from stage0 via live-bootstrap
- Stage2: Cross toolchain bridging us to modern 64 bit architectures
- Stage3: Native toolchain in native 64 bit architecture
- Stage(x): Later stages build the distributed packages in this repo
- Stage0: 387 bytes of x86 assembly built by 3 distros with the same hash
For further reading see the Bootstrappable Builds Project.
-
An OCI building runtime
- Currently Docker supported (v25+)
containerd
support is required
- Support for buildah and podman coming soon
- Currently Docker supported (v25+)
-
Gnu Make
make
make rust
make NOCACHE=1
Do this after successfully reproducing all packages and stages:
make sign
- Sui Blockchain Fullnode
- Large rust application w/ C dependencies
- Nimiq Blockchain Protocol
- Large rust application implementing all supporting software for the Nimiq blockchain protocol
- QuorumOS
- Nitro Enclave Framework w/ minimal rust init system and support applications
- EnclaveOS
- Mininmal Nitro Enclave Hello World
- AirgapOS
- Standalone minimal bootable Linux ISO for workstations
- ReproOS
- Server Linux image w/ minimal hypervisor guest image
- Reproducible builds made easy: introducing StageX
- Arnaud Brousseau | 2024
- Remote attestations are useless without reproducible builds
- Arnaud Brousseau | 2024
- Bitcoin Optech: Reproducible Builds
- Arch Linux: Reproducible Builds
- NixOS: Reproducible Builds
- Guix: Reproducible Builds
- Debian: Reproducible Builds
- SoK: Analysis of Software Supply Chain Security by Establishing Secure Design Properties
- Chinenye Okafor, Taylor R. Schorlemmer, Santiago Torres-Arias, James C. Davis | June 2024
- A Review of Attacks Against Language-Based Package Managers
- Aarnav M. Bos | February 2023
- Software supply chain: review of attacks, risk assessment strategies and security controls
- Betul Gokkaya, Leonardo Aniello, Basel Halak | May 2023
- Enhancing Software Supply Chain Resilience: Strategy For Mitigating Software Supply Chain Security Risks And Ensuring Security Continuity In Development Lifecycle
- Ahmed Akinsola, Abdullah Akinde | July 2024
- What is Software Supply Chain Security
- Marcela S. Melara and Mic Bowman | September 2022
- An Industry Interview Study of Software Signing for Supply Chain Security
- Kelechi G. Kalu, Tanya Singla, Chinenye Okafor, Santiago Torres-Arias, James C. Davis | June 2024
- Journey to the Center of Software Supply Chain Attacks
- Piergiorgio Ladisa, Serena Elisa Ponta, Antonino Sabetta, Matias Martinez, Olivier Barais | April 2023
- SoK: A Defense-Oriented Evaluation of Software Supply Chain Security
- Eman Abu Ishgair, Marcela S. Melara, Santiago Torres-Arias | May 2024
- S3C2 Summit 2023-02: Industry Secure Supply Chain Summit
- Trevor Dunlap, Yasemin Acar, Michel Cucker, William Enck, Alexandros Kapravelos, Christian Kastner, Laurie Williams | July 2023
- An Integrity-Focused Threat Model for Software Development Pipelines
- B. M. Reichert (1) and R. R. Obelheiro (1) ((1) Graduate Program in Applied Computing, State University of Santa Catarina) | November 2022
- Dirty-Waters: Detecting Software Supply Chain Smells
- Raphina Liu, Sofia Bobadilla, Benoit Baudry, Martin Monperrus | October 2024
- A Systematic Literature Review on Trust in the Software Ecosystem
- Fang Hou, Slinger Jansen | March 2022
- Backstabber's Knife Collection: A Review of Open Source Software Supply Chain Attacks
- Marc Ohm, Henrik Plate, Arnold Sykosch, Michael Meier | May 2020
- Reproducible Builds: Increasing the Integrity of Software Supply Chains
- Chris Lamb, Stefano Zacchiroli (DGD-I, UP) | April 2021
- Reproducibility of Build Environments through Space and Time
- Julien Malka (IP Paris, LTCI, ACES), Stefano Zacchiroli (IP Paris, LTCI, ACES), Th'eo Zimmermann (ACES, INFRES, IP Paris) | February 2024
- Levels of Binary Equivalence for the Comparison of Binaries from Alternative Builds
- Jens Dietrich, Tim White, Behnaz Hassanshahi, Paddy Krishnan | October 2024
- Repro: An Open-Source Library for Improving the Reproducibility and Usability of Publicly Available Research Code
- Daniel Deutsch and Dan Roth | April 2022
- Reproducible and User-Controlled Software Environments in HPC with Guix
- Ludovic Court`es (INRIA Bordeaux - Sud-Ouest), Ricardo Wurmus | June 2015
- Reflections on trusting trust
- Ken Thompson | 1984
- The Full-Source Bootstrap: Building from source all the way down
- Janneke Nieuwenhuizen, Ludovic Courtès | 2023
- Breaking Bitcoin: The Bitcoin Build System
- Carl Dong | 2020
- Expanding (Dis)trust
- Anton Livaja | 2024