Reproducible Builds in June 2022

View all our monthly reports


Welcome to the June 2022 report from the Reproducible Builds project. In these reports, we outline the most important things that we have been up to over the past month. As a quick recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries.


Save the date!

Despite several delays, we are pleased to announce dates for our in-person summit this year:

November 1st 2022 — November 3rd 2022


The event will happen in/around Venice (Italy), and we intend to pick a venue reachable via the train station and an international airport. However, the precise venue will depend on the number of attendees.

Please see the announcement mail from Mattia Rizzolo, and do keep an eye on the mailing list for further announcements as it will hopefully include registration instructions.


News

David Wheeler filed an issue against the Rust programming language to report that builds are “not reproducible because full path to the source code is in the panic and debug strings”. Luckily, as one of the responses mentions: “the --remap-path-prefix solves this problem and has been used to great effect in build systems that rely on reproducibility (Bazel, Nix) to work at all” and that “there are efforts to teach cargo about it here”.


The Python Security team announced that:

The ctx hosted project on PyPI was taken over via user account compromise and replaced with a malicious project which contained runtime code which collected the content of os.environ.items() when instantiating Ctx objects. The captured environment variables were sent as a base64 encoded query parameter to a Heroku application […]

As their announcement later goes onto state, version-pinning using “hash-checking mode” can prevent this attack, although this does depend on specific installations using this mode, rather than a prevention that can be applied systematically.


Developer vanitasvitae published an interesting and entertaining blog post detailing the blow-by-blow steps of debugging a reproducibility issue in PGPainless, a library which “aims to make using OpenPGP in Java projects as simple as possible”.

Whilst their in-depth research into the internals of the .jar may have been unnecessary given that diffoscope would have identified the, it must be said that there is something to be said with occasionally delving into seemingly “low-level” details, as well describing any debugging process. Indeed, as vanitasvitae writes:

Yes, this would have spared me from 3h of debugging 😉 But I probably would also not have gone onto this little dive into the JAR/ZIP format, so in the end I’m not mad.


Kees Cook published a short and practical blog post detailing how he uses reproducibility properties to aid work to replace one-element arrays in the Linux kernel. Kees’ approach is based on the principle that if a (small) proposed change is considered equivalent by the compiler, then the generated output will be identical… but only if no other arbitrary or unrelated changes are introduced. Kees mentions the “fantastic” diffoscope tool, as well as various kernel-specific build options (eg. KBUILD_BUILD_TIMESTAMP) in order to “prepare my build with the ‘known to disrupt code layout’ options disabled”.


Stefano Zacchiroli gave a presentation at GDR Sécurité Informatique based in part on a paper co-written with Chris Lamb titled Increasing the Integrity of Software Supply Chains. (Tweet)


Debian

In Debian in this month, 28 reviews of Debian packages were added, 35 were updated and 27 were removed this month adding to our knowledge about identified issues. Two issue types were added: nondeterministic_checksum_generated_by_coq and nondetermistic_js_output_from_webpack.

After Holger Levsen found hundreds of packages in the bookworm distribution that lack .buildinfo files, he uploaded 404 source packages to the archive (with no meaningful source changes). Currently bookworm now shows only 8 packages without .buildinfo files, and those 8 are fixed in unstable and should migrate shortly. By contrast, Debian unstable will always have packages without .buildinfo files, as this is how they come through the NEW queue. However, as these packages were not built on the official build servers (ie. they were uploaded by the maintainer) they will never migrate to Debian testing. In the future, therefore, testing should never have packages without .buildinfo files again.

Roland Clobus posted yet another in-depth status report about his progress making the Debian Live images build reproducibly to our mailing list. In this update, Roland mentions that “all major desktops build reproducibly with bullseye, bookworm and sid” but also goes on to outline the progress made with automated testing of the generated images using openQA.


GNU Guix

Vagrant Cascadian made a significant number of contributions to GNU Guix:

Elsewhere in GNU Guix, Ludovic Courtès published a paper in the journal The Art, Science, and Engineering of Programming called Building a Secure Software Supply Chain with GNU Guix:

This paper focuses on one research question: how can [Guix]((https://www.gnu.org/software/guix/) and similar systems allow users to securely update their software? […] Our main contribution is a model and tool to authenticate new Git revisions. We further show how, building on Git semantics, we build protections against downgrade attacks and related threats. We explain implementation choices. This work has been deployed in production two years ago, giving us insight on its actual use at scale every day. The Git checkout authentication at its core is applicable beyond the specific use case of Guix, and we think it could benefit to developer teams that use Git.

A full PDF of the text is available.


openSUSE

In the world of openSUSE, SUSE announced at SUSECon that they are preparing to meet SLSA level 4. (SLSA (Supply chain Levels for Software Artifacts) is a new industry-led standardisation effort that aims to protect the integrity of the software supply chain.)

However, at the time of writing, timestamps within RPM archives are not normalised, so bit-for-bit identical reproducible builds are not possible. Some in-toto provenance files published for SUSE’s SLE-15-SP4 as one result of the SLSA level 4 effort. Old binaries are not rebuilt, so only new builds (e.g. maintenance updates) have this metadata added.

Lastly, Bernhard M. Wiedemann posted his usual monthly openSUSE reproducible builds status report.


diffoscope

diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 215, 216 and 217 to Debian unstable. Chris Lamb also made the following changes:

  • New features:

    • Print profile output if we were called with --profile and we were killed via a TERM signal. This should help in situations where diffoscope is terminated due to some sort of timeout. []
    • Support both PyPDF 1.x and 2.x. []
  • Bug fixes:

    • Also catch IndexError exceptions (in addition to ValueError) when parsing .pyc files. (#1012258)
    • Correct the logic for supporting different versions of the argcomplete module. []
  • Output improvements:

    • Don’t leak the (likely-temporary) pathname when comparing PDF documents. []
  • Logging improvements:

    • Update test fixtures for GNU readelf 2.38 (now in Debian unstable). [][]
    • Be more specific about the minimum required version of readelf (ie. binutils), as it appears that this ‘patch’ level version change resulted in a change of output, not the ‘minor’ version. []
    • Use our @skip_unless_tool_is_at_least decorator (NB. at_least) over @skip_if_tool_version_is (NB. is) to fix tests under Debian stable. []
    • Emit a warning if/when we are handling a UNIX TERM signal. []
  • Codebase improvements:

    • Clarify in what situations the main finally block gets called with respect to TERM signal handling. []
    • Clarify control flow in the diffoscope.profiling module. []
    • Correctly package the scripts/ directory. []

In addition, Edward Betts updated a broken link to the RSS on the diffoscope homepage and Vagrant Cascadian updated the diffoscope package in GNU Guix [][][].


Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:


Testing framework

The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:

  • Holger Levsen:

    • Add a package set for packages that use the R programming language [] as well as one for Rust [].
    • Improve package set matching for Python [] and font-related [] packages.
    • Install the lz4, lzop and xz-utils packages on all nodes in order to detect running kernels. []
    • Improve the cleanup mechanisms when testing the reproducibility of Debian Live images. [][]
    • In the automated node health checks, deprioritise the “generic kernel warning”. []
  • Roland Clobus (Debian Live image reproducibility):

    • Add various maintenance jobs to the Jenkins view. []
    • Cleanup old workspaces after 24 hours. []
    • Cleanup temporary workspace and resulting directories. []
    • Implement a number of fixes and improvements around publishing files. [][][]
    • Don’t attempt to preserve the file timestamps when copying artifacts. []

And finally, node maintenance was also performed by Mattia Rizzolo [].


Mailing list and website

On our mailing list this month:

Lastly, Chris Lamb updated the main Reproducible Builds website and documentation in a number of small ways, but primarily published an interview with Hans-Christoph Steiner of the F-Droid project. Chris Lamb also added a Coffeescript example for parsing and using the SOURCE_DATE_EPOCH environment variable []. In addition, Sebastian Crane very-helpfully updated the screenshot of salsa.debian.org’s “request access” button on the How to join the Salsa group. []


Contact

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:




View all our monthly reports