En Provence [Some thoughts about npm package provenance - and why I have not enabled it]

Thursday, July 25, 2024

Last year, the GitHub blog outlined efforts to secure the Node.js package repository by Introducing npm package provenance. They write:

In order to increase the level of trust you have in the npm packages you download from the registry you must have visibility into the process by which the source was translated into the published artifact.

This requirement is addressed by npm package provenance:

What we need is a way to draw a direct line from the npm package back to the exact source code commit from which it was derived.

The npm documentation on Generating provenance statements explains the implications:

When a package in the npm registry has established provenance, it does not guarantee the package has no malicious code. Instead, npm provenance provides a verifiable link to the package's source code and build instructions, which developers can then audit and determine whether to trust it or not.

It is important to call out that provenance does NOT:

Establish trustworthiness: A package published with provenance could contain code to delete personal files when installed or imported. And if a package did so, it would be inappropriate to remove provenance as that is not a statement about trust.
Enable reproducible builds: The process used to produce a package can (and frequently does) reference artifacts outside its own repository when creating the package. A common scenario is installing other npm packages without a lock file. Even using a lock file offers no guarantee because dependent packages could reference ephemeral content like a URL, local state, randomness, etc..
Avoid the need to audit package contents: Knowing which repository commit was used to generate a package offers no guarantee about what is actually in the package. So it is necessary to manually audit every file in the package if security is important and trustworthiness needs to be established.
Avoid the need to audit package contents for every new version: Similarly, just because one version of a package was found to be trustworthy, there is no guarantee the next version will be. Therefore, it is necessary to perform a complete audit for every update.

It's also notable that provenance DOES:

Require giving an npm publish token to GitHub: As outlined in the documentation on Publishing packages to the npm registry, provenance requires granting GitHub permission to publish packages on your behalf. This creates a new opportunity for integrity to be compromised (and makes GitHub an attractive target for attackers).
Require bypassing two-factor authentication (2FA) for npm package publish: Multi-factor authentication is widely considered a baseline security practice. As outlined in the documentation on Requiring 2FA for package publishing and settings modification, this security measure must be disabled to give GitHub permission to publish on your behalf (because GitHub does not have access to the second factor).
Require defining and maintaining a GitHub Actions workflow for package publish: This additional effort should not be overwhelming for a package maintainer, but it represents ongoing time and attention that does not add (direct) value for a package's users. Furthermore, this workflow must be replicated across each package.

Considering the advantages and disadvantages, it does not seem to me that introducing npm provenance offers compelling enough benefits (for package consumers or for producers) to offset the cost for a maintainer. (Especially for a maintainer like myself with multiple packages and limited free time.) For now, my intent is to continue using git tags to identify which repository commit is associated with published package versions (e.g., markdownlint tags).

Tags: Node.js Technical