Draft ECMA-xxx

1st Edition / July 5, 2025

Package-URL Specification

About this Specification

The document at https://tc54.org/ecmaXXX/ is the most accurate and up-to-date Package-URL specification.

This document is available as a single page and as multiple pages.

Contributing to this Specification

This specification is developed on GitHub with the help of the Package-URL community. There are a number of ways to contribute to the development of this specification:

Refer to the colophon for more information on how this document is created.

Introduction

Software ecosystems have evolved into highly interconnected networks of components, packages, and dependencies. Managing this complexity demands a robust, uniform mechanism to identify and track software packages across diverse ecosystems and tools. Package-URL (PURL) was developed to address this challenge by providing a simple, consistent, and flexible approach to identifying software packages with precision and clarity.

PURL introduces a standardized URL-based syntax that uniquely identifies software packages, independent of their ecosystem or distribution channel. Unlike traditional identification methods, PURL embeds critical metadata directly into its structure, enabling efficient, accurate package identification at scale. This standardization ensures interoperability between tools and ecosystems, fostering greater collaboration and reducing ambiguity in software supply chain management.

Challenges addressed by PURL:

As software supply chain security becomes a global priority, formalizing PURL as an international standard ensures its adoption and consistent implementation. Standardization under Ecma International Technical Committee 54 (TC54) positions PURL as a foundational building block for secure, transparent, and efficient software ecosystems worldwide.

By enabling a universally recognized and implementable specification, PURL aligns with global efforts to improve the security, reliability, and accountability of software supply chains. Its adoption ensures that organizations and developers can rely on a common language to manage software packages across the diverse and rapidly evolving software landscape.

1 Scope

This Standard defines the Package-URL specification.

2 Conformance

2.1 Requirements Terminology

In this standard, the words that are used to define the significance of each requirement are detailed below. These words are used in accordance with their definitions in RFC 2119, and their respective meanings are reproduced below:

  • Must: This word, or the adjective “required” and the auxiliary verb "shall", means that the item is an absolute requirement of the standard.
  • Should: This word, or the adjective “recommended”, means that there might exist valid reasons in particular circumstances to ignore this item, but the full implications should be understood and the case carefully weighed before making an implementation decision.
  • May: This word, or the adjective “optional”, means that this item is truly optional.

The words "must not", "shall not", "should not", and "not recommended", are the negative forms of "must", "shall", "should", and "recommended", respectively. There is no negative form of "may".

2.2 Implementation Conformance

A conforming implementation of Package-URL (PURL) must fully implement and support all elements defined within this specification, including the syntax, components, and semantic requirements for constructing and interpreting valid PURLs.

A conforming implementation of PURL must adhere to the syntax defined in this specification, ensuring that all PURLs are parsed, constructed, and validated according to the prescribed rules. The implementation must provide full support for ecosystem-agnostic behaviour, enabling PURLs to function consistently and reliably across diverse environments.

All required components of a PURL, such as the scheme, type, and name, must be present and validated according to the rules defined in this specification. Additionally, optional components, including qualifiers and subpaths, must be handled appropriately if provided, in full compliance with their specified behaviours.

Implementations must ensure that equivalent PURLs are consistently resolved to the same canonical representation. This includes strict adherence to normalisation and equivalence rules. Furthermore, implementations must process URI encoding and decoding for PURL components according to the standards outlined in RFC 3986.

Invalid PURLs that fail to conform to the specification must be identified and rejected by any conforming implementation. This guarantees the integrity and reliability of PURLs in all supported contexts.

A conforming implementation of PURL may extend its functionality by providing ecosystem-specific validation, processing, or metadata handling, as long as these extensions do not violate the core specification. Additionally, implementations may offer auxiliary tools or features, such as utilities for constructing or validating PURLs, provided they align with the standard's requirements.

A conforming implementation must not redefine or alter the core syntax, components, or semantics defined by this specification. Any prohibited extensions explicitly identified in the specification must not be implemented. Furthermore, behaviours that compromise the interoperability of PURLs across tools, platforms, or ecosystems are strictly disallowed.

A conforming implementation of Package-URL may choose to implement or not implement Normative Optional subclauses. If any Normative Optional behaviour is implemented, all of the behaviour in the containing Normative Optional clause must be implemented. A Normative Optional clause is denoted in this specification with the words "Normative Optional" in a coloured box, as shown below.

2.3 Example Normative Optional Clause Heading

Example clause contents.

3 Normative References

The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

RFC 3986, Uniform Resource Identifier (URI): Generic Syntax.
https://datatracker.ietf.org/doc/html/rfc3986

4 Overview

This section contains a non-normative overview of the Package-URL specification.

The Package-URL (PURL) specification defines a lightweight, universal syntax for identifying software packages. By leveraging a URL-based format, PURL provides a consistent and interoperable mechanism for referencing software packages across a wide range of ecosystems and tools. Its design addresses the challenges of ambiguity, inconsistency, and fragmentation in software package identification, enabling better interoperability and traceability in modern software supply chains.

This specification focuses on the core aspects of PURL, including its syntax, required components, optional attributes, and conformance requirements. It does not cover ecosystem-specific types or extensions such as PURL Version Ranges (VERS). However, the flexibility of PURL allows it to be extended to meet the needs of diverse package ecosystems without compromising its universal applicability.

The primary audience for this specification includes developers, tool implementers, and organisations involved in software composition analysis, dependency management, and supply chain security. PURL is foundational to a variety of use cases, from software bill of materials (SBOM) generation and license compliance to vulnerability tracking and software artifact exchange.

While this document serves as the authoritative reference for implementing PURL, it is complemented by various ecosystem-specific guidance documents, examples, and related standards. These resources provide additional context and practical insights for leveraging PURL effectively.

This overview is non-normative and serves to provide context for the specification’s intent, purpose, and audience. For detailed requirements and conformance criteria, refer to the normative sections of this specification.

5 Package-URL Specification

PURL standards for Package-URL.

A purl is a URL composed of seven components:

Table 1: Components of a PURL
Component Requirement Description
scheme Required The URL scheme with the constant value of "pkg". One of the primary reasons for this single scheme is to facilitate the future official registration of the "pkg" scheme for package URLs.
type Required The package "type" or package "protocol" such as maven, npm, nuget, gem, pypi, etc.
namespace Optional A name prefix such as a Maven groupid, a Docker image owner, a GitHub user or organization. Namespace is type-specific.
name Required The name of the package.
version Optional The version of the package.
qualifiers Optional Qualifier data for a package such as OS, architecture, repository, etc. Qualifiers are type-specific.
subpath Optional Subpath within a package, relative to the package root.

Components are separated by a specific character for unambiguous parsing. Components are designed such that they form a hierarchy from the most significant on the left to the least significant on the right.

5.1 A PURL is a URL

  • A purl is a valid URL and URI that conforms to the URL definitions or specifications at:
    • https://tools.ietf.org/html/rfc3986
    • https://en.wikipedia.org/wiki/URL#Syntax
    • https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Syntax
    • https://url.spec.whatwg.org/
  • A purl is a valid URL because it is a locator even though it has no Authority URL component: each type has a default repository location when defined.
  • The purl components are mapped to these URL components:
    • scheme: this is a URL scheme with a constant value: pkg
    • purl type, namespace, name and version components: these are collectively mapped to a URL path
    • purl qualifiers: this maps to a URL query
    • purl subpath: this is a URL fragment
  • Special URL schemes as defined in https://url.spec.whatwg.org/ such as file://, https://, http:// and ftp:// are NOT valid purl types. They are valid URL or URI schemes but they are not purl. They may be used to reference URLs in separate attributes outside of a purl or in a purl qualifier.
  • Version control system (VCS) URLs such git://, svn://, hg:// or as defined in Python pip or SPDX download locations are NOT valid purl types. They are valid URL or URI schemes but they are not purl. They are a closely related, compact and uniform way to reference VCS URLs. They may be used as references in separate attributes outside of a purl or in a purl qualifier.
  • A purl must NOT contain a URL Authority because there is no support for username, password, host or port components. A namespace segment may sometimes look like a host, but its interpretation is specific to a type.

5.2 Permitted characters

A canonical purl is composed of these permitted ASCII characters:

  • the Alphanumeric Characters: A to Z, a to z, 0 to 9,
  • the Punctuation Characters: .-_ (period '.', dash '-', underscore '_' and tilde ''),
  • the Plus Character: + (plus '+'),
  • the Percent Character: % (percent sign '%'), and
  • the Separator Characters :/@?=&# (colon ':', slash '/', at sign '@', question mark '?', equal sign '=', ampersand '&' and pound sign '#').

5.3 Separator characters

A canonical purl use the following separator characters:

  • ':' (colon) is the separator between scheme and type
  • '/' (slash) is the separator between type, namespace and name
  • '/' (slash) is the separator between subpath segments
  • '@' (at sign) is the separator between name and version
  • '?' (question mark) is the separator before qualifiers
  • '=' (equals) is the separator between a key and a value of a qualifier
  • '&' (ampersand) is the separator between qualifiers (each being a key=value pair)
  • '#' (number sign) is the separator before subpath

5.4 Character encoding

  • In the "Rules for each purl component" section, each component defines when and how to apply percent-encoding and decoding to its content.
  • When percent-encoding is required by a component definition, the component string MUST first be encoded as UTF-8.
  • In the component string, each "data octet" MUST be replaced by the percent-encoded "character triplet" applying the percent-encoding mechanism defined in RFC 3986 section 2.1 (https://datatracker.ietf.org/doc/html/rfc3986#section-2.1), including the RFC definition of "data octet" and "character triplet", and using these definitions for RFC's "allowed set" and "delimiters":
    • "allowed set" is composed of the Alphanumeric Characters and the Punctuation Characters
    • "delimiters" is composed of the Separator Characters
  • The following characters MUST NOT be percent-encoded:
    • the Alphanumeric Characters,
    • the Punctuation Characters,
    • the Separator Characters when being used as purl separators,
    • the colon ':', whether used as a Separator Character or otherwise, and
    • the percent sign '%' when used to represent a percent-encoded character.
  • Where the space ' ' is permitted, it MUST be percent-encoded as '%20'.
  • With the exception of the percent-encoding mechanism, the rules regarding percent-encoding are defined by this specification alone.

5.5 Component-level rules

A purl string is an ASCII URL string composed of seven components. Except as expressly stated otherwise in this section, each component:

  • MAY be composed of any of the characters defined in the "Permitted characters" section
  • MUST be encoded as defined in the "Character encoding" section

The rules for each component are:

5.5.1 Scheme

  • The scheme is a constant with the value "pkg".
  • The scheme MUST be followed by an unencoded colon ':'.
  • PURL parsers MUST accept URLs where the scheme and colon ':' are followed by one or more slash '/' characters, such as 'pkg://', and MUST ignore and remove all such '/' characters.

5.5.2 Type

  • The package type MUST be composed only of ASCII letters and numbers, period '.', plus '+', and dash '-'.
  • The type MUST start with an ASCII letter.
  • The type MUST NOT be percent-encoded.
  • The type is case insensitive. The canonical form is lowercase.

5.5.3 Namespace

  • The namespace is optional, unless required by the package's type definition.
  • If present, the namespace MAY contain one or more segments, separated by a single unencoded slash '/' character.
  • All leading and trailing slashes '/' are not significant and SHOULD be stripped in the canonical form. They are not part of the namespace.
  • Each namespace segment MUST be a percent-encoded string.
  • When percent-decoded, a segment:
    • MUST NOT contain any slash '/' characters.
    • MUST NOT be empty.
    • MAY contain any Unicode character other than '/' unless the package's type definition provides otherwise.
  • A URL host or Authority MUST NOT be used as a namespace. Use instead a repository_url qualifier. Note however that for some types, the namespace may look like a host.

5.5.4 Name

  • The name is prefixed by a single slash '/' separator when the namespace is not empty.
  • All leading and trailing slashes '/' are not significant and SHOULD be stripped in the canonical form. They are not part of the name.
  • A name MUST be a percent-encoded string.
  • When percent-decoded, a name MAY contain any Unicode character unless prohibited by the package's type definition in PURL-TYPES.rst.

5.5.5 Version

  • The version is prefixed by a '@' separator when not empty.
  • This '@' is not part of the version.
  • A version MUST be a percent-encoded string.
  • When percent-decoded, a version MAY contain any Unicode character unless the package's type definition provides otherwise.
  • A version is a plain and opaque string.

5.5.6 Qualifiers

  • The qualifiers component MUST be prefixed by an unencoded question mark '?' separator when not empty. This '?' separator is not part of the qualifiers component.
  • The qualifiers component is composed of one or more key=value pairs. Multiple key=value pairs MUST be separated by an unencoded ampersand '&'. This '&' separator is not part of an individual qualifier.
  • A key and value MUST be separated by the unencoded equal sign '=' character. This '=' separator is not part of the key or value.
  • A value MUST NOT be an empty string: a key=value pair with an empty value is the same as if no key=value pair exists for this key.
  • For each key=value pair:
    • The key MUST be composed only of lowercase ASCII letters and numbers, period '.', dash '-' and underscore '_'.
    • A key MUST start with an ASCII letter.
    • A key MUST NOT be percent-encoded.
    • Each key MUST be unique among all the keys of the qualifiers component.
    • A value MAY be composed of any character and all characters MUST be encoded as described in the "Character encoding" section.

5.5.7 Subpath

  • The subpath string is prefixed by a '#' separator when not empty.
  • This '#' is not part of the subpath.
  • The subpath contains zero or more segments, separated by slash '/'.
  • Leading and trailing slashes '/' are not significant and SHOULD be stripped in the canonical form.
  • Each subpath segment MUST be a percent-encoded string.
  • When percent-decoded, a segment:
    • MUST NOT contain a '/'
    • MUST NOT be any of '..' or '.'
    • MUST NOT be empty
  • The subpath MUST be interpreted as relative to the root of the package

6 PURL Type Schema

Each package manager, platform, type, or ecosystem has its own conventions and protocols to identify, locate, and provision software packages. The package type is the component of a package URL that is used to capture this information with a short string such as maven, npm, nuget, gem, pypi, etc. Known purl type definitions are formalized here independent of the core Package URL specification. See also a candidate list further down.

Definitions can also include types reserved for future use.

The PURL Type JSON Schema is the reference implementation for the Ecma standard.

Table 2: Properties for the root object
Property Type Requirement Description
$id object Schema defining the structure and constraints of a specific PURL type
definitions string Specifies whether this component is required, optional, or prohibited
character_constraints string Regex defining valid characters
case_rules object Defines case sensitivity and normalization rules
properties Determines if case must be preserved or ignored
normalization Defines if values must be normalized to lowercase, uppercase, or kept as provided

Annex A (informative) Colophon

This specification is authored on GitHub in a plaintext source format called Ecmarkup. Ecmarkup is an HTML and Markdown dialect that provides a framework and toolset for authoring ECMA specifications in plaintext and processing the specification into a full-featured HTML rendering that follows the editorial conventions for this document. Ecmarkup builds on and integrates a number of other formats and technologies including Grammarkdown for defining syntax and Ecmarkdown for authoring algorithm steps. PDF renderings of this specification are produced by printing the HTML rendering to a PDF.

We extend our gratitude to TC39 for their exceptional work in developing Ecmarkup, which has greatly facilitated TC54's successful adoption of this tool for the preparation and maintenance of our technical specifications.

Annex B (informative) Bibliography

TODO

Copyright & Software License

Ecma International

Rue du Rhone 114

CH-1204 Geneva

Tel: +41 22 849 6000

Fax: +41 22 849 6001

Web: https://ecma-international.org/

Copyright Notice

© 2025 Ecma International

This draft document may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published, and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this section are included on all such copies and derivative works. However, this document itself may not be modified in any way, including by removing the copyright notice or references to Ecma International, except as needed for the purpose of developing any document or deliverable produced by Ecma International.

This disclaimer is valid only prior to final version of this document. After approval all rights on the standard are reserved by Ecma International.

The limited permissions are granted through the standardization phase and will not be revoked by Ecma International or its successors or assigns during this time.

This document and the information contained herein is provided on an "AS IS" basis and ECMA INTERNATIONAL DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY OWNERSHIP RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Software License

All Software contained in this document ("Software") is protected by copyright and is being made available under the "BSD License", included below. This Software may be subject to third party rights (rights from parties other than Ecma International), including patent rights, and no licenses under such third party rights are granted under this license even if the third party concerned is a member of Ecma International. SEE THE ECMA CODE OF CONDUCT IN PATENT MATTERS AVAILABLE AT https://ecma-international.org/memento/codeofconduct.htm FOR INFORMATION REGARDING THE LICENSING OF PATENT CLAIMS THAT ARE REQUIRED TO IMPLEMENT ECMA INTERNATIONAL STANDARDS.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  3. Neither the name of the authors nor Ecma International may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE ECMA INTERNATIONAL "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL ECMA INTERNATIONAL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.