ASN.1 Utilities
Utility for reading and writing ASN.1 based content, including handling BER and
DER.
The Ups and Downs of ASN.1
Pros
-
Encoding independence. The
ASN.1IDL (schema) is entirely decoupled from the wire format. A single schema can be used withBER,DER,PER,XER,JER, orOERwithout any changes to the type definitions. This makes it uniquely adaptable: the same protocol definition can target a compact binary transport and a human-readable XML or JSON transport simultaneously. -
Highly standardized.
ASN.1is defined by ITU-T X.680–X.696 and the corresponding ISO 8824/8825 series. Every encoding rule (BER, DER, PER, …) is itself a formal standard, not a convention or implementation detail. This rigour is whyASN.1is the foundation of X.509 certificates, PKCS/CMS cryptographic messages, SNMP, LDAP, Kerberos, and most of the LTE/5G air interface protocols — domains where interoperability is non-negotiable and correctness is safety-critical. -
Rich, formal type system with built-in constraints. The IDL supports value range constraints (
INTEGER (1..255)), size constraints (UTF8String (SIZE (1..64))), alphabet constraints (FROM ("A".."Z")), component constraints (WITH COMPONENTS), and more. Constraints are part of the schema, not application logic, so a conformant codec can validate them automatically. -
Deterministic canonical encoding.
DER(Distinguished Encoding Rules) is a strict subset ofBERthat produces exactly one valid byte sequence for any value. This property is essential for cryptographic use cases (signature verification, certificate fingerprinting) and for protocol interoperability where two independent implementations must produce identical output. -
Forward and backward compatibility. Extension markers (
...) allow new fields to be added toSEQUENCEandSETtypes without breaking existing decoders. A decoder that does not know about an extension field simply ignores it — the protocol evolves gracefully over decades, as seen in cellular standards that have been extended continuously since the 1980s. -
Self-describing in BER. The
BERType-Length-Value encoding embeds the tag of every value on the wire. A genericBERdecoder can walk any encoded message and display its structure without the schema — useful for diagnostics and protocol analysis tools. -
Information Object Classes. The class mechanism (X.681) allows strongly typed, parameterised table-driven protocols: a
SEQUENCEfield can be constrained to carry exactly the type identified by a companion OID field, enforcing relationships between fields at the schema level rather than at runtime.
Cons
-
The
ASN.1IDL syntax is very complicated. The grammar has evolved over 40 years of ITU/ISO standards and accumulated significant complexity: multiple ways to write the same construct, context-sensitive parsing rules, and corner cases that trip up even experienced implementors. Reading a real-world.asn1file (e.g. from a 3GPP spec) requires considerable domain knowledge. -
Very limited OSS tooling. Unlike
protobuforJSON, theASN.1ecosystem is dominated by expensive commercial compilers. Open-source alternatives for IDL parsing,BER/DERcodecs, and code generation are sparse and often incomplete. This library exists in part to address that gap for the JVM. -
Non-human-readable binary encodings.
BERandDERbytes are opaque without a decoder. WhileXERandJERexist, they are rarely used in practice, so debugging live traffic requires specialised tooling rather than a plain text editor orcurl. -
Steep learning curve. The combination of the complex IDL, multiple encoding rules with subtle differences, and sparse documentation means onboarding a new developer is significantly harder than for
protobuforJSON-based systems. -
Niche community.
ASN.1expertise is concentrated in telecom, cryptography, and defense. General-purpose web and backend development communities have almost no exposure to it, making recruitment and knowledge transfer harder.
Comparison to alternatives
vs. protobuf (and gRPC)
Protobuf is the most direct competitor: both are binary, IDL-first, and
designed for compact, efficient encoding.
| Aspect | ASN.1 (DER/BER) |
Protobuf |
|---|---|---|
| Standardisation | ITU-T/ISO formal standard | Google-maintained, de-facto |
| Schema language | Rich, complex; constraints, classes, multiple encoding rules | Simple, approachable; field numbers, oneof, maps |
| Encoding options | BER, DER, PER, XER, JER, OER (all standardised) | Binary wire format + JSON mapping |
| Canonical encoding | DER is fully deterministic — required for crypto | Not canonical by default; field order is unspecified |
| Compactness | PER is among the most compact binary formats that exist | Compact, but not as tunable as Aligned/Unaligned PER |
| Tooling | Sparse OSS ecosystem; commercial tools dominate | Excellent: protoc, plugins for every major language |
| Constraints | First-class schema-level (size, value range, alphabet) | None in schema; must validate in application code |
| Extensibility | Extension markers (...) with formal semantics |
New fields with new field numbers; unknown fields forwarded |
| Adoption | Telecom, PKI, SNMP, LDAP, Kerberos | Microservices, gRPC, storage formats (BigQuery, etc.) |
Bottom line: if you are designing a new service inside a modern infrastructure,
protobuf is almost always the better practical choice — the tooling, community,
and simplicity win. ASN.1 is the right answer when the protocol must be
formally standardised, interoperate with existing telecom or security
infrastructure, or requires canonical binary encoding (e.g. for signing).
vs. JSON (e.g. Jackson)
JSON is the de-facto standard for web APIs and configuration; Jackson is the
dominant Java library for working with it.
| Aspect | ASN.1 (DER/BER) |
JSON / Jackson |
|---|---|---|
| Human readability | Not readable without a decoder | Fully human-readable |
| Compactness | Highly compact (binary TLV or PER) | Verbose; field names repeated in every message |
| Schema | Formal IDL with enforced constraints | Optional (JSON Schema); rarely enforced at codec level |
| Type fidelity | Full: distinct integer sizes, binary data, OIDs, dates | Weak: no integers vs. floats, no native binary, no date type |
| Number precision | Arbitrary-precision integers, exact decimals | IEEE 754 doubles; large integers lose precision |
| Binary data | Native OCTET STRING and BIT STRING types |
Requires Base64 encoding; 33 % size overhead |
| Tooling / ecosystem | Sparse | Ubiquitous — every language has multiple mature libraries |
| Developer experience | High barrier; specialist knowledge required | Minimal barrier; every web developer knows JSON |
| Interoperability | Guaranteed by formal standards | Loose; behaviour differences across parsers (number limits, key order, duplicate keys) |
| Streaming / partial decode | BER supports indefinite-length encoding | Requires streaming JSON parsers; less standardised |
Bottom line: JSON/Jackson wins on developer experience, ecosystem breadth, and
debuggability for everything that lives near a browser or HTTP API. ASN.1
wins on encoding efficiency, type safety, and formal correctness for
bandwidth-constrained or security-critical protocols. For a JVM microservice
exchanging data over HTTP, JSON is the pragmatic default; for an IoT device
sending telemetry over a constrained radio link, ASN.1 PER is often worth
the complexity.
The ASN.1 Syntax
There is a good source of material for figuring out the ASN.1 DLS syntax, where I have among others used access to various sources of ASN.1 files on the internet, and publicly available books.
First a number of terms and what is meant by them:
identifier: Identifiers are a string of[a-zA-Z][a-zA-Z0-9]*(-[a-zA-z0-9]+)*.Type-Name: Types are identified by an identifier that starts with an upper-case letter, and indicates the name of a type, either specified throughout the ASN.1 IDL, or from theUNIVERSALnamespace. Note that in theUNIVERSALtype namespace, there are a number of types that contains a space in it, likeOCTET STRING,BIT STRINGorOBJECT IDENTIFIER. Only these universally known types may contain the space character as part of the name.value-name: Values (non-type fields and identifiers) starts with a a lower-case letter, and identified a name pointing to a value with a type.SubType: is a modification of a type, usually enclosed in(...), but sometimes not, exceptions will be mentioned later.- String literals may be:
"Double quoted for unicode string"'0100'Bfor encoding bit string, with single-quotes.'917421436587'Hfor hexadecimal encoded octet strings, with single-quites.fooor{'foo' 9 'bar'}using single quotes, or bracket enclosed sequence of single-quote strings and decimal code points forASCIIencoded strings (aka IA5 in ASN.1 spec), where the code-points can represent non-printable characters.
Integernumbers are of the form-?[1-9][0-9]*, and can represent all- It is not allowed to represent decimal form numbers (with a dot), but are defined
Each ASN.1 file is referenced as a Module.
Module-Name { module object id }
DEFINITIONS
-- add file options here
-- ( AUTOMATIC | IMPLICIT | EXPLICIT )? TAGS
BEGIN
-- put stuff here
END
Note that the ID part is not mandatory, but if it is missing, this ASN.1 spec may not be imported into other ASN.1 spec files. If EXPORTS is specified, then ONLY those types and values may be referenced outside this module, though if no EXPORTS are present, then ALL types and values may be referenced outside the module.
EXPORTS Type-Name, Type-Name2;
For any type or value not defined in the module itself, or in the UNIVERSAL type namespace
only imported IDs may be used in this module.
IMPORTS
Type-Name, value-name FROM Other-File-Spec { id }
Type-Name2 FROM Third-File-Spec { id };
Note that the ID part is mandatory when importing.
Type-Name ::= `Type-Definition`
Class-Name ::= CLASS `Class-Specification`
Type-Set `Type-Specification` ::= { v1 | v2 }
value-name ::= `Type-Specification` `value`
value-name `Type-Specification` ::= `value`
value-name `Type-Specification` ::= { `value specification` }
Where I use the term Type definition to be how to define a type, while the term Type specification is a reference to
a type with optional sub-typing. The latter one have less flexibility when it comes to what it can specify, but both of
them can take a type and modify them before using it in its relative position.
And yes, for some reason both ways of defining values are allowed (:facepalm:).
Classes
Class-Name ::= CLASS {
&Class-Type,
&class-value Value-Type
} WITH SYNTAX {
[`optional string containing one or more fields from above`]
`string containing fields from above`
`in total all fields must be referenced exactly once`
}
class-Value Class-Name ::= {
SYNTAX WORD value-name
MORE WORDS Type-Name
EVEN MORE WORDS 42
}
Class-Object-Set Class-Name ::= { class-Value | class-Value-2, ... }
-- AFAIK only SEQUENCE types may be templated in this way.
Templated-Type ::= SEQUENCE {
identifier Class-Name.&class-value({Class-Object-Set}),
argument Class-Name.&Class-Type({Class-Object-Set}{@identifier})
}
- TODO: Figure out the
ANY DEFINED BYtype. It looks like the only reasonable way to represent is is to have a TLV of unknown type, and let the implementers connect the dots.
And yes, both ways
Constructed types.
There exist 3 types of constructed types:
CHOICESEQUENCESET
Constructed-Type ::= Base-Type {
component Component-Type,
..., -- either this or ...
...! err1 err2, -- <- exception marker ?????
extended Component-Type-2 -- for version2
}
err1 INTEGER ::= 501
err2 INTEGER ::= 503
Constructed types and extensibility.
Normally constructed types have a fixed component set, but by either
adding the ... extensible-marker component, or by setting.
Choice
A choice is a meta-type that signifies the presence of at most one of a set of values.
Choice-Type ::= CHOICE {
first-choice [0] Value-Type-1,
second-choice [1] Value-Type-2
}
- Type matching a CHOICE is the same as matching any of its fields, and it is not encoded into it's own constructed group unless used in an EXPLICIT numbered field in a sequence or set.
Sequence and Set
Sequences and set's are pretty similar, but where the SEQUENCE type can take advantage of the fact that fields must come in the order they were defined.
Sub-Types and constraints
String-Type ::= UTF8String (SIZE (from..to)) -- bounded
String-Type ::= UTF8String (SIZE (from..)) -- unbounded on higher numbers
String-Type ::= UTF8String (SIZE (fixed)) -- fixed size
String-Type ::= IA5String (FROM ("A" .. "Z")) -- limited alphabet
String-Type ::= IA5String (FROM ("0" .. "9", A" .. "Z")) (SIZE (1 ..)) -- limited alphabet
Integer-Type ::= INTEGER (min..max) -- integer with bounded value.
Integer-Type ::= INTEGER (min..) -- integer with lower-bound value.
Integer-Type ::= INTEGER (..max) -- integer with upper-bound value.
Sequence-Type ::= Origin-Type (WITH COMPONENTS {..., `additional components`})
For extending a sequence with additional components: They will be inserted where the extension point is, or at the end if