elisp: Bindat Spec
36.20.1 Describing Data Layout
------------------------------
To control unpacking and packing, you write a “data layout
specification”, a special nested list describing named and typed
“fields”. This specification controls the length of each field to be
processed, and how to pack or unpack it. We normally keep bindat specs
in variables whose names end in ‘-bindat-spec’; that kind of name is
automatically recognized as risky.
A field’s “type” describes the size (in bytes) of the object that the
field represents and, in the case of multibyte fields, how the bytes are
ordered within the field. The two possible orderings are “big endian”
(also known as “network byte ordering”) and “little endian”. For
instance, the number ‘#x23cd’ (decimal 9165) in big endian would be the
two bytes ‘#x23’ ‘#xcd’; and in little endian, ‘#xcd’ ‘#x23’. Here are
the possible type values:
‘u8’
‘byte’
Unsigned byte, with length 1.
‘u16’
‘word’
‘short’
Unsigned integer in network byte order, with length 2.
‘u24’
Unsigned integer in network byte order, with length 3.
‘u32’
‘dword’
‘long’
Unsigned integer in network byte order, with length 4. Note: These
values may be limited by Emacs’s integer implementation limits.
‘u16r’
‘u24r’
‘u32r’
Unsigned integer in little endian order, with length 2, 3 and 4,
respectively.
‘str LEN’
String of length LEN.
‘strz LEN’
Zero-terminated string, in a fixed-size field with length LEN.
‘vec LEN [TYPE]’
Vector of LEN elements of type TYPE, defaulting to bytes. The TYPE
is any of the simple types above, or another vector specified as a
list of the form ‘(vec LEN [TYPE])’.
‘ip’
Four-byte vector representing an Internet address. For example:
‘[127 0 0 1]’ for localhost.
‘bits LEN’
List of set bits in LEN bytes. The bytes are taken in big endian
order and the bits are numbered starting with ‘8 * LEN − 1’ and
ending with zero. For example: ‘bits 2’ unpacks ‘#x28’ ‘#x1c’ to
‘(2 3 4 11 13)’ and ‘#x1c’ ‘#x28’ to ‘(3 5 10 11 12)’.
‘(eval FORM)’
FORM is a Lisp expression evaluated at the moment the field is
unpacked or packed. The result of the evaluation should be one of
the above-listed type specifications.
For a fixed-size field, the length LEN is given as an integer
specifying the number of bytes in the field.
When the length of a field is not fixed, it typically depends on the
value of a preceding field. In this case, the length LEN can be given
either as a list ‘(NAME ...)’ identifying a “field name” in the format
specified for ‘bindat-get-field’ below, or by an expression ‘(eval
FORM)’ where FORM should evaluate to an integer, specifying the field
length.
A field specification generally has the form ‘([NAME] HANDLER)’,
where NAME is optional. Don’t use names that are symbols meaningful as
type specifications (above) or handler specifications (below), since
that would be ambiguous. NAME can be a symbol or an expression ‘(eval
FORM)’, in which case FORM should evaluate to a symbol.
HANDLER describes how to unpack or pack the field and can be one of
the following:
‘TYPE’
Unpack/pack this field according to the type specification TYPE.
‘eval FORM’
Evaluate FORM, a Lisp expression, for side-effect only. If the
field name is specified, the value is bound to that field name.
‘fill LEN’
Skip LEN bytes. In packing, this leaves them unchanged, which
normally means they remain zero. In unpacking, this means they are
ignored.
‘align LEN’
Skip to the next multiple of LEN bytes.
‘struct SPEC-NAME’
Process SPEC-NAME as a sub-specification. This describes a
structure nested within another structure.
‘union FORM (TAG SPEC)...’
Evaluate FORM, a Lisp expression, find the first TAG that matches
it, and process its associated data layout specification SPEC.
Matching can occur in one of three ways:
• If a TAG has the form ‘(eval EXPR)’, evaluate EXPR with the
variable ‘tag’ dynamically bound to the value of FORM. A
non-‘nil’ result indicates a match.
• TAG matches if it is ‘equal’ to the value of FORM.
• TAG matches unconditionally if it is ‘t’.
‘repeat COUNT FIELD-SPECS...’
Process the FIELD-SPECS recursively, in order, then repeat starting
from the first one, processing all the specifications COUNT times
overall. The COUNT is given using the same formats as a field
length—if an ‘eval’ form is used, it is evaluated just once. For
correct operation, each specification in FIELD-SPECS must include a
name.
For the ‘(eval FORM)’ forms used in a bindat specification, the FORM
can access and update these dynamically bound variables during
evaluation:
‘last’
Value of the last field processed.
‘bindat-raw’
The data as a byte array.
‘bindat-idx’
Current index (within ‘bindat-raw’) for unpacking or packing.
‘struct’
The alist containing the structured data that have been unpacked so
far, or the entire structure being packed. You can use
‘bindat-get-field’ to access specific fields of this structure.
‘count’
‘index’
Inside a ‘repeat’ block, these contain the maximum number of
repetitions (as specified by the COUNT parameter), and the current
repetition number (counting from 0). Setting ‘count’ to zero will
terminate the inner-most repeat block after the current repetition
has completed.