Portable-Network-Archive-Specification

4. Chunk Specifications

4.1. Critical chunks

4.1.1. AHED Archive header

The AHED chunk must appear FIRST. It contains:

Significance Size Description
Major version 1-byte Major version of PNA
Minor version 1-byte Minor version of PNA
General purpose bit flag 2-byte Bit flags
Archive number 4-byte Archive number
Major version

Currently, only 0 is defined. It may be changed if there is a change in the structure of each chunk that makes up PNA.

Minor version

Currently, only 0 is defined. It may be changed when there is a change in the type of chunks that make up PNA.

General purpose bit flag

Bit0 ~ Bit15 currently does not used. reserve for the future.

Archive number

Contains the number of the archive when the archive is split.
Archive number is start with 0.

4.1.2. AEND Archive tailer

This chunk must appear last. This signals the end of PNA data stream. The chunk data area is empty. Decoders should not load more than this chunk.

4.1.3. ANXT Archive continues marker

Indicates that the archive is split and the following file exists. The Archive number field of the AHED chunk of the next file will be the value of the Archive number field of the AHED chunk of the current file incremented by 1. The chunk data area is empty.

4.1.4. FHED File header

Basic information about each entry is stored.

Significance Size Description
Major version 1-byte Major version
Minor version 1-byte Minor version
Entry kind 1-byte Entry kind
Compression method 1-byte Compression method
Encryption method 1-byte Encryption method
Cipher mode 1-byte Cipher mode
Path n-byte File path
Entry kind

The entry kind is recorded. 0 is regular file 1 is directory 2 is symbolic link 3 is hard link 4 is a file that has previously appeared in the archive

Compression method

The compression method is recorded. 0 is not compression 1 is deflate 2 is zstandard 4 is lzma

Encryption method

The encryption method is recorded. 0 is not encryption 1 is AES (Rijndael) 2 is Camellia

When this field value is 0, PHSF chunk is not required.

Cipher mode

Cipher mode of encryption. 0 is cbc mode 1 is ctr mode

Not interested in the value of this field, if Encryption method filed value is 0.

File path

The path of entry is encoded by UTF-8. The / is used as a path separator. Paths should not contain / at the leading or trailing. Decoders should ignore them even if they contain a leading or trailing /.

FHED field values (PNA method code)

The Compression method, Encryption method, and Cipher mode fields in the FHED chunk must be set to the following PNA method codes (integer values):

Field Value Algorithm/Mode Note
Compression method 0 No compression  
  1 Deflate zlib compatible
  2 Zstandard  
  4 LZMA xz
Encryption method 0 No encryption  
  1 AES (Rijndael) 256-bit key
  2 Camellia 256-bit key
Cipher mode 0 CBC Cipher Block Chaining
  1 CTR Counter Mode

Note:

4.1.5. PHSF Password hash

The information about the key derivation function when encrypting a file.
This chunk appeared after FHED chunk and before FDAT chunk.
If the value of the Encryption method field of FHED chunk is not 0, this chunk is required.

size description
n-byte PHC string format

About PHC string format

4.1.6. FDAT File data

The actual data of the file is recorded. Multiple of these chunks are permitted for a single entry.

4.1.7. FEND File tailer

This signals the end of the entry data stream. The chunk data area is empty.

4.1.8. SHED Solid mode header

Basic information of Solid mode archive is stored.

significance size description
Major version 1-byte Major version
Minor version 1-byte Minor version
Compression method 1-byte Compression method
Encryption method 1-byte Encryption method
Cipher mode 1-byte Cipher mode

4.1.9. SDAT Solid mode data

Solid mode archive data. Contains chunks representing entries.

4.1.9.1 Structure of Entries in SDAT

In solid mode archives, the SDAT chunk contains a continuous stream of data representing multiple entries. Unlike per-entry mode where each entry is composed of discrete FHED through FEND chunks within the archive, the SDAT chunk serializes these chunks together within its own data stream.

The following structural rules apply to the contents of an SDAT chunk:

Each entry in solid mode is not stored as an independent top-level chunk sequence in the file but is instead embedded within the data field of the SDAT chunk. Decoders must parse the contents of SDAT as a logical concatenation of standard entry structures.

The use of solid mode is intended to improve compression efficiency by processing similar files together as a single compression/encryption unit.

Example structure of SDAT content (conceptual):

[ FHED | Ancillary* | FDAT* | FEND ]
[ FHED | Ancillary* | FDAT* | FEND ]
...

The outer SDAT chunk may be followed by additional SDAT chunks if the solid data stream is split. These must be interpreted as a single contiguous logical stream, terminated by the SEND chunk.

4.1.10. SEND Solid mode tailer

This signals the end of the solid data stream.
The chunk data area is empty.

4.2. Ancillary chunks

All Auxiliary Chunks must appear before the AEND chunk.

4.2.1 Timestamp information

4.2.1.1 cTIM Created timestamp

The creation datetime is recorded in Unix time. When this chunk appears after the FHED chunk and before the FEND chunk, it indicates the creation datetime of the entry.

size description
8byte Unix timestamp
4.2.1.2 mTIM Modified timestamp

The last modified datetime is recorded in Unix time. When this chunk appears after the FHED chunk and before the FEND chunk, it indicates the last modified datetime of the entry.

size description
8byte Unix timestamp
4.2.1.3 aTIM Accessed timestamp

The last accessed datetime is recorded in Unix time. When this chunk appears after the FHED chunk and before the FEND chunk, it indicates the last accessed datetime of the entry.

size description
8byte Unix timestamp
4.2.1.4 cTNS Created timestamp (Nanoseconds)

Provides the nanosecond portion of the file creation time, to be used in conjunction with the cTIM chunk.

Size Description
4-byte Unsigned integer (u32), nanoseconds

Valid values are in the range 0 <= value < 1,000,000,000.

4.2.1.5 mTNS Modified timestamp (Nanoseconds)

Provides the nanosecond portion of the last modified time, to be used in conjunction with the mTIM chunk.

Size Description
4-byte Unsigned integer (u32), nanoseconds

Valid values are in the range 0 <= value < 1,000,000,000.

4.2.1.6 aTNS Accessed timestamp (Nanoseconds)

Provides the nanosecond portion of the last accessed time, to be used in conjunction with the aTIM chunk.

Size Description
4-byte Unsigned integer (u32), nanoseconds

Valid values are in the range 0 <= value < 1,000,000,000.

Decoder Notes:

4.2.2 permission information

4.2.2.1 fPRM File permission

File permissions are recorded. This chunk appeared after FHED chunk and before FEND chunk.

significance size description
uid 8-byte user ID
uname length 1-byte length of uname
uname n-byte Unix user name
gid 8-byte group ID
gname length 1-byte length of gname
gname n-byte Unix group name
permissions 2-byte file permission bytes

4.2.3 Extended attribute

4.2.3.1 xATR Extended attribute

An extended attribute are recorded. this chunk appeared after FHED chunk and before FEND chunk. this chunk can appear many times.

significance size description
name length 4-byte length of name
name n-byte attribute name
body length 4-byte length of body
body n-byte attribute value

4.3. Summary of standard chunks

This table summarizes some properties of the standard chunk types.

Critical chunks

Name Multiple in Archive Multiple in Entry Optional Ordering constraints
AHED No N/A No Must be first
FHED Yes No Yes Must start an entry
PHSF Yes No Yes Before FDAT or SDAT if used
FDAT Yes Yes Yes Multiple FDATs must be consecutive
FEND Yes No Yes Must end an entry
SHED Yes No Yes Must start an Solid mode data
SDAT Yes Yes Yes Multiple SDATs must be consecutive
SEND Yes No Yes Must end an Solid mode data
ANXT No N/A Yes  
AEND No N/A No Must be last

Ancillary chunks

Name Multiple in Archive Multiple in Entry Optional Ordering constraints
cTIM Yes No Yes Between FHED and FEND
mTIM Yes No Yes Between FHED and FEND
aTIM Yes No Yes Between FHED and FEND
cTNS Yes No Yes Must accompany cTIM, between FHEDFEND
mTNS Yes No Yes Must accompany mTIM, between FHEDFEND
aTNS Yes No Yes Must accompany aTIM, between FHEDFEND
fPRM Yes No Yes Between FHED and FEND
xATR Yes Yes Yes Between FHED and FEND

4.4. Additional chunk types

Additional public PNA chunk types are defined in the document “Extensions to the PNA 0.0 Specification, Version 0.0.0” [PNA-EXTENSIONS]. Chunks described there are expected to be less widely supported than those defined in this specification. However, application authors are encouraged to use those chunk types whenever appropriate for their applications. Additional chunk types can be proposed for inclusion in that list by contacting the PNA specification maintainers at @Portable-Network-Archive on GitHub.

New public chunks will be registered only if they are of use to others and do not violate the design philosophy of PNA. Chunk registration is not automatic, although it is the intent of the authors that it be straightforward when a new chunk of potentially wide application is needed. Note that the creation of new critical chunk types is discouraged unless absolutely necessary.

Applications can also use private chunk types to carry data that is not of interest to other applications. See Recommendations for Encoders: Use of private chunks.

Decoders must be prepared to encounter unrecognized public or private chunk-type codes. Unrecognized chunk types must be handled as described in Chunk naming conventions.