Graft Identifier (GID)
Graft uses a 16 byte identifier called a Graft Identifier (GID) to identify Volumes, Logs, and Segments. GIDs are similar to UUIDs and ULIDs with a Graft specific prefix and different canonical encoding.
- 128-bit compatibility (same size as UUID)
- Up to 2^64 unique GIDs per millisecond
- Lexicographically sortable by creation time!
- Canonically encoded as a 24-character string, compared to the 36-character UUID
- Case sensitive
- URL safe representation
- Creation time is embedded: newer GIDs sort after older ones
GIDs have the following layout:
0 1 2 3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| prefix | timestamp |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| timestamp | prefix |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| random |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| random |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+Every GID has a 1 byte prefix which encodes its type. There are currently three GID prefixes: Volume, Log, and Segment. The prefix may contain other types or namespace metadata in the future.
The three GID types are:
- VolumeId: Identifies a Volume.
- LogId: Identifies a Log (an ordered list of Commits)
- SegmentId: Identifies a Segment (compressed storage of pages in object storage)
Following the prefix is a 48 bit timestamp encoding milliseconds since the unix epoch and stored in network byte order (MSB first).
Following the timestamp is a duplicate of the prefix. This second prefix ensures that the random bytes section of the GID does not start with a zero-byte, which is an important aspect of it’s encoded representation.
Finally there are 64 bits of random noise allowing up to 2^64 GIDs to be generated per millisecond.
GIDs are canonically serialized into 24 bytes using the bs58 algorithm with the Bitcoin alphabet:
123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz