Universally Unique Identifiers (UUIDs)

Course Code: U10814

Course Name: Networking and Operating Systems

Credits: 20

Module Leaders: Dr Alexios Louridas & Mr Danny Werb

Material created by Seb Blair, modified by Danny Werb

UUIDs

  • 32 base 16 characters ([0-9][A-F])

  • 128-bit numbers

center

UUID Version 1

UUID Versions

There are 8 versions as of 23 June 2022

  • UUID1 = Time-based + Unique or MAC [no repeats till 3603AD] ​

  • UUID2 = Time-based (LSB) + user ID

  • UUID3 = Namespace + MD5 hash​

  • UUID4 = PRNG [1 trillion UUIDs for a chance of 2 repeats]​

  • UUID5 = Namespace + SHA-1 hash​

  • UUID6 = Timestamp and monotonic counter

  • UUID7 = UNIX Timestamp

  • UUID8 - User defined data

Who?

  • Created by Microsoft but standardised by the Internet Engineering Task Force (IETF) and the International Telecommunication Union (ITU), so that each user or thing can be uniquely identifiable.

  • ITU-T X.667 | ISO/IEC 9834-8

Where?

Combinations: UUID 4

  • grain of sand
  • UUID4 has 122 random bits,
  • =
  • Volume of sand as UUID4 = (50 quadrillion, 190 trillion and 8)
  • Compare to volume of Jupiter = (1 quadrillion, 434 trillion, 281 billion, 810 million, 739 thousand and 360)
  • So, you could give each grain of sand on 35 Jupiters a version 4 UUID

UUID 1

UUID1 is calculated based on a combination of the current timestamp and the host computer's MAC address:

  • Timestamp: The first 60 bits represent the current timestamp in 100ns intervals since 15/10/1582 (the date the Gregorian calendar was introduced)

  • Clock sequence: The following 14 bits are the clock sequence value, designed to avoid conflicts in case multiple UUIDs are created within the same 100ns window

  • Node: The final 48 bits are reserved for the MAC address of the computer's NIC. Sometimes these bits may be generated using PRNG if the MAC isn't available

Epochs/Time

  • 01/01/1970

    • Unix engineers set the arbitrary timestamp because... it was convenient...
  • 19/01/2038 03:14:07 the storage for 32-bit will become obsolete, as the value will be too large (aka. Epochcalypse or Y2k38 bug)

    • Will need to migrate to 64-bit or just deal with the timestamp showing 19/01/1901 03:14:08

UUID 1: Time format

Name Bytes Hex Bits Comments
time_low 4 8 32 integer giving the low 32-bits of time
time_mid 2 4 16 integer giving the middle 16-bits of time
time_hi+version 2 4 16 16 bits of time high with bits 6-7 multiplexed with version number
clock_seq_hi_and_res clock_seq_low 2 4 16 1 to 3-bit "variant" in the most significant bits, followed by the 13 to 15-bit clock sequence

Example

  1. Current timestamp in Unix Epoch Time (ns) = .

  2. Divide the timestamp by 100 to convert it to 100-nanosecond intervals:

  1. Add the number of 100-nanosecond intervals between the UUID epoch (1582-10-15) and the Unix epoch (1970-01-01):

at the time of making the slide

Example Continued

Breaking down the UUID components as follows:

  1. Time Low (32 bits): The first 32 bits of the timestamp in hexadecimal:

    $ printf "0x%08X\n" 1392857302
    > 0x530550D6
    
  2. Time Mid (16 bits): The next 16 bits of the timestamp in hexadecimal:

    $ printf "0x%04X\n" 7346
    > 0x1CB2
    
  3. Time High and Version (16 bits): The next 16 bits of the timestamp () with the version (1) in hexadecimal:

    $ printf "%04X\n" <<< echo $((7459+1))
    > 0x1D24
    

Example Continued....

  1. Clock Sequence (14 bits), in truth this can be 14 random bits so:

    $ printf "%04X\n" <<< echo $((RANDOM % 16384))
    > 21FF
    
  2. Node (48 bits): A randomly generated 48-bit value or MAC address if you must:

    $ dd if=/dev/urandom bs=1 count=6 2>/dev/null | od -An -tx1 | tr -d ' \n'
    > 2876c5202c7f
    
  3. Put it all togehter:

    • timeLow - timeMid - timeHigh+version - clockSeq - Node

    • 530550D6-1CB2-1D24-21FF-2876C5202C7F

UUID 2

  • Distributed Computing Environment (DCE)

  • Combination of:

    • Current time and date.
    • The local identifier replaces the lower 32 bits of the timestamp.48-bit MAC address of the host machine
      • Domain Name or Hostname
        $ id -u; id -g; whoami;
        
    • MAC Address or random generated Hex
      $ cat -A /dev/urandom | less
      

UUID 3

  • Namespace could be website, DNS information, plain text, etc
  • The namespace value is hashed using the md5hash algorithm
  • GNU coreutils implements this using md5sum
    $ md5sum <<< "Test"
    > 2205e48de5f93c784733ffcca841d2b5  -
    

MD5 Algorithm

vertical

MD5 Algorithm

  • Round 1: (b AND c) OR ((NOT b) AND (d))

  • Round 2: (b AND d) OR (c AND (NOT d))

  • Round 3: b XOR c XOR d

  • Round 4: c XOR (b OR (NOT d))

UUID​4

  1. Generate 128 random bits:
$ dd if=/dev/random count=16 bs=1 2> /dev/null | xxd -ps
> 7c1e598398eb691f3f4be4123c3ce9a7

[0]0111110 00001111 00101100 11000001 11001100 01110101 10110100 100011111 00111111 01001011 11100100 00010010 00111100 00111100 11101001 10100111

  1. Take the 7th byte and perform an AND operation with 0x0F to clear out the high nibble. Then, OR it with 0x40 to set the version number to 4.

00000100 = 10110100 & 00001111​ (0x0f)

01000100 = 00000100 | 01000000 (0x40)

UUID4 Continued

  1. Next, take the 9th byte (00111111) and perform an AND operation with 0x3F and then OR it with 0x80.

00111111​ = 00111111​ & 00111111​ (0x3f)

100111111 = 00111111​ | 10000000 (0x80)

  1. Convert the 128 bits to hexadecimal representation and insert the hyphens to achieve the canonical text representation.​

Before: 7C1E5983-98EB-691F-3f4B-E4123C3CE9A7

After: 7C1E5983-98EB-441F-CF4B-E4123C3CE9A7

Your Turn

10101110 00100001 10110100 11111100 01101111 01110010 10100011 00110010 10111010 10000110 11010010 00001001 11010001 11100000 10101111 01101001 ​

  1. Take the 7th byte and perform an AND operation with 0x0F to clear out the high nibble. Then, OR it with 0x40 to set the version number to 4.​

  2. Next, take the 9th byte and perform an AND operation with 0x3F and then OR it with 0x80.​

  3. Convert the 128 bits to hexadecimal representation and insert the hyphens to achieve the canonical text representation.​

Answer

Before: AE21B4FC-6F72-A332-BA86-D209D1E0AF69

After: AE21B4FC-6F72-4332-BA86-D209D1E0AF69

UUID 5

  • Namespace could be website, DNS information, plain text, etc
  • The namespace value is hashed using the sha1sum algorithm

Resources

Original RFC: https://www.rfc-editor.org/rfc/rfc4122

IETF Draft document that is intended to obsolete the original RFC4122: https://www.ietf.org/archive/id/draft-ietf-uuidrev-rfc4122bis-14.html

(RFC4122 Gregorian calendar no roll overs till 3603)

minecraft uses Version 4

50 quadrillion, 190 trillion and 8 1 quadrillion 434 trillion 281 billion 810 million 739 thousand and 360

Gregorian Calendar 15/10/1582 - The Gregorian calendar, also known as the Western calendar, is the most widely used calendar in the world today. Its predecessor, the Julian calendar, was replaced because it did not correctly reflect the actual time it takes the Earth to circle once around the Sun, known as a tropical year. Unix 01/01/1970 - The year 2038 problem is related to Unix time because times after 03:14:07 UTC on 19 January 2038 will require computers to store the value as greater than 32-bits.

Years designed as two digits not four. In January, a World Bank report estimated that only 21 of 139 developing countries had taken concrete steps to address the year 2000 problem. The report went on to anticipate year 2000 impacts on power, telecommunications, energy, food distribution and medical care in developing countries hundreds of billions of dollars spent to kill off the bug worldwide, Y2K ended as mere only glitches.

md5sum needs to be stripped of whitespace and `-`

Each 512-bit block gets broken down further into 16 sub-blocks of 32 bits each. There are four rounds of operations, with each round utilizing all the sub-blocks, the buffers, and a constant array value. This constant array can be denoted as T[1] -> T[64]. Each of the sub-blocks are denoted as M[0] -> M[15].

Command - dd == data defintion arguments - if == read from FILE instead of stdin - count == Copy N blocks - bs == read/write byes 1 at a time - 2> errors redirected to blackhole Command - xxd == hexdump - ps == plain text from beginnning of stream