GoWin Tools
Tools
โ† UUID v4 Generator

UUID v4 Generator ยท 5 min read

What Is a UUID and When Should You Use One?

UUIDs are 128-bit identifiers designed to be unique without central coordination. Learn what the four parts of a v4 UUID mean, when to use UUIDs vs auto-increment IDs, and the odds of a collision.

What Is a UUID?

A UUID โ€” Universally Unique Identifier โ€” is a 128-bit number used to identify information without requiring a central authority to assign it. The standard format, defined in RFC 4122, looks like this:

550e8400-e29b-41d4-a716-446655440000

The 32 hexadecimal digits are split into five groups of 8-4-4-4-12 characters, separated by hyphens. This gives 32 hex digits ร— 4 bits = 128 bits of information โ€” though a few of those bits are reserved for version and variant information.

UUID Versions Explained

Version 1 โ€” Time-based

Version 1 UUIDs incorporate a 60-bit timestamp (100-nanosecond intervals since October 1582) combined with a MAC address or random node identifier. They are guaranteed unique across machines and time, but they embed information about when and where they were generated โ€” a privacy concern in some contexts.

Version 4 โ€” Random

Version 4 is the most widely used. It consists of 122 bits of cryptographically random data, with 4 bits reserved for the version number (4) and 2 bits for the variant. The version appears as the first digit of the third group (41d4 โ†’ version 4). The variant bits appear in the first digit of the fourth group.

Version 7 โ€” Time-ordered random (new)

Version 7 is a newer format gaining adoption. It places a millisecond-precision Unix timestamp in the high bits, followed by random data. This makes v7 UUIDs sort in creation order โ€” solving the database indexing problem that plagues v4 UUIDs โ€” while retaining randomness and privacy.

The Collision Probability

A version 4 UUID has 122 random bits. The probability of generating two identical UUIDs seems vanishingly small โ€” and it is. To have a 50% chance of any collision, you would need to generate approximately 2.7 ร— 1018 UUIDs. At a rate of one billion UUIDs per second, that would take over 85 years.

In practice, UUID collisions are not a realistic concern for any production system. The birthday problem math confirms that even at internet scale โ€” billions of UUIDs generated per day across all systems worldwide โ€” the cumulative collision probability over years of operation remains negligible.

UUID vs Auto-Increment IDs

Auto-incrementing integer IDs (1, 2, 3...) are simpler and more efficient in many database scenarios, but they have real drawbacks:

  • Predictability: Knowing that record 1001 exists makes it trivial to guess that records 1000 and 1002 exist โ€” an information leak.
  • Merge conflicts: Merging data from two databases with separate auto-increment sequences produces collisions. UUIDs eliminate this problem entirely.
  • Distributed systems: In microservice architectures, records created independently on different nodes cannot safely use auto-increment without a coordination layer.

The tradeoff: random v4 UUIDs are terrible for database index performance. Because they are random, each insert goes to a random position in a B-tree index, causing frequent page splits and cache misses. This is why v7 UUIDs โ€” or ULIDs (Universally Unique Lexicographically Sortable Identifiers) โ€” exist: they are time-ordered, so inserts cluster at the end of the index like auto-increment IDs would.

Common Uses for UUIDs

  • Database primary keys โ€” particularly in distributed or multi-tenant systems
  • File names โ€” uploaded user files get UUID names to prevent collisions and path traversal attacks
  • API keys and session tokens โ€” a v4 UUID provides 122 bits of randomness, sufficient for a temporary token
  • Request correlation IDs โ€” trace a single request across distributed microservices
  • Idempotency keys โ€” ensure a payment or operation is not processed twice

When NOT to Use UUIDs

UUIDs are 16 bytes versus 4 bytes for a 32-bit integer or 8 bytes for a 64-bit integer. At scale, this storage overhead matters โ€” both in the table itself and in every index, foreign key, and join that references that column. For a table with billions of rows and many foreign key references, switching from bigint to UUID can measurably increase storage and slow queries.

If your system is single-node, never needs to merge data from separate sources, and has high-volume write patterns where index performance is critical โ€” a bigint auto-increment ID is often the better engineering choice.

References

  1. Leach, P., Mealling, M., & Salz, R. (2005). RFC 4122: A Universally Unique IDentifier (UUID) URN Namespace. Internet Engineering Task Force.
  2. Schwartz, B. (2007). UUID vs auto increment in MySQL โ€” performance and storage analysis. Percona Database Performance Blog.
  3. The PostgreSQL Global Development Group. (2024). uuid-ossp โ€” generate universally unique identifiers. PostgreSQL Documentation.
  4. Wikipedia contributors. (2024). Universally unique identifier โ€” collision probability section. Wikipedia.