UUID v4 Generator · 5 min read

What Is a UUID and When Should You Use One?

UUIDs are 128-bit identifiers designed to be unique without central coordination. Learn what the four parts of a v4 UUID mean, when to use UUIDs vs auto-increment IDs, and the odds of a collision.

What Is a UUID?

A UUID — Universally Unique Identifier — is a 128-bit number used to identify information without requiring a central authority to assign it. The standard format, defined in RFC 4122, looks like this:

550e8400-e29b-41d4-a716-446655440000

The 32 hexadecimal digits are split into five groups of 8-4-4-4-12 characters, separated by hyphens. This gives 32 hex digits × 4 bits = 128 bits of information — though a few of those bits are reserved for version and variant information.

UUID Versions Explained

Version 1 — Time-based

Version 1 UUIDs incorporate a 60-bit timestamp (100-nanosecond intervals since October 1582) combined with a MAC address or random node identifier. They are guaranteed unique across machines and time, but they embed information about when and where they were generated — a privacy concern in some contexts.

Version 4 — Random

Version 4 is the most widely used. It consists of 122 bits of cryptographically random data, with 4 bits reserved for the version number (4) and 2 bits for the variant. The version appears as the first digit of the third group (41d4 → version 4). The variant bits appear in the first digit of the fourth group.

Version 7 — Time-ordered random (new)

Version 7 is a newer format gaining adoption. It places a millisecond-precision Unix timestamp in the high bits, followed by random data. This makes v7 UUIDs sort in creation order — solving the database indexing problem that plagues v4 UUIDs — while retaining randomness and privacy.

The Collision Probability

A version 4 UUID has 122 random bits. The probability of generating two identical UUIDs seems vanishingly small — and it is. To have a 50% chance of any collision, you would need to generate approximately 2.7 × 10¹⁸ UUIDs. At a rate of one billion UUIDs per second, that would take over 85 years.

In practice, UUID collisions are not a realistic concern for any production system. The birthday problem math confirms that even at internet scale — billions of UUIDs generated per day across all systems worldwide — the cumulative collision probability over years of operation remains negligible.

UUID vs Auto-Increment IDs

Auto-incrementing integer IDs (1, 2, 3...) are simpler and more efficient in many database scenarios, but they have real drawbacks:

Predictability: Knowing that record 1001 exists makes it trivial to guess that records 1000 and 1002 exist — an information leak.
Merge conflicts: Merging data from two databases with separate auto-increment sequences produces collisions. UUIDs eliminate this problem entirely.
Distributed systems: In microservice architectures, records created independently on different nodes cannot safely use auto-increment without a coordination layer.

The tradeoff: random v4 UUIDs are terrible for database index performance. Because they are random, each insert goes to a random position in a B-tree index, causing frequent page splits and cache misses. This is why v7 UUIDs — or ULIDs (Universally Unique Lexicographically Sortable Identifiers) — exist: they are time-ordered, so inserts cluster at the end of the index like auto-increment IDs would.

Common Uses for UUIDs

Database primary keys — particularly in distributed or multi-tenant systems
File names — uploaded user files get UUID names to prevent collisions and path traversal attacks
API keys and session tokens — a v4 UUID provides 122 bits of randomness, sufficient for a temporary token
Request correlation IDs — trace a single request across distributed microservices
Idempotency keys — ensure a payment or operation is not processed twice

When NOT to Use UUIDs

UUIDs are 16 bytes versus 4 bytes for a 32-bit integer or 8 bytes for a 64-bit integer. At scale, this storage overhead matters — both in the table itself and in every index, foreign key, and join that references that column. For a table with billions of rows and many foreign key references, switching from bigint to UUID can measurably increase storage and slow queries.

If your system is single-node, never needs to merge data from separate sources, and has high-volume write patterns where index performance is critical — a bigint auto-increment ID is often the better engineering choice.

References

Leach, P., Mealling, M., & Salz, R. (2005). RFC 4122: A Universally Unique IDentifier (UUID) URN Namespace. Internet Engineering Task Force.
Schwartz, B. (2007). UUID vs auto increment in MySQL — performance and storage analysis. Percona Database Performance Blog.
The PostgreSQL Global Development Group. (2024). uuid-ossp — generate universally unique identifiers. PostgreSQL Documentation.
Wikipedia contributors. (2024). Universally unique identifier — collision probability section. Wikipedia.