GoWin Tools
Tools
Password Strength Checker

Password Strength Checker · 6 min read

zxcvbn vs Entropy: Why Password Strength Meters Disagree

One meter says "weak", another says "strong", and the entropy calculator says something else again. Here is what each is actually measuring — and which one to trust.

Type Tr0ub4dor&3into five different password strength meters and you will get five different answers. One says "Strong". One says "Fair". The entropy calculator gives you a number with too many decimal places. zxcvbn says it falls in 3 hours of offline cracking. They are all looking at the same string. The disagreement comes from what they are trying to measure.

What "Entropy" Meters Compute

Naive meters count character classes. Lowercase only? 26-character alphabet. Add uppercase? 52. Add digits? 62. Add symbols? Roughly 95. Then they multiply by the password length and take log2 to get a bit count.

This formula is correct only when the password was drawn uniformly at random from that alphabet. aaaaaaaa from a 95-character alphabet would score about 52 bits — even though an attacker would guess it in roughly one attempt. The formula does not know anything about likelihood; it only knows the size of the search space if every character were independent.

What Composition-Rule Meters Compute

The green-bar-on-the-signup-page meter usually does something even simpler: it adds points for length, points for having mixed case, points for digits, points for symbols. Hit the magic threshold and the bar turns green.

de Carné de Carnavalet and Mannan's 2014 study tested 13 of these meters from major sites. Their conclusion: most are inconsistent with each other and with reality. The same password could be "weak" on one site and "very strong" on the next. Password1! often scores green. It is in every breach corpus.

What zxcvbn Computes

zxcvbn, written by Dropbox engineer Daniel Wheeler in 2012, takes a different approach: it tries to estimate how an attacker would actually crack the password. It does this by decomposing the input into the most plausible sequence of patterns and summing the cost of each pattern.

The patterns it knows about include:

  • Dictionary words (English, names, common passwords) plus leet-speak variants and capitalisation.
  • Sequences (abcdef, 123456).
  • Repeats (aaaa, abcabc).
  • Keyboard patterns (qwerty, asdfgh).
  • Dates, years, and common formats.

It then adds up the search cost across the parsed patterns. Tr0ub4dor&3looks complex but parses as "capitalised dictionary word with leet substitutions, plus symbol, plus digit" — a structure that an attacker's rule set covers cheaply. zxcvbn's estimate matches reality much better than the entropy formula.

Why They Disagree

The three approaches are answering different questions:

  • Entropy: "If this password were random, how many guesses would brute force take?"
  • Composition meter: "Does this password tick the corporate compliance boxes?"
  • zxcvbn: "If a real attacker with a real cracking dictionary tried this, how long would it take?"

For human-chosen passwords, zxcvbn is closest to truth. For machine-generated random passwords, the entropy formula is correct and zxcvbn slightly underestimates strength because it can't prove the input was random. The composition meter is wrong in both cases.

The Edge Case That Breaks zxcvbn

zxcvbn caps its estimates at around 10^10 guesses, because beyond that the exact number stops being useful. So a genuinely random 20-character password registers as "practically uncrackable" without distinguishing between 70-bit and 130-bit. For comparing very strong passwords, fall back to the entropy formula.

zxcvbn's dictionaries also reflect 2012-era data. A modern reimplementation, zxcvbn-ts, refreshes the corpora and adds language packs, but the underlying assumption — attackers use roughly these patterns — has not fundamentally shifted.

Which Meter to Trust

For a password you typed yourself, trust zxcvbn. If it says "crackable in seconds", believe it, even if a green bar elsewhere disagrees. The estimator was specifically built to model the cracking pipeline used in real breaches.

For a password generated from a CSPRNG with a known character set, the entropy formula is correct: bits = length × log2(charset). Use that.

For the green bar on a signup page — politely ignore it. NIST SP 800-63B has been telling sites to drop composition rules since 2017, but most have not. The bar is a UX nudge, not a measurement.

References

  1. Wheeler, D. L. (2016). zxcvbn: Low-Budget Password Strength Estimation. USENIX Security Symposium.
  2. de Carné de Carnavalet, X. & Mannan, M. (2014). From Very Weak to Very Strong: Analyzing Password-Strength Meters. Network and Distributed System Security Symposium.
  3. Grassi, P. A., Garcia, M. E., & Fenton, J. L. (2017). NIST Special Publication 800-63B: Digital Identity Guidelines. National Institute of Standards and Technology.
  4. Ur, B. et al. (2017). Design and Evaluation of a Data-Driven Password Meter. ACM Conference on Human Factors in Computing Systems (CHI).