Email Length VARCHAR Limit in a User Database

Published: 15 Mar 2023

This page is a summary of factors to consider when deciding maximum VARCHAR length of the email address field in a typical user database.

There are theoretical limits (254 characters) based on IETF RFC. Therefore, the default data type is VARCHAR(254) if you want to accommodate every possible email.

That said, vast majority of real emails are much shorter: 99.9%+ are below 40 characters based on various analyses of user/email databases. For reasons including storage, performance, UI/UX, or security, you may want to consider a lower limit, such as VARCHAR(63), if your particular use case allows.

Theoretical limits: 320 or 254?

320 characters

Every email address has two parts:

  • The local part (before the "@") = max 64 characters
  • The domain part (after the "@") = max 255 characters

Therefore, maximum length is 64 + @ + 255 = 320 characters.

These limits are specified in RFC 5321 section 4.5.3.1. Size Limits and Minimums:

4.5.3.1.1.  Local-part

   The maximum total length of a user name 
   or other local-part is 64 octets.

4.5.3.1.2.  Domain

   The maximum total length of a domain name 
   or number is 255 octets.

RFC (Request for Comments) are documents published by the Internet Engineering Task Force (IETF) which describe internet protocols, standards, and best practices. This one defines the Simple Mail Transfer Protocol (SMTP) used for sending emails over the internet.

The 255-character domain name limit is based on RFC 1035 (2.3.4. Size limits), which defines the Domain Name System (DNS) used for resolving domain names.

254 characters

In practice, while you can theoretically create an email address with up to 320 characters, it would be unreliable to actually use it for sending and receiving emails.

Most email clients and servers impose a limit of 256 characters on the so called Path, which is email address enclosed in angle brackets, e.g. <john@gmail.com>.

The same RFC 5321 section 4.5.3.1. Size Limits and Minimums mentions this limit:

4.5.3.1.3.  Path

   The maximum total length of a reverse-path 
   or forward-path is 256 octets (including 
   the punctuation and element separators).

Therefore, excluding the angle brackets, the email address alone is limited to 254 characters.

Best in most cases: VARCHAR(254)

Based on the above limits, the default data type to use for email addresses is VARCHAR(254).

It should accommodate any valid email address.

Some people use VARCHAR(255) out of habit, but there is no use for the extra character.

99.9% emails are below 40 characters

In the real world, most email addresses are much shorter than either 320 and 254 characters.

The typical length is somewhere around 15-30 characters.

  • john@gmail.com = 14 characters
  • john.smith@company.com = 22 characters
  • alexander.stapleton@smallercompany.com = 38 characters

Even Alexander in the above example would probably have preferred a shorter email address if his employer allowed.

My own data

I checked the user database from one of my own sites, with a sample of 6797 email addresses of paying customers (therefore all valid emails):

  • Maximum email address length was 47 characters. It was a hotmail.com email, with local part of the funny kind (x-rated), but a valid paying customer indeed.
  • The next two longest emails were 43 and 41 characters.
  • All others were up to 39 characters.

These are the relative frequencies (the first row ">30 = 2.32%" means 2.32% of all emails in the database were longer than 30 characters):

>30 = 2.32%
>31 = 1.69%
>32 = 1.31%
>33 = 0.82%
>34 = 0.53%
>35 = 0.35%
>36 = 0.22%
>37 = 0.18%
>38 = 0.12%
>39 = 0.04%

Another analysis on bigger sample

A similar analysis posted on StackOverflow using a much larger sample (>10M but including invalid emails) confirmed my own findings. Their main points (quote):

  • The longest valid one is 89.
  • There are hundreds longer ones up to the limit of our column (255) but they are apparently fake by visual inspection.
  • The peak of the length distribution is at 19.
  • There isn't long tail. Everything falls off sharply after 38.

And their conclusion:

We cleaned up the DB by throwing away anything longer than 40. The good news is that no one has complained but the bad news is not many records got cleaned out.

Another email address length statistics:
https://www.atdata.com/blog/long-email-addresses

Some use shorter VARCHAR

In light of the email address length distribution, many sources mention using a VARCHAR shorter than 254, such as 50, 63, 80, or 127.

Here are a few more quotes from StackOverflow:

I've used 120 variable characters for years. The real world logic is that even if someone is ready to fill your 320 varchar field...I bet they have a 40 char alternative email just standing by

In my systems I also use varchar(50) and I have never had a complaint that a user cannot register.

The questions are:

  • Why use a VARCHAR shorter than 254 in the first place? (it is not just performance)
  • When can you safely use a shorter VARCHAR and when not?

Why use shorter VARCHAR

The obvious motivations are storage and performance, but there are often other reasons, not limited to the database itself, such as UI/UX and security.

Storage

When using VARCHAR, storage space is not important from the single column perspective, as VARCHAR only uses the space actually needed for the particular string (unlike fixed length CHAR). If most email addresses are shorter than 40 characters, the space needed to store them in VARCHAR(50) is not much different from VARCHAR(254).

That said, in tables with many columns it may become important for the total row size limit, which varies across DBMS and storage engines. This is closely related to performance.

Performance

In a typical web application, email address is one of the most important pieces of user data. It uniquely identifies users, it is used in login forms, in database indexes and joins. There are many reasons to make it as small and fast as possible.

UI/UX

A long email address takes up a lot of screen space. If you need to also consider extremely long emails, designing the user interface becomes more complicated – in login and signup forms, user profile pages, and other places where email address appears alongside other information.

Security

Shorter inputs are also preferable for security, as they give a potential attacker less space to play.

When to use shorter VARCHAR and when not

You can use a lower limit on email address length if your use case satisfies the following conditions:

You have complete control over user input. All emails come into your database from online forms which you can implement and validate yourself. For example, if your database uses VARCHAR(63) and someone tries to register with a 70-character email, your signup form should inform them about the limit and prompt them to use a shorter email address. Some people will comply, some people will leave.

You are OK with rejecting a small portion of potential customers. This is in line with Pareto principle of 80/20, in this case more like 75% length reduction by rejecting <0.01% users.

On the contrary, you should use the full VARCHAR(254) if you expect to get users and emails from sources you can't fully control.

For instance, if you collect leads offline, from a third party, or your company acquires another and you want to merge their database with yours.

Premature optimization?

When building a new database and application from scratch, limiting email length to something like 50 or 63 characters may be a case of premature optimization.

Truth is that 99% projects will never grow to the size when you need to worry about performance implications of email VARCHAR length.

Even if you do, there are probably other, more effective places where database performance can be optimized.

That said, reducing the email VARCHAR limit in an existing database would be close to impossible. You would probably have to contact the few unlucky users with something like this:

Dear valued customer,

Your email address no longer fits in our database. Please change it to maximum 63 characters or stop using our service.

Increasing an existing VARCHAR field length is always easier than decreasing it.

What I use

Personally, I use VARCHAR(63) in all projects where I have full control, including this site. It will help performance when I have a billion users a few years down the road.

Good StackOverflow threads

https://stackoverflow.com/questions/386294/what-is-the-maximum-length-of-a-valid-email-address

https://stackoverflow.com/questions/1199190/what-is-the-optimal-length-for-an-email-address-in-a-database

https://stackoverflow.com/questions/1297272/how-long-should-sql-email-fields-be

https://dba.stackexchange.com/questions/37014/in-what-data-type-should-i-store-an-email-address-in-database

By remaining on this website or using its content, you confirm that you have read and agree with the Terms of Use Agreement.

We are not liable for any damages resulting from using this website. Any information may be inaccurate or incomplete. See full Limitation of Liability.

Content may include affiliate links, which means we may earn commission if you buy on the linked website. See full Affiliate and Referral Disclosure.

We use cookies and similar technology to improve user experience and analyze traffic. See full Cookie Policy.

See also Privacy Policy on how we collect and handle user data.

© 2024 Dvut