This page is a summary of factors to consider when deciding maximum VARCHAR length of the email address field in a typical user database.
That said, vast majority of real emails are much shorter: 99.9%+ are below 40 characters based on various analyses of user/email databases. For reasons including storage, performance, UI/UX, or security, you may want to consider a lower limit, such as VARCHAR(63), if your particular use case allows.
On this page:
- Theoretical limits: 320 or 254?
- 320 characters
- 254 characters
- Best in most cases: VARCHAR(254)
- 99.9% emails are below 40 characters
- My own data
- Another analysis on bigger sample
- Some use shorter VARCHAR
- Why use shorter VARCHAR
- When to use shorter VARCHAR and when not
- Premature optimization?
- What I use
- Good StackOverflow threads
Theoretical limits: 320 or 254?
Every email address has two parts:
- The local part (before the "@") = max 64 characters
- The domain part (after the "@") = max 255 characters
Therefore, maximum length is 64 + @ + 255 = 320 characters.
22.214.171.124.1. Local-part The maximum total length of a user name or other local-part is 64 octets. 126.96.36.199.2. Domain The maximum total length of a domain name or number is 255 octets.
RFC (Request for Comments) are documents published by the Internet Engineering Task Force (IETF) which describe internet protocols, standards, and best practices. This one defines the Simple Mail Transfer Protocol (SMTP) used for sending emails over the internet.
In practice, while you can theoretically create an email address with up to 320 characters, it would be unreliable to actually use it for sending and receiving emails.
Most email clients and servers impose a limit of 256 characters on the so called Path, which is email address enclosed in angle brackets, e.g. <firstname.lastname@example.org>.
188.8.131.52.3. Path The maximum total length of a reverse-path or forward-path is 256 octets (including the punctuation and element separators).
Therefore, excluding the angle brackets, the email address alone is limited to 254 characters.
Best in most cases: VARCHAR(254)
Based on the above limits, the default data type to use for email addresses is VARCHAR(254).
It should accommodate any valid email address.
Some people use VARCHAR(255) out of habit, but there is no use for the extra character.
99.9% emails are below 40 characters
In the real world, most email addresses are much shorter than either 320 and 254 characters.
The typical length is somewhere around 15-30 characters.
- email@example.com = 14 characters
- firstname.lastname@example.org = 22 characters
- email@example.com = 38 characters
Even Alexander in the above example would probably have preferred a shorter email address if his employer allowed.
My own data
I checked the user database from one of my own sites, with a sample of 6797 email addresses of paying customers (therefore all valid emails):
- Maximum email address length was 47 characters. It was a hotmail.com email, with local part of the funny kind (x-rated), but a valid paying customer indeed.
- The next two longest emails were 43 and 41 characters.
- All others were up to 39 characters.
These are the relative frequencies (the first row ">30 = 2.32%" means 2.32% of all emails in the database were longer than 30 characters):
>30 = 2.32% >31 = 1.69% >32 = 1.31% >33 = 0.82% >34 = 0.53% >35 = 0.35% >36 = 0.22% >37 = 0.18% >38 = 0.12% >39 = 0.04%
Another analysis on bigger sample
A similar analysis posted on StackOverflow using a much larger sample (>10M but including invalid emails) confirmed my own findings. Their main points (quote):
- The longest valid one is 89.
- There are hundreds longer ones up to the limit of our column (255) but they are apparently fake by visual inspection.
- The peak of the length distribution is at 19.
- There isn't long tail. Everything falls off sharply after 38.
And their conclusion:
We cleaned up the DB by throwing away anything longer than 40. The good news is that no one has complained but the bad news is not many records got cleaned out.
Another email address length statistics:
Some use shorter VARCHAR
In light of the email address length distribution, many sources mention using a VARCHAR shorter than 254, such as 50, 63, 80, or 127.
Here are a few more quotes from StackOverflow:
I've used 120 variable characters for years. The real world logic is that even if someone is ready to fill your 320 varchar field...I bet they have a 40 char alternative email just standing by
In my systems I also use varchar(50) and I have never had a complaint that a user cannot register.
The questions are:
- Why use a VARCHAR shorter than 254 in the first place? (it is not just performance)
- When can you safely use a shorter VARCHAR and when not?
Why use shorter VARCHAR
When using VARCHAR, storage space is not important from the single column perspective, as VARCHAR only uses the space actually needed for the particular string (unlike fixed length CHAR). If most email addresses are shorter than 40 characters, the space needed to store them in VARCHAR(50) is not much different from VARCHAR(254).
That said, in tables with many columns it may become important for the total row size limit, which varies across DBMS and storage engines. This is closely related to performance.
In a typical web application, email address is one of the most important pieces of user data. It uniquely identifies users, it is used in login forms, in database indexes and joins. There are many reasons to make it as small and fast as possible.
A long email address takes up a lot of screen space. If you need to also consider extremely long emails, designing the user interface becomes more complicated – in login and signup forms, user profile pages, and other places where email address appears alongside other information.
Shorter inputs are also preferable for security, as they give a potential attacker less space to play.
When to use shorter VARCHAR and when not
You can use a lower limit on email address length if your use case satisfies the following conditions:
You have complete control over user input. All emails come into your database from online forms which you can implement and validate yourself. For example, if your database uses VARCHAR(63) and someone tries to register with a 70-character email, your signup form should inform them about the limit and prompt them to use a shorter email address. Some people will comply, some people will leave.
You are OK with rejecting a small portion of potential customers. This is in line with Pareto principle of 80/20, in this case more like 75% length reduction by rejecting <0.01% users.
On the contrary, you should use the full VARCHAR(254) if you expect to get users and emails from sources you can't fully control.
For instance, if you collect leads offline, from a third party, or your company acquires another and you want to merge their database with yours.
When building a new database and application from scratch, limiting email length to something like 50 or 63 characters may be a case of premature optimization.
Truth is that 99% projects will never grow to the size when you need to worry about performance implications of email VARCHAR length.
Even if you do, there are probably other, more effective places where database performance can be optimized.
That said, reducing the email VARCHAR limit in an existing database would be close to impossible. You would probably have to contact the few unlucky users with something like this:
Dear valued customer,
Your email address no longer fits in our database. Please change it to maximum 63 characters or stop using our service.
Increasing an existing VARCHAR field length is always easier than decreasing it.
What I use
Personally, I use VARCHAR(63) in all projects where I have full control, including this site. It will help performance when I have a billion users a few years down the road.