TIL - Email Address Validation
Meta note: I'm going to start writing some short TIL posts about technical topics that I come across in the course of my work or personal tech tinkering. I was inspired by Julia Evans' TIL blog. It seems like a great way to record the things I'm learning for later reference, and I'm hoping it will also be a positive thing to have in my portfolio when job-hunting.
I went down a rabbit hole of email validation discourse today. It's a topic I've encountered before, but usually only in passing, and I wanted to figure out what's considered to be "best practice" here.
The most authoritative source I found which concisely sums up the situation is OWASP's cheat sheet on email address validation. Some key points:
- Lots of strings that you probably don't want to accept as email addresses are, technically, valid email addresses.
- Just because something is a technically valid email address doesn't mean you can send an email to it.
- The only way to reliably tell if you're going to be able to send an email to an arbitrary address is to... try and send the email!
- They also recommend some basic initial validations which can rule out strings that you probably don't want to accept as email addresses.
Some interesting discussions I found while researching:
- How can I validate an email address using a regular expression? on Stack Overflow
- Can it cause harm to validate email addresses with a regex? on Stack Overflow
- I Knew How To Validate An Email Address Until I Read The RFC
That last has an interesting comment by someone named David: "The best compromise I've seen so far is to have it complain if you use a "weird" character, but then offer you a chance to say "no, that really is my email address". That way it catches stupid mistakes, but lets you have the final say as to whether it's valid or not."
That strikes me as an elegant solution if you don't want to block the rare user with an esoteric email address, but still want to provide some guidance in the more common situation where a user has simply entered a typo. Though of course, then you're back to square one with deciding when to display an "are you sure?" warning vs silently accepting the email, albeit with lower stakes.