Validating data is key to ensuring both a good user experience and for keeping your databases clean and secure. And while the HTML standard has more recently integrated its own browser based validation for such things as numbers and emails, often times you will want a custom approach to go along with your brand and your theme.
In this article, I will go over all of the ways in which you can validate emails as well as the built-in HTML methods that browsers provide. I will also discuss some of the challenges that are present today, such as uncommon character use and inconsistencies with regex validation across browsers.
Let's start with the simplest approach first:
Browser based email validation
The HTML specification allows for input fields to be set to an 'email' type.
<input type='email' />
When the form is submitted, if the browser detects that the field isn't a properly formatted email address, then a notification will spring up to inform the user.
While this method works just fine for standard workflows (i.e. user fills in the form -> user hits enter -> done), it does not provide any form of flexibility. You essentially get the same pop-up window and same notification message regardless of the issue.
And it is up to the browser vendor to implement this for us. The above screenshot was from a Firefox session for example. On Chrome, the message would look something like the following:
Definitely more helpful. But again, the design and styling is left up to the browser. In a real world corporate environment, there is a good chance that the design for validation modals, alerts and notifications will come from a designer. And the designs will more than likely not be as simple as simply a tooltip popup.
The easiest to implement so far on this list, but maybe not the most useful overall.
Regex approach
Regular expressions are ideal for this type of pattern matching scenario. What are regular expressions? Essentially they are string patterns that are used to define search patterns in other strings.
There are plenty of expressions that you will find online that aim to validate emails. This particular one from https://emailregex.com/ has a 99.99% match rate.
/^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/
A few things to note. Notice that regular expressions are not pretty to read. They are indeed complex sequences which makes them slightly more error prone than other alternatives.
Also notice how there is only a 99.99% success rate in matching patterns. And that's because regular expressions for emails aren't exactly perfect in a sense. The email standard is not a static document. While in the past you mainly had a username@url.topleveldomain sequence, these days, things are kind of weird. For one, top-level domains like .com and .net used to be the only game in town. But these days, you can find niche and trendy TLD's like .museum and .party.
And we can assume that this will continue to change going forward. So while we can't have 100% regex supremacy, this is still a relatively valid approach to consider.
Custom validation
And lastly, you have the option of parsing out your own strings and creating your own rules for whats an email valid. Let's take a look at a quick example of a standard email.
username @ url . com
There are essentially 5 main parts that comprise an email address.
1. The username portion (local name)
2. An @ symbol
3. The domain name
4. A (.) symbol
5. The domain extension
If you can ensure that these 5 elements are in place, then you have a good place to start for validation.
Here is a C# example that would do just that.
public static bool EmailIsValid(string strEmail)
{
if (strEmail.IndexOf('@') == -1 || strEmail.IndexOf('.') == -1)
return false;
string strPrefix = string.Empty;
string strSuffix = string.Empty;
string strExtension = string.Empty;
strPrefix = strEmail.Substring(0, strEmail.IndexOf('@'));
strSuffix = strEmail.Substring(strEmail.IndexOf('@') + 1);
strExtension = strSuffix.Substring(strSuffix.IndexOf('.') + 1);
if (strPrefix.Length == 0 || strSuffix.Length == 0 || strExtension.Length == 0)
return false;
return true;
}
The code essentially relies on finding substrings of characters in order from @ to . and ensuring that there is some content in each of the primary areas.
This method is not perfect mind you. There is alot more that goes into the email specification that is not taken into account, such as valid characters and maximum allowable characters. But it is a starting point if you wish to add to it and make it more robust.
Ideally, you want to validate emails both on the client-side (through JavaScript) and on the server-side as well. This ensures that older browsers that do not have built-in email validation do not effect the data on your server.
The biggest benefit of having your own custom validation is that you are kept in control of the design. If you have a particular UI/UX theme on your web forms, then you won't have to rely on the built-in popup balloon with a static message.
Having any type of validation, whether custom or built-in, is pretty much mandatory these days. With millions of bots crawling the web daily, the last thing that you want is to wake up with a database full of garbage data that you will then have to spend hours cleaning up.