First and Last name validation for forms and databases

First name and last name validation is not trivial. How many times have you had to build a database table with a first name and last name? Let me guess, your answer is 'All the time'.

Now, how many times have you googled "How to validate human names?". With every project we create we face this problem. An easy solution would be to use validator.isAlpha('name', ['en-US', 'en-GB', 'pt-pt']) right out of NPM, but we always realize that we need to allow a larger subset of characters. As our project grows, our users grow; this widely varies the validation rules for the user names.

Let's solve this problem once and for all. We will start by supporting North American, Australian and English names. We will then add support for Western European names followed by all human names covering North America, South America, Europe, Africa, Australia and Asia. We will also learn how to do fine grained control validation for different languages like Arabic or Russian.

Starting point

We will start by creating a simple NAME_REGEX regular expression in javascript which allows for "word" characters. We will also create the function isValidName which tests a string against the NAME_REGEX. This function will be run on the first name and last name separately. This will allow us more validation control and will make us compatible with most systems out there that use first name (given name) and last name (surname) fields separately like Banks.

const NAME_REGEX = /^\\w+$/

/** Validates a name field (first or last name) */
const isValidName = (name) => NAME_REGEX.test(name)

Remove numbers and underscores

We will now fine tune our regex a bit so that it doesn't allow for numerical digits and underscores.

const NAME_REGEX = /^[a-zA-Z]+$/

We replaced the \\w with [a-zA-Z].

Allow for multiple words, hyphens and apostrophes

There are lot of first names that consists of more than one word, for example: Yuv Raj (indian) and Fatima ElZahraa (arabic). So we need to allow for spaces as well. Irish names contain apostrophes, for example: O'Neil (irish). Some names include hyphens, for example: kunis-edison.

const NAME_REGEX = /^([a-zA-Z]+[ \\-']{0,1}){1,3}$/

We added [ \\-']{0,1} to support 0 or 1 hyphen/apostrophe/period between and the end of the alphabetic characters of the name. Finally, we wrapped the regex with brackets and added {1,3} requiring the regex to occur a minimum of 1 and a maximum of 3 times.

An important thing to note is that we want the hyphens, apostrophes and spaces to only appear in the middle of the text, not at the end. For example: O'neil should be valid, but O' should not be valid.

const NAME_REGEX = /^[a-zA-Z]+([ \\-']{0,1}[a-zA-Z]+){0,2}$/

To achieve this we replaced [ \\-']{0,1} with ([ \\-']{0,1}[a-zA-Z]+)*{0,2}) this forces at least a single alphabetic letter after the special character, while also allowing the name to have length 1. We replace the {1,3} with {0,2} to achieve the same result (min 1, max 3 blocks); since the regex now forces alphabet at the end, we need to replace n with n-1 to achieve the same validation (accounting for the alpha prefix).

Side-note: If your name length is at least 2, you can simplify the [a-zA-Z]+([ \\-']{0,1}[a-zA-Z]+)* to [a-zA-Z]+[ \\-']{0,1}[a-zA-Z]+.

Starting point

Remove numbers and underscores

Allow for multiple words, hyphens and apostrophes

Add period support