|
||
| Inside Technique : Form Validation Made Easy : Regular Expressions Regular expressions (or "regexes" as I prefer) offer a comprehensive way to define patterns of characters. The power of the regular expression is the ability to describe the order, number, types, and even absence of a series of characters or character groups within a string. JScript in Internet Explorer 4 is roughly equivalent to JavaScript 1.2, which implements Perl 4 regular expression syntax. Since regular expressions are such a complex topic, I will touch on the syntax just enough to get you excited enough about them to run out and get a book. Regular expressions use special characters called "metacharacters" to descibe textual patterns. The following table lists the metacharacters and their corresponding meanings [the table comes straight from the Microsoft Scripting Technologies site]. I recommend that you print or otherwise store this information somewhere that is quickly accessible if you aren't already familiar with the syntax.
NOTE: The use of the backslash ("\") has special meaning when used with the metacharacters. It acts as an escape character, telling the regex parser that the following character should be interpreted literally (i.e. "\*" matches a literal asterisk character). Now that you know all the metacharacters, let's look at a couple of examples to test your understanding. I'll start with some common uses and move on to more complex regexes. /^\d{5}$/
Maybe we should step through this one bit by bit. The carat (^) indicates the beginning of the input; that just means that the following character must be the first character in the string. A \d indicates a digit (any single number 0 through 9). The braces following it require that exactly five of the preceding character [the digit] must occur together. It does not require the same number to occur five times since we only used the digit class and not a specific digit. The dollar sign ($) means the end of input. All together, this regex simply matches any string that is exactly five digits - no more, no fewer. If we were to remove the beginning- and end-of-input markers (^ and $, respectively), the string would only have to contain five digits in a row. When attacking form validation with regular expressions, you will almost always need to use these metacharacters to make certain your user has entered correct data. Let's take a look at a more advanced version of the first example: /^\d{5}(\-?\d{4})?$/
We already know the first little bit (\d{5}) so let's move
directly to the last half. The use of parentheses serves to group
all the metacharacters within them. Any repetition metacharacters
following the parentheses ("?" in this case) operate on
the group as a whole; therefore, we expect zero or one occurrence
of If we decompose this group, we can expect a literal hyphen (-) zero or one time followed by exactly four digits. [As you can glean now, this regex represents a US ZIP code.] With regular expressions, you don't need to fan through the string character by character looking for non-digits and an optional hyphen delimiter. Here is a JavaScript example of how we test a string for a match: var bResult = /^\d{5}(\-?\d{4})?$/.test(string);
All regular expressions are written between forward slashes. The test method of the RegExp object returns true if the string parameter matches the regular expression and false if it does not. If you prefer to construct your regexes at runtime, you may use the RegExp constructor function: var re = new RegExp("^\\d{5}(\\-?\\d{4})?$");
Notice the escaped backslashes. Since this constructor uses a string parameter to create a regular expression object, all the characters must be resolved as literals. There is a second parameter to the RegExp constructor that you should know as well. It defines how the expression should act when it is attempting to match, whether ignoring the case of the letters ("i") or matching all instances of the pattern within the input ("g") or both ("ig"). The second parameter has its counterpart in the normal regex syntax as well: var sInput = "10:52 AM";
var re = new RegExp("am", "i");
var bResult = re.test(sInput); // true
The above regular expression looks for the string "am" in the input without regard to case. The following replaces all instances of the string "flounder" with "fish": // must match exactly
sInput = sInput.replace(/flounder/g, "fish");
// case-insensitive match
sInput = sInput.replace(/flounder/gi, "fish");
Now you have seen a small portion of regular expressions. We can combine the power of regular expression testing with the technique of creating your own methods to make a validation method for form elements. Page 1:Form Validation Made Easy © 1997-2000 InsideDHTML.com, LLC. All rights reserved. |