Welcome, Guest

Please login or register

TUTORIALS SUBMENU

PHOTOSHOP    FLASH    ILLUSTRATOR    BLENDER    CINEMA 4D    WEB-CODING    [SUBMIT]

Related Links

Regular Expressions


One of PHPs most useful features is its string processing abilities. Feed PHP any string, and it can process it in any number of different ways with a multitude of different in-built functions.   Finding letter occurrences, replacing certain words, limiting the number of characters, etc - it's all made very easy.

One very useful function in particular is preg_replace(), which allows you to find certain occurrences of words in an advanced, customized way and replace them with a a string of your choice. The searched string can either be a simple string (although I recommend you use str_replace() for this function only due to its superior speed), or it can be a regular expression (REGEX). These regular expressions are like targeted wildcards, albeit MUCH more complex.  The aim of this tutorial is to describe the formulation strategy of various REGEX expressions, what they do, and how to customize them to your own unique purposes.  As you can guess, this is an ADVANCED tutorial, so no efforts will be made to explain the preg_replace() or str_replace() functions.  If you need this tutorial, you are more than likely able to read the PHP manual anyway...  ;)

Basic REGEX
Make no mistake - REGEX is widely used today - even searches in Microsoft Windows use them to some degree. Let me point you towards a simple example:

*.* - This is REGEX, and in windows it means "find any file with any extension in a given directory". In PHP it would mean "find one or more characters followed by a dot followed by one or more characters. Let us enhance that a little:

[A-Z]*.* - The "[A-Z]" is a character class and it basically means any letter from a to z that is uppercase. If you want to collect lowercase you would enter "[a-z]". If you would like to collect any letter, the obvious solution would be "[A-Za-z]".

TIP: If you want to check for a custom range of characters you could always use [g-p], etc.

Occurence-Counting REGEX
A character class followed by a "*" means "zero or more characters from the selected character class". So this string: [a-z]* would mean "zero or more lowercase letters". If you need to check for at least one occurrence of a letter you would use:

[a-z]+ - A "+" basically means "one or more occurrences". You could also do:

[a-z]{1} - This means "exactly one or more occurrences of a lowercase letter". So "exactly two to three occurrences of a lowercase letter" would be: [a-z]{2-3}

If you want to check for an optional character you use the question mark (?), like this: [a-z]? - And the explanation of this line is "an optional lowercase character". Now that we have this covered lets move on...

Character-Counting REGEX
^(.){4-6}$ - In PHP REGEX the carrot (^) symbol basically means the beginning of the line. So the dollar ($) symbol obviously means the end of the line. The end of the line occurs when a '/n' character is found. So this expression will mean "the start of the line followed by 4 to 6 any characters followed by the end of the line". Yes, the dot (.) character means "any character". So the line: (.*)  would mean "any amount of any character". The carrot (^) character can also be used for negating character classes. By negating I mean checking if there are no characters of the specified range. So a string like ^[^0-9]*$ would mean "start of the line followed by zero or more any characters that is NOT a digit followed by the end of the line".

The Zen of Brackets
By now you have probably noticed all the different brackets that are used. All of them have a different meaning. Let me explain:

  • The parenthesis "(" and ")" are used to group different expressions together, to which (if you need to use preg_replace) you can return later using a simple "$n" where n means a digit representing order from left to right of all the groups in the REGEX string. So, if you want to extract the text from the second group in this: ^([a-z]+)[A-Z]?([0-5]{1-3})$ You would have to use "$2" (the first group is ([a-z]+) and the second is ([0-5]{1-3})). And, of course, the usual translation of the string to human language is "the start of the line followed by one or more lowercase letters followed by an optional uppercase letter followed by 1 to 3 digits not higher then 5 followed by the end of the line".

  • The curly brackets "{" and "}" represent the widely used minimum/maximum values. As explained earlier, they can be used to further customize checking for characters in a string instead of the usual "one or more" or "zero or more". Syntax would be: {n} for n or more e.g. {1}, or {n-m} for no less than n number of characters and no more then m number of characters.   e.g. {3-7}

  • And finally, of course, there are the the normal brackets "[" and "]". These represent a character range, which was also explained earlier. The syntax for this one is: [a-b] where a is the range start and b is the range end e.g. [A-Z]

Of course, you don't have to use all REGEX for a string. You can also check for occurrences of words in a more advanced way. If, for example, you would like to search for a string containing the word "military" followed by an optional digit followed by the end of the line, you would write something like this: [Mm]ilitary[0-9]?$     Take note that the "[Mm]" is also a character range - it specifies a search for either character in the brackets. You can use all kinds of characters in your searches, but if you want to use a special character (e.g. a bracket) you will need to escape it using the all-saving backslash (\). This is, of course, the rule for PHP in general anyway!    So, for example, if you want to search for "[word]" you would write the REGEX like this: (\[word\]+)

Commonly Used Examples
Now that we have all the advanced theory out of the way, here are some frequently used reference REGEX expressions found in popular PHP-driven scripts:

\[b\](.*?)\[/b\] - What you see here is REGEX used to search for text encased in a [b] and [/b] tag. This is used very widely among forums, news systems of all kinds, etc.

[0-9A-Za-z]{8-15} - This could be used in scripts that utilise registration with passwords. This REGEX only accepts a string that is numeric or alphabetic with minimum 8 and maximum 15 characters.

The Speed Issue & Techniques
Using preg_replace() is definitely convenient, but it isn't too fast considering that PHP has to parse the string for metacharacters first instead of proceeding straight to the searching. I cant stress this enough: if you want to search a rather large text file for the word "cat" then, FOR THE LOVE OF GOD, use the strstr() function instead of preg_match(). Don't use preg functions when you're not using REGEX. Trust me on this one!

Also, many new people don't see the magic of arrays and proceed with the ignorant way of using 30 preg_match() functions each after the other instead of just putting the content in an array and searching that instead. Arrays are faster, more convenient and, most of all, they wont make your code look messy. Incidentally, if you are still rusty with arrays, you will do well to check out Scrowler's tutorial on arrays, also on Biorust...

Well, this is the end of the tutorial, so if you have any questions (or just want to flame me for writing some innate babble) then proceed to the Biorust forums and leave your opinions there. I promise someone will get back to you.

- Tutorial written by Blodo

Automatic Translations: Translate Into French Translate Into German Translate Into Italian Translate Into Spanish Translate Into Portuguese

Last 5 User Comments


There are no comments for this tutorial yet.
You can place a comment by clicking here.
Featured Tutorialsmore

Realistic Feathers
Realistic Feathers
- Adobe Photoshop -
Basic Anime Eyes
Basic Anime Eyes
- Adobe Photoshop -
Creating Cold Light
Creating Cold Light
- Adobe Photoshop -
UV Mapping
UV Mapping
- Blender 3D -
Membership

Username:
Password:  
Remember Me

Lost Password? || Register

Related Links

Special Options
Printer Friendly Version
Forum Threads

 Link Pop Up
Author: Rtouch
Posted: Aug 28th, 12:59pm
Activity: 5 replies, 0 views
Tips and Tricks on how to speed up rendering
Author: heartscool
Posted: Aug 21st, 6:42am
Activity: 0 replies, 0 views
Hello
Author: goingtothedogs
Posted: Aug 17th, 5:42pm
Activity: 3 replies, 0 views
3ds,zbruh,texturing help
Author: heartscool
Posted: Aug 16th, 5:06am
Activity: 0 replies, 0 views
3dsMax-autocad question
Author: heartscool
Posted: Aug 03rd, 7:02pm
Activity: 0 replies, 0 views
Car texturing
Author: heartscool
Posted: Aug 02nd, 5:28am
Activity: 0 replies, 0 views
 Hi Everyone!! :)
Author: StarMania
Posted: Jul 31st, 5:30pm
Activity: 3 replies, 0 views
Fire text effect in 3ds max
Author: heartscool
Posted: Jul 24th, 1:56pm
Activity: 0 replies, 0 views
Where did everyone go!?!
Author: LemonTree
Posted: Jul 22nd, 12:15am
Activity: 6 replies, 0 views
Protecting Email Addresses from Spammers in HTML
Author: LemonTree
Posted: Jul 22nd, 12:13am
Activity: 0 replies, 0 views
3ds max wireframe render ?
Author: heartscool
Posted: Jul 20th, 2:51pm
Activity: 0 replies, 0 views
Site issue
Author: Jormi_Boced
Posted: Jul 07th, 8:40pm
Activity: 11 replies, 0 views
Forum Threads

--- Site Resources ---
Total Tutorials:212
Total Downloads:    438
Total Fonts:    4673