<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>HM2K.com</title>
	<atom:link href="http://www.hm2k.com/feed" rel="self" type="application/rss+xml" />
	<link>http://www.hm2k.com</link>
	<description>The research of an internet entrepreneur and IT consultant</description>
	<pubDate>Thu, 17 Jul 2008 14:33:09 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>suPHP and .phps PHP code highlighting support</title>
		<link>http://www.hm2k.com/posts/suphp-and-phps</link>
		<comments>http://www.hm2k.com/posts/suphp-and-phps#comments</comments>
		<pubDate>Thu, 17 Jul 2008 12:27:12 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[Apache]]></category>

		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Sysadmin]]></category>

		<category><![CDATA[cPanel]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/?p=214</guid>
		<description><![CDATA[Today a user on one of my web servers asked me why .phps files would only download and not show the highlighted PHP code as expected.
This is usually done by adding the following to your &#8220;httpd.conf&#8221;&#8230;
AddType &#8216;application/x-httpd-php-source&#8217; .phps
We use the cPanel web hosting control panel and to improve security cPanel recommend using suPHP, which allows [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "suPHP and .phps PHP code highlighting support", url: "http://www.hm2k.com/posts/suphp-and-phps" });</script>]]></description>
			<content:encoded><![CDATA[<p>Today a user on one of my web servers asked me why .phps files would only download and not show the highlighted PHP code as expected.</p>
<p>This is usually done by adding the following to your &#8220;httpd.conf&#8221;&#8230;</p>
<blockquote><p>AddType &#8216;application/x-httpd-php-source&#8217; .phps</p></blockquote>
<p>We use the cPanel web hosting control panel and to improve security <a href="http://www.cpanel.net/support/docs/ea/ea3/ea3php_hardening_php.html" onclick="javascript:urchinTracker ('/outbound/article/www.cpanel.net');">cPanel recommend using suPHP</a>, which allows PHP scripts to run as a user rather than &#8220;nobody&#8221;.</p>
<p>This means that adding the above line to &#8220;httpd.conf&#8221; <a href="http://lists.marsching.com/pipermail/suphp/2005-January/000638.html" onclick="javascript:urchinTracker ('/outbound/article/lists.marsching.com');">does not work with suPHP</a>.</p>
<p>So what can be done?</p>
<p><span id="more-214"></span></p>
<p>The official word is located in the <a href="http://www.suphp.org/FAQ.html" onclick="javascript:urchinTracker ('/outbound/article/www.suphp.org');">suPHP FAQ</a>, which says:</p>
<blockquote><p><strong> Does suPHP support code highlighting by using the &#8220;.phps&#8221; extension?</strong></p>
<p>suPHP itself has no support for code highlighting. The main reason      is that PHP-CGI does not support any input parameter to activate      code highlighting. However there is a solution based on a small      PHP script and some rewrite rules. You can find the discussion at      <a href="http://forums.macosxhints.com/archive/index.php/t-23595.html" onclick="javascript:urchinTracker ('/outbound/article/forums.macosxhints.com');">http://forums.macosxhints.com/archive/index.php/t-23595.html</a>.</p></blockquote>
<p>So I decided to checkout the suggested link.</p>
<p>I noticed that even though the FAQ suggested using rewrite rules, the forum did not provide any kind of working solution.</p>
<p>Using the PHP code supplied, and a bit of rewrite ingenuity we can get this working as expected.</p>
<p>First, create a file called &#8220;phpsource.php&#8221;, in this file paste the following code:</p>
<blockquote><p>&lt;?php<br />
if (substr($_GET['file'],strpos($_GET['file'],&#8217;.')) == &#8216;.phps&#8217;) {<br />
highlight_file($_GET['file']);<br />
}<br />
?&gt;</p></blockquote>
<p>Then, in your &#8220;.htaccess&#8221;, paste the following code:</p>
<blockquote><p>RewriteRule ^(.+\.phps)$ phpsource.php?file=$1 [L]</p>
<p><em>Note: If you don&#8217;t already have rewrites turned on </em><em>in your &#8220;.htaccess&#8221; file,</em><em> you will also need the line &#8220;RewriteEngine On&#8221; at the top.</em></p></blockquote>
<p>What this will do is pass all &#8220;.phps&#8221; files through your &#8220;phpsource.php&#8221; script, and output a highlighted version.</p>
<p>The benefits of this solution is that it&#8217;s portable (will work on any server); it won&#8217;t(/shouldn&#8217;t) break when you upgrade apache or PHP; it&#8217;s pretty secure as it&#8217;ll only handle .phps files, as expected; it&#8217;s quick and effective.</p>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=suPHP+and+.phps+PHP+code+highlighting+support&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2Fsuphp-and-phps" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/suphp-and-phps/feed</wfw:commentRss>
		</item>
		<item>
		<title>OpenCart v0.7.8 released</title>
		<link>http://www.hm2k.com/posts/opencart</link>
		<comments>http://www.hm2k.com/posts/opencart#comments</comments>
		<pubDate>Tue, 08 Jul 2008 22:17:50 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/?p=209</guid>
		<description><![CDATA[OpenCart is an open source PHP-based e-commerce online shop website solution. Ideal for new or existing stores to start selling online.
OpenCart all began because (at the time) the leading open source e-commerce solution out there was not very good, to say the least.
The first notable release was OpenCart v0.5 back in late 2006 and has [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "OpenCart v0.7.8 released", url: "http://www.hm2k.com/posts/opencart" });</script>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.opencart.com/" onclick="javascript:urchinTracker ('/outbound/article/www.opencart.com');">OpenCart</a> is an open source PHP-based e-commerce online shop website solution. Ideal for new or existing stores to start selling online.</p>
<p>OpenCart all began because (at the time) the leading open source e-commerce solution out there was not very good, to say the least.</p>
<p>The first notable release was OpenCart v0.5 back in late 2006 and has been gaining momentum ever since.</p>
<p>The project is lead by Daniel Kerr, and I have also recently joined the team.</p>
<p><a href="http://open-cart.googlecode.com/files/opencart_v0.7.8.zip" onclick="javascript:urchinTracker ('/outbound/article/open-cart.googlecode.com');">Download OpenCart v0.7.8</a></p>
<p>If you need any assistance with OpenCart, you can find me on the <a href="http://forum.opencart.com/" onclick="javascript:urchinTracker ('/outbound/article/forum.opencart.com');">OpenCart Community Forums</a>, and on the <a href="http://code.google.com/p/open-cart/" onclick="javascript:urchinTracker ('/outbound/article/code.google.com');">OpenCart Google Code project site</a>.</p>
<p>Enjoy!</p>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=OpenCart+v0.7.8+released&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2Fopencart" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/opencart/feed</wfw:commentRss>
		</item>
		<item>
		<title>longip script</title>
		<link>http://www.hm2k.com/posts/longip-script</link>
		<comments>http://www.hm2k.com/posts/longip-script#comments</comments>
		<pubDate>Wed, 25 Jun 2008 23:52:29 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[Development]]></category>

		<category><![CDATA[FreeBSD]]></category>

		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/?p=210</guid>
		<description><![CDATA[I wanted to create a script that would convert a normal IP address to a long IP, just like mIRC Script&#8217;s $longip alias.
$longip(address)
Converts an IP address into a long value and  vice-versa.
$longip(158.152.50.239)  returns  2660774639
$longip(2660774639)       returns  158.152.50.239
What I was originally trying to do was increase an IP by 1, but due to the [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "longip script", url: "http://www.hm2k.com/posts/longip-script" });</script>]]></description>
			<content:encoded><![CDATA[<p>I wanted to create a script that would convert a normal IP address to a long IP, just like mIRC Script&#8217;s $longip alias.</p>
<blockquote><p><span style="font-weight: bold; font-size: 9pt; font-family: 'Verdana'; color: #00007f;">$longip(address)</span></p>
<p><span style="color: #000000;">Converts an IP address into a long value and  vice-versa.</span></p>
<p><span style="color: #000000;">$longip(158.152.50.239)  returns  2660774639</span></p>
<p><span style="color: #000000;">$longip(2660774639)       returns  158.152.50.239</span></p></blockquote>
<p>What I was originally trying to do was increase an IP by 1, but due to the octets only allowing up to 255, this became increasingly difficult to do.</p>
<p>What I decided to do in the end was convert the IP to a &#8220;longip&#8221; then increase it by 1, then convert the IP BACK to normal IP.</p>
<p>This required a way to convert an IP to and from longIP, I was told it could be done purely using shell script, here&#8217;s what I did&#8230;</p>
<p><span id="more-210"></span></p>
<p>I decided that shell script wasn&#8217;t powerful enough for what I wanted, and that I could do it easier in perl, this is the result:</p>
<blockquote><p>#!/usr/bin/perl</p>
<p># longip by HM2K 2008 (Updated: 17/01/08)</p>
<p># Description: Converts (Short) IPs to Long Ips and visa versa.<br />
# Usage: ./longip.pl &lt;ip&gt;</p>
<p>use warnings;<br />
use strict;<br />
use Socket;</p>
<p>sub longip {<br />
my $input=shift;<br />
if ($input =~ /\d+\.\d+\.\d+\.\d+/) { return ip2long($input); }<br />
else { return long2ip($input); }<br />
}</p>
<p>sub ip2long { return unpack(&#8221;l*&#8221;, pack(&#8221;l*&#8221;, unpack(&#8221;N*&#8221;, inet_aton(shift)))); }</p>
<p>sub long2ip { return inet_ntoa(pack(&#8221;N*&#8221;, shift)); }</p>
<p>print longip(shift);</p></blockquote>
<p>Thanks for the assistance from #perlhelp (EFnet).</p>
<p>It&#8217;s also worth noting that cls (EFnet) created a full shell script version called &#8220;ipconv.sh&#8221;, which is about 50 long lines in total (too long for such a simple task imo). If you ask him (or me) nicely, you may receive a copy.</p>
<p>Enjoy!</p>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=longip+script&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2Flongip-script" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/longip-script/feed</wfw:commentRss>
		</item>
		<item>
		<title>What is a hacker?</title>
		<link>http://www.hm2k.com/posts/what-is-a-hacker</link>
		<comments>http://www.hm2k.com/posts/what-is-a-hacker#comments</comments>
		<pubDate>Wed, 18 Jun 2008 11:37:36 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[Rants]]></category>

		<category><![CDATA[Sysadmin]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/?p=208</guid>
		<description><![CDATA[Not so long ago I was asked to answer some questions for a friend of a friend who was writing a dissertation about the &#8220;hacking and warez scene&#8221; (which I have not been heavily involved in since I turned 18).
As I had known him for a long time, I felt obliged to help out, plus [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "What is a hacker?", url: "http://www.hm2k.com/posts/what-is-a-hacker" });</script>]]></description>
			<content:encoded><![CDATA[<p>Not so long ago I was asked to answer some questions for a friend of a friend who was writing a dissertation about the &#8220;hacking and warez scene&#8221; (which I have not been heavily involved in since I turned 18).</p>
<p>As I had known him for a long time, I felt obliged to help out, plus I was now interested in the questions that would be asked.</p>
<p>Based on what had been said I knew they were going to be questions on defining what a hacker is and what a hacker does, something i&#8217;ve been interested in defining for quite some time.</p>
<p>Here&#8217;s what I said&#8230;</p>
<p><span id="more-208"></span></p>
<p><strong>What would you say a hacker was in your eyes?</strong></p>
<p>It&#8217;s said that a hacker is not defined by the hacker but by others. You soon come to realise the extent of this when the media use this term in an article about you.</p>
<p>How I see it is the term is used based on a level of understanding. For your average Joe, who knows very little about computers outside of your basic office package may view someone who does, as a hacker.</p>
<p>However, it&#8217;s likely that these more knowledgeable people do not view themselves as a hacker, and also have formed their own opinion of what a hacker is or more importantly, what a hacker isn&#8217;t.</p>
<p><strong>Do you know about the history of Hackers?</strong></p>
<p>Traditionally a &#8220;hacker&#8221; was defined as a programmer who &#8220;hacked&#8221; up bits of code. It&#8217;s since been defined in many different ways.</p>
<p>Ever since the first network was constructed, there has always been hackers. If you build a system people will inevitably mess with it.</p>
<p><strong>What type of things do they do?</strong></p>
<p>These days hackers are most notorious for breaching the security of a system. However, hackers come in different forms and colours.</p>
<p><strong>Have you ever heard of the hacker ethic?</strong></p>
<p>Well, you have your white hat hackers, commonly known as &#8220;security experts&#8221; and are often employed by companies to test the security of systems if not, to prove a point or as a proof of concept. Ultimately their skills are used for good purposes, sort of hackers with morals if you like.</p>
<p>On the other hand you have your black hat hackers who use their skills for various &#8220;bad&#8221; purposes, whether it be for illegal activities, or purely malicious.</p>
<p>You also have your grey hat hackers, a cross between the two. They generally do bad things, such as breach system security, to achieve things that are in their mind, are morally &#8220;not bad&#8221; or sometimes even good. Usually for personal gain.</p>
<p><strong>So do you think a modder different from a hacker?</strong></p>
<p>In my opinion, and in the context of software engineering a hacker is someone who essentially makes the application do something it wouldn&#8217;t normally do, which is effectively exactly what a modder is.</p>
<p>If you look at some of the most commonly used software applications used at the very core of the internet including Apache HTTPD and BIND, you&#8217;ll soon find out that they are made up from many of hacks and workarounds.</p>
<p>However, this type of hacker would be a software hacker, a distant relative of the security expert.</p>
<p><strong>So how did you get started and involved in it?</strong></p>
<p>I have always been interested in computers, but I have also always been interested in how things work.</p>
<p>When I was young I didn&#8217;t understand enough about the &#8220;hacking&#8221; to actually be involved with it. I did however learn a little about phreaking and then software cracking.</p>
<p>I found it very interesting that with the changes of a few bytes in an executable file you could turn a piece of software from trial to registered. To achieve this you must understand the basics of reverse engineering software. Often a painstakingly slow process, with little reward at the end.</p>
<p>Eventually through increasing use of the internet, I began to learn about web servers, mail servers, rfcs, dns, irc and all sorts of scripting languages including perl and php.</p>
<p>It wasn&#8217;t long before I begin hacking around with these platforms and protocols.</p>
<p><strong>How did you learn?</strong></p>
<p>I&#8217;m very much a hands on learner. I learn by example, and trial and error, not by reading a book.</p>
<p>Having said that, I have read books on all sorts of computer related topics, but I find them more useful as a resource than a method of learning.</p>
<p>It&#8217;s taken me quite some time to fully understand some aspects of the way the things work, especially online. I think learning the basics of how computers and the internet works is a very important part of education.</p>
<p>It concerns me that the kids of today, although they know how to use computers and the internet, do not understand how it works.</p>
<p><strong>What do you make of the new generation of hackers?</strong></p>
<p>I don&#8217;t think there is a new generation of hackers. What I see are the same old spammers, but all new developers.</p>
<p>Hacking has evolved.</p>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=What+is+a+hacker%3F&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2Fwhat-is-a-hacker" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/what-is-a-hacker/feed</wfw:commentRss>
		</item>
		<item>
		<title>Friendly URLs (revisited)</title>
		<link>http://www.hm2k.com/posts/friendly-urls</link>
		<comments>http://www.hm2k.com/posts/friendly-urls#comments</comments>
		<pubDate>Mon, 16 Jun 2008 12:02:59 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[Apache]]></category>

		<category><![CDATA[Google]]></category>

		<category><![CDATA[PHP]]></category>

		<category><![CDATA[SEO]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/posts/friendly-urls</guid>
		<description><![CDATA[Turn dynamic URLs into friendly URLs
I&#8217;m sure we&#8217;re all familiar with URLs that look like this:
http://www.example.com/?nav=page
These type of URLs aren&#8217;t particularly &#8220;friendly&#8221;, inf act they are so ugly that even Google doesn&#8217;t like them!
Google suggests that search engines do not like dynamic URLs as much as static URLs.
&#8220;static&#8221; or &#8220;friendly&#8221; versions of the above URL [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "Friendly URLs (revisited)", url: "http://www.hm2k.com/posts/friendly-urls" });</script>]]></description>
			<content:encoded><![CDATA[<p><strong>Turn dynamic URLs into friendly URLs</strong></p>
<p>I&#8217;m sure we&#8217;re all familiar with <a href="http://en.wikipedia.org/wiki/URL" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">URLs</a> that look like this:</p>
<blockquote><p>http://www.example.com/?nav=page</p></blockquote>
<p>These type of URLs aren&#8217;t particularly &#8220;friendly&#8221;, inf act they are so ugly that even Google doesn&#8217;t like them!</p>
<p><a href="http://www.google.co.uk/intl/en/webmasters/guidelines.html" onclick="javascript:urchinTracker ('/outbound/article/www.google.co.uk');"><span id="more-53"></span>Google</a> suggests that search engines do not like dynamic URLs as much as static URLs.</p>
<p>&#8220;static&#8221; or &#8220;friendly&#8221; versions of the above URL could be as follows:</p>
<blockquote><p>http://www.example.com/page.html</p></blockquote>
<p>Apache&#8217;s <a href="http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html" onclick="javascript:urchinTracker ('/outbound/article/httpd.apache.org');">mod_rewrite</a> can be easily used via a file called &#8220;.htaccess&#8221; to turn dynamic urls into friendly urls.</p>
<p>Here is an example of how it&#8217;s done:</p>
<blockquote><p>#Turn on the Rewrite Engine<br />
RewriteEngine on<br />
#Set the base path<br />
RewriteBase /<br />
#Check that the lookup isn&#8217;t an existing file<br />
RewriteCond %{REQUEST_FILENAME} !-f<br />
#Check that the lookup isn&#8217;t an existing directory<br />
RewriteCond %{REQUEST_FILENAME} !-d<br />
#Check that the file isn&#8217;t index.php (avoid looping)<br />
RewriteCond %{REQUEST_URI} !^index\.php$<br />
#Force all .html lookups to the index file<br />
RewriteRule (.+)*\.html index.php?nav=$1 [QSA,L]<br />
#Note: QSA=query string append;L=Last, no more rules</p></blockquote>
<p>This will rewrite all paths ending in &#8220;.html&#8221; to your index file.</p>
<p>From there, it&#8217;s simply a case of tailoring the rewrite to your requirements.</p>
<p>Checkout the <a href="http://www.ilovejackdaniels.com/mod_rewrite_cheat_sheet.png" onclick="javascript:urchinTracker ('/outbound/article/www.ilovejackdaniels.com');">mod_rewrite cheat sheet</a> for more help on rewrites.</p>
<p>If you ARE using PHP, a better way might be to just hand over ALL the path information to your &#8220;index.php&#8221; and handle it from there, the rewrite to do that looks something like this:</p>
<blockquote><p>RewriteEngine on<br />
RewriteBase /<br />
RewriteCond %{REQUEST_FILENAME} !-f<br />
RewriteCond %{REQUEST_FILENAME} !-d<br />
RewriteCond %{REQUEST_URI} !^index\.php$<br />
RewriteRule ^(.+)$ index.php/$1 [QSA,L]</p></blockquote>
<p>As per above this will only rewrite paths that don&#8217;t exist.</p>
<p>In your &#8220;index.php&#8221;, you can parse $_SERVER['PATH_INFO'] (and sometimes $_SERVER['ORIG_PATH_INFO']) for the path information.</p>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=Friendly+URLs+%28revisited%29&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2Ffriendly-urls" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/friendly-urls/feed</wfw:commentRss>
		</item>
		<item>
		<title>What is a valid email address?</title>
		<link>http://www.hm2k.com/posts/what-is-a-valid-email-address</link>
		<comments>http://www.hm2k.com/posts/what-is-a-valid-email-address#comments</comments>
		<pubDate>Tue, 03 Jun 2008 16:31:12 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[Development]]></category>

		<category><![CDATA[Email]]></category>

		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/?p=205</guid>
		<description><![CDATA[With the on-going abuse to email based systems, we are in need of ways to validate the email addresses we&#8217;re handling.
We all know what an email address looks like, we see them and use them every single day. But how do you know if it&#8217;s valid or not? The next obvious question should be, what [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "What is a valid email address?", url: "http://www.hm2k.com/posts/what-is-a-valid-email-address" });</script>]]></description>
			<content:encoded><![CDATA[<p>With the on-going abuse to email based systems, we are in need of ways to validate the email addresses we&#8217;re handling.</p>
<p>We all know what an email address looks like, we see them and use them every single day. But how do you know if it&#8217;s valid or not? The next obvious question should be, what defines a valid email address?</p>
<p>This is what I intend on investigating.</p>
<p><span id="more-205"></span></p>
<p>Before you begin, I would like you make you aware of the difference between validation and verification, which is as follows:</p>
<blockquote><p>Validation is a check to ensure it is true to the specification (eg: is the number N digits long?). Not to be confused with verification which is a check to ensure it is correct within the intended system (eg: does the number work when phoned?).</p></blockquote>
<p>A good starting point for anyone to investigating what anything is, is Wikipedia. So, as to make this easy to follow, that&#8217;s where we&#8217;re going to start, by looking at the &#8220;<a href="http://en.wikipedia.org/wiki/E-mail_address" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">E-mail address</a>&#8221; article.</p>
<p>As you read the article, you&#8217;ll soon find out about the limitations and validation (not to be confused with <a href="http://en.wikipedia.org/wiki/E-mail_authentication" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">authentication</a>) set by the <a href="http://en.wikipedia.org/wiki/Request_for_Comments" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">RFCs</a>.  The earliest RFC with regards to email was [<a href="http://tools.ietf.org/html/rfc822" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC822</a>], which was made obsolete by [<a href="http://tools.ietf.org/html/rfc2822" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822</a>]. There are other RFCs you should perhaps also pay attention to which are listed in the article, however I intend on going over these later.</p>
<p>To fully understand how to find out what a valid email address is, we need to fully understand what an RFC is and why we need them.</p>
<p>An RFC (request for comments) essentially is a way in which internet developers can set standards and protocols. The RFCs we need to be focusing on are the ones relating to email, as they will tell us exactly what defines an email address as an email address. Thus in order for us to fully understand what defines an email as valid, we MUST read the RFCs.</p>
<p>RFCs however, aren&#8217;t easy, they are written what appears to be a mystical language, that looks like English, but it isn&#8217;t. Okay, so maybe it&#8217;s not that bad, but it isn&#8217;t exactly a straight forward task to translate it into &#8220;Plain English&#8221;.</p>
<p>After reading <a href="http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-address-until-i.aspx" onclick="javascript:urchinTracker ('/outbound/article/haacked.com');">I Knew How To Validate An Email Address Until I Read The RFC</a> and <a href="http://www.pgregg.com/projects/php/code/showvalidemail.php" onclick="javascript:urchinTracker ('/outbound/article/www.pgregg.com');">Paul Gregg&#8217;s Demonstrating why email regexs are poor</a>, I knew this wasn&#8217;t going to be easy.</p>
<p>To utilise the specification written in the RFC, we need to convert it into a usable language. In this case we will be using regular expressions within PHP. This article assumes you understand PHP and regular expressions, or will at least try&#8230;</p>
<p>And so I decided to start translating [<a href="http://tools.ietf.org/html/rfc2822" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822</a>] into PHP based regular expressions.</p>
<p>The RFC often provides binary encoded US-ASCII characters and standard characters, in most cases I will translate them to hexadecimal encoding using <a href="http://www.php.net/chr" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">chr()</a>, <a href="http://www.php.net/ord" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">orc()</a> and <a href="http://www.php.net/dechex" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">dechex()</a> (eg: %d109 -&gt; chr(109) -&gt; m -&gt; orc(m) -&gt; 109 -&gt; dechex(109) -&gt; \\x6D).</p>
<p><em>Note: The PHP code here is for display purposes only, it may not actually work due to the changes wordpress makes to the formatting (in particular to the double quotes), if you require the proper code, it is available on request.</em></p>
<blockquote><p>FROM: General Description [<a href="http://tools.ietf.org/html/rfc2822#section-2.1" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 2.1</a>]<br />
Messages are divided into lines of characters.  A line is a series of<br />
characters that is delimited with the two characters carriage-return<br />
and line-feed; that is, the carriage return (CR) character (ASCII<br />
value 13) followed immediately by the line feed (LF) character (ASCII<br />
value 10).  (The carriage-return/line-feed pair is usually written in<br />
this document as &#8220;CRLF&#8221;.)</p></blockquote>
<p>$CR            = &#8220;\\x0d&#8221;;<br />
$LF            = &#8220;\\x0a&#8221;;<br />
$CRLF        = &#8220;(?:$CR$LF)&#8221;;</p>
<blockquote><p>FROM: Primative Tokens [<a href="http://tools.ietf.org/html/rfc2822#section-3.2.1" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 3.2.1</a>]</p>
<p>The following are primitive tokens referred to elsewhere in this<br />
standard, but not otherwise defined in [http://tools.ietf.org/html/rfc2234 RFC2234].  Some of them will<br />
not appear anywhere else in the syntax, but they are convenient to<br />
refer to in other parts of this document.</p>
<p>NO-WS-CTL       =       %d1-8 /         ; US-ASCII control characters<br />
%d11 /          ;  that do not include the<br />
%d12 /          ;  carriage return, line feed,<br />
%d14-31 /       ;  and white space characters<br />
%d127<br />
text            =       %d1-9 /         ; Characters excluding CR and LF<br />
%d11 /<br />
%d12 /<br />
%d14-127 /<br />
obs-text<br />
specials        =       &#8220;(&#8221; / &#8220;)&#8221; /     ; Special characters used in<br />
&#8220;&lt;&#8221; / &#8220;&gt;&#8221; /     ;  other parts of the syntax<br />
&#8220;[" / "]&#8221; /<br />
&#8220;:&#8221; / &#8220;;&#8221; /<br />
&#8220;@&#8221; / &#8220;\&#8221; /<br />
&#8220;,&#8221; / &#8220;.&#8221; /<br />
DQUOTE</p>
<p>No special semantics are attached to these tokens.  They are simply<br />
single characters.</p></blockquote>
<p>$NO_WS_CTL        = &#8220;[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x7f]&#8220;;<br />
$text            = &#8220;[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f]&#8220;;<br />
$DQUOTE            = &#8220;\\x22&#8243;;<br />
$specials        = &#8220;[\\x28\\x29\\x3c\\x3e\\x5b\\x5d\\x3a\\x3b\\x40\\x5c\\x2c\\x2e$DQUOTE]&#8220;;</p>
<blockquote><p>FROM: Miscellaneous obsolete tokens [<a href="http://tools.ietf.org/html/rfc2822#section-4.1" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 4.1</a>]<br />
obs-qp          =       &#8220;\&#8221; (%d0-127)<br />
obs-text        =       *LF *CR *(obs-char *LF *CR)<br />
obs-char        =       %d0-9 / %d11 /          ; %d0-127 except CR and<br />
%d12 / %d14-127         ;  LF</p></blockquote>
<p>$obs_qp            = &#8220;(?:\\x5c[\\x00-\\x7f])&#8221;;<br />
$obs_char        = &#8220;[\\x00-\\x09\\x0b\\x0c\\x0e-\\x7f]&#8220;;<br />
$obs_text        = &#8220;(?:$LF*$CR*(?:$obs_char$LF*$CR*)*)&#8221;;</p>
<blockquote><p>FROM: Structured Header Field Bodies [<a href="http://tools.ietf.org/html/rfc2822#section-2.2.2" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 2.2.2</a>]<br />
the space (SP, ASCII value 32) and horizontal tab (HTAB, ASCII value 9) characters<br />
(together known as the white space characters, WSP)</p></blockquote>
<p>$WSP            = &#8220;[\\x20\\x09]&#8220;;</p>
<blockquote><p>FROM: Obsolete folding white space [<a href="http://tools.ietf.org/html/rfc2822#section-4.2" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 4.2</a>]<br />
obs-FWS         =       1*WSP *(CRLF 1*WSP)</p></blockquote>
<p>$obs_FWS        = &#8220;(?:$WSP+(?:$CRLF$WSP+)*)&#8221;;</p>
<blockquote><p>FROM: Quoted characters [<a href="http://tools.ietf.org/html/rfc2822#section-3.2.2" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 3.2.2</a>]<br />
quoted-pair     =       (&#8221;\&#8221; text) / obs-qp</p></blockquote>
<p>$quoted_pair    = &#8220;(?:\\x5c$text|$obs_qp)&#8221;;</p>
<blockquote><p>FROM: Folding white space and comments [<a href="http://tools.ietf.org/html/rfc2822#section-3.2.3" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 3.2.3</a>]<br />
FWS             =       ([*WSP CRLF] 1*WSP) /   ; Folding white space<br />
obs-FWS<br />
ctext           =       NO-WS-CTL /     ; Non white space controls</p>
<p>%d33-39 /       ; The rest of the US-ASCII<br />
%d42-91 /       ;  characters not including &#8220;(&#8221;,<br />
%d93-126        ;  &#8220;)&#8221;, or &#8220;\&#8221;<br />
ccontent        =       ctext / quoted-pair / comment<br />
comment         =       &#8220;(&#8221; *([FWS] ccontent) [FWS] &#8220;)&#8221;<br />
CFWS            =       *([FWS] comment) (([FWS] comment) / FWS)</p></blockquote>
<p>$FWS            = &#8220;(?:(?:(?:$WSP*$CRLF)?$WSP*)|$obs_FWS)&#8221;;<br />
$ctext            = &#8220;(?:$NO_WS_CTL|[\\x21-\\x27\\x2A-\\x5b\\x5d-\\x7e])&#8221;;<br />
$ccontent        = &#8220;(?:$ctext|$quoted_pair)&#8221;;<br />
/* NOTICE: &#8216;ccontent&#8217; translated only partially to avoid an infinite loop. */<br />
$comment        = &#8220;(?:\\x28((?:$FWS?(?:$ccontent|(?1)))*$FWS?\\x29))&#8221;;<br />
$CFWS           = &#8220;((?:$FWS?$comment)*(?:(?:$FWS?$comment)|$FWS))&#8221;;</p>
<blockquote><p>FROM: Atom [<a href="http://tools.ietf.org/html/rfc2822#section-3.2.4" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 3.2.4</a>]<br />
atext           =       ALPHA / DIGIT / ; Any character except controls,<br />
&#8220;!&#8221; / &#8220;#&#8221; /     ;  SP, and specials.<br />
&#8220;$&#8221; / &#8220;%&#8221; /     ;  Used for atoms<br />
&#8220;&amp;&#8221; / &#8220;&#8216;&#8221; /<br />
&#8220;*&#8221; / &#8220;+&#8221; /<br />
&#8220;-&#8221; / &#8220;/&#8221; /<br />
&#8220;=&#8221; / &#8220;?&#8221; /<br />
&#8220;^&#8221; / &#8220;_&#8221; /<br />
&#8220;`&#8221; / &#8220;{&#8221; /<br />
&#8220;|&#8221; / &#8220;}&#8221; /<br />
&#8220;~&#8221;<br />
atom            =       [CFWS] 1*atext [CFWS]<br />
dot-atom        =       [CFWS] dot-atom-text [CFWS]<br />
dot-atom-text   =       1*atext *(&#8221;.&#8221; 1*atext)</p></blockquote>
<p>$ALPHA            = &#8216;[\\x41-\\x5a\\x61-\\x7a]&#8216;;<br />
$DIGIT            = &#8216;[\\x30-\\x39]&#8216;;<br />
$atext            = &#8220;(?:$ALPHA|$DIGIT|[\\x21\\x23-\\x27\\x2a\\x2b\\x2d\\x2f\\x3d\\x3f\\x5e\\x5f\\x60\\x7b-\\x7e])&#8221;;<br />
$atom            = &#8220;(?:$CFWS?$atext+$CFWS?)&#8221;;<br />
$dot_atom_text    = &#8220;(?:$atext+(?:\\x2e$atext+)*)&#8221;;<br />
$dot_atom        = &#8220;(?:$CFWS?$dot_atom_text$CFWS?)&#8221;;</p>
<blockquote><p>FROM: Quoted strings [<a href="http://tools.ietf.org/html/rfc2822#section-3.2.5" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 3.2.5</a>]<br />
qtext           =       NO-WS-CTL /     ; Non white space controls</p>
<p>%d33 /          ; The rest of the US-ASCII<br />
%d35-91 /       ;  characters not including &#8220;\&#8221;<br />
%d93-126        ;  or the quote character<br />
qcontent        =       qtext / quoted-pair<br />
quoted-string   =       [CFWS]<br />
DQUOTE *([FWS] qcontent) [FWS] DQUOTE<br />
[CFWS]</p></blockquote>
<p>$qtext            = &#8220;(?:$NO_WS_CTL|[\\x21\\x23-\\x5b\\x5d-\\x7e])&#8221;;<br />
$qcontent        = &#8220;(?:$qtext|$quoted_pair)&#8221;;<br />
$quoted_string    = &#8220;(?:$CFWS?\\x22(?:$FWS?$qcontent)*$FWS?\\x22$CFWS?)&#8221;;</p>
<blockquote><p>FROM: Miscellaneous tokens [<a href="http://tools.ietf.org/html/rfc2822#section-3.2.6" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 3.2.6</a>]<br />
word            =       atom / quoted-string</p></blockquote>
<p>$word            = &#8220;(?:$atom|$quoted_string)&#8221;;</p>
<blockquote><p>Obsolete Addressing [http://tools.ietf.org/html/rfc2822#section-4.4 RFC2822 Section 4.4]<br />
obs-local-part  =       word *(&#8221;.&#8221; word)<br />
obs-domain      =       atom *(&#8221;.&#8221; atom)</p></blockquote>
<p>$obs_local_part    = &#8220;(?:$word(?:\\x2e$word)*)&#8221;;<br />
$obs_domain        = &#8220;(?:$atom(?:\\x2e$atom)*)&#8221;;</p>
<blockquote><p>FROM: Addr-spec specification [<a href="http://tools.ietf.org/html/rfc2822#section-3.4.1" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822 Section 3.4.1</a>]</p>
<p>addr-spec       =       local-part &#8220;@&#8221; domain<br />
local-part      =       dot-atom / quoted-string / obs-local-part<br />
domain          =       dot-atom / domain-literal / obs-domain<br />
domain-literal  =       [CFWS] &#8220;[" *([FWS] dcontent) [FWS] &#8220;]&#8221; [CFWS]<br />
dcontent        =       dtext / quoted-pair<br />
dtext           =       NO-WS-CTL /     ; Non white space controls</p>
<p>%d33-90 /       ; The rest of the US-ASCII<br />
%d94-126        ;  characters not including &#8220;[",<br />
;  "]&#8220;, or &#8220;\&#8221;</p></blockquote>
<p>$dtext            = &#8220;(?:$NO_WS_CTL|[\\x21-\\x5a\\x5e-\\x7e])&#8221;;<br />
$dcontent        = &#8220;(?:$dtext|$quoted_pair)&#8221;;<br />
$domain_literal    = &#8220;(?:$CFWS?\\x5b(?:$FWS?$dcontent)*$FWS?\\x5d$CFWS?)&#8221;;<br />
$local_part        = &#8220;(?:$dot_atom|$quoted_string|$obs_local_part)&#8221;;<br />
$domain            = &#8220;(?:$dot_atom|$domain_literal|$obs_domain)&#8221;;<br />
$addr_spec        = &#8220;($local_part\\x40$domain)&#8221;;</p>
<p>There we have it, how to validate an email address according to [<a href="http://tools.ietf.org/html/rfc2822" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822</a>].</p>
<p>However, let&#8217;s stop right there and reflect on what we have here. What we have is regular expression based on [<a href="http://tools.ietf.org/html/rfc2822" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822</a>] that must be correct, but does it work? are there any problems? Well yes, there are some problems&#8230;</p>
<ul>
<li>The comments, and content of comments have an infinite loop due to possible nested comments.</li>
<li>It does not appear to validate folding white space where it should.</li>
<li>It does not correctly validate domain literals (IP addresses), they are simply not validated by [<a href="http://tools.ietf.org/html/rfc2822" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822</a>], which means that IP addresses that (under current protocol) are invalid (eg: 300.300.300.300) .</li>
<li>Domain names are not validated correctly either, IP addresses are allowed, when they shouldn&#8217;t be, and certain characters are allowed in places they shouldn&#8217;t, like dash (-) at the start or end of a domain name (eg: test@-example.com).</li>
<li>Length is no concern, email addresses can be as long as you like, much like the regex.</li>
<li>There are many more RFC&#8217;s to investigate and translate before we can fully validate all parts of an email address.</li>
<li>The email address validation regular expression according to [<a href="http://tools.ietf.org/html/rfc2822" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822</a>] ALONE is almost 20,000 characters long, that&#8217;s BEFORE we look into solving these other issues.</li>
</ul>
<p>This is simply <strong>unacceptable</strong>.</p>
<p>Although there are fixes and workarounds, in the form of stripping, and further validation based on other RFCs I began to feel that this wasn&#8217;t really suitable for validating real world email addresses.</p>
<p>Ultimately I feel that unless you&#8217;re building an mail client or an mail server sticking so strictly to the RFC (especially [<a href="http://tools.ietf.org/html/rfc2822" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2822</a>]) isn&#8217;t always going to give you the best results, in real world situations.</p>
<p>Look around, email addresses in the real world aren&#8217;t so strict and are far more loosely defined.</p>
<ul>
<li>No folding white space (FWS) - I&#8217;ve never seen a multi-line email address field for a single address.</li>
<li>No comments (CFWS) - Comments simply do not belong in an email address, they can go else where.</li>
<li>No quotes - When was the last time you saw quoted text in an email address?</li>
<li>No IP addresses, domains only - They are only used in temporary circumstances, not live.</li>
<li>No new lines - they could result in &#8220;email header injection&#8221;.</li>
<li>Reasonable lengths - both parts, and the whole thing needs to be kept to a reasonable maximum length.</li>
<li>The domain part doesn&#8217;t need to be so strict - We can easily verify it later using DNS.</li>
<li>TLDs need to be future proof - Don&#8217;t restrict yourself to a set list. Don&#8217;t forget about <a href="http://idn.icann.org/" onclick="javascript:urchinTracker ('/outbound/article/idn.icann.org');">IDN</a>.</li>
<li>Most RFCs are outdated, and unreliable - Remember they are technical documents for servers and clients, but not for real world situations.</li>
<li>Only need to validate real world email addresses - Don&#8217;t be concerned with edge case test samples.</li>
</ul>
<p>Hence forth, the rest of this article will concentrate on this &#8220;less strict&#8221; or &#8220;LOOSE&#8221; specification, defined by real world situations, rather than technical.</p>
<p>Upon going back to the drawing board I discovered [<a href="http://www.apps.ietf.org/rfc/rfc3696.html" onclick="javascript:urchinTracker ('/outbound/article/www.apps.ietf.org');">RFC3696</a>], written by the guy who wrote [<a href="http://www.apps.ietf.org/rfc/rfc2881.html" onclick="javascript:urchinTracker ('/outbound/article/www.apps.ietf.org');">RFC2881</a>] (SMTP). This will give us the basics of what is required for a valid email address.</p>
<p>[<a href="http://www.apps.ietf.org/rfc/rfc3696.html#sec-3" onclick="javascript:urchinTracker ('/outbound/article/www.apps.ietf.org');">RFC3696 Section 3</a>] entitled &#8220;Restrictions on email addresses&#8221; states:</p>
<blockquote><p>Contemporary email addresses consist of a &#8220;local part&#8221; separated from    a &#8220;domain part&#8221; (a fully-qualified domain name) by an at-sign (&#8221;@&#8221;).</p></blockquote>
<p>We&#8217;ll look at the &#8220;local part&#8221; first.</p>
<p>First off, as above, we will be overlooking quoted forms.</p>
<blockquote><p>[<a href="http://www.apps.ietf.org/rfc/rfc3696.html#sec-3" onclick="javascript:urchinTracker ('/outbound/article/www.apps.ietf.org');">RFC3696 Section 3</a>]</p>
<p>&#8220;These quoted    forms are rarely recommended, and are uncommon in practice&#8221;</p></blockquote>
<p>We&#8217;ll ignore anything about using quotes, &#8220;real world&#8221; email addresses don&#8217;t contain quotes.</p>
<blockquote><p>[<a href="http://www.apps.ietf.org/rfc/rfc3696.html#sec-3" onclick="javascript:urchinTracker ('/outbound/article/www.apps.ietf.org');">RFC3696 Section 3</a>]</p>
<p>Without quotes, local-parts may consist of any combination of    alphabetic characters, digits, or any of the special characters</p>
<pre>      ! # $ % &amp; ' * + - / = ?  ^ _ ` . { | } ~</pre>
<p>period (&#8221;.&#8221;) may also appear, but may not be used to start or end the    local part, nor may two or more consecutive periods appear.</p></blockquote>
<p>&#8220;alphabetic characters&#8221; are &#8220;a-zA-Z&#8221;, digits are &#8220;0-9&#8243;, and special characters appear as above, in PHP based regex, the combination or &#8220;comb&#8221; for short, looks like this:</p>
<blockquote><p>$comb        = &#8216;[a-zA-Z0-9!#$%&amp;\'*+\/=?^`{|}~.-]&#8216;;</p></blockquote>
<p>You&#8217;ll notice that some of the special characters have backslashes (\) next to them, this is to &#8220;escape&#8221; them when being used as a regular expression, as they normally hold special meaning. Also the dash (-) symbol was moved to the end so that it did not act as &#8220;between&#8221;.</p>
<p>Putting this information together, including the bit about periods appearing in the middle, but never two together, that appears like this:</p>
<blockquote><p>$local_part    = &#8220;($comb(?:\.$comb)?)+&#8221;;</p></blockquote>
<p>That&#8217;s the local part done. Now onto the domain part, which we&#8217;ll base on [<a href="http://www.apps.ietf.org/rfc/rfc3696.html#sec-2" onclick="javascript:urchinTracker ('/outbound/article/www.apps.ietf.org');">RFC3696 Section 2</a>].</p>
<blockquote><p>the labels (words or strings    separated by periods) that make up a domain name must consist of only    the ASCII [ASCII] alphabetic and numeric characters, plus the hyphen. No other symbols or punctuation characters are permitted, nor is    blank space.  If the hyphen is used, it is not permitted to appear at    either the beginning or end of a label.  There is an additional rule    that essentially requires that top-level domain names not be all-    numeric.</p></blockquote>
<blockquote><p>Most internet applications that reference other hosts or systems    assume they will be supplied with &#8220;fully-qualified&#8221; domain names,    i.e., ones that include all of the labels leading to the root,    including the TLD name.  Those fully-qualified domain names are then    passed to either the domain name resolution protocol itself or to the    remote systems.  Consequently, purported DNS names to be used in    applications and to locate resources generally must contain at least    one period (&#8221;.&#8221;) character.</p>
<p>A DNS label may be no more than 63 octets long.</p></blockquote>
<p>Although it doesn&#8217;t say it as such in [RFC3696], we are on the understanding that periods cannot appear at the start or end of a domain name, but that is of course because periods are only used to &#8220;separate labels&#8221;.</p>
<p>When building this I had some issues to overcome&#8230;</p>
<ul>
<li>DNS labels cannot start or end with a dash (-), however two or more are allowed together in a label.</li>
<li>TLDs cannot be &#8220;all numerics&#8221;, TLDs are generally all alphabetical, APART from IDN TLDs, which start with &#8220;xn--&#8221;, followed by a string of ASCII characters. This does throw a spanner in the works, however, there&#8217;s one consistency which is seen throughout, which is that all valid TLDs always start with at least 1 alphabetical character, this is what we will check for.</li>
<li>TLDs are generally between 2 and 6 characters, IDN TLDs changes all this, as I have seen IDN TLDs as long as 18 characters in length, the RFC, however says 63.</li>
<li>A label can be 1 character long.</li>
</ul>
<p>Finally, we need to ensure that the length is correct. For this we need to read the [<a href="http://www.rfc-editor.org/errata_search.php?rfc=3696" onclick="javascript:urchinTracker ('/outbound/article/www.rfc-editor.org');">RFC3696 errata</a>].</p>
<blockquote>
<pre class="rfctext">   In addition to restrictions on syntax, there is a length limit on
   email addresses.  That limit is a maximum of 64 characters (octets)
   in the "local part" (before the "@") and a maximum of 255 characters
   (octets) in the domain part (after the "@") for a total length of 320
   characters. However, there is a restriction in RFC 2821 on the length of an
   address in MAIL and RCPT commands of 256 characters.  Since addresses
   that do not fit in those fields are not normally useful, the upper
   limit on address lengths should normally be considered to be 256.</pre>
</blockquote>
<p>When it comes to dealing with lengths in regular expressions, it can often become very confusing, so I wrote this little peice of advice to refer to&#8230;</p>
<blockquote><p>(it){X,Y} means &#8220;see it between X and Y more times&#8221;</p></blockquote>
<p>What we need to do in terms of length is as follows:</p>
<ul>
<li>The &#8220;local-part&#8221; total length must be no longer than 64 characters.</li>
<li>The &#8220;domain-part&#8221; total length must be no longer than 255 characters.</li>
<li>Each &#8220;dns-label&#8221; total length must be no longer than 63 characters.</li>
<li>The entire &#8220;email address&#8221; total length must be no longer than 256 characters.</li>
</ul>
<p>Put this together with the fact that certain elements cannot start or end with certain characters, it makes it difficult to correctly place the end check. Here&#8217;s a run down of that:</p>
<ul>
<li>The &#8220;local-part&#8221; cannot start or end with a period (.)</li>
<li>The &#8220;local-part&#8221; must not have two periods together</li>
<li>A &#8220;dns-label&#8221; cannot start or end with a dash (-)</li>
</ul>
<p>I found that I was unable to satisfy both the lengths and the character placements in a single regular expression. This forced me to make a decision, I could have one or the other, or neither.</p>
<p>I figured that lengths actually hold very little value in validation. Providing the email looks right specific lengths won&#8217;t matter. Besides, we don&#8217;t need regular expressions in order to check lengths, it&#8217;s a very simple principle. It&#8217;s also worth noting that I discovered the local part <a href="mailto:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx@mailinator.com">CAN be over 64 characters</a>, <a href="http://www.mailinator.com/maildir.jsp?email=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx&amp;x=0&amp;y=0" onclick="javascript:urchinTracker ('/outbound/article/www.mailinator.com');">check it out</a>.</p>
<p>After playing around with dots and dashes in various places in email addresses on various servers and clients I soon discovered that it wasn&#8217;t as strict as I had first perceived. I found many examples of dots and dashes where they shouldn&#8217;t be, mainly at on the end of dns-labels (such as &#8220;x-.x.com&#8221;). Ultimately, at least for the &#8220;local-part&#8221;, it&#8217;s down to the user. For both parts verification should be used instead.</p>
<p>So now the local-part now looks like this:</p>
<blockquote><p>$local_part    = &#8220;[a-zA-Z0-9!#\$%&amp;\'\*\+\/=\?\^_`\{\|\}~\.-]+&#8221;;</p></blockquote>
<p>And FINALLY, the domain part looks like this:</p>
<blockquote><p>$consists    = &#8216;[a-zA-Z0-9][a-zA-Z0-9-]*&#8217;;<br />
$label        = &#8220;(?:$consists(?:\.$consists)?)&#8221;;<br />
$tldlabel    = &#8220;(?:[a-zA-Z][a-zA-Z0-9-]+)&#8221;;<br />
$domain        = &#8220;$label\.$tldlabel&#8221;;</p></blockquote>
<p>We now need to bring the two parts back together, separated by an at-sign (@)&#8230;</p>
<blockquote><p>$addr_spec=&#8221;$local_part@$domain&#8221;;</p></blockquote>
<p>Once you&#8217;ve added the syntax to match the start and end position, the resulting regular expression, looks something like this:</p>
<blockquote><p>/^[a-zA-Z0-9!#$%&amp;\'*+\/=?^_`{|}~.-]+@(?:[a-zA-Z0-9][a-zA-Z0-9-]*(?:\.[a-zA-Z0-9][a-zA-Z0-9-]*)?)+\.(?:[a-zA-Z][a-zA-Z0-9-]+)$/i</p></blockquote>
<p>I&#8217;m sure some of you have probably been shouting all the way through this saying that you can shorten the regex, I purposely didn&#8217;t do this to make it easier to follow. However you can shorten [a-zA-Z] by using the &#8220;case insensitive&#8221; modifier allowing you to remove &#8220;A-Z&#8221;, it also might be worth noting that you can use &#8220;\d&#8221; instead of &#8220;0-9&#8243;.</p>
<p>Here&#8217;s what I did:</p>
<blockquote><p>$addr_spec=str_replace(&#8217;a-zA-Z&#8217;,'a-z&#8217;,$addr_spec);<br />
$addr_spec=str_replace(&#8217;0-9&#8242;,&#8217;\d&#8217;,$addr_spec);</p></blockquote>
<p>You may also wish to take it further and consider replacing &#8220;a-z\d&#8221; with &#8220;\w&#8221;, and also removing the extra &#8220;_&#8221;, since &#8220;\w&#8221; means word, which includes &#8220;a-zA-Z0-9_&#8221;.</p>
<p>Here&#8217;s how it looks:</p>
<blockquote><p>/^[\w!#$%&amp;\'*+\/=?^`{|}~.-]+@(?:[a-z\d][a-z\d-]*(?:\.[a-z\d][a-z\d-]*)?)+\.(?:[a-z][a-z\d-]+)$/i</p></blockquote>
<p><strong>Update:</strong> Due to <a href="http://www.php-security.org/MOPB/PMOPB-45-2007.html" onclick="javascript:urchinTracker ('/outbound/article/www.php-security.org');">recent vulnerabilities</a> in PHP&#8217;s very own email address validation regex (FILTER_VALIDATE_EMAIL) used in the var_filter function, it&#8217;s recommended that you use the /D modifier, that will prevent newlines from matching. ie:</p>
<blockquote><p>/^[\w!#$%&amp;\'*+\/=?^`{|}~.-]+@(?:[a-z\d][a-z\d-]*(?:\.[a-z\d][a-z\d-]*)?)+\.(?:[a-z][a-z\d-]+)$/iD</p></blockquote>
<p><strong>Final thoughts</strong></p>
<p>Learning how to correctly validate an email address has been one of the most stressful and time consuming things i&#8217;ve had to do in web development.</p>
<p>RFCs aren&#8217;t easy to understand, they are a complete minefield, and it results in something that is incomprehensible and unmaintainable.</p>
<p>There&#8217;s a lot that can be said for proper validation, so many people get it wrong, and it can mean the difference between a sale and no sale, but there&#8217;s a difference between doing it properly based strictly on technical specification and doing it properly for real world situations.</p>
<p>In order to validate correctly, you must be in touch with the real world, and not get caught up too much in the technical documentation, otherwise you will find yourself far from the original objective.</p>
<p>Thus a lot can be said about the outdated RFCs, and the people who write them. The technical specification is so far out of touch with reality it does not actually work in practice.</p>
<p>Having said all this, of course validation has it&#8217;s limitations and can only do so much. Once you&#8217;ve validated the email address to the best of your ability without compromising too much resources, verification is the next step.</p>
<p>This article for intent and purpose set out to validate an email address. Although basic levels of verification can be done very easily, I feel that it goes beyond the scope of this article.</p>
<p>For more information with regards to email address verification, I suggest you look into the Simple Mail Transfer Protocol (SMTP), details can be found in [<a href="http://tools.ietf.org/html/rfc2821" onclick="javascript:urchinTracker ('/outbound/article/tools.ietf.org');">RFC2821</a>], you may also be interested in the <a href="http://www.php.net/getmxrr" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">getmxrr()</a> function. Also consider the use of DNS to verify the domain name.</p>
<p>I hope you&#8217;ve enjoyed reading this article, it took me a long time to complete, and was quite stressful, but I feel satisfied that I am now fully qualified to validate email addresses to a satisfactory level. I hope that now, you are too.</p>
<p>I look forward to your comments.</p>
<p><strong>Resources<br />
</strong></p>
<ul>
<li><a href="http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html" onclick="javascript:urchinTracker ('/outbound/article/www.ex-parrot.com');">Perl&#8217;s Mail::RFC822::Address</a></li>
<li><a href="http://code.iamcal.com/php/rfc822/rfc2822.phps" onclick="javascript:urchinTracker ('/outbound/article/code.iamcal.com');">Cal&#8217;s is_valid_email_address PHP function</a></li>
<li><a href="http://uk.php.net/manual/en/function.preg-match-all.php#62104" onclick="javascript:urchinTracker ('/outbound/article/uk.php.net');">sinful-music.com&#8217;s mime_extract_rfc2822_address</a></li>
<li><a href="http://SimonSlick.com/VEAF/" onclick="javascript:urchinTracker ('/outbound/article/SimonSlick.com');">SimonSlick&#8217;s Validate Email Address Format</a></li>
<li><a href="http://www.santosj.name/php/stop-doing-email-validation-the-wrong-way/" onclick="javascript:urchinTracker ('/outbound/article/www.santosj.name');">Jacob Santos&#8217;s &#8220;Stop Doing Email Validation the Wrong Way&#8221; rant.</a></li>
<li><a href="http://www.markussipila.info/pub/emailvalidator.php" onclick="javascript:urchinTracker ('/outbound/article/www.markussipila.info');">Validate email addresses using regular expressions</a></li>
<li><a href="http://www.ilovejackdaniels.com/php/email-address-validation/" onclick="javascript:urchinTracker ('/outbound/article/www.ilovejackdaniels.com');">ilovejackdaniels.com on email address validation</a></li>
</ul>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=What+is+a+valid+email+address%3F&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2Fwhat-is-a-valid-email-address" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/what-is-a-valid-email-address/feed</wfw:commentRss>
		</item>
		<item>
		<title>Firefox &#8220;Always On Top&#8221; on Windows XP</title>
		<link>http://www.hm2k.com/posts/firefox-always-on-top</link>
		<comments>http://www.hm2k.com/posts/firefox-always-on-top#comments</comments>
		<pubDate>Mon, 26 May 2008 22:05:24 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[Software]]></category>

		<category><![CDATA[Windows]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/?p=204</guid>
		<description><![CDATA[When watching long streaming online videos I often watch them while I do other things, perhaps even browse other websites. I need something to keep firefox on top!
The problem is the moment that I use another application the focus is taken away from the Firefox window and it goes into the background. This is no [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "Firefox &#8220;Always On Top&#8221; on Windows XP", url: "http://www.hm2k.com/posts/firefox-always-on-top" });</script>]]></description>
			<content:encoded><![CDATA[<p>When watching long streaming online videos I often watch them while I do other things, perhaps even browse other websites. I need something to keep firefox on top!</p>
<p>The problem is the moment that I use another application the focus is taken away from the Firefox window and it goes into the background. This is no good as I can no longer see the video.</p>
<p>I decided to investigate a solution that could keep the Mozilla Firefox window &#8220;Always on Top&#8221;&#8230;</p>
<p><span id="more-204"></span></p>
<p>Like most people my first port of call was the Mozilla Firefox community, I made a post called &#8220;<a href="http://forums.mozillazine.org/viewtopic.php?p=2810517" onclick="javascript:urchinTracker ('/outbound/article/forums.mozillazine.org');">Always on top</a>&#8220;, which asked if there was javascript or a plugin somewhere that can force the window to always be on top.</p>
<p>I quickly discovered that there wasn&#8217;t going to be any kind of easy way to achieve this using Firefox alone.</p>
<p>What I needed was a third party application, ideally something freeware, or even better open source.</p>
<p>Here&#8217;s a short list of what I found:</p>
<ul>
<li><a href="http://www.actualtools.com/windowmanager/" onclick="javascript:urchinTracker ('/outbound/article/www.actualtools.com');">Actual Window Manager</a> - $49.95
<ul>
<li>Seems rather expensive for something that should be free.</li>
</ul>
</li>
<li><a href="http://www.delayedreaction.com/freestuff/index.html" onclick="javascript:urchinTracker ('/outbound/article/www.delayedreaction.com');">Keep On Top</a> - Freeware
<ul>
<li>Very basic, doesn&#8217;t really offer anything special</li>
</ul>
</li>
<li><a href="http://www.abstractpath.com/powermenu/" onclick="javascript:urchinTracker ('/outbound/article/www.abstractpath.com');">AbstractPath PowerMenu</a> - Freeware
<ul>
<li>Gives you context menus that offer: Always On Top, Transparency and Minimize To Tray.</li>
</ul>
</li>
<li><a href="http://rbtray.sourceforge.net/" onclick="javascript:urchinTracker ('/outbound/article/rbtray.sourceforge.net');">RBTray</a> - Open Source
<ul>
<li>Offers the context menus: Always On Top, and Minimize To Tray</li>
</ul>
</li>
<li><a href="http://www.codeproject.com/KB/DLL/WinPin.aspx" onclick="javascript:urchinTracker ('/outbound/article/www.codeproject.com');">WinPin</a> - Open Source
<ul>
<li>Gives you a little pin to the left of your &#8220;Minimise&#8221; button that will &#8220;pin&#8221; the window on top.</li>
</ul>
</li>
</ul>
<p>All in all, I&#8217;d say WinPin does the job just fine, if you also want to Minimise to Tray too, try RBTray.</p>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=Firefox+%26%238220%3BAlways+On+Top%26%238221%3B+on+Windows+XP&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2Ffirefox-always-on-top" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/firefox-always-on-top/feed</wfw:commentRss>
		</item>
		<item>
		<title>Seen script for mIRC updated</title>
		<link>http://www.hm2k.com/posts/seen-script-for-mirc-updated</link>
		<comments>http://www.hm2k.com/posts/seen-script-for-mirc-updated#comments</comments>
		<pubDate>Wed, 21 May 2008 10:33:18 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[IRC]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/?p=201</guid>
		<description><![CDATA[Looking for someone? huh? Well, look no further, this script is designed to keep log of people quiting, parting, being kicked out of and changing their nick&#8230; It also tells you if they are still on IRC, on a different channel or such, its basically the easiest way to keep track of people. It can [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "Seen script for mIRC updated", url: "http://www.hm2k.com/posts/seen-script-for-mirc-updated" });</script>]]></description>
			<content:encoded><![CDATA[<p>Looking for someone? huh? Well, look no further, this script is designed to keep log of people quiting, parting, being kicked out of and changing their nick&#8230; It also tells you if they are still on IRC, on a different channel or such, its basically the easiest way to keep track of people. It can now also tell you when someone last spoke.</p>
<p><span id="more-201"></span>History:</p>
<p>Seen v1.1 - 20/05/08 Lastspoke can work from seen.log now too, added relay so can be used locally, redid input and output, reset your log file<br />
Seen v1.03    - 09/06/06 now has lastspoke, other functions were fixed too.<br />
Seen v1.02    - Cleaned up the script, added this file.<br />
Seen v1.01    - Fixed a few bugs and added new features.<br />
Seen v1.0    - Initial Public Release - Basic Code.</p>
<p>Download <a href="http://www.hm2k.com/?dl=seen.mrc" >seen.mrc </a></p>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=Seen+script+for+mIRC+updated&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2Fseen-script-for-mirc-updated" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/seen-script-for-mirc-updated/feed</wfw:commentRss>
		</item>
		<item>
		<title>50+ PHP optimisation tips revisited</title>
		<link>http://www.hm2k.com/posts/50-php-optimisation-tips-revisited</link>
		<comments>http://www.hm2k.com/posts/50-php-optimisation-tips-revisited#comments</comments>
		<pubDate>Tue, 20 May 2008 17:51:28 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/?p=178</guid>
		<description><![CDATA[After reading an article some time ago entitled &#8220;40 Tips for optimizing your php Code&#8221; (and some others that are suspiciously similar), I decided to redo it, but properly this time with more accurate tips, providing references and citations for each and every one.
The result is this list of over 50 PHP optimisation tips&#8230;
Enjoy!


echo is [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "50+ PHP optimisation tips revisited", url: "http://www.hm2k.com/posts/50-php-optimisation-tips-revisited" });</script>]]></description>
			<content:encoded><![CDATA[<p>After reading an article some time ago entitled &#8220;<a href="http://digg.com/programming/40_Tips_for_optimizing_your_php_Code" onclick="javascript:urchinTracker ('/outbound/article/digg.com');">40 Tips for optimizing your php Code</a>&#8221; (and some others that are suspiciously similar), I decided to redo it, but properly this time with more accurate tips, providing references and citations for each and every one.</p>
<p>The result is this list of over 50 PHP optimisation tips&#8230;</p>
<p>Enjoy!</p>
<p><span id="more-178"></span></p>
<ol>
<li><a href="http://www.php.net/echo" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');"><em>echo</em></a> is faster than <a href="http://www.php.net/print" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');"><em>print</em></a>. <a href="http://web.archive.org/web/20050407085143/http://dynacker.dotgeek.org/printvsecho/" onclick="javascript:urchinTracker ('/outbound/article/web.archive.org');">[Citation]</a></li>
<li>Wrap your string in single quotes (&#8217;) instead of double quotes (&#8221;) is faster because PHP searches for variables inside &#8220;&#8230;&#8221; and not in &#8216;&#8230;&#8217;, use this when you&#8217;re not using variables you need evaluating in your string.  <a href="http://spindrop.us/2007/03/03/php-double-versus-single-quotes/" onclick="javascript:urchinTracker ('/outbound/article/spindrop.us');">[Citation]</a></li>
<li>Use sprintf instead of variables contained in double quotes, it&#8217;s about 10x faster. <a href="http://teroheikkinen.iki.fi/blog/view/php-s_different_echo_methods_performance_comparison.html" onclick="javascript:urchinTracker ('/outbound/article/teroheikkinen.iki.fi');">[Citation]</a></li>
<li>Use <a href="http://www.php.net/echo" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">echo</a>&#8217;s multiple parameters (or stacked) instead of string concatenation. <a href="http://blog.libssh2.org/index.php?/archives/28-How-long-is-a-piece-of-string.html" onclick="javascript:urchinTracker ('/outbound/article/blog.libssh2.org');">[Citation]</a></li>
<li>Use pre-calculations, set the maximum value for your for-loops before and not in the loop. ie: <a href="http://www.php.net/for" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">for</a> ($x=0; $x &lt; count($array); $x), this calls the count() function each time, use $max=count($array) instead before the for-loop starts. <a href="http://www.php.lt/benchmark/phpbench.php" onclick="javascript:urchinTracker ('/outbound/article/www.php.lt');">[Citation]</a></li>
<li>Unset or null your variables to free memory, especially large arrays. <a href="http://lists.nyphp.org/pipermail/talk/2003-January/001855.html" onclick="javascript:urchinTracker ('/outbound/article/lists.nyphp.org');">[Citation]</a></li>
<li>Avoid magic like <a href="http://uk2.php.net/manual/en/language.oop5.overloading.php" onclick="javascript:urchinTracker ('/outbound/article/uk2.php.net');">__get, __set</a>, <a href="http://www.php.net/__autoload" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">__autoload</a>. <a href="http://www.ilia.ws/files/zend_performance.pdf" onclick="javascript:urchinTracker ('/outbound/article/www.ilia.ws');">[Citation]</a></li>
<li>Use require() instead of require_once() where possible. <a href="http://peter.mapledesign.co.uk/weblog/archives/writing-faster-php-code-1-require_once" onclick="javascript:urchinTracker ('/outbound/article/peter.mapledesign.co.uk');">[Citation]</a></li>
<li>Use <a href="http://en.wikipedia.org/wiki/Path_(computing)" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">full paths</a> in includes and requires, less time spent on resolving the OS paths. <a href="http://t3.dotgnu.info/blog/php/include_once-mostly-harmless.html" onclick="javascript:urchinTracker ('/outbound/article/t3.dotgnu.info');">[Citation]</a></li>
<li>require() and include() are identical in every way except require halts if the file is missing. Performance wise there is very little difference. <a href="http://groups.google.com/group/php.general/browse_thread/thread/72332fe1ed21e104/b1650148cd6e3c17?lnk=st&amp;q=php+require+vs+include+performance#b1650148cd6e3c17" onclick="javascript:urchinTracker ('/outbound/article/groups.google.com');">[Citation]</a></li>
<li>Since PHP5, the time of when the script started executing can be found in $_SERVER[’REQUEST_TIME’], use this instead of time() or microtime(). <a href="http://www.php.net/time" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">[Citation]</a></li>
<li><a href="http://www.php.net/pcre" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">PCRE</a> regex is quicker than <a href="http://www.php.net/ereg" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">EREG</a>, but always see if you can use quicker native functions such as <a href="http://www.php.net/strncasecmp" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">strncasecmp</a>, <a href="http://www.php.net/strpbrk" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">strpbrk</a> and <a href="http://www.php.net/stripos" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">stripos</a> instead. <a href="http://talks.php.net/show/php-best-practices/36" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>When parsing with XML in PHP try <a href="http://www.bin-co.com/php/scripts/xml2array/" onclick="javascript:urchinTracker ('/outbound/article/www.bin-co.com');">xml2array</a>, which makes use of the <a href="http://www.php.net/xml" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">PHP XML functions</a>, for HTML you can try PHP&#8217;s <a href="http://www.php.net/dom" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">DOM document</a> or <a href="http://www.php.net/domxml" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">DOM XML</a> in PHP4. <a href="http://htmlparsing.icenine.ca/" onclick="javascript:urchinTracker ('/outbound/article/htmlparsing.icenine.ca');">[Citation]</a></li>
<li><a href="http://www.php.net/str_replace" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">str_replace</a> is faster than <a href="http://www.php.net/preg_replace" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">preg_replace</a>, str_replace is best overall, however <a href="http://www.php.net/strtr" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">strtr</a> is sometimes quicker with larger strings. Using array() inside str_replace is usually quicker than multiple str_replace. <a href="http://www.tummblr.com/web-development/php/php-speed-of-string-replacement-functions/" onclick="javascript:urchinTracker ('/outbound/article/www.tummblr.com');">[Citation]</a></li>
<li>&#8220;else if&#8221; statements are faster than <a href="http://en.wikipedia.org/wiki/Switch_statement" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">select statements</a> aka <a href="http://www.php.net/switch" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">case/switch</a>. <a href="http://www.php.lt/benchmark/phpbench.php" onclick="javascript:urchinTracker ('/outbound/article/www.php.lt');">[Citation]</a></li>
<li>Error suppression with @ is very slow. <a href="http://michelf.com/weblog/2005/bad-uses-of-the-at-operator/" onclick="javascript:urchinTracker ('/outbound/article/michelf.com');">[Citation]</a></li>
<li>To reduce bandwidth usage turn on mod_deflate in Apache v2 <a href="http://howtoforge.com/apache2_mod_deflate" onclick="javascript:urchinTracker ('/outbound/article/howtoforge.com');">[Citation]</a> or for Apache v1 try mod_gzip. <a href="http://talks.php.net/show/php-best-practices/40" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Close your database connections when you&#8217;re done with them. <a href="http://uk.php.net/manual/en/function.mysql-close.php#69063" onclick="javascript:urchinTracker ('/outbound/article/uk.php.net');">[Citation]</a></li>
<li>$row[’id’] is <a href="http://www.moskalyuk.com/blog/php-optimization-tips/1272" onclick="javascript:urchinTracker ('/outbound/article/www.moskalyuk.com');">7 times faster</a> than $row[id], because if you don&#8217;t supply quotes it has to guess which index you meant, assuming you didn&#8217;t mean a constant. <a href="http://www.php.net/constants#language.constants.syntax" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">[Citation]</a></li>
<li>Use &lt;?php &#8230; ?&gt; tags when declaring PHP as all other styles are depreciated, including short tags. <a href="http://talks.php.net/show/php-best-practices/10" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Use strict code, avoid suppressing errors, notices and warnings thus resulting in cleaner code and less overheads. Consider having <a href="http://www.php.net/error_reporting" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">error_reporting(E_ALL)</a> always on. <a href="http://talks.php.net/show/php-best-practices/11" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>PHP scripts are be served at 2-10 times slower by Apache httpd than a static page. Try to use static pages instead of server side scripts. <a href="http://talks.php.net/show/php-best-practices/34" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>PHP scripts (unless cached) are compiled on the fly every time you call them. Install a PHP caching product (such as <a href="http://www.php.net/memcache" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">memcached</a> or <a href="http://eaccelerator.net/" onclick="javascript:urchinTracker ('/outbound/article/eaccelerator.net');">eAccelerator</a> or <a href="http://sourceforge.net/projects/turck-mmcache/" onclick="javascript:urchinTracker ('/outbound/article/sourceforge.net');">Turck MMCache</a>) to typically increase performance by 25-100% by removing compile times. You can even <a href="http://www.cpanel.net/support/docs/ea/ea3/ea3php_php_extensionmgr.html" onclick="javascript:urchinTracker ('/outbound/article/www.cpanel.net');">setup eAccelerator on cPanel using EasyApache3</a>. <a href="http://www.phpfive.net/php-opcode-caching-with-eaccelerator-article45.htm" onclick="javascript:urchinTracker ('/outbound/article/www.phpfive.net');">[Citation]</a></li>
<li>An alternative caching technique when you have pages that don&#8217;t change too frequently is to cache the HTML output of your PHP pages. Try <a href="http://smarty.php.net/" onclick="javascript:urchinTracker ('/outbound/article/smarty.php.net');">Smarty</a> or <a href="http://pear.php.net/Cache_Lite" onclick="javascript:urchinTracker ('/outbound/article/pear.php.net');">Cache Lite</a>.  <a href="http://phplens.com/phpeverywhere/tuning-apache-php" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">[Citation]</a></li>
<li>Use isset where possible in replace of strlen. (ie: if (strlen($foo) &lt; 5) { echo &#8220;Foo is too short&#8221;; } vs. if (!isset($foo{5})) { echo &#8220;Foo is too short&#8221;; } ). <a href="http://blog.dynom.nl/archives/String-length-vs-isset-to-check-string-lengths_20070807_5.html" onclick="javascript:urchinTracker ('/outbound/article/blog.dynom.nl');">[Citation]</a></li>
<li>++$i is faster than $ i++, so <a href="http://www.hudzilla.org/phpwiki/index.php?title=Pre-increment_where_possible" onclick="javascript:urchinTracker ('/outbound/article/www.hudzilla.org');">use pre-increment where possible</a>. <a href="http://talks.php.net/show/php-best-practices/32" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Make use of the countless predefined functions of PHP, don&#8217;t attempt to build your own as the native ones will be far quicker; if you have very time and resource consuming functions, consider writing them as C extensions or modules. <a href="http://talks.php.net/show/php-best-practices/31" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Profile your code. A profiler shows you, which parts of your code consumes how many time. The <a href="http://xdebug.org/" onclick="javascript:urchinTracker ('/outbound/article/xdebug.org');">Xdebug debugger</a> already contains a profiler. Profiling shows you the bottlenecks in overview. <a href="http://talks.php.net/show/php-best-practices/39" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Document your code. <a href="http://talks.php.net/show/php-best-practices/16" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Learn the difference between good and bad code. <a href="http://www.sitepoint.com/blogs/2007/05/25/good-and-bad-php-code/" onclick="javascript:urchinTracker ('/outbound/article/www.sitepoint.com');">[Citation]</a></li>
<li>Stick to coding standards, it will make it easier for you to understand other people&#8217;s code and other people will be able to understand yours. <a href="http://talks.php.net/show/php-best-practices/15" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Separate code, content and presentation: keep your PHP code separate from your HTML. <a href="http://www.ibm.com/developerworks/library/wa-phprock1/index.html" onclick="javascript:urchinTracker ('/outbound/article/www.ibm.com');">[Citation]</a></li>
<li>Don&#8217;t bother using complex template systems such as Smarty, use the one that&#8217;s included in PHP already, see <a href="http://www.php.net/ob_get_contents" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">ob_get_contents</a> and <a href="http://www.php.net/extract" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">extract</a>, and simply pull the data from your database. <a href="http://www.massassi.com/php/articles/template_engines/" onclick="javascript:urchinTracker ('/outbound/article/www.massassi.com');">[Citation]</a></li>
<li>Never trust variables coming from user land (such as from $_POST) use <a href="http://www.php.net/mysql_real_escape_string" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">mysql_real_escape_string</a> when using mysql, and <a href="http://www.php.net/htmlspecialchars" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">htmlspecialchars</a> when outputting as HTML. <a href="http://talks.php.net/show/php-best-practices/19" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>For security reasons never have anything that could expose information about paths, extensions and configuration, such as display_errors or <a href="http://www.php.net/phpinfo" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">phpinfo</a>() in your webroot. <a href="http://talks.php.net/show/php-best-practices/24" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Turn off <a href="http://www.php.net/register_globals" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">register_globals</a> (it&#8217;s disabled by default for a reason!). No script at production level should need this enabled as it is a security risk. Fix any scripts that require it on, and fix any scripts that require it off using <a href="http://uk.php.net/manual/en/security.globals.php#82542" onclick="javascript:urchinTracker ('/outbound/article/uk.php.net');">unregister_globals()</a>. Do this now, as it&#8217;s set to be removed in PHP6. <a href="http://talks.php.net/show/php-best-practices/27" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Avoid using plain text when storing and evaluating passwords to avoid exposure, instead use a hash, such as an <a href="http://www.php.net/md5" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">md5</a> hash. <a href="http://talks.php.net/show/php-best-practices/28" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Use <a href="http://www.php.net/ip2long" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">ip2long</a>() and <a href="http://www.php.net/long2ip" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">long2ip</a>() to store IP addresses as integers instead of strings. <a href="http://blog.rightbrainnetworks.com/2006/09/18/10-things-you-probably-didnt-know-about-php/" onclick="javascript:urchinTracker ('/outbound/article/blog.rightbrainnetworks.com');">[Citation]</a></li>
<li>You can avoid reinventing the wheel by using the PEAR project, giving you existing code of a high standard. <a href="http://www.moskalyuk.com/blog/php-optimization-tips/1272" onclick="javascript:urchinTracker ('/outbound/article/www.moskalyuk.com');">[Citation]</a></li>
<li>When using header(&#8217;Location: &#8216;.$url); remember to follow it with a die(); as the script continues to run even though the location has changed or avoid using it all together where possible. <a href="http://richardlynch.blogspot.com/2007/06/php-header-location-redirect-refresh.html" onclick="javascript:urchinTracker ('/outbound/article/richardlynch.blogspot.com');">[Citation]</a></li>
<li>In <a href="http://www.php.net/oop" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">OOP</a>, if a method can be a <a href="http://en.wikipedia.org/wiki/Method_(computer_science)#Static_methods" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">static method</a>, declare it static. Speed improvement is by a factor of 4. <a href="http://ilia.ws/files/frankfurt_perf.pdf" onclick="javascript:urchinTracker ('/outbound/article/ilia.ws');">[Citation]</a>.</li>
<li>Incrementing a local variable in an OOP method is the fastest. Nearly the same as calling a local variable in a function and incrementing a global variable is 2 times slow than a local variable. <a href="http://phplens.com/lens/php-book/optimizing-debugging-php.php" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">[Citation]</a></li>
<li>Incrementing an object property (eg. $this-&gt;prop++) is 3 times slower than a local variable. <a href="http://phplens.com/lens/php-book/optimizing-debugging-php.php" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">[Citation]</a></li>
<li>Incrementing an undefined local variable is 9-10 times slower than a pre-initialized one. <a href="http://phplens.com/lens/php-book/optimizing-debugging-php.php" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">[Citation]</a></li>
<li>Just declaring a global variable without using it in a function slows things down (by about the same amount as incrementing a local var). PHP probably does a check to see if the global exists. <a href="http://phplens.com/lens/php-book/optimizing-debugging-php.php" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">[Citation]</a></li>
<li>Method invocation appears to be independent of the number of methods defined in the class because I added 10 more methods to the test class (before and after the test method) with no change in performance. <a href="http://phplens.com/lens/php-book/optimizing-debugging-php.php" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">[Citation]</a></li>
<li>Methods in derived classes run faster than ones defined in the base class. <a href="http://phplens.com/lens/php-book/optimizing-debugging-php.php" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">[Citation]</a></li>
<li>A function call with one parameter and an empty function body takes about the same time as doing 7-8 $localvar++ operations. A similar method call is of course about 15 $localvar++ operations. <a href="http://phplens.com/lens/php-book/optimizing-debugging-php.php" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">[Citation]</a></li>
<li>Not everything has to be OOP, often it is just overhead, each method and object call consumes a lot of memory. <a href="http://talks.php.net/show/php-best-practices/33" onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">[Citation]</a></li>
<li>Never trust user data, escape your strings that you use in SQL queries using <a href="http://www.php.net/mysql_real_escape_string" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">mysql_real_escape_string</a>, instead of mysql_escape_string or addslashes. Also note that if <a href="http://www.php.net/magic_quotes_gpc" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">magic_quotes_gpc</a> is enabled you should use <a href="http://www.php.net/stripslashes" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">stripslashes</a> first. <a href="http://www.jemjabella.co.uk/articles/php-security-tips" onclick="javascript:urchinTracker ('/outbound/article/www.jemjabella.co.uk');">[Citation]</a></li>
<li>Unset your database variables (the password at a minimum), you shouldn&#8217;t need it after you make the database connection.</li>
<li><a href="http://en.wikipedia.org/wiki/RTFM" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">RTFM!</a> PHP offers a <a href="http://www.php.net/manual/" onclick="javascript:urchinTracker ('/outbound/article/www.php.net');">fantastic manual</a>, possibly one of the best out there, which makes it a very hands on language, providing working examples and talking in plain English. Please USE IT! <a href="http://xkcd.com/293/" onclick="javascript:urchinTracker ('/outbound/article/xkcd.com');">[Citation]</a></li>
</ol>
<p>If you still need help, try #PHP on the <a href="http://chat.efnet.org/" onclick="javascript:urchinTracker ('/outbound/article/chat.efnet.org');">EFnet</a> IRC Network. (Read the !rules first).</p>
<p>Also see:</p>
<ul>
<li>an <a href="http://phplens.com/lens/php-book/optimizing-debugging-php.php" target="_blank" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">Excellent Article</a> about optimizing PHP by John Lim</li>
<li><a href="http://pear.php.net/manual/en/standards.php"onclick="javascript:urchinTracker ('/outbound/article/pear.php.net');"  onclick="javascript:urchinTracker ('/outbound/article/pear.php.net');">PEAR coding standards</a></li>
<li><a href="http://talks.php.net/show/php-best-practices/"onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');"  onclick="javascript:urchinTracker ('/outbound/article/talks.php.net');">PHP best practices</a> by ez.no (Use left and right keys to scroll through the pages)</li>
<li><a href="http://phplens.com/phpeverywhere/tuning-apache-php" onclick="javascript:urchinTracker ('/outbound/article/phplens.com');">Tuning Apache and PHP for Speed on Unix</a></li>
<li><a href="http://c2.com/cgi/wiki?PrematureOptimization" onclick="javascript:urchinTracker ('/outbound/article/c2.com');">Premature Optimisation</a></li>
<li><a href="http://ilia.ws/files/frankfurt_perf.pdf" onclick="javascript:urchinTracker ('/outbound/article/ilia.ws');">PHP and Performance</a></li>
<li><a href="http://www.scribd.com/doc/10633/Performance-Tuning-PHP" onclick="javascript:urchinTracker ('/outbound/article/www.scribd.com');">Performance Tuning PHP</a></li>
<li><a href="http://www.ibm.com/developerworks/library/wa-phprock1/index.html" onclick="javascript:urchinTracker ('/outbound/article/www.ibm.com');">Develop rock-solid code in PHP</a></li>
<li><a href="http://www.moskalyuk.com/blog/php-optimization-tips/1272" onclick="javascript:urchinTracker ('/outbound/article/www.moskalyuk.com');">12 PHP optimization tips</a></li>
<li><a href="http://blog.rightbrainnetworks.com/2006/09/18/10-things-you-probably-didnt-know-about-php/" onclick="javascript:urchinTracker ('/outbound/article/blog.rightbrainnetworks.com');">10 things you (probably) didn’t know about PHP</a></li>
</ul>
<p>Think you&#8217;re a PHP guru now? See if you can <a href="http://www.nickhalstead.com/2007/05/23/php-interview-questions-from-yahoo/" onclick="javascript:urchinTracker ('/outbound/article/www.nickhalstead.com');">answer these questions</a>.</p>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=50%2B+PHP+optimisation+tips+revisited&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2F50-php-optimisation-tips-revisited" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/50-php-optimisation-tips-revisited/feed</wfw:commentRss>
		</item>
		<item>
		<title>Geek in the Park 2008</title>
		<link>http://www.hm2k.com/posts/geek-in-the-park-2008</link>
		<comments>http://www.hm2k.com/posts/geek-in-the-park-2008#comments</comments>
		<pubDate>Mon, 19 May 2008 12:34:45 +0000</pubDate>
		<dc:creator>hm2k</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://www.hm2k.com/?p=199</guid>
		<description><![CDATA[It&#8217;s about this time that geeks start to think about things to do over the summer.
Geek in the Park is a day-long event for geeks (mainly web developers) to have a get together.
The event is going to be held in Jephson Gardens, Leamington Spa, Warwickshire.
The Picnic during the day, followed by The Discussion in the [...]<script type="text/javascript">SHARETHIS.addEntry({ title: "Geek in the Park 2008", url: "http://www.hm2k.com/posts/geek-in-the-park-2008" });</script>]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s about this time that geeks start to think about things to do over the summer.</p>
<p><a href="http://www.geekinthepark.co.uk/" onclick="javascript:urchinTracker ('/outbound/article/www.geekinthepark.co.uk');">Geek in the Park</a> is a day-long event for geeks (mainly web developers) to have a get together.</p>
<p>The event is going to be held in <a href="http://maps.google.co.uk/maps?f=q&amp;hl=en&amp;q=Jephson+Pl,+Willes+Rd,+Leamington+Spa+CV31,+United+Kingdom&amp;sll=52.793338,-2.147051&amp;sspn=0.008758,0.020084&amp;ie=UTF8&amp;cd=3&amp;geocode=0,52.286316,-1.523581&amp;ll=52.288316,-1.527379&amp;spn=0.002215,0.005021&amp;t=h&amp;z=18" onclick="javascript:urchinTracker ('/outbound/article/maps.google.co.uk');">Jephson Gardens, </a><a href="http://maps.google.co.uk/maps?f=q&amp;hl=en&amp;q=Jephson+Pl,+Willes+Rd,+Leamington+Spa+CV31,+United+Kingdom&amp;sll=52.793338,-2.147051&amp;sspn=0.008758,0.020084&amp;ie=UTF8&amp;cd=3&amp;geocode=0,52.286316,-1.523581&amp;ll=52.288316,-1.527379&amp;spn=0.002215,0.005021&amp;t=h&amp;z=18" onclick="javascript:urchinTracker ('/outbound/article/maps.google.co.uk');">Leamington Spa, Warwickshire</a>.</p>
<blockquote><p>The Picnic during the day, followed by The Discussion in the evening. The event starts on Saturday at noon and everything should wrap up by around 11pm.</p></blockquote>
<p>&#8230;and yes, I will be attending.</p>
<p>So, come see me at Geek in the Park on Saturday 9th August 2008. Put it in your diary!</p>
<p>I aim to get there around mid-day, but have been known to be fashionably late.</p>
<p><span class="url">You&#8217;ll find me listed on both the </span><a href="http://upcoming.yahoo.com/event/665107/"class="url" rel="external"  onclick="javascript:urchinTracker ('/outbound/article/upcoming.yahoo.com');">Upcoming</a> and <a href="http://www.facebook.com/event.php?eid=18849565907"class="url" rel="external"  onclick="javascript:urchinTracker ('/outbound/article/www.facebook.com');">Facebook</a><span class="url"> event pages.</span></p>
<p>Don&#8217;t forget to visit the <a href="http://2008.geekinthepark.co.uk/" onclick="javascript:urchinTracker ('/outbound/article/2008.geekinthepark.co.uk');">Geek in the Park</a> website and add your email address to the announcements.</p>
<p>For those of you on IRC, you can visit #webdev @ EFnet to discuss it further.</p>
<p><em>See you there!</em></p>
<p><a href="http://sharethis.com/item?&wp=2.5.1&amp;publisher=9d639a19-8384-407d-bb52-094c334b0028&amp;title=Geek+in+the+Park+2008&amp;url=http%3A%2F%2Fwww.hm2k.com%2Fposts%2Fgeek-in-the-park-2008" onclick="javascript:urchinTracker ('/outbound/article/sharethis.com');">ShareThis</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.hm2k.com/posts/geek-in-the-park-2008/feed</wfw:commentRss>
		</item>
	</channel>
</rss>
