This site took several weeks to complete. Part of that was because I spent most of my time doing client work, rather than work for my own site. But part of that was because of my vision, and the place my tools left me.
Don't get me wrong. I like WordPress. I think the UI is intuitive, and I use it all the time for building client sites. I can remember the web without your simple CMS systems. I can remember clients calling me up to adjust wording. They didn't like it (I don't work for free, you know) and I wasn't thrilled about being distracted by minor edits. WordPress is a significant improvement over the old way.
But WordPress doesn't have the kind of commitment I'd like to see when it comes to a functioning internet. WordPress itself produces insane code at times, often leaving theme developers in a lurch. Some of their theme functions allow you to substitute your own tags. Others don't.
When it comes to HTML 5 there are two major problems in WordPress 3.x that need fixing. The first is the WP's use of the dreaded "rel" attribute on categories. Evidently, the people who put WordPress together didn't see a need to let me not clutter my code with absolutely pointless attributes. That means that when you call a function like
the_categories();
You get a list of category names that look like this:
<li class="cat-item cat-item-1">
<a href="http://wp.ajseidl.com/?cat=1" title="1 posts filed under Uncategorized" rel="category tag">Uncategorized</a>
</li>
The trouble with that is that in HTML 5, the rel attribute has a specific set of values that are valid for it–and neither "category" nor "tag" are one.
So, we have to remove the attribute. The way to do this as a theme developer is by attaching a function onto the categories filter in our functions.php file. That function should look like this:
function ajs_kill_rel($thelist){
return preg_replace('/\srel=".*"/uU','',$thelist);
}
For those of you who aren't as regex literate, the above will take the rel attributes and their values out of the code WordPress produces when you call the_categories(); It will "replace" the instances of rel="whatever is in here" with nothing. That's why the function's called "ajs_kill_rel()."
To make this work, mind you, you still have to add the filter. To do that, you simply add this line to the main portion of the functions.php file:
add_filter('the_category','ajs_kill_rel');
That will work to take the rel off of anything, by the way, so if it's turning up elsewhere uglying up your code, just add it as a filter there as well.
The other major offense committed by WordPress against HTML 5 validity is the treatment of so-called "empty tags." For instance, the old XHTML image tag:
<img src="cool.jpg" alt="Cool" />
That slash there isn't valid anymore. (Well, not unless you're serving pages up as xml+html, in which case you've got more problems than this post can solve.)
The slash has to go. The function I arrived at looked like this:
function ajs_fix_xhtml($thecontent)
{
return preg_replace('/\>/uU','>',$thecontent);
}
Again, for you non-regexperts, this function takes the content (of the page or post) and strips out the /> replacing it instead with >. So the above image goes in like this:
<img src="cool.jpg" alt="Cool" />
And comes out like this:
<img src="cool.jpg" alt="Cool" >
Worth a moments note here that this function will also do the same for line-breaks (<br/>) and horizontal rules (<hr/>) and all the other empty tags.
We're not quite done yet. When you add an image into a WordPress blog post or page, and you use the little GUI buttons to make it align to the left, right or center, WordPress adds a class to the image, as well as an align attribute.
Now, while the XHTML empty tag ending is excusable (after all, WordPress is built for XHTML,) the align attribute is absolutely not. The align attribute on things like images was deprecated in HTML 4.01! It wasn't valid in HTML 4.01 Strict, or XHTML 1.0 Strict, and it's certainly not valid in HTML 5.
So, it has to go. We can do this on the same function, because these elements will only show up in the content portions of our pages. We add a new line to the function, and now it looks like this:
function ajs_fix_xhtml($thecontent)
{
$stripped_imgs = preg_replace('/\>/uU','>',$thecontent);
return preg_replace('//Uu','',$stripped_imgs);
}
To make this work, we attach it as a filter to the the_content(); call.
add_filter('the_content','ajs_fix_xhtml');
And there you have it. That will get your standard WP output back up to snuff. Of course, this won't last long. The minute you try to install a plugin, you'll almost certainly break the validity of your code. But, that's a matter for a different post.