camen design

Targeting Browsers With CSS Using mod_rewrite

Ⅰ. the Need for CSS-Hacks

In a world of imperfect browsers, the CSS-hack can be both a beautiful thing, and also an ugly thing too.

It is unavoidable that at some point some feature or design you wish to implement hits a limitation or parsing / rendering bug of one (or more) browsers you are supporting.

If this has never happened to you; well done—you’re an IE only web-developer who’s never stepped outside of Microsoft software, nor table-based design. You’re probably confused by the term CSS.

If, however, you do not own that privileged position of ignorance, everybody comes across the need to target a particular bit of CSS to a particular browser.

Ⅱ. the Question of Maintenance

There are many arguments about if CSS-hacks should be totally avoided, embraced, or properly managed somehow.

Firstly, to think they can be avoided is naïve. I have a website that only supports Acid2 compliant browsers, uses CSS3 liberally, and doesn’t support IE one iota—and still I have CSS-hacks to deal with browser idiosyncrasies.

The next two lines of argument are either a) just embed hacks in your normal document—it’s less maintenance to only edit one document and b) keep hacks to a separate file where they can be managed individually, ideally using conditional comments to target IE specifically or Javascript to correct behaviour.

Embedding CSS-hacks in stylesheets increases maintenance by making them an integral part of testing in all browsers. CSS designed to correct one browser may have undesired effects on another browser. Or a hack that has been used before, may no longer work on a newer version of a browser, but now you have to scramble to somehow support both browsers.

A hack is anything that has the potential to fail disastrously
each time something changes expectedly.

Having to make corrections to your whole stylesheet to please one browser, is unnecessary maintenance. Having all your failure points in one document is not less maintenance. I have not very fond memories of having to rewrite large portions of a stylesheet because of being unable to please one browser without effecting sic others.

The answer therefore is to tie CSS-hacks—or at least alternative CSS—directly to the browser it fixes, so that it does not becoming a ticking time bomb for other browsers and future versions.

In the case of IE, conditional comments give very precise targeting capabilities. Conditional comments are an excellent solution for applying alternative CSS, rather than relying on CSS-hacks preying on broken parsing. However, there are a couple of let-downs:

Having to maintain HTML because of a browser, is like suddenly waking up one day to discover that the letter Z has been officially depreciated by the European Union and now you have to update all your documents to conform.

Your HTML, like diamonds, should be forever.

Lastly, using a Javascript fix is only increasing the length of thread you have to follow when something breaks. If suddenly you find that a CSS property you apply does not have the expected behaviour in IE, but gives different results with the Javascript fix on, and then off: you now have a maintenance nightmare that involves back-tracking through ‘x’-tons of Javascript, that likely, you didn’t write yourself.

What if a new version of a browser comes out and your Javascript fix doesn’t work with that browser, but also, the site design won’t render correctly without it? Now you’re forced to either remove the Javascript and do the CSS again, properly, from scratch; or update the Javascript with all the work that entails.

A solution is needed that:

  1. Works for all browsers and does not rely on browser-specific capabilities to implement
  2. Negates the need for CSS-hacks, and instead encourages alternative CSS
  3. Targets browsers/versions specifically or according to a range
  4. Does not fail when a new browser or version arrives on the scene

Ⅲ. the Answer Is Still Targeting

We all know that user-agent targeting is bad. It defeats the purpose of having standards to work across all browsers. However, thus far, we have, by proxy, ascertained that there are two types of CSS:

The maintenance problems so far outlined are because there is no clean separation of these two—except for IE conditional comments, with the acknowledgement that they are an incomplete solution.

We have also ascertained that browser-specific CSS is eventually required given imperfect browsers. You can either choose to compromise on your design to avoid this, or compromise on the browser and keep the design. Either way, the end-user should never be compromised on.

It’s important to state that writing standards-compliant, proper CSS is always desired. Legacy browsers are exactly that: legacy. Therefore a solution to dividing the two types of CSS must primarily acknowledge that proper CSS comes first and does not require any maintenance going forward. The current solutions mentioned earlier do not give weight to the better CSS over the bad CSS. Basically, with a perfect solution, a perfect browser would never parse any hack or CSS intended for only one browser—it would be ignorant of legacy browsers.

The important thing with User-Agent targeting is the ability to also not target user-agents. Some believe user-agent targeting to be a form of quicksand that has you chasing after every browser ever made.

A browser that applies the standards, and requires no browser-specific CSS to render the design, does not need to be targeted. No solution is actually needed.

My proposal is therefore a solution only for legacy browsers. If a browser does not need browser-specific CSS to get it to render a design right, then it requires zero maintenance on part of this solution. There is no interference by the solution to browsers that do not need it.

The Solution

Apache’s mod_rewrite module allows us to take a file requested by the browser, and provide a different file instead, based on some matching or non-matching criteria.

What if we had a “fixes.css” file that was different for each browser that needs to be targeted, but void by default so that any browser that doesn’t need fixes, doesn’t get any?

We start with two stylesheets declared in the HTML head element:

<link rel="stylesheet" type="text/css" href="css/main.css" />
<link rel="stylesheet" type="text/css" href="css/fixes.css" />

You write your main stylesheet using absolutely no CSS-hacks or browser-specific workarounds. Write for the future. As more and more browsers come on board with Acid 2 / 3, CSS3 compatibility &c. they should automatically fall into working with your site—zero maintenance needed.

Remember, this mod_rewrite solution is only to bring non-complaint browsers into spec. Ideally, for example, you’d write your stylesheet to meet Acid 2 capabilities and when IE8 is released the site just works, and the IE7 workarounds stay with IE7.

The actual ‘fixes.css’ file itself is, therefore, completely blank.

In your ‘.htaccess’ file use the following two lines to target a browser and rewrite the ‘fixes.css’ file to a browser-specific version: (IE7 in this example)

(Assuming you’ve already got “RewriteEngine on” in your ‘.htaccess’ file and “RewriteBase” if needed.)
RewriteCond %{HTTP_USER_AGENT} MSIE\s7\.(?!.*Opera)
RewriteRule ^(.*fixes)\.css$ /$1.ie7.css [L]

This selects a user-agent containing MSIE 7. as long as it’s not followed by Opera at some point (to ignore Opera spoofing as IE). The ‘fixes.css’ file is then rewritten to a ‘fixes.ie7.css’ file where you may store the IE7 specific fixes you need to apply.

Now users of IE7 will receive a different ‘fixes.css’ than users of any other browser.

It’s a bit like having a user style on your own website. You can apply a CSS “patch” to the site for specific browsers, keeping your main stylesheet legacy free and future-facing.

Potential Uses

In short, old browsers shouldn’t prevent you from using the latest and greatest that up-to-date browsers offer. It’s up to you to keep legacy where it belongs.

Addressing Concerns

“Aha!”, I hear you say. “But surely this creates more maintenance, now we have many CSS files!”

With this solution, when you make a browser-specific correction, you only need to test in that browser itself. Less maintenance. Without this solution, a browser-specific fix also gets parsed by all browsers, and you need to test them as well. More maintenance.

“But now you have to maintain the ‘.htaccess’ file!”

As browsers get more capable, the less browser-specific CSS you will need to target, if at all. An old browser doesn’t stop being an old browser. As long as you’re not changing the HTML, this solution can safely bit-rot without ever being looked at.

This solution also allows you to react smoothly and quickly to an unexpected regression between versions of browsers. Imagine you try out a beta browser like Opera 10 or Firefox 3.1 and you suddenly find your design breaks with it. You can—without touching a line of your existing main stylesheet—add two lines to the ‘.htaccess’ file to detect the new browser, and then develop, on a new sheet, the necessary fixes without compromising any of the existing CSS. You can then choose to leave that as is, or then once working roll it back into the main sheet. This is significantly less panic-prone than having to apply fixes for an unreleased browser to a live, perfectly working stylesheet.

Imagine you want to stop supporting a legacy browser. With this method, you remove the two lines in the ‘.htaccess’ file, delete the specific ‘fixes.css’ file and that’s it—browser unsupported. Without this solution, you would have to manually unpick the CSS-hacks from your main stylesheet and slowly rub-out any influence that browser had on the design of the CSS. Sometimes the only way to truly remove a browser’s influence from a large bit of CSS is to simply rewrite it (due to all the extra capabilities opened up with newer versions that you should now be making use of).

“What about people/browsers spoofing user-agents?”

Not a problem. Seriously. Writing the right regex to spot lame attempts at spoofing is not difficult. Many browsers add like Gecko / khtml despite this being patently false. It’s becoming more common even to add Firefox into Gecko browsers given Firefox’s weight in the market. It’s sad, really, that some have given in to this pathetic behaviour.

If you’re literally browsing with a completely false user-agent by default, you are a) 0.01% of the population and b) have a personality disorder, or something. Spoofing has it’s uses, but it’s legacy as well and should be sent the way of the dodo (if browser vendors stop with this ridiculous behaviour of saying they’re engines they are nothing like).

In other words, the answer is: learn regex.

The HTTP-Request Tax

I hate wasting HTTP-Requests. If you want a fast site, it’s not bandwidth or file size that matters; no, it’s the number of HTTP-Requests. Each one could have potential delays up to 1–2 seconds in worse cases.

So there’s a severe flaw with this solution, the browser has to fetch two CSS files, even if the browser doesn’t need any fixes. This is a tax on good behaviour!

There is a workaround for this that presents only one stylesheet to compliant browsers, but two stylesheets to those browsers that need fixes.

If you present just one stylesheet in the header:

<link rel="stylesheet" type="text/css" href="css/main.css" />

And then, in the ‘.htaccess’ file, redirect ‘main.css’ to the fix sheets, for those legacy browsers:

RewriteCond %{HTTP_USER_AGENT} MSIE\s7\.(?!.*Opera)
RewriteRule ^(.*)main\.css$ /$1fixes.ie7.css [L]
# ...

# after legacy fixes, redirect ‘main.css’ to the actual main sheet ‘_main.css’
RewriteRule ^(.*)(?<!_)main\.css$ /$1_main.css [L]

And at the top of each fixes sheet, just simply import the main stylesheet.
(The main sheet uses a dummy name to prevent the @import from getting stuck in an infinite loop.)

@import "_main.css";
/* browser specific fixes here... */

This way, a browser that needs correction gets the fixes sheet first (with the main stylesheet imported at the top) and a browser that doesn’t need any fixes just gets the main sheet alone.

Example Reference

For those with worn C and V keys, here’s some additional rewrite conditions to select particular browsers (Add the RewriteRule according to your chosen method). If you’ve any suggestions for good examples, do send them my way

If you want some sample user-agent strings to work out the right regex needed for a particular browser, here’s a good list.

Internet Explorer (Any Version):

RewriteCond %{HTTP_USER_AGENT} MSIE\s\d\.(?!.*Opera)

Change \d to any number to select a particular version

Internet Explorer, Version ‘X’ and Below:

RewriteCond %{HTTP_USER_AGENT} MSIE\s(\d)\.(?!.*Opera)
RewriteCond %1 <=7

Change the <= to >= to target a particular version and above.

Firefox, Versions Before 3.0:

RewriteCond %{HTTP_USER_AGENT} Gecko/(\d{8})(?!.*Opera)
RewriteCond %1 <20080529

Do not ever try detect Firefox using “Firefox”. Remember that Gecko, the rendering engine that powers Firefox is used in many other browsers too that would have equal capabilities, such as Camino. Many browsers now spoof as Firefox too, so always detect using the Gecko date of the browser in the user-agent.

iPhone:

RewriteCond %{HTTP_USER_AGENT} iPhone|iPod

Enjoy.

ReMarkable!

ReMarkable is my own Markdown-like syntax, used to write and publish the content on this site.
It is a plain-text method of writing articles that is then converted into HTML.

To skip the talk and see it in use, you can add “.rem” to the end of any article on this site to see the original ReMarkable source for the article. For example: “/code/remarkable.rem”.

Version History

v0.2.4—31st December ’08
  • Added testing script to verify output (incomplete)
  • **strong** will now give <strong>*strong*</strong> instead of <strong>*strong</strong>*
  • Lines being conjoined even if word-wrapping was off
  • Correctly indent small blocks as paragraphs, also allow small blocks within lists/blockquotes
v0.2.3.1—17th December ’08
  • Links were not working, bug introduced in previous version.
    I need to implement some unit tests or something
v0.2.3—17th December ’08
  • nofollow support for links, prefix URLs with “!”
  • PRE blocks now allowed within lists, definition lists and blockquotes
  • Blockquotes were not converting if they were the last thing in a document
v0.2.2—14th December ’08
  • Soft-space as newline marker has been removed. Too many problems involved with it being invisible unless your text editor has white-space on and differentiates spaces from soft-spaces. Instead, use the ‘not’ character “¬”. This is located directly on the keyboard on PCs and obtained via Alt+L on Mac.
v0.2.1—11th December ’08
  • Some HTML tidying not working when $indent>0
v0.2
  • Initial Beta Release

Why ReMarkable?

I have a lot of HTML. In writing and tweaking the content on this site, I found I was getting bogged down in a lot of HTML tags for relatively minor things, such as links, abbreviations and citations.

Though I love writing in HTML, I wanted to reduce this complexity and focus more on the content than the markup.

The first place I turned to was Markdown, probably the most widely known plain-text formatter. John Gruber wrote Markdown for his site, daringfireball.net and to suit his way of writing articles.

There were a number of shortcomings with Markdown when placed against my site:

Working around these concerns of mine could have been possible by using another Markdown clone that would be easier for me to customise. A project called PHPMarkdown implements a Markdown parser in PHP, it also adds to the syntax, providing additions like abbreviations and definition lists.

There’s one big problem with PHPMarkdown though. It’s size. My site’s php is almost 600 lines long. PHPMarkdown is almost 3’000 lines long. To include PHPMarkdown in my site would be like strapping an elephant to a flea and asking it to jump a canyon.

Really then, what I knew I had to do was to write my own Markdown clone, in my own particular demeanour. I could cherry-pick the best syntax I wanted, but ultimately make it as artful as Camen Design is itself.

Features

At a glance, compared to Markdown, ReMarkable has:

ⅰ. More Inline Syntax:

ReMarkable adds syntax for {abbr|abbreviations}, ((small text)), ~citations~, [insertions], --deletions-- and «inline quotations».

ⅱ. Definition Lists:

The PHPMarkdown syntax doesn’t allow for optional descriptions, where as the HTML spec, and ReMarkable does.

:: Definition Term
	Description…
	
:: Definition Term 2

:: Definition Term 3
	Description…

You can see this structure put to good use in this article.

ⅲ. IDs in Headings:

An idea taken from PHPMarkdown, for the benefit of writing a table of contents, or others linking to specific parts of an article, headings can have an HTML ID like so:

# Heading Level 1 # (#id)

Heading Level 2 (#id2)
======================
Heading Level 3 (#id3)
----------------------

Links with absolute URLs are automatically marked up with rel="external".

Hyperlinks directly to a file have the type="mime/type" attribute added automatically for the most common file types. ReMarkable has an easy to modify internal list of these types automatically recognised.

The benefit of this is that a) you should be doing this anyway, and b) you can use CSS mime-type icons like this.

ⅴ. Intelligent List Paragraphing:

ReMarkable intelligently adds <p> tags to <li> items.
A normal list:

•	Item 1
•	Item 2
•	Item 3

Produces:

<ul>
	<li>Item 1</li>
	<li>Item 2</li>
	<li>Item 3</li>
</ul>

If any list item contains more than one paragraph, or any other block such as another list or blockquote, paragraphs are automatically added.

•	Item 1
•	Lorem ipsum dolor sit amet, consectetur adipisicing elit.
	
	sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
•	Item 3

Produces:

<ul>
	<li>Item 1</li>
	<li>
		<p>
			Lorem ipsum dolor sit amet, consectetur adipisicing elit.
		</p><p>
			sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
		</p>
	</li>
</ul>

But, if you put blank lines between the list items:

•	Item 1

•	Item 2

•	Item 3

ReMarkable adds paragraph tags for all list items

<ul>
	<li><p>Item 1</p></li>
	<li><p>Item 2</p></li>
	<li><p>Item 3</p></li>
</ul>

Lastly, if a list item contains no space between the opening text, and a list within the list item, ReMarkable does not add the first paragraph. Why? For a table of contents list:

1.	Features
	1.1.	More Inline Syntax:
	1.2.	Definition Lists:

Produces:

<ol>
	<li>
		Features
		<ol>
			<li>More Inline Syntax:</li>
			<li>Definition Lists:</li>
		</ol>

	</li>
</ol>

Notice Features is not wrapped in a paragrah.
ReMarkable does all of these conversion cases using only one regex replace.

ⅴⅰ. Human Readable Output:

In order for ReMarkable to be acceptable, it had to replace my hand-written HTML.
ReMarkable outputs clean and organised HTML and does perfect word wrapping so that when you view-source you don’t have to scroll sideways.

ReMarkable can also indent the whole output to your liking, so you can fit it into your blog template. <pre> blocks are intelligently unindented so that your code samples don’t break trying to fit ReMarkable output into your site design.

Remarkable Code

Whilst not on exact feature parity with PHPMarkdown, ReMarkable achieves it’s design with very compact code.

PHPMarkdown is spread across 120 functions composing two classes.
ReMarkable is nearly 400 lines long in just one function.

Lists, definition lists and blockquotes are recursively converted (including all the conversion cases mentioned above) in two lines of code. Needless to say, the regex kills a kitten each time you run ReMarkable.

ReMarkable’s code has been a real labour of love. I have tried to make sure it is well commented, but the tricks being used to reduce the amount of looping, and to achieve all that it does within one function can be difficult to understand. A future Under the Hood article will do a more detailed breakdown of the PHP used.

What’s Next?

The goal with ReMarkable was to be able to publish my entire site’s contents. That has been achieved.

Now ReMarkable is in your hands. I’m sure you’ll find a million bugs I never noticed. I don’t write articles the same way you do. Send bugs and suggestions to kroccamen@gmail.com

What’s Planned

There are a number of shortcomings of ReMarkable that I’m aware of and will be tackling at various stages of the future (as Camen Design’s needs demand, really).

Autotext

For the benefit of being a) lazy and b) writing on systems where Unicode is not so easy to type, I’d like ReMarkable to integrate a SmartyPants like auto-text converter to add smart quotes and other typographical highlights like en & em dashes. Also things like automatically adding <sup> to number ordinals.

Autotext would also include syntax to automatically generate a table of contents list from ReMarkable-headings with IDs

Syntax highlighting for PRE blocks

This won’t be rolled into ReMarkable itself, but instead be a separate add-on module (which will also allow you to use a different more better syntax-highlighter than my own). At the moment I’m manually marking up my code examples as there is no current syntax-highlighter that is small enough and able to deal with my website’s position of having no classes.

I intend to write a tiny and simple syntax highlighter, that works without classes, to automatically markup (not just colour) my code samples automatically.

Tables?
Syntax for writing tables will be difficult to add in elegant way. A lot of thought will have to be put into this as it must maintain clean markup and work primarily without classes. (It’s a sin of software to force the artist to use a particular class-name, when that is their decision and shouldn’t be taken out of their hands). How would you like it if you bought a canvas to paint on, but the canvas mandated that you must paint only fruit?

Any suggestions though for these features, is always welcome.


Priorities

That said however, ReMarkable was not designed to ever be all things to all people. In accepting suggestions and fixes into my version of the code, these are ReMarkable’s priorities:

HTML5, UTF-8. No classes
If you need every HTML tag to have a class or ID you need to learn how to write better code.
ReMarkable doesn’t output any HTML5 specific tags, so it’s safe to use in HTML4
Sloppy writing is not a fringe case

ReMarkable will never allow escaping of characters. That’s a cop out to avoid properly designing the syntax.
Any weird text you’re writing that’s firing off ReMarkable syntax is more than likely a computer term or technical quote and should be wrapped in `code` spans.

For example: I copied the ~luser folder to ~root would be recognised as a citation between the two tildes. Regardless of any literal readability, these folder names (including the tilde) are not part of the English language. Use I copied the `~luser` folder to `~root` instead.

If the bug can be solved by following the syntax documentation correctly, or by changing two or three letters in your source, then it’s not a bug.

Code is Art
If I can’t maintain elegance and beauty in the code then it can’t be worth it.

Enjoy.

How to Learn HTML5

I received this email:

Hi Kroc,

I stumbled on your website today and was quite impressed with quite a few different things about it (the design, the tone of your writing, &c.).

The one thing in particular, and something I wanted to question you on, was your reference to HTML5. Something I know nothing about. In the post I came across you talked about a guy named Sam Ruby and how he referred to it as minimalist code. Anyway that’s beside the point. I’m really interested in using no IDs and/or classes, but know nothing about HTML5.

So here’s my question, do you recommend any other resources or tutorials for someone like me? I rely very heavily on classes and my stuff is sloppy for the most part. I’d love to convert my site but have no idea where to start.

Any advice or words of wisdom would be greatly appreciated. Thanks in advance.

Joe Holst

Hello Sir,

You only require one thing, and then to do three things in order to succeed in your goals here.

Firstly, you need to get some willpower. You already seem to have that, as you’ve taken interest to email me to ask about where to start. Without the want to write better code, no learning in the world can help!

It should be noted that absolutely nothing I’m doing is undocumented; in that fact none of the code on my website is even special—all it is, is the representation of my personal drive for quality, as I measure it.

I don’t think willpower is going to be a problem for you if you already twig that HTML5 and cleaner code is the direction you might want to go. Some people don’t ever progress beyond writing IE-only junk, and that’s not down to skill level, it’s a willpower problem.

Secondly, you need to do three things:

  1. revise,
  2. revise,
  3. revise.

I proof-read and break my code over and over again, to chip away what niggles me here and there. Getting rid of classes is a one-by-one process, because each class has a use that may be totally unrelated to the other classes in the document. Each class is it’s own problem, some big, some small—some requiring a complete rewrite even.

I’m quite happy to break my entire codebase for a week—rewriting and reorganising everything—just to get one tiny annoyance out of sight and out of mind.

If you proof read your codebase you’ll spot various things that could be better, all of varying difficulty levels, some maybe without a clear solution. Start by picking something that annoys you, that you know could be done better, and fix it… even if it’s just tidying up a comment so that it looks nicer. Polish your code.

That is how I work. I scan through my code and look at it objectively. I think about where I can make it cleaner, tidier, less complicated, better documented, easier for others to understand; anything that catches my eye.

I pick something that I personally feel motivated toward (I have no boss on my personal site, so I only have to fix what I care about; and in that, the quality, through passion, is maintained) and I set about fixing it. Sometimes that’s a big problem—like what I’m working on right now: a Markdown clone, so as to reduce the amount of HTML I’m writing for minor things like abbreviations, links and citations ReMarkable—or sometimes it’s a small thing like shaving a few lines off here and there. I do whatever my heart feels capable of doing that day. I try never strain myself by doing work I’m not interested in—that’s for secular work.

Once I’ve made my fix, I go through the whole process again. Often whilst I’m implementing one fix it unearths other annoyances that I want to solve and I’ll either get distracted and go off and fix those, or wait until I’ve finished the thing I started on and be well set with another task to do afterwards—with a much stronger understanding of the problem to drive me forward with the design issues.

O, the design issues.

I am a slave to my code sometimes. I will not accept a sloppy solution. I pace around a lot. I wrestle with the architectural design of the code in my mind for days. I spend 100 hours writing 29 lines of code. Because to me, good code is not about how l33t your programming knowledge is, good code is about how much you rethink what you’re trying to do. What cohesive statement are you trying to make with your code?

Sometimes your code is not just about trying to solve a problem,
it’s trying to solve a problem using your personality.

But getting back to HTML5

Before you learn HTML5, first learn HTML4. I know that sounds stupid, you may very well think you already know HTML4. In learning HTML5, I first referred to this list. Apart from the depreciated ones, do you know how to use all of them? I found that there were a number of HTML4 tags that I rarely used, that could easily replace sloppy <div>s and classes.

I see this often, and it really annoys me.

<div id="header">
	<h1>Title</h1>
</div>

Elements that are not <div>s can be styled just the same as a <div>. A <h1> is not somehow magically unable to have borders or backgrounds or margins or padding or anything a <div> can.

People seem to get this mindset that only <div>s can be used as boxes.
Here’s a list of elements you can use instead of <div>s.

P, BLOCKQUOTE (with P inside), H1/H2/H3…, ADDRESS, DL/DT/DD, UL/OL/LI, HR

(Note: You can’t put a block level element in another. i.e. you can’t have a H1 inside a P or vice-versa)

All of these can be styled with any effect—meaning that you can get rid of a <div> and/or class, by just referring to the element directly, or by it’s parent.

For example:

<h1>Website Title</h1>
<ol id="menu">
	<li>menu 1</li>
	<li>menu 2</li>
</ol>
<h2>Article title</h2>

Is better than

<div id="title">Website Title</div>
<div id="menu">
	<div class="menuitem">menu 1</div>
	<div class="menuitem">menu 2</div>
</div>
<div class="title2">Article title</div>

And for that matter, why would you have a class for a menu item, when the menu is perfectly identifiable? Even in this bad example, you could still get rid of the menuitem class, and just refer to them with #menu div {...}.

An <ol> makes a perfect menu. It is after all, an Ordered List. Get to know what each of the element names means, how you would think of that in a standard word processor document, and then how that can be applied to your site, imagining your site as a word processor document without any CSS. Your menu would be a table of contents of sorts, and therefore would obviously be an Ordered List.

Here’s a different example I helped someone with:

<div id="leftcol">
	<h2 class="blue">Recent Project</h2>

</div>
<div id="rightcol">
	<h2 class="green">News Updates</h2>

</div>

He wanted the headings on the left blue, and the headings on the right, green. Which is fair enough.
However, could you not just select the column, and do away with the need for the classes?

/* as a rough example */
#leftcol h2	{color: blue;}
#rightcol h2	{color: green;}

Knowing, and using more elements, instead of resorting to <div>s all the time, allows you to use CSS to select those elements more widely as well as specifically. Here, both of these are &lt;h2&gt; s. Now if they were <div>s, we would have to use classes, because there would likely be yet more <div>s in the columns and you couldn’t say #leftcol div without turning many things blue or green instead of just what you wanted.

Getting rid of <span>s with classes requires knowing the meaning of the many inline elements. Google them.

ABBR, ACRONYM, BIG, CITE, CODE, DEL, DFN, EM, INS, KBD, Q, SAMP, SMALL, STRONG, SUB, SUP, VAR

Think of these elements outside of the browser, on a printed piece of paper. You can then bombard them with CSS to make them look like anything—even if they don’t look anything like what they’re supposed represent—but their use will be semantically sound, having the right meaning in your website.


Learn the selectors.

If you know the selectors and the the tags well enough, then you only need a class (or ID) when you cannot differentiate two elements from each other with the browser you are supporting. Since I’m using CSS3 and not supporting IE at all, I don’t need any classes because I’ve made the right choice of tags, and can differentiate all of them with the right CSS selectors.

If you want clean code using few, if any, classes then right away ditch IE6. Stop supporting it, tell people to upgrade. Without + and > selectors, IE6 is too frightened to go anywhere that isn’t within sight of a class or ID.

IE7 does support + and >, and whilst it lacks in many other areas, it has the necessary basics to write good HTML/CSS. Check what your targeted browsers support.


Once you have made a decent HTML4 site, then you will look at the HTML5 specification, and it will make sense—you will know what to do with it.


Kind regards,

How to Use <abbr> in HTML5, and in General

Before I begin, I should profess that I am completely accountable for having never followed any of these rules in the past. However, the whole reason for writing this article was to solve that problem. Since moving to my new website back-end, I decided to go through the entire site’s contents with a fine brush and polish all of the code.

In doing that, I discovered how vague I was on the semantics of the abbr element, and working through all the test-cases that have sprung up in the wealth of HTML I’ve written for this site, I’ve documented here my new understanding of the often-abused abbr element.

Ⅰ. Abbreviations Are Not Dictionary Definitions

Let’s first define abbreviation clearly:

An abbreviation is where you have shortened one or more words into:
either one word, or an alternative phrase or acronym

The problem with the use of <abbr> so far, has been that developers have assumed that every abbreviation and acronym has had to be defined in full. This is incorrect.

BAD:	I made some <abbr title="American Standard Code for Information Interchange">ASCII</abbr> art.

An abbr element expands its contents into the desired spoken form. When you read a document, you naturally expand the abbreviations as appropriate in your mind.

GOOD:	Red <abbr title="versus">vs.</abbr> Blue

You would not read out aloud the abbreviated “et cetera” in “Granny went to the market and bought apples, bread & milk etc.” as “eee-tee-see”? So as it should be with HTML abbreviations. Here are some examples:

BAD:	My <abbr title="Cascading Style Sheets">CSS</abbr> is tweaked almost daily.
GOOD:	My <abbr title="style sheet">CSS</abbr> is tweaked almost daily.

Here we’ve used the abbr element to span over an abbreviation and provide an alternative, natural way of reading the abbreviation.

GOOD:	price <abbr title="does not equal">!=</abbr> <abbr title="total cost of ownership">TCO</abbr>

We have adapted something unpronounceable as letters into something perfectly readable.

In general, abbreviations should maintain the grammar. Whilst not necessary, this example demonstrates how grammatical flow can be improved, whilst also expanding a Latin abbreviation:

Along the way, open-source has forgotten what it really means (<abbr title="that is,">i.e.</abbr> in real life) to give.

Try and communicate your intentions. If you would personally read something one-way, define the abbreviation how you intend it to be read:

Switch to using the <abbr title="“wizzy-wig”">WYSIWYG</abbr> editor, instead.

In the example below however, there’s an abbreviation CDs inside the abbreviation title:

<abbr title="recordable CDs">CD-Rs</abbr> and <abbr title="recordable DVDs">DVD±Rs</abbr> are susceptible to literal bit-rot.

Isn’t this wrong? No, because remember that the point of abbreviations are to expand one phrase into another. The user is assumed to already know what a CD is, it doesn’t have to be spelt out for them.

This follows neatly into the next point: when and where to expand abbreviations at all…

Ⅱ. the title Attribute Is Optional

Oh man, this is so important. The misuse of the abbr element is because almost everybody is under the assumption that abbr elements must have a title attribute, in fact— it’d seem pointless otherwise!

Your users do not need to know the definition of every single acronym and abbreviated technical term. In fact, they don’t care. They don’t have to know what the V in DVD stands for if they know a DVD when they see one.

Only title abbreviations that you expect people to read as the expanded form in their mind, or out aloud.

An abbr element without a title attribute should be used on any abbreviation / acronym that is written in all-capitals (unless you are providing a spoken alternative, like the WYSIWYG example from earlier), to communicate that the abbreviation is either unpronounceable as a word, or that it is capitalised—not for emphasis—but because each letter has an individual meaning. E.g.

The <abbr>FBI</abbr> are like the British <abbr>MI5</abbr>.

Ⅲ. Citations Are Not Abbreviations

This one is very sneaky and can easily catch you out.

BAD:	The site will be built using <abbr title="Hypertext Pre-Processor">PHP</abbr>.

Firstly, this reads wrong; the abbreviation breaks the grammar. Secondly, remember that abbreviations are to communicate how things should be read, not to define terms.

But thirdly, it is not an abbreviation. It is not a section of the document that has been shortened or re-phrased by the author to fit their chosen grammar. It is not a personal rendering of words. The sentence is referring to a software product. This is a citation.

GOOD:	The site will be built using <cite>PHP</cite>.

Even though a cited name can be an abbreviation of something else, the name seals that abbreviation and turns that name into a real word of sorts (a brand). Names that are already made from an abbreviation, can then even be abbreviated! (since they behave as normal words) For example “Mac OS X” is already an abbreviation of “Macintosh Operating System version Ten”, and people then often abbreviate further, calling it “OS X” or by referring to the version number / name “10.5 / Leopard”.

What Counts as a Citation?

A citation represents the title of a work, where you are referring to it in the context of your sentence, or in passing. A work is defined as an intellectual human creation.

A work can be a book, a poem, a published piece of writing, a piece of art, a website, a song, a film, a TV show, a game &c. and also software.

However this does not include the following: people’s names, the name of a ship or real products in general; such as a packet of crisps, a stereo or computer hardware.

There Are Exceptions

I won’t go into details, but there are exceptions here and there, that mostly lie around the context; whereby you are either referring to the citation itself, or the use of that work in a specific case - particularly with broadly used technologies like HTML, CSS and PHP.

I am referring directly to the <cite>PHP</cite> language/technology.
My website’s <abbr>PHP</abbr> is small.

That said, details like this will boil down to personal taste, and it’ll never really hurt to just stick to using one element or the other for all such instances, regardless.

Ⅳ. Abbreviations Should Be Meek

An abbreviation is merely anything that is read different from how it is written and vice-versa. It does not need to be in your face, Javascript-powered, “intelli-text”.

What if the Reader Does Not Know What a Technical Abbreviation / Acronym Means?

Isn’t the point of an abbr element so that these technical terms can be defined by hovering the mouse over the term?

There’s two valid answers to this:

  1. That’s what the <dfn> element is for, and…
  2. It is not your responsibility to be an encyclopædia.
    Being paranoid about your reader’s abilities is just going to make your life difficult

You, the author, only have to take the responsibility to know your audience and define those terms which you think they won’t know, or that you may be newly introducing to them.

If a user does not know a term, your website is not the only resource in existence where they can then find the definition! The user can easily google the term. In many browsers they can just right-click the word and choose to search the web for it. On a Mac, there’s a system-wide integrated dictionary you can access in a number of ways. There is no end to the ways a user can find out what a term means if they need to.

How to Style Your Abbreviations

The traditional way to style abbreviations is a grey dotted line, like so:

abbr	{border-bottom: 1px solid #666;}

However, this was under the previous model of using abbr as some kind of inline dictionary. Abbreviations are for the benefit of screen readers, search engines and enthusiasts like me. Generally, abbreviations shouldn’t be styled at all.

That said, abbreviations still do provide a useful service by allowing readers to uncover how something should be read. We need a subtle approach that doesn’t fill the user’s screen with grey dotted lines, but at the same time does allow them to discover where you’ve provided reading “hints”.

The method I’m using is to only show the grey-dotted underline when the user’s mouse is within the paragraph containing the abbreviations, so that when the user moves their mouse into the surrounding text, the abbreviations (with titles) will be marked, and the user can hover over them to then see the tooltip.

*:hover>abbr[title]	{border-bottom: 1px dotted #666; cursor: help;}
Update: The above code only works with abbreviations directly within paragraphs, if the abbr element is wrapped in a link or any other kind of tag, the grey dotted line won’t appear until you hover directly over it. The new CSS below fixes this:
(where section is the element/ID containing your blog posts)
/* first, the immediate descendants of the content area are set to highlight abbreviations on hover, but avoiding lists; as I don’t want *all* abbreviations highlighted when you hover on a root list… */
section>*:not(ol):not(ul):not(dl):hover abbr[title],
/* …only when hovering on each list-item */
 p:hover abbr[title], li:hover abbr[title], dl>*:hover abbr[title] {
	border-bottom: 1px dotted #666; cursor: help;
}

I hope this article provides with some practical guidance and enthusiasm.

If you spot any flaws in the HTML of my articles, please do contact me and let me know, I’ve got so many thousands of lines I’m sure to have made mistakes everywhere. Also you’re free to e-mail me if you’ve any questions about this article and using abbr, cite and HTML5 in general.

Special thanks goes to Adam of firsttube.com for reviewing the article whilst it was being prepared and spotting a number of flaws.

Under the Hood #5:
New Website-Ish

Welcome to Camen Design v0.2-ish. I’ve replaced the publishing code in the site, leaving the HTML5 & CSS intact. They will be replaced in the next update. I plan to target Firefox 3.1 (and hopefully Safari 4 may be out by then too), allowing me to make use of CSS animation/transitions and border-image.

In fact, because my site has its PHP / HTML / CSS entirely separated, any one can be replaced without touching a line of the other.

On the subject of future-proof CSS, I noted:

A CSS file is such that you can throw it away easily and start again. I could design my website any way I wanted without ever changing the HTML.

Clean and separated HTML/CSS means that parts can be replaced. That’s what future-proof is about – the ability to adapt to changes.

Being bit-rot proof is an entirely different matter!

That’s a different topic for another day though. This article is about the new back-end:

Clean URLs

In the previous version of the site, a PHP script handled the database, spitting out the data as requested. v0.2 is now all static XHTML5 files, ensuring faster load speed and better caching. The publishing script generates pre-gzipped “.xhtml” files for each of the articles, as well as each of the index pages. The home page is “1.xhtml”, the second page “2.xhtml” and so on. Each content-type is a folder, containing another set of numbered files for the pages. e.g. “blog/1.xhtml” and so on.

Mod_Rewrite is used to mask the “.xhtml” extension, so it isn’t required, giving nice looking URLs with no querystring as before. “/?blog&amp;page=2” now becomes “/blog/2”.

The ‘.htaccess’ file I wrote now handles everything dynamic, applying the ‘application/xhtml+xml’ mime-type to the HTML, but falling back to ‘text/html’ for browsers that can’t deal with that.

I’ve opened up my ‘.htaccess’ file so you can view it fully, but a detailed break-down is covered below.

Serving Compressed XHTML5

A big problem with the old code was that a single PHP page was not being cached very well, relying on me manually setting all the HTTP-Headers for the various pages requested. In this new version each page is a separate file, and so Apache and your browser can handle things fine.

FileETag MTime Size
AddDefaultCharset utf-8

This declaration tells Apache to send ETags in HTTP-Headers. The ETag is a unique hash of the file, so that the browser knows when the file has actually changed. Apache sends ETags automatically anyway, but uses the default “MTime INode Size”, which ties the ETag to the file’s storage cluster on the disk. If you were to upload the same file again, despite it’s contents not changing, Apache would send a different ETag in that case.

# .xhtml files are gzipped html5 documents ready to serve
AddType application/xhtml+xml .xhtml
AddEncoding gzip .xhtml

This creates a new file-type “.xhtml”, and serves it as ‘application/xhtml+xml’ by default. Though it is possible to serve HTML5 as ‘text/html’ in Firefox 3 & Safari, the <legend> tag will not work correctly when used inside a <figure> element. This is due to the all-round broken-ness of the <legend> tag in all browsers (caused by pandering to IE’s even more broken implementation).

The publishing script uses the gzencode PHP function when saving the files to zip the contents for bandwidth-savings and fast delivery. The “AddEncoding” declaration applies this to Apache, adding the necessary “Content-Encoding: gzip;HTTP-Header automatically.

# load page 1 by deafult
DirectoryIndex 1.xhtml index.php index.html

The home page is just page 1 of however many pages of the full archive of the site. Therefore “1.xhtml” is set as the default page to go to in a folder so that “/art/” returns “/art/1.xhtml”.

# if the url contains the ".xhtml", show the source code
RewriteCond %{THE_REQUEST} \b([^\.]*[^/])\.xhtml\b
RewriteRule ^ - [T=text/plain,L]

Viewing the HTML source of the pages on this site is an integral part of it’s design, so I wanted to make it very easy to do so. Just click the “html” link at the top of any page. The code above checks if the URL typed into the browser had the “.xhtml” included and if so, keeps the URL as is, but serves it as “text/plain” instead, preventing the browser from rendering the HTML.

# leave the ".xhtml" off (clean urls)
RewriteRule ^([^\.]*[^/])$ /$1.xhtml [L]

This finds any URL that has zero or one subfolder, and no dot in the filename. It then rewrites the URL to append “.xhtml” as the actual file to return. This is so “/blog/hello”, secretly returns the file “/blog/hello.xhtml”.

# although I don’t support IE, I do have to fall back to text/html,
# otherwise it will try and download the page instead of rendering it
RewriteCond %{HTTP_ACCEPT} !application/xhtml\+xml
RewriteCond %{REQUEST_FILENAME} .*\.xhtml
RewriteRule ^ - [T=text/html,L]

As described, this will check the browser capabilities to see if it does not accept ‘application/xhtml+xml’ and revert to ‘text/html’. If this is not done, IE will try and download the file instead of showing it. In 2008.

Compressed CSS

# "csz" compressed CSS filetype
AddType text/css .csz
AddEncoding gzip .csz

As with the “.xhtml” definition, this creates a “.csz” filetype of mime-type “text/css”, and default gzip (compressed) encoding. The publishing script takes the normal ‘design.css’ file and spins off a compressed copy ‘design.csz’.

# on my localhost, don’t use a cached CSS file
RewriteCond %{DOCUMENT_ROOT} "^/Users/kroc/Sites/Camen Design/upload"
RewriteRule ^design/$ /design/design.css [L]
RewriteRule ^design/$ /design/design.csz [L]

When I’m editing the website on my computer, I’m refreshing constantly to see new CSS changes. This code checks if the webroot is that of my Mac’s localhost and passes the standard ‘.css’ file and stops processing. The next line passes the compressed CSS file in the case the document root match was not made (live server).

Compressed RSS

The publishing script creates a compressed ‘rss.rsz’ file in each folder and on root.

AddType application/rss+xml .rsz
AddEncoding gzip .rsz

RewriteRule ^([a-z0-9-]+/)?rss$ /$1.rsz [L]

This redirects URLs ending in “rss” to the compressed “rsz” file. e.g. “/tweet/rss” becomes “/tweet/rss.rsz”.

Static Publishing

When I mentioned my plans for v0.2, I noted one particular fallacy:

A simple text field is never going to replicate the editing power I have with TextMate. I’ve got no search and replace, no syntax highlighting, no keyboard shortcuts.

Trying to add these things is just re-implementing the wheel, and thus breaks my very own design principle №3, Let Everybody Else Do Their Job

Therefore, I removed the administration interface, in favour of a Laguna 2 (sadly offline now) style system.

The publish script is available to view online, but is not much use out of context. You can download a stub copy of this website with everything necessary to roll your own using the enclosure in your RSS reader, or the attachment at the bottom of this article.

Content on Disk

The source content of this website is just a folder, with a sub folder for each of the “content-types” (blog | tweet | photo &c.). In each folder is a file containing a JSON meta-data header and the raw HTML of the article. This layout directly maps to the new clean URLs too.

camen design v0.2’s data folder layout

Creating a new blog post is nothing more than creating a new file. Because content is now disk files, instead of database entries, I can use my text-editor’s global search and replace and HTML editing capabilities and I can use my O.S. to manage the files instead of having to implement more and more server-side administration pages to do the same thing.

Now I can blog the same way I create the stuff I blog about.

Inside a Content Entry

A content file looks like this inside: (this one is for this article)

{	"date"		:	200807101232,
	"updated"	:	200807101232,
	"title"		:	"Under The Hood #3: ¬Using A Quick &amp; Easy SQLite Database",
	"licence"	:	"cc-by",
	"tags"		:	["code-is-art", "web-dev"],
	"enclosure"	:	"sqlite.php"
}

HTML content goes here...

Pretty self-explanatory. When creating a new article, the “date” and “updated” fields are left out, and the publishing script then adds them in automatically. If I want to mark an article as updated, and thus push it to the top of the RSS regardless of it’s original publishing date, I just delete the “updated” line and the publishing script puts in a new timestamp.


The attached zip file is updated every time I publish, so it always contains the most up to date code.