Making the Ugly Elegant: Templating With DOM
- How It Works
- The Code
- Caveats
-
The API
- Instantiation
- Shorthand XPath Syntax
(string)
Output
-
repeat
next
setValue
set
addClass
remove
- History
Templating is easy to do in any particular way, but doing it right is hard. I can’t
count how many hip new template engines have popped up in just the last few years alone. I’m about to add one to
the pile, but it is certainly not ‘hip’. It is however the closest I have ever gotten to the fabled golden
fleece of “100% separation”. Unlike most other forms of templating, this really doesn’t mix logic and
HTML, nor does it try to mask the blatent logic (“if this, then this”) by renaming ‘logic’ or using a
{{special syntax}}.
What we’re going to do is this: take a static (and I mean static) HTML page, load it into the DOM as
an XML tree and then use the PHP as your logic, removing bits of the template not needed and changing the text
about.
I got this idea from this blog post:
Your
templating engine sucks and everything you have ever written is spaghetti code (yes, you). The article
itself is long, agressive, rambling and fails to demonstrate the principle concretely. I simply ignored all the text
and focused on the core principle that was being noted: instead of embedding some form of code in the HTML (even if
it’s just evolved search/replace syntax), just load the HTML into DOM and minpulate there so that the
HTML itself is ignorant of the templating.
The reason why this is not just the same as a {{special-syntax}} is that we are not mixing two
different languages, syntaxes or programming models in one HTML file. If you change your templating engine,
it’s still HTML. If you change your logic, it’s still HTML. Special syntaxes invent another language to intermix
with HTML and thus add programatic concepts to a declartive syntax—which is not clean separation
no matter what you name it.
Ever since the ’Web was invented there has been a transluscent, yet intransient divisor between those
developers who understand the fundamental difference between a declerative
markup syntax and a programming language, and those who don’t. Some learn to see this difference, others
simply ignore it and believe that it is a swell idea to tie structured data to a structured program that
will bit rot one thousand times quicker than the data will. If you are trying to replace HTML or CSS with
JavaScript, you are doing it wrong and have just signed a maintenance contract from hell, with yourself,
for yours and your data’s life.
Kroc Camen—I Don’t Want to Do This Any More
By doing it this way, the HTML file itself can be designed independently of the software, and that whoever does the
HTML doesn’t have to know PHP. You could change the whole server language and it wouldn’t change the template
one bit. More importantly you can actually view the whole look of the template in the browser without running the
software. The reason I’m adopting this templating approach for NoNonsense Forum
is to make it easier for anybody to modify the look of their forum without having to learn PHP, and hopefully
encourage more contribution from all skill levels.
It took a few revisions, two weeks and a lot of head-wracking to beat the DOM into something elegant,
but here it is, NoNonsense Templating:
How It Works
The first thing to wrap your head around is that DOM templating works on the principle of mostly taking
away rather than adding. Logic-wise this is more difficult to get used to than you would think; you will be used to
adding data according to logic rather than “if this, then remove the thing that it is not”.
Firstly your template should be a static HTML page that contains all of the content and ‘possibilities’ of your
output, where by we will remove what is not relevant to the page. For example:
<p id="login" class="logged-out">
You are not logged in.
</p>
<p id="login" class="logged-in">
You are logged in as <b class="username">Bob</b>
</p>
In the PHP we can modify the HTML this way:
(Please note that templates you load must be valid XML and have a single root node—e.g.
“<html>
”—in order to work, the examples in this article omit this for simplicity. See
XML caveats for more details)
//load the template and provide an interface
$template = new DOMTemplate (file_get_contents ('test.html'));
//lets imagine the user is logged in, remove the logged-out section and set the username
$template->remove ('.logged-out');
$template->setValue ('.username', 'Alice');
The command “remove ('.logged-out')
” finds all elements that have a class of
“logged-out” and deletes them (You can also refer to IDs using ‘#id’).
The setValue
method sets the text-content of an element, removing anything that was within. By
replacing element content it means that you can provide dummy text to test the look and feel of your template, and
it will be replaced with the real data. No more staring at {{NAME_GOES_HERE}}
!
Behind the scenes “.logged-out” becomes the full XPath
“.//*[contains(@class,"logged-out")]
”. The shorthand syntax also supports specifying a required
element type and/or an attribute to target, e.g:
$template->setValue ('a.my-button@href', '/some_url');
You can also use full XPath syntax:
//if using HTTPS, change the Google search box to use HTTPS too
if (@$_SERVER['HTTPS'] == 'on') $template->setValue (
'//form[@action="http://google.com/search"]/@action', 'https://encrypted.google.com/search'
);
Looping is always a sore point in templating. How do you take a chunk and repeat it down the page without having to
define a ton of logic in your templates?
Looping with the DOM is shockingly elegant!
$item = $template->repeat ('.list-item');
foreach ($data as $value) {
$item->setValue ('.item-name', $value);
$item->next ();
}
The repeat
method takes an element (via shorthand/XPath) to be used as the repeating template and
copies it, then you just set and remove elements from the repeating template as if it were its own template. Once
you’ve templated that iteration you call the next
method and the HTML is added after the previous
element, then the template repeater resets itself back to the original HTML so you can template it again!
Once you’ve made all your changes to the template, just retrieve the final HTML and output.
die ($template);
See the API for details of all the functions.
The Code
If you would like to see a real-world use of this templating system with a ton of examples you can draw from real,
practical code you can examine the source code of my forum system called NoNonsene
Forum here:
If you don’t like the idea of targetting classes or IDs in your HTML, have a look at v4 of DOMTemplate that finds
elements according to data-template attributes.
Caveats
- Whitespace handling is good, but not perfect
-
In the case of repeating an element the whitespace within is kept, but the whitespace outside the
element is not. This is not a major problem, it just means that the closing and opening tags of
your lists will be paired (e.g. “…</li><li>…
”).
The biggest issue is that when elements are removed, the whitespace around them remains, meaning
that you get a number of blank lines in the output HTML where the elements used to be. There’s
no direct way of handling this other than perhaps using a search/replace to remove blank lines in
the HTML after it’s been templated.
One benefit of using the DOM however is that if you want minify the HTML a little,
you can just add “$this->DOMDocument->preserveWhiteSpace = false;
” to the
constructor function of DOMTemplate
and the markup will be returned as a big blob
with few line-breaks.
If you add “$this->DOMDocument->formatOutput = true;
” instead, the markup
will be ‘tidied’ for you, re-nesting the elements neatly in an easy to read fashion.
- XML woes
-
DOMTemplate stores and manipulates the template internally as strict XML. Thankfully, since
v16, DOMTemplate automatically converts your source HTML to XML on
loading and converts from XML to HTML on output, thus alieviating most of the input-strictness
problems with earlier versions. There is however still a few caveats to remember:
-
HTML must be valid
-
The automatic conversion of HTML named-entities (invalid in XML) into Unicode is
still not comprehensive. 248 of the most common are covered, but a total of
over
2100 exist. DOMTemplate may in a future version cover all 2100+ named
entities, but until then ensure that your HTML source does not use any
named-entities outside of the 248 recognised by DOMTemplate
-
HTML that you load either through
DOMTemplate
or apply to the template
using setValue must have only one root node.
I.e. a list of elements can not be used unless
wrapped by an element.
The API
Instantiation
Provide the HTML to load as a string when instantiating the template class. It must be valid and have only one root
element (e.g. <html>
).
$template = new DOMTemplate (file_get_contents ('index.html'));
If you are loading an XHTML document, or any XML file with a default namespace (e.g.
<html xmlns="http://www.w3.org/1999/xhtml">
), you must specify a prefix (any will do)
and the namespace URL like so:
$template = new DOMTemplate ('index.html', 'html', 'http://www.w3.org/1999/xhtml');
All XPath queries you make with this template must prefix element names with the namespace, including for
the shorthand:
$template->setValue ('//html:title', 'Hello World'); //XPath
$template->setValue ('html:a#my-button@href, 'http://google.co.uk'); //shorthand
This bizzare requirement is a limitation in the design of XPath itself.
Shorthand XPath Syntax
-
All of the methods that accept a query (setValue
,
set
, addClass
,
remove
&
repeat
) use a shorthand-syntax where you only need to
provide the class (“.class”) or ID (“#id”) you want to target
and the full XPath query is built
for you. E.g. `.my-button`
-
An element type can be provided: `a#my-button`
-
An attribute name can be provided which will be the target of the
setValue
, set
and remove
methods: `a#my-button@href`
-
You can test attributes for values (the element will be selected, not the attribute):
`label@for="submit"`
-
You can specify the index of an element to select: `li[1]`
-
You can select child elements: `#list/li/a`
-
You can also just use full XPath query, as-is: `/html/head/title`
-
You can provide multiple targets by separating the queries with commas, e.g:
`.header, .body, .footer`
You can intermix shorthand and full XPath like this.
(string)
Output
To get the HTML out of the template, cast the template class object to a string,
e.g.:
$template = new DOMTemplate ('<span>test</span>');
echo $template;
In instances where the intended type is ambiguous, use PHP’s casting syntax to force a string conversion:
$html = (string) $template;
repeat
repeat (string $query)
Takes a shorthand XPath query and returns a DOMTemplateRepeaterArray
object instantiated with the element(s) selected in the query. This object supports the
set
, setValue
,
addClass
& remove
methods,
in addition to the following method:
next
Takes the current HTML content of the elements within DOMTemplateRepeaterArray
object and appends it
as a sibling to the previously repeated template (i.e either the element(s) you
instantiated the repeater with, or the element(s) that were added by the previous call to the next
method), then resets its HTML content back to the original HTML it had when it was created.
In simple terms, it adds the templated HTML to end of a list and then resets it back to the original HTML, to be
used again. In practical terms, like this:
$item = $template->repeat ('.list-item');
foreach ($data as $value) {
$item->setValue ('.item-name', $value);
$item->next ();
}
setValue
setValue (string $query, string $value, [bool $asHTML=false])
Replaces the content of all elements matched with the shorthand XPath query with the
given value. The string value is HTML-encoded (unless you give `asHTML`
as true), so any HTML in the
value will appear as-is, rather than be rendered as HTML. This method intelligently sets the value to elements,
attributes and classes according to the XPath used. See addClass for
details on HTML class behaviour.
$template->setValue ('#name', 'Kroc');
set
set (array $queries, [bool $asHTML=false])
Allows you to write code in a more compact way by specifiying an array of shorthand XPath
queries and their associated value to set.
$template->set (array (
'#name' => 'Kroc',
'#site' => 'http://camendesign.com'
));
addClass
addClass (string $class)
Adds the specificed HTML class name to every element matched with the shorthand XPath
query. If an element already has a class attribute, mutliple class names will be separated by spaces when the
new class is added.
$template->addClass ('#section', 'open');
remove
remove (string $query | array $queries)
Deletes all the elements (and their children) matched with the shorthand XPath query.
$template->remove ('.secret-stuff');
Also accepts an array in the format of “'xpath' => true|false
”.
If the value is false, the XPath will be skipped. This allows you to write compact removal code by not having to
write “if (x) $template->remove ('y');
” several times in a row, e.g:
$template->remove (array (
'.section-1' => $section == 1,
'.section-2' => $section == 2,
⋮
));
For a good example of this style of writing, see
the code for
NoNonsense Forum.
In addition to this behaviour, you can also remove classNames from a class attribute, whilst retaining any other
class names present by specifying the className to remove in the value, when tragetting a class attribute with the
XPath, thusly:
$template->remove (array ('a@class' => 'undesired'));
History
- v16 Filtering of HTML on input and output, removing the strict-XML requirement for source
text. The
`html`
method was removed in favour of casting the class to a String
- v15 Throw an exception for invalid XPath queries or HTML
- v14 XPaths are cached for speed
- v13 Multiple namespace support
- v12 Ability to remove classNames using
`remove`
method
- v11 Changed instantiation to use a string instead of a filename
- v10
`repeat`
now works simultaneously with multiple elements instead of just
one
- v9 Greatly improved shorthand XPath syntax adding index matching, child matching &
attribute testing
- v8 Changed
`setValue`
to intelligently apply to elements, attributes or
classes, with a parameter to include HTML as-is (`setHTML`
was removed)
- v7 XML prolog is kept if already present and UTF-8 characters are no longer hex-encoded
- v6 XML namespace support. Also, template repeating now appends as a sibling, not as the
last child of the parent (removes the need for a superfluous parent element).
- v5 New shorthand XPath syntax for classes and IDs instead of
`data-template`
attributes
- v4 Added multiple XPath targets
- v3 Added method chaining
- v2 Added HTML entity decoding
- v1 Initial release