^{5:20pm • 2023} Jun 15

c share + remix domtemplate

Making the Ugly Elegant: Templating With DOM

How It Works
The Code
Caveats
The API
1. Instantiation
2. Shorthand XPath Syntax
3. (string)
4. repeat
  1. next
5. setValue
6. set
7. addClass
8. append
9. remove
History

Templating is easy to do in any particular way, but doing it right is hard. I can’t count how many hip new template engines have popped up in just the last few years alone. I’m about to add one to the pile, but it is certainly not ‘hip’. It is however the closest I have ever gotten to the fabled golden fleece of “100% separation”. Unlike most other forms of templating, this really doesn’t mix logic and HTML, nor does it try to mask the blatant logic (“if this, then this”) by renaming ‘logic’ or using a {{special syntax}}.

What we’re going to do is this: take a static (and I mean static) HTML page, load it into the DOM as an XML tree and then use the PHP as your logic, removing bits of the template not needed and changing the text about.

I got this idea from this blog post: Your templating engine sucks and everything you have ever written is spaghetti code (yes, you). The article itself is long, agressive, rambling and fails to demonstrate the principle concretely. I simply ignored all the text and focused on the core principle that was being noted: instead of embedding some form of code in the HTML (even if it’s just evolved search / replace syntax), just load the HTML into DOM and manipulate there so that the HTML itself is ignorant of the templating.

The reason why this is not just the same as a {{special-syntax}} is that we are not mixing two different languages, syntaxes or programming models in one HTML file. If you change your templating engine, it’s still HTML. If you change your logic, it’s still HTML. Special syntaxes invent another language to intermix with HTML and thus add programmatic concepts to a declarative syntax—which is not clean separation no matter what you name it.

By doing it this way, the HTML file itself can be designed independently of the software, and that whoever does the HTML doesn’t have to know PHP. You could change the whole server language and it wouldn’t change the template one bit. More importantly you can actually view the whole look of the template in the browser without running the software. The reason I’m adopting this templating approach for NoNonsense Forum is to make it easier for anybody to modify the look of their forum without having to learn PHP, and hopefully encourage more contribution from all skill levels.

It took a few revisions, two weeks and a lot of head-wracking to beat the DOM into something elegant, but here it is, NoNonsense Templating:

How It Works

The first thing to wrap your head around is that DOM templating works on the principle of mostly taking away rather than adding. Logic-wise this is more difficult to get used to than you would think; you will be used to adding data according to logic rather than “if this, then remove the thing that it is not”.

Firstly your template should be a static HTML page that contains all of the content and ‘possibilities’ of your output, where by we will remove what is not relevant to the page. For example:

<p id="login" class="logged-out">
	You are not logged in.
</p>
<p id="login" class="logged-in">
	You are logged in as <b class="username">Bob</b>
</p>

In the PHP we can modify the HTML this way:

(Please note that templates you load must be valid XML and have a single root node—e.g. `<html>`—in order to work, the examples in this article omit this for simplicity. See XML caveats for more details)

//load the template and provide an interface
$template = new DOMTemplate (file_get_contents ('test.html'));

//lets imagine the user is logged in, remove the logged-out section and set the username
$template->remove ('.logged-out');
$template->setValue ('.username', 'Alice');

The `remove` call finds all elements that have a class of `logged-out` and deletes them (you can also refer to IDs using `#id`).

The `setValue` method sets the text-content of an element, removing anything that was within. By replacing element content it means that you can provide dummy text to test the look and feel of your template, and it will be replaced with the real data.

Behind the scenes `.logged-out` becomes the full XPath query `.//*[contains(@class,"logged-out")]`. The shorthand syntax also supports specifying a required element type and/or an attribute to target, e.g:

$template->setValue ('a.my-button@href', '/some_url');

You can also use full XPath syntax:

//if using HTTPS, change the Google search box to use HTTPS too
if (@$_SERVER['HTTPS'] == 'on') $template->setValue (
	'//form[@action="http://google.com/search"]/@action',
	'https://encrypted.google.com/search'
);

Looping is always a sore point in templating. How do you take a chunk and repeat it down the page without having to define a ton of logic in your templates?

Looping with the DOM is shockingly elegant!

$item = $template->repeat ('.list-item');
foreach ($data as $value) {
	$item->setValue ('.item-name', $value);
	$item->next ();
}

The `repeat` method takes an element (via shorthand/XPath) to be used as the repeating template and copies it, then you just `set` and `remove` elements from the repeating template as if it were its own template. Once you’ve templated that iteration you call the `next` method and the HTML is added after the previous element, then the template repeater resets itself back to the original HTML so you can template it again!

Once you’ve made all your changes to the template, just retrieve the final HTML and output.

die ($template);

See the API for details of all the functions.

The Code

View source code
View source code on GitHub
Discuss this article in the forum
(no registration required)

If you would like to see a real-world use of this templating system with a ton of examples you can draw from real, practical code you can examine the source code of my forum system called NoNonsense Forum here:

If you don’t like the idea of targetting classes or IDs in your HTML, have a look at v4 of DOMTemplate that finds elements according to data-template attributes.

Caveats

Whitespace handling is good, but not perfect

In the case of repeating an element the whitespace within is kept, but the whitespace outside the element is not. This is not a major problem, it just means that the closing and opening tags of your lists will be paired (e.g. “…</li><li>…”).

The biggest issue is that when elements are removed, the whitespace around them remains, meaning that you get a number of blank lines in the output HTML where the elements used to be. There’s no direct way of handling this other than perhaps using a search/replace to remove blank lines in the HTML after it’s been templated.

One benefit of using the DOM however is that if you want minify the HTML a little, you can just add “$this->DOMDocument->preserveWhiteSpace = false;” to the constructor function of DOMTemplate and the markup will be returned as a big blob with few line-breaks.

If you add “$this->DOMDocument->formatOutput = true;” instead, the markup will be ‘tidied’ for you, re-nesting the elements neatly in an easy to read fashion.

XML woes

DOMTemplate stores and manipulates the template internally as strict XML. Thankfully, since v16, DOMTemplate automatically converts your source HTML to XML on loading and converts from XML to HTML on output, thus alleviating most of the input-strictness problems with earlier versions. There is however still a few caveats to remember:

HTML must be valid
The automatic conversion of HTML named-entities (invalid in XML) into Unicode is still not comprehensive. 248 of the most common are covered, but a total of over 2100 exist. DOMTemplate may in a future version cover all 2100+ named entities, but until then ensure that your HTML source does not use any named-entities outside of the 248 recognised by DOMTemplate. Recent PHP versions appear to return the complete set now
HTML that you load either through instantiation or apply to the template using `setValue` must have only one root node. I.e. a list of elements can not be used unless wrapped by an element

The API

Instantiation

Provide the HTML to load as a string when instantiating the template class. It must be valid and have only one root element (e.g. `<html>`).

$template = new DOMTemplate (file_get_contents ('index.html'));

If you are loading an XHTML document, or any XML file with a default namespace (e.g. `<html xmlns="http://www.w3.org/1999/xhtml">`), you must specify a prefix (any will do) and the namespace URL like so:

$template = new DOMTemplate ('index.html', 'html', 'http://www.w3.org/1999/xhtml');

All XPath queries you make with this template must prefix element names with the namespace, including for the shorthand:

$template->setValue ('//html:title',         'Hello World');            //XPath
$template->setValue ('html:a#my-button@href, 'http://google.co.uk');    //shorthand

This bizarre requirement is a limitation in the design of XPath itself.

Shorthand XPath Syntax

All of the methods that accept a query (`setValue`, `set`, `addClass`, `append`, `remove` & `repeat`) use a shorthand-syntax where you only need to provide the class (`.class`) or ID (`#id`) you want to target and the full XPath query is built for you. E.g. `.my-button`
An element type can be provided: `a#my-button`
An attribute name can be provided which will be the target of the `setValue`, `set` and `remove` methods: `a#my-button@href`
You can test attributes for values (the element will be selected, not the attribute):
`label@for="submit"`
You can specify the index of an element to select: `li[1]`
You can select child elements: `#list/li/a`
You can also just use full XPath query, as-is: `/html/head/title`
You can provide multiple targets by separating the queries with commas, e.g: `.header, .body, .footer`
You can intermix shorthand and full XPath like this.

`(string)`

To get the HTML out of the template, cast the template class object to a string, e.g.:

$template = new DOMTemplate ('<span>test</span>');
echo $template;

In instances where the intended type is ambiguous, use PHP’s casting syntax to force a string conversion:

$html = (string) $template;

`repeat`

`repeat (string $query)`

Takes a shorthand XPath query and returns a `DOMTemplateRepeaterArray` object instantiated with the element(s) selected in the query. This object supports the `set`, `setValue`, `addClass`, `append` & `remove` methods, in addition to the following method:

`next`

Takes the current HTML content of the elements within `DOMTemplateRepeaterArray` object and appends it as a sibling to the previously repeated template (i.e either the element(s) you instantiated the repeater with, or the element(s) that were added by the previous call to the `next` method), then resets its HTML content back to the original HTML it had when it was created.

In simple terms, it adds the templated HTML to end of a list and then resets it back to the original HTML, to be used again. In practical terms, like this:

$item = $template->repeat ('.list-item');
foreach ($data as $value) {
	$item->setValue ('.item-name', $value);
	$item->next ();
}

`setValue`

`setValue (string $query, string $value, [bool $asHTML=false])`

Replaces the content of all elements matched with the shorthand XPath query with the given value. The string value is HTML-encoded (unless you give `asHTML` as true), so any HTML in the value will appear as-is, rather than be rendered as HTML. This method intelligently sets the value to elements, attributes and classes according to the XPath used. See `addClass` for details on HTML class behaviour.

$template->setValue ('#name', 'Kroc');

`set`

`set (array $queries, [bool $asHTML=false])`

Allows you to write code in a more compact way by specifying an array of shorthand XPath queries and their associated value to set.

$template->set (array (
	'#name' => 'Kroc',
	'#site' => 'http://camendesign.com'
));

`addClass`

`addClass (string $class)`

Adds the specified HTML class name to every element matched with the shorthand XPath query. If an element already has a class attribute, multiple class names will be separated by spaces when the new class is added.

$template->addClass ('#section', 'open');

`append`

`append (string $query, string $content)`

Appends content to the end of the inside of any element(s) matched by the shorthand XPath query. E.g.:

<article>
	Stuff here
	⋮
	<== Append new content here
</article>

`remove`

`remove (string $query | array $queries)`

Deletes all the elements (and their children) matched with the shorthand XPath query.

$template->remove ('.secret-stuff');

Also accepts an array in the format of `'xpath' => true|false`.
If the value is false, the XPath will be skipped. This allows you to write compact removal code by not having to write `if (x) $template->remove ('y');` several times in a row, e.g:

$template->remove (array (
	'.section-1' => $section == 1,
	'.section-2' => $section == 2,
	⋮
));

For a good example of this style of writing, see the code for NoNonsense Forum.

In addition to this behaviour, you can also remove class names from a class attribute, whilst retaining any other class names present by specifying the class name to remove in the value, when targetting a class attribute with the XPath, thusly:

$template->remove (array ('a@class' => 'undesired'));

History

v20
- Switch to PHP7 as a minimum requirement, use a namespace
- Added `append` method
- Fixes for empty elements when converting from XML to HTML,
  and a regression in `repeat`, with thanks to Mauskin
v19 Add more void elements
v18 Three community bug fixes:
- Eric Desbiens (olace): Adding class to an element that already had a class would fail
- Zegnat: iframes should not self-close
- Peter: Typo with `$this::XML` should be `$this->XML`
v17 Fixed regex bug where the same letter either side of an equals sign being removed
v16 Filtering of HTML on input and output, removing the strict-XML requirement for source text. The `html` method was removed in favour of casting the class to a String
v15 Throw an exception for invalid XPath queries or HTML
v14 XPaths are cached for speed
v13 Multiple XMLNS support
v12 Ability to remove classNames using `remove` method
v11 Changed instantiation to use a string instead of a filename
v10 `repeat` now works simultaneously with multiple elements instead of just one
v9 Greatly improved shorthand XPath syntax adding index matching, child matching & attribute testing
v8 Changed `setValue` to intelligently apply to elements, attributes or classes, with a parameter to include HTML as-is (`setHTML` was removed)
v7 XML prolog is kept if already present and UTF-8 characters are no longer hex-encoded
v6 XML namespace support. Also, template repeating now appends as a sibling, not as the last child of the parent (removes the need for a superfluous parent element).
v5 New shorthand XPath syntax for classes and IDs instead of `data-template` attributes
v4 Added multiple XPath targets
v3 Added method chaining
v2 Added HTML entity decoding
v1 Initial release