Camen Design

remarkable 49.5 KB

ReMarkable!

ReMarkable is my own Markdown-like syntax, used to write and publish the content on this site.
It is a plain-text method of writing articles that is then converted into HTML.

To skip the talk and see it in use, you can add “.rem” to the end of any article on this site (or click the “rem” link at the bottom of each page) to see the original ReMarkable source for the article. For example: “/code/remarkable.rem”.

Version History

v0.4.5—14th January ’12
  • Incorrect text-escaping in e-mode regexes caused “$” to break text in table of contents (essentially some text was being interpreted as PHP code)
v0.4.4–9th February ’11
  • Applied some changes made by Zengnat
  • Stop throwing a Notice when no title is specified for a link
  • Fixes two more Notices that apply to e-mail links
  • Add support for the HTML5 hgroup element (this is paragraph processing, not syntax support)
v0.4.3—5th October ’10
  • Large update to documentation, including how to mix HTML and ReMarkable syntax
  • Changed command line to accept source text from stdinwill break existing code—see source for examples
  • Added options parameter for output preferences (NOXHTML / TABSPACE_2 / TABSPACE_4),
    see documentation. This will expand over time.
  • Titles allowed on links “<description (/href) "title">
v0.4.2—22nd August ’10
  • Spelling mistake with <figcaption>
  • A hyperlink as the only thing on a line should not be wrapped in a paragraph
    (was supposed to be doing this since 0.4 but was broken)
v0.4.1—15th August ’10
v0.4—12th August ’10
  • Shaved 38 lines off the code (was about 100 but added comments and tidy formatting)
  • Added Quick Reference Guide
  • Anything allowed in the bookmark portion of hyperlink URLs
    (this is being abused by more and more sites now)
  • Changed en-dash syntax to “ -to- ” (requires spaces either side) because it was clashing too much with common writing like “up-to-date” &c.
  • If a hyperlink is the only thing on a line, it does not get wrapped in a paragraph
    (useful for block images / figures)
  • Added image linking “<"alt" thumb.jpg = image.png>
  • Syntax for HTML5 <figure>
  • Don’t wrongly exit if calling a PHP script from the command line that then includes ReMarkable
  • Do not warn if “$_SERVER['argc']” is not present (thanks Andrew Rowls)
  • Title casing now capitalises correctly after em/en dash
  • DTs now support IDs. Use “:: (#id) …
  • LIs now support IDs. Use “• (#id)” before the tab character (any bullet type allowed), you can also indent the following lines more to account for an ID bumping the margin to two tabs or more
  • Added “¾” to autocorrection
  • Ellipses converted to unicode “...” → “…”
  • Accented letters title-casing properly in headings
  • Blank line between DT & DD no longer causes infinite loop
  • Syntax language in pre fence can now contain numbers, e.g. “VB6”

For previous history, see the changelog

Why ReMarkable?

I have a lot of HTML. In writing and tweaking the content on this site, I found I was getting bogged down in a lot of HTML tags for relatively minor things, such as links, abbreviations and citations.

Though I love writing in HTML I wanted to reduce this complexity and focus more on the content than the markup.

The first place I turned to was Markdown, probably the most widely known plain-text formatter. John Gruber wrote Markdown for his site, daringfireball.net and to suit his way of writing articles.

There were a number of shortcomings with Markdown when placed against my site:

Working around these concerns of mine could have been possible by using another Markdown clone that would be easier for me to customise. A project called PHPMarkdown implements a Markdown parser in PHP, it also adds to the syntax, providing additions like abbreviations and definition lists.

There’s one big problem with PHPMarkdown though. Its size. PHPMarkdown is almost 3’000 lines long. To include PHPMarkdown in my site would be like strapping an elephant to a flea and asking it to jump a canyon.

Really then, what I knew I had to do was to write my own Markdown clone, in my own particular demeanour. I could cherry-pick the best syntax I wanted but ultimately make it as artful as Camen Design is itself.

Features

At a glance, compared to Markdown, ReMarkable has:

More Inline Syntax:
ReMarkable adds syntax for {abbr|abbreviations}, {{dfn|definitions}}, ~citations~, ((side text)), [insertions], ---deletions--- and «inline quotations».
Definition lists:

The PHPMarkdown syntax doesn’t allow for optional descriptions where as the HTML spec, and ReMarkable, does.

:: Definition Term
	Description…

:: Definition Term 2

:: Definition Term 3
	Description…

You can see this structure put to good use in this article.

IDs in Headings:

An idea taken from PHPMarkdown for the benefit of writing a table of contents or others linking to specific parts of an article, headings can have an HTML ID like so:

# Heading Level 1 # (#id)

Heading Level 2 (#id2)
======================
Heading Level 3 (#id3)
----------------------
Automatic Title Casing of Headings:
Using my PHP port of David Gouch’s toTitleCase, headings are automatically correctly Title Cased by ReMarkable
Automatic Table of Contents:
Using the special marker “&__TOC__;” ReMarkable will generate a table of contents for each heading after the marker that has an ID.

Links with absolute URLs are automatically marked up with rel="external".

Hyperlinks directly to a file have the type="mime/type" attribute added automatically for the most common file types. ReMarkable has an easy to modify internal list of these types automatically recognised.

The benefit of this is that a) you should be doing this anyway, and b) you can use CSS mime-type icons like this.

Intelligent List Paragraphing:

ReMarkable intelligently adds <p> tags to <li> items.
A normal list:

•	Item 1
•	Item 2
•	Item 3

Produces:

<ul>
	<li>Item 1</li>
	<li>Item 2</li>
	<li>Item 3</li>
</ul>

If any list item contains more than one paragraph, or any other block such as another list or blockquote, paragraphs are automatically added.

•	Item 1
•	Lorem ipsum dolor sit amet, consectetur adipisicing elit.
	
	sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
•	Item 3

Produces:

<ul>
	<li>Item 1</li>
	<li>
		<p>
			Lorem ipsum dolor sit amet, consectetur adipisicing elit.
		</p><p>
			sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
		</p>
	</li>
	<li>Item 3</li>
</ul>

But, if you put blank lines between the list items:

•	Item 1

•	Item 2

•	Item 3

ReMarkable adds paragraph tags for all list items

<ul>
	<li><p>Item 1</p></li>
	<li><p>Item 2</p></li>
	<li><p>Item 3</p></li>
</ul>

Lastly, if a list item contains no space between the opening text, and a list within the list item, ReMarkable does not add the first paragraph. Why? For a table of contents list:

1.	Features
	1.1.	More Inline Syntax:
	1.2.	Definition Lists:
	…

Produces:

<ol>
	<li>
		Features
		<ol>
			<li>More Inline Syntax:</li>
			<li>Definition Lists:</li>
		</ol>
		…
	</li>
</ol>

Notice Features is not wrapped in a paragraph.
ReMarkable does all of these conversion cases using only one regex replace.

Human Readable Output:

In order for ReMarkable to be acceptable, it had to replace my hand-written HTML.
ReMarkable outputs clean and organised HTML and does perfect word wrapping so that when you view-source you don’t have to scroll sideways.

ReMarkable can also indent the whole output to your liking, so you can fit it into your blog template. <pre> blocks are intelligently unindented so that your code samples don’t break trying to fit ReMarkable output into your site design.

Remarkable Code

Whilst not on exact feature parity with PHPMarkdown, ReMarkable achieves its design with very compact code.

PHPMarkdown is spread across 120 functions composing two classes.
ReMarkable is nearly 600 lines long in just one function (and heavily commented).

Lists, definition lists and blockquotes are recursively converted (including all the conversion cases mentioned above) in two lines of code. Needless to say, the regex kills a kitten each time you run ReMarkable.

ReMarkable’s code has been a real labour of love. I have tried to make sure it is well commented, but the tricks being used to reduce the amount of looping, and to achieve all that it does within one function can be difficult to understand. A future Under the Hood article will do a more detailed breakdown of the PHP used.

What’s Next?

The goal with ReMarkable was to be able to publish my entire site’s contents. That has been achieved.

Now ReMarkable is in your hands. I’m sure you’ll find a million bugs I never noticed. I don’t write articles the same way you do. Send bugs and suggestions to kroc@camendesign.com.

What’s Planned

There are a number of shortcomings of ReMarkable that I’m aware of and will be tackling at various stages of the future (as Camen Design’s needs demand, really).

Any suggestions for features, is always welcome. That said however, ReMarkable was not designed to ever be all things to all people. In accepting suggestions and fixes into my version of the code, these are ReMarkable’s priorities:

HTML5, UTF-8. No classes
If you need every HTML tag to have a class or ID you need to learn how to write better code.
Sloppy writing is not a fringe case

ReMarkable will never allow escaping of characters. That’s a cop out to avoid properly designing the syntax. Any weird text you’re writing that’s firing off ReMarkable syntax is more than likely a computer term or technical quote and should be wrapped in `code` spans.

If the bug can be solved by following the syntax documentation correctly, or by changing two or three letters in your source, then it’s not a bug.

Code is Art
If I can’t maintain elegance and beauty in the code then it can’t be worth it.

Enjoy.