camen design

ReMarkable!

ReMarkable is my own Markdown-like syntax, used to write and publish the content on this site.
It is a plain-text method of writing articles that is then converted into HTML.

To skip the talk and see it in use, you can add “.rem” to the end of any article on this site to see the original ReMarkable source for the article. For example: “/code/remarkable.rem”.

Version History

v0.3.2—17th June ’09
  • Default “base_path” changed to “.
  • Fixed segfault with PHP, memory overflow in the regex for PRE blocks
  • Corrected warnings when PHPerror_reporting” switched on
  • Unicode multiplication symbol “×” for “4x4”, “Increased 3×!” or “6ft × 10ft”
  • New line-break convention supported: ‘space-underscore’ at the end of a line,
    e.g. “The quick brown fox _”
  • `...`” now allows double-backtick inside and “ ```` ” gives “<samp>``</samp>
  • `...`” is now <samp>, and “``...``” for <code>
v0.3.1—20th April ’09
  • Superscript was failing on “10^th” as the unicode quotes confusing the regex
  • Hyperlinks / images now support “//” shorthand for “http://
    (see meiert.com/en/blog/20090218/performance-and-rfc-2396/)
  • Markup for images, base_path parameter, see documentation
  • En-dash “-to-” was being replaced in heading ID fragment
  • No-follow marker in hyperlinks was not actually working!
v0.3.0—24th March ’09
  • Auto-correction: converts a number of ASCII conventions to nicer unicode / HTML. See documentation.
  • Important: deletion markup has changed to “---...---”, to make way for “--” as em-dash.
  • ReMarkable can be used from the command line
  • Asterisk bullet points were being confused with bold (e.g. “* *text...*”)
  • Important: Pre-fence syntax has changed to “~~~>” to distinguish opening / closing fence posts in nested fences
  • Fixed slashes being added to apostrophes, e.g. “Don\\\'t
  • Important: Changed no-follow marker from “!” to “^” because “!” could be a valid relative URL fragment
  • Placeholder tags were being padded 1 extra because “¡” is 2 bytes, not one
  • No longer need to have blank line after a pre-fence inside a pre-fence
  • Placeholder tags were incorrectly being restored (use array_reverse instead of arsort)
  • Increased list of block-level elements to not wrap with paragraphs

For previous history, see the changelog

Why ReMarkable?

I have a lot of HTML. In writing and tweaking the content on this site, I found I was getting bogged down in a lot of HTML tags for relatively minor things, such as links, abbreviations and citations.

Though I love writing in HTML, I wanted to reduce this complexity and focus more on the content than the markup.

The first place I turned to was Markdown, probably the most widely known plain-text formatter. John Gruber wrote Markdown for his site, daringfireball.net and to suit his way of writing articles.

There were a number of shortcomings with Markdown when placed against my site:

Working around these concerns of mine could have been possible by using another Markdown clone that would be easier for me to customise. A project called PHPMarkdown implements a Markdown parser in PHP, it also adds to the syntax, providing additions like abbreviations and definition lists.

There’s one big problem with PHPMarkdown though. Its size. My site’s php is almost 600 lines long. PHPMarkdown is almost 3’000 lines long. To include PHPMarkdown in my site would be like strapping an elephant to a flea and asking it to jump a canyon.

Really then, what I knew I had to do was to write my own Markdown clone, in my own particular demeanour. I could cherry-pick the best syntax I wanted, but ultimately make it as artful as Camen Design is itself.

Features

At a glance, compared to Markdown, ReMarkable has:

ⅰ. More Inline Syntax:

ReMarkable adds syntax for {abbr|abbreviations}, ((small text)), ~citations~, [insertions], ---deletions--- and «inline quotations».

ⅱ. Definition Lists:

The PHPMarkdown syntax doesn’t allow for optional descriptions, where as the HTML spec, and ReMarkable does.

:: Definition Term
	Description…
	
:: Definition Term 2

:: Definition Term 3
	Description…

You can see this structure put to good use in this article.

ⅲ. IDs in Headings:

An idea taken from PHPMarkdown, for the benefit of writing a table of contents, or others linking to specific parts of an article, headings can have an HTML ID like so:

# Heading Level 1 # (#id)

Heading Level 2 (#id2)
======================
Heading Level 3 (#id3)
----------------------

ⅳ. Automatic Title Casing of Headings

Using my PHP port of David Gouch’s toTitleCase, headings are automatically correctly Title Cased by ReMarkable

ⅴ. Automatic Table of Contents

Using the special marker &__TOC__; ReMarkable will generate a table of contents for each heading after the marker that has an ID.

Links with absolute URLs are automatically marked up with rel="external".

Hyperlinks directly to a file have the type="mime/type" attribute added automatically for the most common file types. ReMarkable has an easy to modify internal list of these types automatically recognised.

The benefit of this is that a) you should be doing this anyway, and b) you can use CSS mime-type icons like this.

ⅶ. Intelligent List Paragraphing:

ReMarkable intelligently adds <p> tags to <li> items.
A normal list:

•	Item 1
•	Item 2
•	Item 3

Produces:

<ul>
	<li>Item 1</li>
	<li>Item 2</li>
	<li>Item 3</li>
</ul>

If any list item contains more than one paragraph, or any other block such as another list or blockquote, paragraphs are automatically added.

•	Item 1
•	Lorem ipsum dolor sit amet, consectetur adipisicing elit.
	
	sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
•	Item 3

Produces:

<ul>
	<li>Item 1</li>
	<li>
		<p>
			Lorem ipsum dolor sit amet, consectetur adipisicing elit.
		</p><p>
			sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
		</p>
	</li>
</ul>

But, if you put blank lines between the list items:

•	Item 1

•	Item 2

•	Item 3

ReMarkable adds paragraph tags for all list items

<ul>
	<li><p>Item 1</p></li>
	<li><p>Item 2</p></li>
	<li><p>Item 3</p></li>
</ul>

Lastly, if a list item contains no space between the opening text, and a list within the list item, ReMarkable does not add the first paragraph. Why? For a table of contents list:

1.	Features
	1.1.	More Inline Syntax:
	1.2.	Definition Lists:

Produces:

<ol>
	<li>
		Features
		<ol>
			<li>More Inline Syntax:</li>
			<li>Definition Lists:</li>
		</ol>

	</li>
</ol>

Notice Features is not wrapped in a paragrah.
ReMarkable does all of these conversion cases using only one regex replace.

ⅷ. Human Readable Output:

In order for ReMarkable to be acceptable, it had to replace my hand-written HTML.
ReMarkable outputs clean and organised HTML and does perfect word wrapping so that when you view-source you don’t have to scroll sideways.

ReMarkable can also indent the whole output to your liking, so you can fit it into your blog template. <pre> blocks are intelligently unindented so that your code samples don’t break trying to fit ReMarkable output into your site design.

Remarkable Code

Whilst not on exact feature parity with PHPMarkdown, ReMarkable achieves its design with very compact code.

PHPMarkdown is spread across 120 functions composing two classes.
ReMarkable is nearly 400 lines long in just one function.

Lists, definition lists and blockquotes are recursively converted (including all the conversion cases mentioned above) in two lines of code. Needless to say, the regex kills a kitten each time you run ReMarkable.

ReMarkable’s code has been a real labour of love. I have tried to make sure it is well commented, but the tricks being used to reduce the amount of looping, and to achieve all that it does within one function can be difficult to understand. A future Under the Hood article will do a more detailed breakdown of the PHP used.

What’s Next?

The goal with ReMarkable was to be able to publish my entire site’s contents. That has been achieved.

Now ReMarkable is in your hands. I’m sure you’ll find a million bugs I never noticed. I don’t write articles the same way you do. Send bugs and suggestions to kroccamen@gmail.com.

What’s Planned

There are a number of shortcomings of ReMarkable that I’m aware of and will be tackling at various stages of the future (as Camen Design’s needs demand, really).

Any suggestions for features, is always welcome.


Priorities

That said however, ReMarkable was not designed to ever be all things to all people. In accepting suggestions and fixes into my version of the code, these are ReMarkable’s priorities:

HTML5, UTF-8. No classes
If you need every HTML tag to have a class or ID you need to learn how to write better code.
ReMarkable doesn’t output any HTML5 specific tags, so it’s safe to use in HTML4
Sloppy writing is not a fringe case

ReMarkable will never allow escaping of characters. That’s a cop out to avoid properly designing the syntax.
Any weird text you’re writing that’s firing off ReMarkable syntax is more than likely a computer term or technical quote and should be wrapped in `code` spans.

For example: I copied the ~luser folder to ~root would be recognised as a citation between the two tildes. Regardless of any literal readability, these folder names (including the tilde) are not part of the English language. Use I copied the `~luser` folder to `~root` instead.

If the bug can be solved by following the syntax documentation correctly, or by changing two or three letters in your source, then it’s not a bug.

Code is Art
If I can’t maintain elegance and beauty in the code then it can’t be worth it.

Enjoy.