Improved Title Case Function for PHP
Update 2: The script is failing the test suite, I’ll have to update again once I’ve solved this, but it looks tricky - PHP and Unicode issues. Update: Fixed a couple of major problems:1. HTML entities were not being removed before processing
2. Made a typo when porting from Javascript that caused some failures
John Gruber originally made available his script to Title Case text, working around the fringe-cases.
From this, a number of ports were made of the script of which particularly noteworthy David Gouch’s Javascript port that was smaller, simpler and handled more fringe cases.
I’ve ported this to PHP and put it to use on this site. My version is based on David Gouch’s
Javascript port, unlike the WordPress
port which is, frankly, crap.
Code below.
//original Title Case script © John Gruber <daringfireball.net>//javascript port © David Gouch <individed.com>//PHP port of the above by Kroc Camen <camendesign.com>//this is required for PHP to not break unicode characters in your titles when using `strtolower`/`strtoupper`//you can place this near the top of your script, or within the function itselfmb_internal_encoding ("UTF-8");function titleCase ($title) {//remove HTML, storing it for later// HTML elements to ignore | tags | entities$regx = '/<(code|var)[^>]*>.*?<\/\1>|<[^>]+>|&\S+;/';preg_match_all ($regx, $title, $html, PREG_OFFSET_CAPTURE);$title = preg_replace ($regx, '', $title);//break by punctuation, find the start of wordspreg_match_all ('/[\w&`\'‘’"“\.@:\/\{\(\[<>_]+-? */', $title, $matches, PREG_OFFSET_CAPTURE);foreach ($matches[0] as &$m) $title = substr_replace ($title, $m[0]=//find words that should be lowercase$m[1]>0 && mb_substr ($title, $m[1]-2, 1) !== ':' && preg_match ('/^(a(nd?|s|t)?|b(ut|y)|en|for|i[fn]|o[fnr]|t(he|o)|vs?\.?|via)[ \-]/i', $m[0]//convert them to lowercase) ? mb_strtolower ($m[0])//else: brackets and other wrappers: ( preg_match ('/[\'"_{(\[]/', mb_substr ($title, $m[1]-1, 3))//convert first letter within wrapper to uppercase? mb_substr ($m[0], 0, 1).mb_strtoupper (mb_substr ($m[0], 1, 1)).mb_substr ($m[0], 2)//else: do not uppercase these cases: ( preg_match ('/[A-Z]+|&|[\w]+[._][\w]+/', mb_substr ($m[0], 1)) ||preg_match ('/[\])}]/', mb_substr ($title, $m[1]-1, 3))? $m[0]: mb_strtoupper (mb_substr ($m[0], 0, 1)).mb_substr ($m[0], 1))),$m[1], strlen ($m[0]));//restore the HTMLforeach ($html[0] as &$tag) $title = substr_replace ($title, $tag[0], $tag[1], 0);return $title;}
Anything broken, please let me know.
Kind regards,