This is a document written using ReMarkable, a shorthand syntax for generating HTML.

{	"date"		:	200807121257,
	"updated"	:	200807121257,
	"licence"	:	"cc-by",
	"tags"		:	["code-is-art", "web-dev"],
	"enclosure"	:	["mime-types.php"]
}

<section>
# Under The Hood #4: ¬Getting A File’s Mime-Type From Apache #

*There is* no sane way to get a file’s mime-type in PHP.¬
The <``mime_content_type`` (//uk3.php.net/mime_content_type)> command is depreciated and not installed by default in PHP5.¬
The <FileInfo PECL extension (//uk3.php.net/manual/en/book.fileinfo.php)> is not installed by default and can be insanely difficult to install.¬
¬
Thirdly, you can use a Unix call:

~~~ PHP ~~~>
$mime_type = exec ("file -i -b '$file_path'");
<~~~

But that only works on Linux / BSD / Mac systems and not for Windows users, and it also requires that if you’re on a shared webhost that they allow you to use the ``exec`` command. This method also doesn’t always give accurate results.

Lastly of course, you could write a simple function to look up a file-extension and return the appropriate mime-type:

~~~ PHP ~~~>
function mimeType ($extension) {
	switch ($extension) {
		case 'gif':	return 'image/gif';			break;
		case 'jpg':	return 'image/jpeg';			break;
		case 'png':	return 'image/png';			break;
		case 'pdf':	return 'application/pdf';		break;
		case 'zip':	return 'application/zip';		break;
		case 'exe':	return 'application/octet-stream';	break;
	}
}
<~~~

But this is hardly automatic, requires constant maintenance when you comes across new types that enter your system and looks plain ugly in your code. It lacks elegance.

                                                            * * *

*I found* a cute solution to this problem through looking at the headers returned by Apache. For every file you access, Apache sets various HTTP Headers, here’s an example from a photo.

~~~>
Date: Sat, 12 Jul 2008 11:24:33 GMT
Server:	Apache/2
Last-Modified: Tue, 17 Jun 2008 13:34:34 GMT
Etag: "2678b24-5edfe-cdfaae80"
Accept-Ranges: bytes
Content-Length: 388606
Content-Type: image/jpeg
<~~~

And lo, Apache is returning the mime-type automatically for any file-type. Since PHP has the ability to <access remote URLs (//uk2.php.net/manual/en/features.remote-files.php)> with many of its functions, we could theoretically ask PHP to ping the local file we want to know the mime-type of and retrieve the relevant information from the HTTP headers.

This is indeed possible, the <``get_headers`` (//uk2.php.net/manual/en/function.get-headers.php)> function will give the HTTP Headers returned from a URL request as an array (if the second parameter is “1”). This function does not work without a full URL (“http://”), and therefore in order to get the mime-type of a local file, it needs to be in a publicly available location (even if only temporarily). You just prepend your domain name to the file path.¬
¬
Here’s the final example:

~~~ PHP ~~~>
$domain = 'http://camendesign.com';
$file_path = 'data/content-media/photo/DSC00013.jpg';

$url_headers = get_headers ("$domain/$file_path", 1);
$mime_type = reset (explode (';', $url_headers['Content-Type']));

echo ($mime_type);
<~~~

Which correctly outputs:

~~~>
image/jpeg
<~~~

Apache sometimes returns a Content-Type value with the character set appended, e.g. “``text/html;charset=UTF-8``”, so “``explode (';', …)``” is used to break this apart, and ``reset`` returns the first array element.


Limitations
============================================================================================================================

There are a lot of issues with this practice, which is why it is not used on this site:

•	On some web-hosts, the firewall will prevent the loop-back call of the sever requesting a file on itself via an external route (the full domain URL). This will always work okay on your localhost and may work fine in many environments, but it’s not guaranteed, so always test first. [I had to fall back to the stupid look-up function `:/`]

•	This introduces lag! Remember that the server is loading an external URL, even if the URL ends up on the same machine. There is inherit lag due to the DNS lookup. Use an IP address if possible, and *always* cache the result.
	*Do not run this code every time a user views a page!* Use it once and once only for a file and then save the result somewhere!

                                                            * * *


Update: Using the ‘mime.types’ file
============================================================================================================================
Hiếu Hoàng writes:

|	Debian’s default lighttpd.conf executes a
|	“/usr/share/lighttpd/create-mime.assign.pl” file to get mime-type assignments.
|	The “/etc/mime.types” which it uses is from the package mime-support.
|	
|	This would alleviate writing the look-up function.
|
| Hiếu Hoàng

Sample output:

~~~>
$ /usr/share/lighttpd/create-mime.assign.pl
mimetype.assign = (
	".ez" => "application/andrew-inset",
	".anx" => "application/annodex",
	".atom" => "application/atom+xml",
	".atomcat" => "application/atomcat+xml",
	".atomsrv" => "application/atomserv+xml",
	[....]
	".avi" => "video/x-msvideo",
	".movie" => "video/x-sgi-movie",
	".mpv" => "video/x-matroska",
	".ice" => "x-conference/x-cooltalk",
	".sisx" => "x-epoc/x-sisx-app",
	".vrm" => "x-world/x-vrml",
)
<~~~

This file isn’t included in the Mac OS X PHP distribution, but I’ve copied it below with some small modifications (Output a PHP array, and Mac OS X’s ‘mime.types’ file is in ‘/etc/apache2/’ instead of ‘/etc/’).

~~~>
#!/usr/bin/perl -w
# (I don’t know Perl, and how to colour this right)
use strict;
open MIMETYPES, "/etc/apache2/mime.types" or exit;
print "\$mime_types = array (\n";
my %extensions;
while(<MIMETYPES>) {
	chomp;
	s/\#.*//;
	next if /^\w*$/;
	if(/^([a-z0-9\/+-.]+)\s+((?:[a-z0-9.+-]+[ ]?)+)$/) {
		foreach(split / /, $2) {
			# mime.types can have same extension for different
			# mime types
			next if $extensions{$_};
			$extensions{$_} = 1;
			print "\".$_\" => \"$1\",\n";
		}
	}
}
print ");\n";
<~~~

This could then be used to quickly put together a comprehensive (though massive) look-up function. I’ve saved a copy of the output of this script, enclosed at at the bottom of this article.

The ‘mime.types’ file is interesting as there may be a way to write a real small function (probably regex) to pull out a mime-type on request, which would be a far more elegant (and practical) than my own solution. I’ll have to give that some thought.

</section>