Be careful, files with BOM will not detected correctly!
(PHP >= 5.3.0, PECL fileinfo >= 0.1.0)
finfo_file -- finfo::file — 返回一个文件的信息
过程化风格
$finfo
, string $file_name
= NULL
[, int $options
= FILEINFO_NONE
[, resource $context
= NULL
]] ) : string面向对象风格
$file_name
= NULL
[, int $options
= FILEINFO_NONE
[, resource $context
= NULL
]] ) : string本函数用来获取一个文件的信息。
finfo
finfo_open() 函数所返回的 fileinfo 资源。
file_name
要检查的文件名。
options
一个 Fileinfo 常量 或多个 Fileinfo 常量 进行逻辑或运算。
context
关于 contexts 的更多描述,请参考 Stream 函数。
返回 file_name
参数指定的文件信息。
发生错误时返回 FALSE
。
Example #1 finfo_file() 例程
<?php
$finfo = finfo_open(FILEINFO_MIME_TYPE); // 返回 mime 类型
foreach (glob("*") as $filename) {
echo finfo_file($finfo, $filename) . "\n";
}
finfo_close($finfo);
?>
以上例程的输出类似于:
text/html image/gif application/vnd.ms-excel
Be careful, files with BOM will not detected correctly!
The way HOWTO get MIME-type of remote file.
<?php
class MimeStreamWrapper
{
const WRAPPER_NAME = 'mime';
public $context;
private static $isRegistered = false;
private $callBackFunction;
private $eof = false;
private $fp;
private $path;
private $fileStat;
private function getStat()
{
if ($fStat = fstat($this->fp)) {
return $fStat;
}
$size = 100;
if ($headers = get_headers($this->path, true)) {
$head = array_change_key_case($headers, CASE_LOWER);
$size = (int)$head['content-length'];
}
$blocks = ceil($size / 512);
return array(
'dev' => 16777220,
'ino' => 15764,
'mode' => 33188,
'nlink' => 1,
'uid' => 10000,
'gid' => 80,
'rdev' => 0,
'size' => $size,
'atime' => 0,
'mtime' => 0,
'ctime' => 0,
'blksize' => 4096,
'blocks' => $blocks,
);
}
public function setPath($path)
{
$this->path = $path;
$this->fp = fopen($this->path, 'rb') or die('Cannot open file: ' . $this->path);
$this->fileStat = $this->getStat();
}
public function read($count) {
return fread($this->fp, $count);
}
public function getStreamPath()
{
return str_replace(array('ftp://', 'http://', 'https://'), self::WRAPPER_NAME . '://', $this->path);
}
public function getContext()
{
if (!self::$isRegistered) {
stream_wrapper_register(self::WRAPPER_NAME, get_class());
self::$isRegistered = true;
}
return stream_context_create(
array(
self::WRAPPER_NAME => array(
'cb' => array($this, 'read'),
'fileStat' => $this->fileStat,
)
)
);
}
public function stream_open($path, $mode, $options, &$opened_path)
{
if (!preg_match('/^r[bt]?$/', $mode) || !$this->context) {
return false;
}
$opt = stream_context_get_options($this->context);
if (!is_array($opt[self::WRAPPER_NAME]) ||
!isset($opt[self::WRAPPER_NAME]['cb']) ||
!is_callable($opt[self::WRAPPER_NAME]['cb'])
) {
return false;
}
$this->callBackFunction = $opt[self::WRAPPER_NAME]['cb'];
$this->fileStat = $opt[self::WRAPPER_NAME]['fileStat'];
return true;
}
public function stream_read($count)
{
if ($this->eof || !$count) {
return '';
}
if (($s = call_user_func($this->callBackFunction, $count)) == '') {
$this->eof = true;
}
return $s;
}
public function stream_eof()
{
return $this->eof;
}
public function stream_stat()
{
return $this->fileStat;
}
public function stream_cast($castAs)
{
$read = null;
$write = null;
$except = null;
return @stream_select($read, $write, $except, $castAs);
}
}
$path = 'http://fc04.deviantart.net/fs71/f/2010/227/4/6/PNG_Test_by_Destron23.png';
echo "File: ", $path, "\n";
$wrapper = new MimeStreamWrapper();
$wrapper->setPath($path);
$fInfo = new finfo(FILEINFO_MIME);
echo "MIME-type: ", $fInfo->file($wrapper->getStreamPath(), FILEINFO_MIME_TYPE, $wrapper->getContext()), "\n";
?>
Just noting (because I ran into it!) that the current implementation of finfo_file has a known bug which causes PHP to allocate huge amounts of memory when certain strings are present in text files that it is examining.
See https://bugs.php.net/bug.php?id=69224 for more info.
I spent days looking and searching for a database with actual plain language descriptions for the media types, for example
finfo(.png) --> "image/png" --> "PNG image".
In Ubuntu based OS's you can find already translated database at /usr/share/mime
http://manpages.ubuntu.com/manpages/hardy/en/man5/gnome-mime.5.html
Here is an wrapper that will properly identify Microsoft Office 2007 documents. It's trivial and straightforward to use, edit, and to add more file extentions/mimetypes.
<?php
function get_mimetype($filepath) {
if(!preg_match('/\.[^\/\\\\]+$/',$filepath)) {
return finfo_file(finfo_open(FILEINFO_MIME_TYPE), $filepath);
}
switch(strtolower(preg_replace('/^.*\./','',$filepath))) {
// START MS Office 2007 Docs
case 'docx':
return 'application/vnd.openxmlformats-officedocument.wordprocessingml.document';
case 'docm':
return 'application/vnd.ms-word.document.macroEnabled.12';
case 'dotx':
return 'application/vnd.openxmlformats-officedocument.wordprocessingml.template';
case 'dotm':
return 'application/vnd.ms-word.template.macroEnabled.12';
case 'xlsx':
return 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet';
case 'xlsm':
return 'application/vnd.ms-excel.sheet.macroEnabled.12';
case 'xltx':
return 'application/vnd.openxmlformats-officedocument.spreadsheetml.template';
case 'xltm':
return 'application/vnd.ms-excel.template.macroEnabled.12';
case 'xlsb':
return 'application/vnd.ms-excel.sheet.binary.macroEnabled.12';
case 'xlam':
return 'application/vnd.ms-excel.addin.macroEnabled.12';
case 'pptx':
return 'application/vnd.openxmlformats-officedocument.presentationml.presentation';
case 'pptm':
return 'application/vnd.ms-powerpoint.presentation.macroEnabled.12';
case 'ppsx':
return 'application/vnd.openxmlformats-officedocument.presentationml.slideshow';
case 'ppsm':
return 'application/vnd.ms-powerpoint.slideshow.macroEnabled.12';
case 'potx':
return 'application/vnd.openxmlformats-officedocument.presentationml.template';
case 'potm':
return 'application/vnd.ms-powerpoint.template.macroEnabled.12';
case 'ppam':
return 'application/vnd.ms-powerpoint.addin.macroEnabled.12';
case 'sldx':
return 'application/vnd.openxmlformats-officedocument.presentationml.slide';
case 'sldm':
return 'application/vnd.ms-powerpoint.slide.macroEnabled.12';
case 'one':
return 'application/msonenote';
case 'onetoc2':
return 'application/msonenote';
case 'onetmp':
return 'application/msonenote';
case 'onepkg':
return 'application/msonenote';
case 'thmx':
return 'application/vnd.ms-officetheme';
//END MS Office 2007 Docs
}
return finfo_file(finfo_open(FILEINFO_MIME_TYPE), $filepath);
}
?>
Well, i have a great probleam with that, MS Office 2007 extensions (pptx, xlsx, docx) do not have a default Mime type, they have "application/zip" mime type, so, to fix that, i do one little function to verify the extension.
That function allow's you to be safe of fake extensions hack.
<?php
$arrayZips = array("application/zip", "application/x-zip", "application/x-zip-compressed");
$arrayExtensions = array(".pptx", ".docx", ".dotx", ".xlsx");
$file = 'path/to/file.xlsx';
$original_extension = (false === $pos = strrpos($file, '.')) ? '' : substr($file, $pos);
$finfo = new finfo(FILEINFO_MIME);
$type = $finfo->file($file);
if (in_array($type, $arrayZips) && in_array($original_extension, $arrayExtensions))
{
return $original_extension;
}
?>
I was getting application/octet-stream or "<= not supported" for all the files.
I found out that in PHP 5.3 the magic file is built-in into PHP and that is what should be used. The magic file found on the system may not always be what libmagic expects, hence the error.
While figuring out my problem using this new function, i had a brainwave in using the full path of the file instead of the relative path. For example:
<?php
$folder = "somefolder/";
$fileName "aFile.pdf";
$finfo = finfo_open(FILEINFO_MIME_TYPE);
finfo_file($finfo, $folder.$fileName);
?>
This will result in an error where it can't find the file specified.
This however fixxes that problem:
<?php
$folder = "somefolder/";
$fileName "aFile.pdf";
$finfo = finfo_open(FILEINFO_MIME_TYPE);
$mime = finfo_file($finfo, dirname(__FILE__)."/".$folder.$fileName);
?>
I thought to use fileinfo to check if a file was gzip or bzip2. However, the mime type of a compressed file is "data" because compression is an encoding rather than a type.
gzip files begin with binary 1f8b.
bzip2 files begin with magic bytes 'B' 'Z' 'h'.
e.g.
<?php
$s = file_get_contents("somefilepath");
if ( bin2hex(substr($s,0,2)) == '1f8b' ) {/* could be a gzip file */}
if( substr($s,0,3) == 'BZh' ){/* could be a bzip2 file */}
?>
I am not an encoding expert. My only testing was using a few of my own encoded files.
OO (bit improved) version of the same thing
<?php
$file = '<somefile>';
$ftype = 'application/octet-stream';
$finfo = @new finfo(FILEINFO_MIME);
$fres = @$finfo->file($file);
if (is_string($fres) && !empty($fres)) {
$ftype = $fres;
}
?>
Another interresting feature of finfo_file on Windows.
This function can return empty string instead of FALSE for some file types (ppt for example). Therefore to be sure do a triple check of output result and provide default type just in case. Here is a sample code:
$ftype = 'application/octet-stream';
$finfo = @finfo_open(FILEINFO_MIME);
if ($finfo !== FALSE) {
$fres = @finfo_file($finfo, $file);
if ( ($fres !== FALSE)
&& is_string($fres)
&& (strlen($fres)>0)) {
$ftype = $fres;
}
@finfo_close($finfo);
}
Just an improvement on the sample Ryan Day posted - slightly off topic since this method does not use finfo_file but in some cases this method might be preferable.
The main change is the -format %m parameters given to the identify call. I would suggest using the full system path to identify i.e. /usr/bin/identify to be a little safer (the location may change from server to server though).
<?php
function is_jpg($fullpathtoimage){
if(file_exists($fullpathtoimage)){
exec("/usr/bin/identify -format %m $fullpathtoimage",$out);
//using system() echos STDOUT automatically
if(!empty($out)){
//identify returns an empty result to php
//if the file is not an image
if($out == 'JPEG'){
return true;
}
}
}
return false;
}
?>
Tempting as it may seem to use finfo_file() to validate uploaded image files (Check whether a supposed imagefile really contains an image), the results cannot be trusted. It's not that hard to wrap harmful executable code in a file identified as a GIF for instance.
A better & safer option is to check the result of:
if (!$img = @imagecreatefromgif($uploadedfilename)) {
trigger_error('Not a GIF image!',E_USER_WARNING);
// do necessary stuff
}