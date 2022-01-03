Detect the file type of a Buffer/Uint8Array/ArrayBuffer
The file type is detected by checking the magic number of the buffer.
This package is for detecting binary-based file formats, not text-based formats like
.txt,
.csv,
.svg, etc.
npm install file-type
Determine file type from a file:
import {fileTypeFromFile} from 'file-type';
console.log(await fileTypeFromFile('Unicorn.png'));
//=> {ext: 'png', mime: 'image/png'}
Determine file type from a Buffer, which may be a portion of the beginning of a file:
import {fileTypeFromBuffer} from 'file-type';
import {readChunk} from 'read-chunk';
const buffer = await readChunk('Unicorn.png', {length: 4100});
console.log(await fileTypeFromBuffer(buffer));
//=> {ext: 'png', mime: 'image/png'}
Determine file type from a stream:
import fs from 'node:fs';
import {fileTypeFromStream} from 'file-type';
const stream = fs.createReadStream('Unicorn.mp4');
console.log(await fileTypeFromStream(stream));
//=> {ext: 'mp4', mime: 'video/mp4'}
The stream method can also be used to read from a remote location:
import got from 'got';
import {fileTypeFromStream} from 'file-type';
const url = 'https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg';
const stream = got.stream(url);
console.log(await fileTypeFromStream(stream));
//=> {ext: 'jpg', mime: 'image/jpeg'}
Another stream example:
import stream from 'node:stream';
import fs from 'node:fs';
import crypto from 'node:crypto';
import {fileTypeStream} from 'file-type';
const read = fs.createReadStream('encrypted.enc');
const decipher = crypto.createDecipheriv(alg, key, iv);
const streamWithFileType = await fileTypeStream(stream.pipeline(read, decipher));
console.log(streamWithFileType.fileType);
//=> {ext: 'mov', mime: 'video/quicktime'}
const write = fs.createWriteStream(`decrypted.${streamWithFileType.fileType.ext}`);
streamWithFileType.pipe(write);
import {fileTypeFromStream} from 'file-type';
const url = 'https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg';
const response = await fetch(url);
const fileType = await fileTypeFromStream(response.body);
console.log(fileType);
//=> {ext: 'jpg', mime: 'image/jpeg'}
import {fileTypeFromBlob} from 'file-type';
const blob = new Blob(['<?xml version="1.0" encoding="ISO-8859-1" ?>'], {
type: 'plain/text',
endings: 'native'
});
console.log(await fileTypeFromBlob(blob));
//=> {ext: 'txt', mime: 'plain/text'}
Detect the file type of a
Buffer,
Uint8Array, or
ArrayBuffer.
The file type is detected by checking the magic number of the buffer.
If file access is available, it is recommended to use
FileType.fromFile() instead.
Returns a
Promise for an object with the detected file type and MIME type:
ext - One of the supported file types
mime - The MIME type
Or
undefined when there is no match.
Type:
Buffer | Uint8Array | ArrayBuffer
A buffer representing file data. It works best if the buffer contains the entire file, it may work with a smaller portion as well.
Detect the file type of a file path.
The file type is detected by checking the magic number of the buffer.
Returns a
Promise for an object with the detected file type and MIME type:
ext - One of the supported file types
mime - The MIME type
Or
undefined when there is no match.
Type:
string
The file path to parse.
Detect the file type of a Node.js readable stream.
The file type is detected by checking the magic number of the buffer.
Returns a
Promise for an object with the detected file type and MIME type:
ext - One of the supported file types
mime - The MIME type
Or
undefined when there is no match.
Type:
stream.Readable
A readable stream representing file data.
Detect the file type from an
ITokenizer source.
This method is used internally, but can also be used for a special "tokenizer" reader.
A tokenizer propagates the internal read functions, allowing alternative transport mechanisms, to access files, to be implemented and used.
Returns a
Promise for an object with the detected file type and MIME type:
ext - One of the supported file types
mime - The MIME type
Or
undefined when there is no match.
An example is
@tokenizer/http, which requests data using HTTP-range-requests. A difference with a conventional stream and the tokenizer, is that it can ignore (seek, fast-forward) in the stream. For example, you may only need and read the first 6 bytes, and the last 128 bytes, which may be an advantage in case reading the entire file would take longer.
import {makeTokenizer} from '@tokenizer/http';
import {fileTypeFromTokenizer} from 'file-type';
const audioTrackUrl = 'https://test-audio.netlify.com/Various%20Artists%20-%202009%20-%20netBloc%20Vol%2024_%20tiuqottigeloot%20%5BMP3-V2%5D/01%20-%20Diablo%20Swing%20Orchestra%20-%20Heroines.mp3';
const httpTokenizer = await makeTokenizer(audioTrackUrl);
const fileType = await fileTypeFromTokenizer(httpTokenizer);
console.log(fileType);
//=> {ext: 'mp3', mime: 'audio/mpeg'}
Or use
@tokenizer/s3 to determine the file type of a file stored on Amazon S3:
import S3 from 'aws-sdk/clients/s3';
import {makeTokenizer} from '@tokenizer/s3';
import {fileTypeFromTokenizer} from 'file-type';
// Initialize the S3 client
const s3 = new S3();
// Initialize the S3 tokenizer.
const s3Tokenizer = await makeTokenizer(s3, {
Bucket: 'affectlab',
Key: '1min_35sec.mp4'
});
// Figure out what kind of file it is.
const fileType = await fileTypeFromTokenizer(s3Tokenizer);
console.log(fileType);
Note that only the minimum amount of data required to determine the file type is read (okay, just a bit extra to prevent too many fragmented reads).
Type:
ITokenizer
A file source implementing the tokenizer interface.
Returns a
Promise which resolves to the original readable stream argument, but with an added
fileType property, which is an object like the one returned from
FileType.fromFile().
This method can be handy to put in between a stream, but it comes with a price.
Internally
stream() builds up a buffer of
sampleSize bytes, used as a sample, to determine the file type.
The sample size impacts the file detection resolution.
A smaller sample size will result in lower probability of the best file type detection.
Note: This method is only available when using Node.js. Note: Requires Node.js 14 or later.
Type:
stream.Readable
Type:
object
Type:
number\
Default:
4100
The sample size in bytes.
import got from 'got';
import {fileTypeStream} from 'file-type';
const url = 'https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg';
const stream1 = got.stream(url);
const stream2 = await fileTypeStream(stream1, {sampleSize: 1024});
if (stream2.fileType && stream2.fileType.mime === 'image/jpeg') {
// stream2 can be used to stream the JPEG image (from the very beginning of the stream)
}
Type:
stream.Readable
The input stream.
Returns a
Set<string> of supported file extensions.
Returns a
Set<string> of supported MIME types.
jpg - Joint Photographic Experts Group image
png - Portable Network Graphics
apng - Animated Portable Network Graphics
gif - Graphics Interchange Format
webp - Web Picture format
flif - Free Lossless Image Format
xcf - eXperimental Computing Facility
cr2 - Canon Raw image file (v2)
cr3 - Canon Raw image file (v3)
orf - Olympus Raw image file
arw - Sony Alpha Raw image file
dng - Adobe Digital Negative image file
nef - Nikon Electronic Format image file
rw2 - Panasonic RAW image file
raf - Fujifilm RAW image file
tif - Tagged Image file
bmp - Bitmap image file
icns - Apple Icon image
jxr - Joint Photographic Experts Group extended range
psd - Adobe Photoshop document
indd - Adobe InDesign document
zip - Archive file
tar - Tarball archive file
rar - Archive file
gz - Archive file
bz2 - Archive file
zst - Archive file
7z - 7-Zip archive
dmg - Apple Disk Image
mp4 - MPEG-4 Part 14 video file
mid - Musical Instrument Digital Interface file
mkv - Matroska video file
webm - Web video file
mov - QuickTime video file
avi - Audio Video Interleave file
mpg - MPEG-1 file
mp1 - MPEG-1 Audio Layer I
mp2 - MPEG-1 Audio Layer II
mp3 - Audio file
ogg - Audio file
ogv - Audio file
ogm - Audio file
oga - Audio file
spx - Audio file
ogx - Audio file
opus - Audio file
flac - Free Lossless Audio Codec
wav - Waveform Audio file
qcp - Tagged and chunked data
amr - Adaptive Multi-Rate audio codec
pdf - Portable Document Format
epub - E-book file
mobi - Mobipocket
exe - Executable file
swf - Adobe Flash Player file
rtf - Rich Text Format
woff - Web Open Font Format
woff2 - Web Open Font Format
eot - Embedded OpenType font
ttf - TrueType font
otf - OpenType font
ico - Windows icon file
flv - Flash video
ps - Postscript
xz - Compressed file
sqlite - SQLite file
nes - Nintendo NES ROM
crx - Google Chrome extension
xpi - XPInstall file
cab - Cabinet file
deb - Debian package
ar - Archive file
rpm - Red Hat Package Manager file
Z - Unix Compressed File
lz - Arhive file
cfb - Compount File Binary Format
mxf - Material Exchange Format
mts - MPEG-2 Transport Stream, both raw and Blu-ray Disc Audio-Video (BDAV) versions
wasm - WebAssembly intermediate compiled format
blend - Blender project
bpg - Better Portable Graphics file
docx - Microsoft Word
pptx - Microsoft Powerpoint
xlsx - Microsoft Excel
jp2 - JPEG 2000
jpm - JPEG 2000
jpx - JPEG 2000
mj2 - Motion JPEG 2000
aif - Audio Interchange file
odt - OpenDocument for word processing
ods - OpenDocument for spreadsheets
odp - OpenDocument for presentations
xml - eXtensible Markup Language
heic - High Efficiency Image File Format
cur - Icon file
ktx - OpenGL and OpenGL ES textures
ape - Monkey's Audio
wv - WavPack
asf - Advanced Systems Format
dcm - DICOM Image File
mpc - Musepack (SV7 & SV8)
ics - iCalendar
vcf - vCard
glb - GL Transmission Format
pcap - Libpcap File Format
dsf - Sony DSD Stream File (DSF)
lnk - Microsoft Windows file shortcut
alias - macOS Alias file
voc - Creative Voice File
ac3 - ATSC A/52 Audio File
3gp - Multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS multimedia services
3g2 - Multimedia container format defined by the 3GPP2 for 3G CDMA2000 multimedia services
m4v - MPEG-4 Visual bitstreams
m4p - MPEG-4 files with audio streams encrypted by FairPlay Digital Rights Management as were sold through the iTunes Store
m4a - Audio-only MPEG-4 files
m4b - Audiobook and podcast MPEG-4 files, which also contain metadata including chapter markers, images, and hyperlinks
f4v - ISO base media file format used by Adobe Flash Player
f4p - ISO base media file format protected by Adobe Access DRM used by Adobe Flash Player
f4a - Audio-only ISO base media file format used by Adobe Flash Player
f4b - Audiobook and podcast ISO base media file format used by Adobe Flash Player
mie - Dedicated meta information format which supports storage of binary as well as textual meta information
shp - Geospatial vector data format
arrow - Columnar format for tables of data
aac - Advanced Audio Coding
it - Audio module format: Impulse Tracker
s3m - Audio module format: ScreamTracker 3
xm - Audio module format: FastTracker 2
ai - Adobe Illustrator Artwork
skp - SketchUp
avif - AV1 Image File Format
eps - Encapsulated PostScript
lzh - LZH archive
pgp - Pretty Good Privacy
asar - Archive format primarily used to enclose Electron applications
stl - Standard Tesselated Geometry File Format (ASCII only)
chm - Microsoft Compiled HTML Help
3mf - 3D Manufacturing Format
jxl - JPEG XL image format
Pull requests are welcome for additional commonly used file types.
The following file types will not be accepted:
.doc - Microsoft Word 97-2003 Document
.xls - Microsoft Excel 97-2003 Document
.ppt - Microsoft PowerPoint97-2003 Document
.msi - Microsoft Windows Installer
.csv - Reason.
.svg - Detecting it requires a full-blown parser. Check out
is-svg for something that mostly works.
