I have a few different JPEG images I\'ve been testing with. As far as I\'ve seen the 0th and first bytes are always 0xFF
and 0xD8
.
The s
Things are complicated here. Since I am currently writing a javascript file identifier, I'll try to answer with my javascript object for JPEG. Especially because the question had a "javascript" tag.
The basic answer is already given (the accepted one) but this is more detailed about how to check the different App markers (with fallback).
There are special APP0s so far for JFIF, EXIF, Adobe, Canon and Samsung (but we don't know about the future). So the logic for the js object is:
If one of the SPECS[x].regex is matched it wins (first one wins). But if nothing is matched, the parent object (only FFd8) wins.
The SPECS object delivers according PRONOM identifiers - you can view them like so
'http://apps.nationalarchives.gov.uk/pronom/fmt/'.concat(PUID) [official] 'http://apps.nationalarchives.gov.uk/pronom/x-fmt/'.concat(xPUID) [experimental]
_FFD8: {
SPECS: [
{
PUID: 112,
regex: /^FFD8FFE8(.{2})53504946460001/,
desc: 'jpeg: Still Picture Interchange Format file (SPIF)',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
version: '1.00'
}
},
{
PUID: 44,
regex: /^FFD8FFE0(.{2})4A464946000102/,
desc: 'jpeg: JPEG File Interchange Format file (JFIF), v. 1.02',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
version: '1.02',
}
},
{
PUID: 43,
regex: /^FFD8FFE0(.{2})4A464946000101/,
desc: 'jpeg: JPEG File Interchange Format file (JFIF), v. 1.01',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
version: '1.01',
}
},
{
PUID: 42,
regex: /^FFD8FFE0(.{2})4A464946000100/,
desc: 'jpeg: JPEG File Interchange Format file (JFIF), v. 1.00',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
version: '1.00',
}
},
{
PUID: 41,
xPUID: 398,
regex: /^FFD8FFE1(.{2})45786966000049492A00(.+)009007000400000030323030/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), little endian, v. 2.0',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'little',
version: '2.0',
}
},
{
PUID: 41,
xPUID: 398,
regex: /^FFD8FFE1(.{2})4578696600004D4D002A(.+)900000070000000430323030/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), big endian, v. 2.0',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'big',
version: '2.0',
}
},
{
PUID: 41,
xPUID: 390,
regex: /^FFD8FFE1(.{2})45786966000049492A00(.+)009007000400000030323130/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), little endian, v. 2.1',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'little',
version: '2.1',
}
},
{
PUID: 41,
xPUID: 390,
regex: /^FFD8FFE1(.{2})4578696600004D4D002A(.+)900000070000000430323130/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), big endian, v. 2.1',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'big',
version: '2.1',
}
},
{
PUID: 41,
xPUID: 391,
regex: /^FFD8FFE1(.{2})45786966000049492A00(.+)009007000400000030323230/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), little endian, v. 2.2',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'little',
version: '2.2',
}
},
{
PUID: 41,
xPUID: 391,
regex: /^FFD8FFE1(.{2})4578696600004D4D002A(.+)900000070000000430323230/,
desc: 'jpeg: JPG Image File, using Exchangeable Image File Format (Exif), big endian, v. 2.2',
regexCapture: [
{ key: 'recordedSignature' },
{ key: 'segmentLength', fn: function(h){ return { value:parseInt(h, 16), _val:h.toString() }; } }
],
valueCapture: {
endian: 'big',
version: '2.2',
}
},
// specific JPEG (all begin with FFD8FF, map them to PUID 41)
{
PUID: 41,
regex: /^FFD8FFED/,
desc: 'jpeg: JPG Image File, Adobe JPEG, Photoshop CMYK buffer'
},
{
PUID: 41,
regex: /^FFD8FFE2/,
desc: 'jpeg: JPG Image File, Canon JPEG, Canon EOS-1D'
},
{
PUID: 41,
regex: /^FFD8FFE3/,
desc: 'jpeg: JPG Image File, Samsung JPEG, e.g. Samsung D500'
},
{
PUID: 41,
regex: /^FFD8FFDB/,
desc: 'jpeg: JPG Image File, Samsung JPEG, e.g. Samsung D807'
}
],
ext: ['JPG', 'JPE', 'JPEG', 'SPF', 'SPIFF'],
signature: [ 255, 216 ],
desc: 'jpeg: JPEG File Interchange Format file, App0 marker not known',
mime: 'image/jpeg',
specifications: [
{ text:'Specification for the JFIF file format', href:'http://www.w3.org/Graphics/JPEG/jfif3.pdf', type:'W3', format:'pdf' },
{ text:'The JPEG compression specification', href:'http://www.w3.org/Graphics/JPEG/itu-t81.pdf', type:'W3', format:'pdf' },
{ text:'Exchangeable image file format for digital still cameras', href:'http://home.jeita.or.jp/tsc/std-pdf/CP3451C.pdf', type:'vendor', format:'pdf' }
],
references: [
{ text:'JPEG JFIF W3 Info', href:'http://www.w3.org/Graphics/JPEG/', type:'W3', format:'html' },
{ text:'JPEG.org', href:'http://www.jpeg.org/', type:'info', format:'html' },
{ text:'JPEG Exif App markers', href:'http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/JPEG.html', type:'info', format:'html'}
]
}
No, it certainly doesn't have to be that way. Reading Wikipedia.
As far as I can tell, the APPn segments are just ways for applications to embed arbitrary data into the image file. Obviously, applications commonly take advantage of this and write 0xFF 0xEO
or 0xFF 0xE1
bytes into the header, but it would be entirely plausible for an application to not do this and just go on with the image data. The first two bytes (0xFF and 0xD8) are mandatory, as they are the SOI (start-of-image) marker.