How to extract file extension from byte array

前端 未结 3 1184
你的背包
你的背包 2020-12-05 05:17

I\'ve got bytes array in database.

How to extract file extension (mime/type) from byte array in java?

相关标签:
3条回答
  • 2020-12-05 05:18

    If this is for storing a file that is uploaded:

    • create a column for the filename extension
    • create a column for the mime type as sent by the browser

    If you don't have the original file, and you only have bytes, you have a couple of good solutions.

    If you're able to use a library, look at using mime-util to inspect the bytes:

    http://technopaper.blogspot.com/2009/03/identifying-mime-using-mime-util.html

    If you have to build your own byte detector, here are many of the most popular starting bytes:

    "BC" => bitcode,
    "BM" => bitmap,
    "BZ" => bzip,
    "MZ" => exe,
    "SIMPLE"=> fits,
    "GIF8" => gif,
    "GKSM" => gks,
    [0x01,0xDA].pack('c*') => iris_rgb,
    [0xF1,0x00,0x40,0xBB].pack('c*') => itc,
    [0xFF,0xD8].pack('c*') => jpeg,
    "IIN1" => niff,
    "MThd" => midi,
    "%PDF" => pdf,
    "VIEW" => pm,
    [0x89].pack('c*') + "PNG" => png,
    "%!" => postscript,
    "Y" + [0xA6].pack('c*') + "j" + [0x95].pack('c*') => sun_rasterfile,
    "MM*" + [0x00].pack('c*') => tiff,
    "II*" + [0x00].pack('c*') => tiff,
    "gimp xcf" => gimp_xcf,
    "#FIG" => xfig,
    "/* XPM */" => xpm,
    [0x23,0x21].pack('c*') => shebang,
    [0x1F,0x9D].pack('c*') => compress,
    [0x1F,0x8B].pack('c*') => gzip,
    "PK" + [0x03,0x04].pack('c*') => pkzip,
    "MZ" => dos_os2_windows_executable,
    ".ELF" => unix_elf,
    [0x99,0x00].pack('c*') => pgp_public_ring,
    [0x95,0x01].pack('c*') => pgp_security_ring,
    [0x95,0x00].pack('c*') => pgp_security_ring,
    [0xA6,0x00].pack('c*') => pgp_encrypted_data,
    [0xD0,0xCF,0x11,0xE0].pack('c*') => docfile
    
    0 讨论(0)
  • 2020-12-05 05:26

    It turned out that there is a decent method in JDK's URLConnection class, please refer to the following answer: Getting A File's Mime Type In Java

    If one needs to extract file extension from byte array instead of file, one should simply use java.io.ByteArrayInputStream (class to read bytes specifically from byte arrays) instead of java.io.FileInputStream (class to read bytes specifically from files) like in the following example:

    byte[] content = ;
    InputStream is = new ByteArrayInputStream(content);
    String mimeType = URLConnection.guessContentTypeFromStream(is);
     //...close stream
    

    Hope this helps...

    0 讨论(0)
  • 2020-12-05 05:26

    Maybe I need to save additional column in my DB for file extension.

    That is a better solution than attempting to deduce a mimetype based on the database content, for (at least) the following reasons:

    • If you have a mime type from the document source, you can store and use that.
    • You could (potentially) ask the user to specify a mimetype when they lodge the document.
    • If you have to use some heuristic-based scheme for figuring out a mimetype:
      • you can do the work once before creating the table row, rather than N times after extracting it, and
      • you can report cases where the heuristic gives no good answer, and maybe ask the user to say what the file type really is.

    (I'm making some assumptions that may not be warranted, but the question doesn't give any clues on how the larger system is intended to work.)

    0 讨论(0)
提交回复
热议问题