How can I get MIME type of an InputStream of a file that is being uploaded?

前端 未结 8 1295
离开以前
离开以前 2020-12-30 21:51

Simple question: how can I get MIME type (or content type) of an InputStream, without saving file, for a file that a user is uploading to my servlet?

相关标签:
8条回答
  • 2020-12-30 22:20

    I'm a big proponent of "do it yourself first, then look for a library solution". Luckily, this case is just that.

    You have to know the file's "magic number", i.e. its signature. Let me give an example for detecting whether the InputStream represents PNG file.

    PNG signature is composed by appending together the following in HEX:

    1) error-checking byte

    2) string "PNG" as in ASCII:

         P - 0x50
         N - 0x4E
         G - 0x47
    

    3) CR (carriage return) - 0x0D

    4) LF (line feed) - 0xA

    5) SUB (substitute) - 0x1A

    6) LF (line feed) - 0xA

    So, the magic number is

    89   50 4E 47 0D 0A 1A 0A
    
    137  80 78 71 13 10 26 10 (decimal)
    -119 80 78 71 13 10 26 10 (in Java)
    

    Explanation of 137 -> -119 conversion

    N bit number can be used to represent 2^N different values. For a byte (8 bits) that is 2^8=256, or 0..255 range. Java considers byte primitives to be signed, so that range is -128..127. Thus, 137 is considered to be singed and represent -119 = 137 - 256.

    Example in Koltin

    private fun InputStream.isPng(): Boolean {
        val magicNumbers = intArrayOf(-119, 80, 78, 71, 13, 10, 26, 10)
        val signatureBytes = ByteArray(magicNumbers.size)
        read(signatureBytes, 0, signatureBytes.size)
        return signatureBytes.map { it.toInt() }.toIntArray().contentEquals(magicNumbers)
    }
    

    Of course, in order to support many MIME types, you have to scale this solution somehow, and if you are not happy with the result, consider some library.

    0 讨论(0)
  • 2020-12-30 22:23

    I wrote my own content-type detector for a byte[] because the libraries above weren't suitable or I didn't have access to them. Hopefully this helps someone out.

    // retrieve file as byte[]
    byte[] b = odHit.retrieve( "" );
    
    // copy top 32 bytes and pass to the guessMimeType(byte[]) funciton
    byte[] topOfStream = new byte[32];
    System.arraycopy(b, 0, topOfStream, 0, topOfStream.length);
    String mimeGuess = guessMimeType(topOfStream);
    

    ...

    private static String guessMimeType(byte[] topOfStream) {
    
        String mimeType = null;
        Properties magicmimes = new Properties();
        FileInputStream in = null;
    
        // Read in the magicmimes.properties file (e.g. of file listed below)
        try {
            in = new FileInputStream( "magicmimes.properties" );
            magicmimes.load(in);
            in.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    
        // loop over each file signature, if a match is found, return mime type
        for ( Enumeration keys = magicmimes.keys(); keys.hasMoreElements(); ) {
            String key = (String) keys.nextElement();
            byte[] sample = new byte[key.length()];
            System.arraycopy(topOfStream, 0, sample, 0, sample.length);
            if( key.equals( new String(sample) )){
                mimeType = magicmimes.getProperty(key);
                System.out.println("Mime Found! "+ mimeType);
                break;
            } else {
                System.out.println("trying "+key+" == "+new String(sample));
            }
        }
    
        return mimeType;
    }
    

    magicmimes.properties file example (not sure these signatures are correct, but they worked for my uses)

    # SignatureKey                  content/type
    \u0000\u201E\u00f1\u00d9        text/plain
    \u0025\u0050\u0044\u0046        application/pdf
    %PDF                            application/pdf
    \u0042\u004d                    image/bmp
    GIF8                            image/gif
    \u0047\u0049\u0046\u0038        image/gif
    \u0049\u0049\u004D\u004D        image/tiff
    \u0089\u0050\u004e\u0047        image/png
    \u00ff\u00d8\u00ff\u00e0        image/jpg
    
    0 讨论(0)
提交回复
热议问题