Detecting a password-protected document

前端 未结 2 1445
花落未央
花落未央 2020-12-03 00:04

Is there any way to know whether a doc/ppt/xls file is password-protected even before opening the file?

相关标签:
2条回答
  • 2020-12-03 00:32

    I have created an utility method that tries to detect if a given office document is protected by a password or not. Here are the list of advantages:

    • supports Word, Excel and PowerPoint documents, both legacy (doc, xls, ppt) and new OpenXml version (docx, xlsx, pptx)
    • does not depend on COM or any other library
    • requires only System, System.IO and System.Text namespaces
    • pretty fast and reliable detection (respects legacy .doc, .ppt and .xls file formats)
    • low memory usage (maximum 64KB)

    Here is the code, hope someone will find it useful:

    public static class MsOfficeHelper
    {
        /// <summary>
        /// Detects if a given office document is protected by a password or not.
        /// Supported formats: Word, Excel and PowerPoint (both legacy and OpenXml).
        /// </summary>
        /// <param name="fileName">Path to an office document.</param>
        /// <returns>True if document is protected by a password, false otherwise.</returns>
        public static bool IsPasswordProtected(string fileName)
        {
            using (var stream = File.OpenRead(fileName))
                return IsPasswordProtected(stream);
        }
    
        /// <summary>
        /// Detects if a given office document is protected by a password or not.
        /// Supported formats: Word, Excel and PowerPoint (both legacy and OpenXml).
        /// </summary>
        /// <param name="stream">Office document stream.</param>
        /// <returns>True if document is protected by a password, false otherwise.</returns>
        public static bool IsPasswordProtected(Stream stream)
        {
            // minimum file size for office file is 4k
            if (stream.Length < 4096)
                return false;
    
            // read file header
            stream.Seek(0, SeekOrigin.Begin);
            var compObjHeader = new byte[0x20];
            ReadFromStream(stream, compObjHeader);
    
            // check if we have plain zip file
            if (compObjHeader[0] == 'P' && compObjHeader[1] == 'K')
            {
                // this is a plain OpenXml document (not encrypted)
                return false;
            }
    
            // check compound object magic bytes
            if (compObjHeader[0] != 0xD0 || compObjHeader[1] != 0xCF)
            {
                // unknown document format
                return false;
            }
    
            int sectionSizePower = compObjHeader[0x1E];
            if (sectionSizePower < 8 || sectionSizePower > 16)
            {
                // invalid section size
                return false;
            }
            int sectionSize = 2 << (sectionSizePower - 1);
    
            const int defaultScanLength = 32768;
            long scanLength = Math.Min(defaultScanLength, stream.Length);
    
            // read header part for scan
            stream.Seek(0, SeekOrigin.Begin);
            var header = new byte[scanLength];
            ReadFromStream(stream, header);
    
            // check if we detected password protection
            if (ScanForPassword(stream, header, sectionSize))
                return true;
    
            // if not, try to scan footer as well
    
            // read footer part for scan
            stream.Seek(-scanLength, SeekOrigin.End);
            var footer = new byte[scanLength];
            ReadFromStream(stream, footer);
    
            // finally return the result
            return ScanForPassword(stream, footer, sectionSize);
        }
    
        static void ReadFromStream(Stream stream, byte[] buffer)
        {
            int bytesRead, count = buffer.Length;
            while (count > 0 && (bytesRead = stream.Read(buffer, 0, count)) > 0)
                count -= bytesRead;
            if (count > 0) throw new EndOfStreamException();
        }
    
        static bool ScanForPassword(Stream stream, byte[] buffer, int sectionSize)
        {
            const string afterNamePadding = "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0";
    
            try
            {
                string bufferString = Encoding.ASCII.GetString(buffer, 0, buffer.Length);
    
                // try to detect password protection used in new OpenXml documents
                // by searching for "EncryptedPackage" or "EncryptedSummary" streams
                const string encryptedPackageName = "E\0n\0c\0r\0y\0p\0t\0e\0d\0P\0a\0c\0k\0a\0g\0e" + afterNamePadding;
                const string encryptedSummaryName = "E\0n\0c\0r\0y\0p\0t\0e\0d\0S\0u\0m\0m\0a\0r\0y" + afterNamePadding;
                if (bufferString.Contains(encryptedPackageName) ||
                    bufferString.Contains(encryptedSummaryName))
                    return true;
    
                // try to detect password protection for legacy Office documents
                const int coBaseOffset = 0x200;
                const int sectionIdOffset = 0x74;
    
                // check for Word header
                const string wordDocumentName = "W\0o\0r\0d\0D\0o\0c\0u\0m\0e\0n\0t" + afterNamePadding;
                int headerOffset = bufferString.IndexOf(wordDocumentName, StringComparison.InvariantCulture);
                int sectionId;
                if (headerOffset >= 0)
                {
                    sectionId = BitConverter.ToInt32(buffer, headerOffset + sectionIdOffset);
                    int sectionOffset = coBaseOffset + sectionId * sectionSize;
                    const int fibScanSize = 0x10;
                    if (sectionOffset < 0 || sectionOffset + fibScanSize > stream.Length)
                        return false; // invalid document
                    var fibHeader = new byte[fibScanSize];
                    stream.Seek(sectionOffset, SeekOrigin.Begin);
                    ReadFromStream(stream, fibHeader);
                    short properties = BitConverter.ToInt16(fibHeader, 0x0A);
                    // check for fEncrypted FIB bit
                    const short fEncryptedBit = 0x0100;
                    return (properties & fEncryptedBit) == fEncryptedBit;
                }
    
                // check for Excel header
                const string workbookName = "W\0o\0r\0k\0b\0o\0o\0k" + afterNamePadding;
                headerOffset = bufferString.IndexOf(workbookName, StringComparison.InvariantCulture);
                if (headerOffset >= 0)
                {
                    sectionId = BitConverter.ToInt32(buffer, headerOffset + sectionIdOffset);
                    int sectionOffset = coBaseOffset + sectionId * sectionSize;
                    const int streamScanSize = 0x100;
                    if (sectionOffset < 0 || sectionOffset + streamScanSize > stream.Length)
                        return false; // invalid document
                    var workbookStream = new byte[streamScanSize];
                    stream.Seek(sectionOffset, SeekOrigin.Begin);
                    ReadFromStream(stream, workbookStream);
                    short record = BitConverter.ToInt16(workbookStream, 0);
                    short recordSize = BitConverter.ToInt16(workbookStream, sizeof(short));
                    const short bofMagic = 0x0809;
                    const short eofMagic = 0x000A;
                    const short filePassMagic = 0x002F;
                    if (record != bofMagic)
                        return false; // invalid BOF
                    // scan for FILEPASS record until the end of the buffer
                    int offset = sizeof(short) * 2 + recordSize;
                    int recordsLeft = 16; // simple infinite loop check just in case
                    do
                    {
                        record = BitConverter.ToInt16(workbookStream, offset);
                        if (record == filePassMagic)
                            return true;
                        recordSize = BitConverter.ToInt16(workbookStream, sizeof(short) + offset);
                        offset += sizeof(short) * 2 + recordSize;
                        recordsLeft--;
                    } while (record != eofMagic && recordsLeft > 0);
                }
    
                // check for PowerPoint user header
                const string currentUserName = "C\0u\0r\0r\0e\0n\0t\0 \0U\0s\0e\0r" + afterNamePadding;
                headerOffset = bufferString.IndexOf(currentUserName, StringComparison.InvariantCulture);
                if (headerOffset >= 0)
                {
                    sectionId = BitConverter.ToInt32(buffer, headerOffset + sectionIdOffset);
                    int sectionOffset = coBaseOffset + sectionId * sectionSize;
                    const int userAtomScanSize = 0x10;
                    if (sectionOffset < 0 || sectionOffset + userAtomScanSize > stream.Length)
                        return false; // invalid document
                    var userAtom = new byte[userAtomScanSize];
                    stream.Seek(sectionOffset, SeekOrigin.Begin);
                    ReadFromStream(stream, userAtom);
                    const int headerTokenOffset = 0x0C;
                    uint headerToken = BitConverter.ToUInt32(userAtom, headerTokenOffset);
                    // check for headerToken
                    const uint encryptedToken = 0xF3D1C4DF;
                    return headerToken == encryptedToken;
                }
            }
            catch (Exception ex)
            {
                // BitConverter exceptions may be related to document format problems
                // so we just treat them as "password not detected" result
                if (ex is ArgumentException)
                    return false;
                // respect all the rest exceptions
                throw;
            }
    
            return false;
        }
    }
    
    0 讨论(0)
  • 2020-12-03 00:36

    Here is a crude version of a password detecter i made. Does not need to open any Office objects.

        public static bool IsPassworded(string file) {
            var bytes = File.ReadAllBytes(file);
            return IsPassworded(bytes);
            return false;
        }
        public static bool IsPassworded(byte[] bytes) {
            var prefix = Encoding.Default.GetString(bytes.Take(2).ToArray());
            if (prefix == "PK") {
                //ZIP and not password protected
                return false;
            }
            if (prefix == "ÐÏ") {
                //Office format.
    
                //Flagged with password
                if (bytes.Skip(0x20c).Take(1).ToArray()[0] == 0x2f) return true; //XLS 2003
                if (bytes.Skip(0x214).Take(1).ToArray()[0] == 0x2f) return true; //XLS 2005
                if (bytes.Skip(0x20B).Take(1).ToArray()[0] == 0x13) return true; //DOC 2005
    
                if (bytes.Length < 2000) return false; //Guessing false
                var start = Encoding.Default.GetString(bytes.Take(2000).ToArray()); //DOC/XLS 2007+
                start = start.Replace("\0", " ");
                if (start.Contains("E n c r y p t e d P a c k a g e")) return true;
                return false;
            }
    
            //Unknown.
            return false;
        }
    

    It might not be 100%. The flags I found by comparing several Excel and Word documents with and without password. To add for PowerPoint just do the same.

    0 讨论(0)
提交回复
热议问题