C# - Check if File is Text Based

后端 未结 6 769
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-11-29 09:19

How can I test whether a file that I\'m opening in C# using FileStream is a \"text type\" file? I would like my program to open any file that is text based, for example, .t

相关标签:
6条回答
  • 2020-11-29 09:56
    public bool IsTextFile(string FilePath)
      using (StreamReader reader = new StreamReader(FilePath))
      {
           int Character;
           while ((Character = reader.Read()) != -1)
           {
               if ((Character > 0 && Character < 8) || (Character > 13 && Character < 26))
               {
                        return false; 
               }
           }
      }
      return true;
    }
    
    0 讨论(0)
  • 2020-11-29 09:59

    To get the real type of a file, you must check its header, which won't be changed even the extension is modified. You can get the header list here, and use something like this in your code:

    using(var stream = new FileStream(fileName, FileMode.Open, FileAccess.Read))
    {
       using(var reader = new BinaryReader(stream))
       {
         // read the first X bytes of the file
         // In this example I want to check if the file is a BMP
         // whose header is 424D in hex(2 bytes 6677)
         string code = reader.ReadByte().ToString() + reader.ReadByte().ToString();
         if (code.Equals("6677"))
         {
            //it's a BMP file
         }
       }
    }
    
    0 讨论(0)
  • 2020-11-29 10:03

    In general: there is no way to tell.

    A text file stored in UTF-16 will likely look like binary if you open it with an 8-bit encoding. Equally someone could save a text file as a .doc (it is a document).

    While you could open the file and look at some of the content all such heuristics will sometimes fail (eg. notepad tries to do this, by careful selection of a few characters notepad will guess wrong and display completely different content).

    If you have a specific scenario, rather than being able to open and process anything, you should be able to do much better.

    0 讨论(0)
  • 2020-11-29 10:08

    I guess you could just check through the first 1000 (arbitrary number) characters and see if there are unprintable characters, or if they are all ascii in a certain range. If the latter, assume that it is text?

    Whatever you do is going to be a guess.

    0 讨论(0)
  • 2020-11-29 10:13

    As others have pointed out there is no absolute way to be sure. However, some implementations check for consecutive NUL characters. Git apparently just checks the first 8000 chars for a NUL and if it finds one treats the file as binary. See here for more details.

    Here is a similar C# solution I wrote that looks for a given number of required consecutive NUL:

    public bool IsBinary(string filePath, int requiredConsecutiveNul = 1)
    {
        const int charsToCheck = 8000;
        const char nulChar = '\0';
    
        int nulCount = 0;
    
        using (var streamReader = new StreamReader(filePath))
        {
            for (var i = 0; i < charsToCheck; i++)
            {
                if (streamReader.EndOfStream)
                    return false;
    
                if ((char) streamReader.Read() == nulChar)
                {
                    nulCount++;
    
                    if (nulCount >= requiredConsecutiveNul)
                        return true;
                }
                else
                {
                    nulCount = 0;
                }
            }
        }
    
        return false;
    }
    
    0 讨论(0)
  • 2020-11-29 10:17

    I have a below solution which works for me.This is general solution which check all types of Binary file.

         /// <summary>
         /// This method checks whether selected file is Binary file or not.
         /// </summary>     
         public bool CheckForBinary()
         {
    
                 Stream objStream = new FileStream("your file path", FileMode.Open, FileAccess.Read);
                 bool bFlag = true;
    
                 // Iterate through stream & check ASCII value of each byte.
                 for (int nPosition = 0; nPosition < objStream.Length; nPosition++)
                 {
                     int a = objStream.ReadByte();
    
                     if (!(a >= 0 && a <= 127))
                     {
                         break;            // Binary File
                     }
                     else if (objStream.Position == (objStream.Length))
                     {
                         bFlag = false;    // Text File
                     }
                 }
                 objStream.Dispose();
    
                 return bFlag;                   
         }
    
    0 讨论(0)
提交回复
热议问题