Using Regex to extract table names from a file containing SQL queries

后端 未结 5 720
慢半拍i
慢半拍i 2021-01-05 11:28

I\'ve a text file containing large number of queries. I want to get all the distinct tables used in the entire file in all the queries. The table name can come after a \'fro

相关标签:
5条回答
  • can try this but it doesnt work for all the types of query,

      public void Main()
        {
            // TODO: Add your code here
    
            string Line = string.Empty;
    
            using (StreamReader sr = new StreamReader(@"D:\ssis\queryfile.txt"))//reading the filename
            {
    
                var text = string.Empty;
    
                do
                {
                    //     MessageBox.Show(Line);
                    text = Line = sr.ReadToEnd();// storing it in a variable by reading till end
    
                    MessageBox.Show(Line);
    
    
    
                } while ((Line = sr.ReadLine()) != null);
    
    
    
                var text1 = text.Replace("[", string.Empty).Replace("]", string.Empty);//replacing brackets with empty space
    
                MessageBox.Show(text1);
    
    
    
                Regex r = new Regex(@"(?<=from|join)\s+(?<table>\S+)", RegexOptions.IgnoreCase | RegexOptions.Compiled);//regex for extracting the tablename after from and join
    
                Match m = r.Match(text1);//creating match object
    
                MessageBox.Show(m.Groups[1].Value);
    
                var v = string.Empty;
    
    
    
                while (m.Success)
                {
    
                    v = m.Groups[0].Value;
    
                    m = m.NextMatch();
    
    
                    StreamWriter wr = new StreamWriter(@"D:\ssis\writefile.txt", true);// writing the match to the file
    
                    var text2 = v.Replace(".", " ,"); // replace the . with , seperated values
    
    
                    wr.WriteLine(text2);
    
                    sr.Close();
                    wr.Close();
    
                }
            }
        }
    
    0 讨论(0)
  • 2021-01-05 11:49

    It depends on structure of your file. Try to use this:

    (?<=from|join)(\s+\w+\b)
    

    Also turn on options Multiline if your not split your file in array or smth else with singleline string members. Also try to turn on IgnorCase option.

    0 讨论(0)
  • 2021-01-05 11:53

    I'd use:

    r = new Regex("(from|join)\s+(?<table>\S+)", RegexOptions.IgnoreCase);
    

    once you have the Match object "m", you'll have the table name with

    m.Groups["table"].Value
    

    example:

    string line = @"select * from tb_name join tb_name2 ON a=b WHERE x=y";
    Regex r = new Regex(@"(from|join)\s+(?<table>\S+)",
             RegexOptions.IgnoreCase|RegexOptions.Compiled);
    
    Match m = r.Match(line);
    while (m.Success) {
       Console.WriteLine (m.Groups["table"].Value);
       m = m.NextMatch();
    }
    

    it will print: tb_table tb_table2

    0 讨论(0)
  • 2021-01-05 11:53
    (from|join)\s(\w+)
    
    0 讨论(0)
  • 2021-01-05 11:57

    Something like this maybe:

    /(from|join)\s+(\w*\.)*(?<tablename>\w+)/
    

    It won't match escaped table names though, and you need to make the regex evaluation case-insensitive.

    0 讨论(0)
提交回复
热议问题