I\'ve a text file containing large number of queries. I want to get all the distinct tables used in the entire file in all the queries. The table name can come after a \'fro
can try this but it doesnt work for all the types of query,
public void Main()
{
// TODO: Add your code here
string Line = string.Empty;
using (StreamReader sr = new StreamReader(@"D:\ssis\queryfile.txt"))//reading the filename
{
var text = string.Empty;
do
{
// MessageBox.Show(Line);
text = Line = sr.ReadToEnd();// storing it in a variable by reading till end
MessageBox.Show(Line);
} while ((Line = sr.ReadLine()) != null);
var text1 = text.Replace("[", string.Empty).Replace("]", string.Empty);//replacing brackets with empty space
MessageBox.Show(text1);
Regex r = new Regex(@"(?<=from|join)\s+(?<table>\S+)", RegexOptions.IgnoreCase | RegexOptions.Compiled);//regex for extracting the tablename after from and join
Match m = r.Match(text1);//creating match object
MessageBox.Show(m.Groups[1].Value);
var v = string.Empty;
while (m.Success)
{
v = m.Groups[0].Value;
m = m.NextMatch();
StreamWriter wr = new StreamWriter(@"D:\ssis\writefile.txt", true);// writing the match to the file
var text2 = v.Replace(".", " ,"); // replace the . with , seperated values
wr.WriteLine(text2);
sr.Close();
wr.Close();
}
}
}
It depends on structure of your file. Try to use this:
(?<=from|join)(\s+\w+\b)
Also turn on options Multiline if your not split your file in array or smth else with singleline string members. Also try to turn on IgnorCase option.
I'd use:
r = new Regex("(from|join)\s+(?<table>\S+)", RegexOptions.IgnoreCase);
once you have the Match object "m", you'll have the table name with
m.Groups["table"].Value
example:
string line = @"select * from tb_name join tb_name2 ON a=b WHERE x=y";
Regex r = new Regex(@"(from|join)\s+(?<table>\S+)",
RegexOptions.IgnoreCase|RegexOptions.Compiled);
Match m = r.Match(line);
while (m.Success) {
Console.WriteLine (m.Groups["table"].Value);
m = m.NextMatch();
}
it will print: tb_table tb_table2
(from|join)\s(\w+)
Something like this maybe:
/(from|join)\s+(\w*\.)*(?<tablename>\w+)/
It won't match escaped table names though, and you need to make the regex evaluation case-insensitive.