Skip to content
Bruno Sonnino
Menu
  • Home
  • About
Menu

Parsing SQL Server code with C# – Part 2 – Advanced parsing

Posted on 28 August 2018

In the last post, I've shown how to parse SQL Server code with C# and get all tokens in it, showing their types. This is very nice, there is a lot you can do with that, but there is a pitfall: you don't have the context the token was used. For example, you have an identifier, but you don't know if it is a parameter or a variable declared in the procedure. You don't know its type and where it is used.

Sometimes you need more information about the token and you can't get it unless you analyze the code. Fortunately, Microsoft has already done that for us and provided a wonderful tool, so we can get the context of the tokens: you can create a new class that inherits from TSqlFragmentVisitor and override its methods, to visit the node types you want (if this doesn't make sense to you right now, keep on reading, you'll see it's very simple).

The TSqlFragmentVisitor class is a massive class that uses the Visitor Pattern to analyze the SQL tree and visit all the nodes, so you can take action on the nodes that come to your attention. If you take a look at it, you will see that there are a huge number of Visit and ExplicitVisit overrrides, one for each kind of node you can have in a SQL Server procedure. Using Visit is similar to ExplicitVisit, with the difference that with ExplicitVisit you can control if the children nodes are also visited.

To use it, you must parse the tree in the same way you did in the last article and then create a visitor instance and pass it in the Accept method. This code shows how this is done:

static void Main(string[] args)
{
    var sql = "Select * from customer";
    var parsed = ParseSql(sql);
    if (parsed.errors.Any())
        return;
    var visitor = new SelectVisitor();
    parsed.sqlTree.Accept(visitor);
    Console.ReadLine();
}

private static (TSqlFragment sqlTree, IList<ParseError> errors) ParseSql(string procText)
{
    var parser = new TSql150Parser(true);
    using (var textReader = new StringReader(procText))
    {
        var sqlTree = parser.Parse(textReader, out var errors);

        return (sqlTree, errors);
    }
}
C#

The visitor class is something like this:

internal class SelectVisitor : TSqlFragmentVisitor
{
    public override void Visit(SelectStatement node)
    {
        Console.WriteLine($"Visiting Select: {node}");
    }

    public override void Visit(QueryExpression node)
    {
        Console.WriteLine($"Visiting QueryExpression: {node}");
    }

    public override void Visit(QuerySpecification node)
    {
        Console.WriteLine($"Visiting QuerySpecification: {node}");
    }

    public override void Visit(SelectStarExpression node)
    {
        Console.WriteLine($"Visiting SelectStarExpression: {node}");
    }
}
C#

Running this code, you will get something like this:

As you can see, the node and its children are visited and we have no control on that. If you want to have some control, you must use the ExplicitVisit:

internal class SelectVisitor : TSqlFragmentVisitor
{
    public override void ExplicitVisit(SelectStatement node)
    {
        Console.WriteLine($"Visiting Select: {node}");
    }

    public override void Visit(QueryExpression node)
    {
        Console.WriteLine($"Visiting QueryExpression: {node}");
    }

    public override void Visit(QuerySpecification node)
    {
        Console.WriteLine($"Visiting QuerySpecification: {node}");
    }

    public override void Visit(SelectStarExpression node)
    {
        Console.WriteLine($"Visiting SelectStarExpression: {node}");
    }
}
C#

As you can see, the child nodes are not visited. To visit them, you must call the base method, like this:

internal class SelectVisitor : TSqlFragmentVisitor
{
    public override void ExplicitVisit(SelectStatement node)
    {
        Console.WriteLine($"Visiting Select: {node}");
        base.ExplicitVisit(node);
    }

    public override void ExplicitVisit(QueryExpression node)
    {
        Console.WriteLine($"Visiting QueryExpression: {node}");
    }

    public override void ExplicitVisit(QuerySpecification node)
    {
        Console.WriteLine($"Visiting QuerySpecification: {node}");
    }

    public override void ExplicitVisit(SelectStarExpression node)
    {
        Console.WriteLine($"Visiting SelectStarExpression: {node}");
    }
}
C#

Nice, no? With this simple code, you can write something to check if any procedure in your database has a "select *", which can be flagged as an error. Just override the Visit for the SelectStarExpression and flag the error if it's visited.

Generating statistics from the database procedures

With this knowledge, we can write a program that generates statistics from the database procedures. We want to know how many inserts, deletes and updates there are in the procedures and what are the tables used with these commands. We also can know how many tables are created and dropped in the procedures.

For that, create a new console application and name it DataStats Add the NuGet package Microsoft.SqlServer.TransactSql.ScriptDom and add this code to Program.cs:

     static void Main(string[] args)
     {
         using (var con = new SqlConnection("Server=.;Database=WideWorldImporters;Trusted_Connection=True;"))
         {
             con.Open();
             var procTexts = GetStoredProcedures(con)
                 .Select(n => new { ProcName = n, ProcText = GetProcText(con, n) })
                 .ToList();
             var procTrees = procTexts.Select(p =>
             {
                 var processed = ParseSql(p.ProcText);
                 var visitor = new StatsVisitor();
                 if (!processed.errors.Any())
                     processed.sqlTree.Accept(visitor);
                 return new { p.ProcName, processed.sqlTree, processed.errors, visitor };
             }).ToList();
             foreach (var procTree in procTrees)
             {
                 Console.WriteLine(procTree.ProcName);
                 if (procTree.errors.Any())
                 {
                     Console.WriteLine("   Errors found:");
                     foreach (var error in procTree.errors)
                     {
                         Console.WriteLine($"     Line: {error.Line}  Col: {error.Column}: {error.Message}");
                     }
                 }
                 else
                 {
                     var visitor = procTree.visitor;
                     Console.WriteLine($"  Inserts: {visitor.Inserts}");
                     foreach (var table in visitor.InsertTables)
                     {
                         Console.WriteLine($"      {table}");
                     }

                     Console.WriteLine($"  Updates: {visitor.Updates}");
                     foreach (var table in visitor.UpdateTables)
                     {
                         Console.WriteLine($"      {table}");
                     }

                     Console.WriteLine($"  Deletes: {visitor.Deletes}");
                     foreach (var table in visitor.DeleteTables)
                     {
                         Console.WriteLine($"      {table}");
                     }

                     Console.WriteLine($"  Creates: {visitor.Creates}");
                     foreach (var table in visitor.CreateTables)
                     {
                         Console.WriteLine($"      {table}");
                     }

                     Console.WriteLine($"  Drops: {visitor.Drops}");
                     foreach (var table in visitor.DropTables)
                     {
                         Console.WriteLine($"      {table}");
                     }
                 }
             }
         }

         Console.ReadLine();
     }

     private static List<string> GetStoredProcedures(SqlConnection con)
     {
         using (SqlCommand sqlCommand = new SqlCommand(
             "select s.name+'.'+p.name as name from sys.procedures p " + 
             "inner join sys.schemas s on p.schema_id = s.schema_id order by name",
             con))
         {
             using (DataTable procs = new DataTable())
             {
                 procs.Load(sqlCommand.ExecuteReader());
                 return procs.Rows.OfType<DataRow>().Select(r => r.Field<String>("name")).ToList();
             }
         }
     }

     private static string GetProcText(SqlConnection con, string procName)
     {
         using (SqlCommand sqlCommand = new SqlCommand("sys.sp_helpText", con)
         {
             CommandType = CommandType.StoredProcedure
         })
         {
             sqlCommand.Parameters.AddWithValue("@objname", procName);
             using (var proc = new DataTable())
             {
                 try
                 {
                     proc.Load(sqlCommand.ExecuteReader());
                     return string.Join("", proc.Rows.OfType<DataRow>().Select(r => r.Field<string>("Text")));
                 }
                 catch (SqlException)
                 {
                     return null;
                 }
             }
         }
     }

     private static (TSqlFragment sqlTree, IList<ParseError> errors) ParseSql(string procText)
     {
         var parser = new TSql150Parser(true);
         using (var textReader = new StringReader(procText))
         {
             var sqlTree = parser.Parse(textReader, out var errors);

             return (sqlTree, errors);
         }
     }
 }
C#

This code is very similar to the one in the last post, but it will create a visitor and will call the Accept method to visit its nodes. The visitor's code is:

internal class StatsVisitor : TSqlFragmentVisitor
{
    public int Inserts { get; private set; }
    public int Updates { get; private set; }
    public int Deletes { get; private set; }
    public int Creates { get; private set; }
    public int Drops { get; private set; }
    public List<string> InsertTables { get; }
    public List<string> UpdateTables { get; }
    public List<string> DeleteTables { get; }
    public List<string> CreateTables { get; }
    public List<string> DropTables { get; }

    public StatsVisitor()
    {
        InsertTables = new List<string>();
        UpdateTables = new List<string>();
        DeleteTables = new List<string>();
        CreateTables = new List<string>();
        DropTables = new List<string>();
    }

    public override void Visit(InsertStatement node)
    {
        Inserts++;
        InsertTables.Add((node.InsertSpecification.Target as NamedTableReference)?.
            SchemaObject.BaseIdentifier.Value);
    }

    public override void Visit(UpdateStatement node)
    {
        Updates++;
        UpdateTables.Add((node.UpdateSpecification.Target as NamedTableReference)?.
            SchemaObject.BaseIdentifier.Value);
    }

    public override void Visit(DeleteStatement node)
    {
        Deletes++;
        UpdateTables.Add((node.DeleteSpecification.Target as NamedTableReference)?.
            SchemaObject.BaseIdentifier.Value);
    }

    public override void Visit(CreateTableStatement node)
    {
        Creates++;
        CreateTables.Add(node.SchemaObjectName.BaseIdentifier.Value);
    }

    public override void Visit(DropTableStatement node)
    {
        Drops++;
        DropTables.AddRange(node.Objects.Select(o => o.BaseIdentifier.Value));
    }

}
C#

When there is an Insert, Delete, Update, Create Table or Drop Table, the corresponding method will be called and the properties referring to the node will be updated. At the end you will have the statistics for all the procedures in the database:

There is just a minor glitch in this code: as you can see in the Inserts, there is a blank line. That is due to the fact that the insert target isn't a table, but a variable. This code solves the issue:

public override void Visit(InsertStatement node)
{
    Inserts++;
    var targetName = (node.InsertSpecification.Target as NamedTableReference)?.
        SchemaObject.BaseIdentifier.Value ??
      (node.InsertSpecification.Target as VariableTableReference)?.Variable.Name;
    InsertTables.Add(targetName);
}
C#

Conclusions

As you can see, Microsoft has put a lot of work in this parser, you can have a lot of power when working with SQL Server code in C# with this parser. The options are endless: check quality of the code, refactor code, make statistics, and so on. In the next post we will see a simple way to reformat your SQL Server code, so you can ensure that the code follows your standards. See you then.

The code for this article is in https://github.com/bsonnino/SqlParsing

3 thoughts on “Parsing SQL Server code with C# – Part 2 – Advanced parsing”

  1. Rishi says:
    17 April 2020 at 06:49

    HI, I can get updating table from this. but I do have a scenario,

    1.) update statement with joins
    2.) Updating table is having alias
    3.) alias is written after Update Keyword

    I am getting alias name, but I want table name. Can I achieve that ? can you help plz?

    Reply
    1. bsonnino says:
      25 April 2020 at 12:56

      I’ll take a look and maybe (no promises :-)), I’ll write an article on that later

      Reply
  2. Pingback: Reformatting your SQL Server code with C# – Bruno Sonnino

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • May 2025
  • December 2024
  • October 2024
  • August 2024
  • July 2024
  • June 2024
  • November 2023
  • October 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • June 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • October 2020
  • September 2020
  • April 2020
  • March 2020
  • January 2020
  • November 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • June 2017
  • May 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • October 2015
  • August 2013
  • May 2013
  • February 2012
  • January 2012
  • April 2011
  • March 2011
  • December 2010
  • November 2009
  • June 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • July 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • Development
  • English
  • Português
  • Uncategorized
  • Windows

.NET AI Algorithms asp.NET Backup C# Debugging Delphi Dependency Injection Desktop Bridge Desktop icons Entity Framework JSON Linq Mef Minimal API MVVM NTFS Open Source OpenXML OzCode PowerShell Sensors Silverlight Source Code Generators sql server Surface Dial Testing Tools TypeScript UI Unit Testing UWP Visual Studio VS Code WCF WebView2 WinAppSDK Windows Windows 10 Windows Forms Windows Phone WPF XAML Zip

  • Entries RSS
  • Comments RSS
©2025 Bruno Sonnino | Design: Newspaperly WordPress Theme
Menu
  • Home
  • About