Formulas are the heart of Excel and are an integral part of creating an Excel file. Each formula in Excel is represented by a semantic model, letting users create or modify a formula.
Wouldn't it be interesting to get a hold of this semantic model to play around with the formula? This is possible by parsing a formula and generating its syntax tree to represent the semantic model.
Grapecity Documents for Excel supports the Formula Parsing feature, which allows parsing of formulas to generate a syntax tree representing the semantic model of the formulas and the ability to format and modify the formulas.
It even creates a syntax tree to generate a formula. For example, refer to the following image, which depicts a formula syntax tree generated for a formula:
GrapeCity.Documents.Excel.Expressions namespace has been added to GcExcel API*to provide a formula parsing feature. The FormulaSyntaxTree class is the entry of formula expressions API.
The Formula Parser topic in the documentation provides a deep dive into the technical details. The GcExcel demos showcase all the operations that can be performed by parsing a formula.
In this blog, we learn how to parse an existing formula in an Excel spreadsheet and modify it as per the requirements in C#.
Creating a sales report using Excel is a widespread scenario. In this blog, we will be making a straightforward sales report using the formula parsing feature offered in the Grapecity Documents for Excel API.
The snapshot below has been captured from an Excel file. The left displays raw sales data consisting of the sales representative's name, region, product, and units sold.
The right shows the sales analysis result corresponding to a specific sales representative who has been extracted from raw data and progress toward monthly sales targets in each product region combination. These targets are derived by categorizing the Units Sold field values into the following three brackets:
This sales analysis on the right has been made using an Excel formula, which combines different Excel functions to get the expected results. The formula makes use of IF, ISNUMBER, and FILTER functions, as described below:
=IF(ISNUMBER(FILTER(A2:D19,A2:A19="Fritz")),IFS(FILTER(A2:D19,A2:A19="Fritz")>5000,"Above Target",FILTER(A2:D19,A2:A19="Fritz")>3000,"On Target",FILTER(A2:D19,A2:A19="Fritz")<2500,"Below Target"),FILTER(A2:D19,A2:A19="Fritz"))
A similar sales analysis result for each sales representative is added to the Excel file to complete the sales report. This is accomplished by modifying the above formula. We need to replace the sales representative's name in the above formula with the name of another sales representative whose sales analysis is required.
This is how the completed sales report looks after adding sales analysis results for each representative along with data formatting:
To make such a modification to the formula, it is imperative to know where the name variable for the sales representatives is located in the formula. If one were to do this manually, it would be a tedious task and prone to errors.
This task can be simplified by parsing the formula and using the parsed syntax tree to replace the sales representative name easily.
This blog will learn how to accomplish formula parsing and modification using Grapecity Documents for Excel and C#. The steps ahead will guide you on using GcExcel API to parse the above-described sales analysis formula and modify it to get the expected result using C#.
To begin with, create a new C# console application (.Net Core) and install Grapecity Documents for Excel package using Nuget Package Manager to get started and follow the steps ahead. Refer to GcExcel quickstart for more details.
Instantiate an instance of the Workbook class and import the sample data from the Excel file, as shown below. The code below even enables Dynamic Array support in GcExcel by setting the AllowDynamicArray property of the Workbook class to true. This is required as the formula used for calculation uses the FILTER function, a dynamic array function.
//Create a new workbook
var workbook = new GrapeCity.Documents.Excel.Workbook();
//Load sample data from excel file
workbook.Open("SampleData.xlsx");
//Enable dynamic array formula
workbook.AllowDynamicArray = true;
After the Workbook is loaded with sample data and the expected formula, we extract the required formula from the worksheet for parsing and modification using the Formula property.
The Formula Parser provided by GcExcel API expects the formula to be passed without the '=' (equal to) operator for successful formula parsing. Therefore, take note of how the formula has been extracted below without the "=" operator.
//Fetch worksheet
var worksheet = workbook.Worksheets[0];
//Fetch the original formula which needs to be parsed.
var originalFormula = worksheet.Range["H3"].Formula.Substring(1);
Invoke the Parse method of FormulaSynatxTree class to parse a formula and generate a syntax tree that helps you understand all different types of values, operators, and functions the formula contains.
Each of these tokens of the formula syntax tree is represented by other classes in GcExcel API, such as FunctionNode for a function, OperatorNode for an operator, etc.
Refer to the following documentation, which lists all the available classes for Formula Parser.
The code below parses the sales analysis formula extracted in the last step. It then appends the values from the generated FormulaSyntaxTree to a workbook, which is later saved as an Excel file to help you understand the syntax tree of the formula.
//Method to parse a formula and print the syntax tree
public static void ParseAndPrint(IWorksheet worksheet, string formula)
{
// Get syntax tree
var syntaxTree = FormulaSyntaxTree.Parse(formula);
// Flatten nodes
var displayItems = new List<(string TypeName, int IndentLevel, string Content)>();
void flatten(SyntaxNode node, int level)
{
displayItems.Add((node.GetType().Name, level, node.ToString()));
foreach (var child in node.Children)
{
flatten(child, level + 1);
}
}
flatten(syntaxTree.Root, 0);
// Output
worksheet.ShowRowOutline = false;
worksheet.OutlineColumn.ColumnIndex = 1;
// Header
worksheet.Range["A1"].Value = "Formula";
worksheet.Range["A3"].Value = "Syntax node";
worksheet.Range["B3"].Value = "Part";
// Values
worksheet.Range["B1"].Value = "'=" + formula;
for (var i = 0; i < displayItems.Count; i++)
{
var item = displayItems[i];
var text = "'" + item.TypeName;
worksheet.Range[i + 4, 0].Value = text;
worksheet.Range[i + 4, 0].IndentLevel = item.IndentLevel;
worksheet.Range[i + 4, 1].Value = "'" + item.Content;
}
//Apply styling
worksheet.Range["A1:B3"].Interior.Color = System.Drawing.Color.FromArgb(68, 114, 196);
worksheet.Range["A1:B3"].Font.Color = System.Drawing.Color.White;
worksheet.Range["A1:B3"].Borders.Color = System.Drawing.Color.FromArgb(91, 155, 213);
worksheet.Range["A1:B3"].Borders.LineStyle = BorderLineStyle.Thin;
worksheet.Range["A1,A3,B3"].Font.Size = 14;
worksheet.Range["A1,A3,B3"].Font.Bold = true;
worksheet.Range["A:C"].EntireColumn.AutoFit();
}
Here is a quick view of the generated FormulaSyntaxTree. Please note this is just a part of the complete syntax tree:
From the syntax tree generated in the last step, you can see the sales representative name is represented as a TextNode and has multiple occurrences in the formula. We can replace all these occurrences with a simple find and replace operation as shown in the code below:
The code below includes some formatting code to format the sales report content.
//Method to parse and modify the formula
public static void ModifyFormula(IWorksheet worksheet, string originalFormula)
{
//Apply UNIQUE formula to get unique sales representatives list
worksheet.Range["F1"].Value = "Unique Rep";
worksheet.Range["F2"].Formula = "=UNIQUE(A2:A19)";
var uniqueRep = worksheet.Range["F2#"];
// Apply Styling
worksheet.Range["F:F"].EntireColumn.AutoFit();
worksheet.Range["F1"].Interior.Color = System.Drawing.Color.FromArgb(68, 114, 196);
worksheet.Range["F1"].Font.Color = System.Drawing.Color.White;
worksheet.Range["F2#"].Borders.Color = System.Drawing.Color.FromArgb(91, 155, 213);
worksheet.Range["F2#"].Borders.LineStyle = BorderLineStyle.Thin;
//Get syntax tree
var syntaxTree = FormulaSyntaxTree.Parse(originalFormula);
//Find
var findText = new TextNode("Fritz");
//Replacement
var replaceText = new TextNode("");
//Loop through names list to modify the formula for each sales representative
for (int r = 0, resultRow = 3; r < uniqueRep.Cells.Count; r++, resultRow = resultRow + 4)
{
//Get name to be replaced in the formula
var cval = uniqueRep.Cells[r].Value.ToString();
if (findText.Value != cval)
{
//Assign name to be replaced to Replace TextNode
replaceText.Value = cval;
//Invoke the recursive method to perform find and replace operation
replaceNode(syntaxTree.Root, findText, replaceText);
//Assign the modified formula to a cell in the worksheet
var resultRange = "H" + resultRow.ToString();
worksheet.Range[resultRange].Formula = "=" + syntaxTree.ToString();
worksheet.Range[resultRange + "#"].Borders.Color = System.Drawing.Color.FromArgb(91, 155, 213);
worksheet.Range[resultRange + "#"].Borders.LineStyle = BorderLineStyle.Thin;
//Update the value of Find node to perform find and replace operation for next sales representative name
findText = replaceText;
}
}
//Find and replace
void replaceNode(SyntaxNode lookIn, SyntaxNode find, SyntaxNode replacement)
{
var children = lookIn.Children;
for (var i = 0; i < children.Count; i++)
{
var child = children[i];
if (child.Equals(find))
{
children[i] = replacement;
}
else
{
replaceNode(child, find, replacement);
}
}
}
}
Here is one of the modified formulas:
=IF(ISNUMBER(FILTER(A2:D19,A2:A19="Xi")),IFS(FILTER(A2:D19,A2:A19="Xi")>5000,"Above Target",FILTER(A2:D19,A2:A19="Xi")>3000,"On Target",FILTER(A2:D19,A2:A19="Xi")<2500,"Below Target"),FILTER(A2:D19,A2:A19="Xi"))
Once all the modified formulas have been added to the worksheet, the Save method of the Workbook class is invoked to save the Excel file as depicted in the code below:
//Save modified Excel file
workbook.Save("ModifiedFormula.xlsx", SaveFileFormat.Xlsx);
Open the saved Excel file to observe the following output:
Download the sample to implement the aforementioned scenario and receive a deep understanding of the concept.
Refer to documentation and demos for more details.