In-Memory Data Analytics with Salesforce Data and C1DataEngine

The ComponentOne DataEngine (C1DataEngine) is a standalone library that lets you query, aggregate, slice & dice, pivot millions of records from large datasets with exceptional performance. It serves as an in-memory data manager that enables extraction, transformation, and loading (ETL) for large data sets. The data is imported directly from SQL Server or any enumerable collection of strongly-typed objects. You can also import data from large CSV or JSON file using a bit of custom code or any third-party package. The imported data can be filled in the base tables using either of the built-in connectors i.e., DbConnector or ObjectConnector. You may refer to C1DataEngine introductory blog and documentation, which can help you understand the basics for C1DataEngine in detail.

This blog will demonstrate how to fetch data from Salesforce using C1DataConnector, another service component provided by ComponentOne which lets you fetch data from varied data sources using ADO.NET technology. We will majorly be focusing on learning how to apply Pivot transformation on the fetched data using C1DataEngine. Refer to the following blog, which showcases the implementation of Pivot transformation on SQL data.

Fetch Data from Salesforce

We must begin by data extraction i.e. fetching the from Salesforce using C1DataConnector's provider for Salesforce.

Define Connection String

The code below depicts the connection string definition for connecting to Salesforce data, providing appropriate values for all the required attributes, including username, password, security token, and other OAuth settings.

private string GetConnectionString()
   const string GCSalesforceServerConnectionString = @"Username=*****; Password=*****; Security Token=*****; OAuth Client Id=*****; OAuth Client Secret=*****;
                                                            OAuth Token Endpoint=;
   return GCSalesforceServerConnectionString;

Import Data to DataEngine using C1Salesforce DataConnector

The code below initializes C1DataEngine by instantiating the Workspace class and defines an instance of C1SalesforceConnection class by passing in the connection string defined above to establish a connection with the Salesforce server. An instance of C1SalesforceCommand class has been created to define the query used to fetch data from a specific table Invoices__c. The connection and command objects are passed to an instance of DbConnector class to fetch the data from Salesforce and create a corresponding table in C1DataEngine's workspace with the same table name in C1DataEngine workspace.

this._workSpace = new Workspace();

C1SalesforceConnection conn = new C1SalesforceConnection(GetConnectionString());

var command = new C1SalesforceCommand(conn, "Select * from Invoices__c");

var connector = new DbConnector(this._workSpace, conn, command);

this.invoice = this._workSpace.table("Invoices__c");

Define Query for C1DataEngine and Fetch Columns

After populating the DataEngine workspace base tables, we will define a query to create a result set with a list of columns we would like to use when performing the data transformation to achieve in-memory data analytics. The code below defines and executes a query to create the result set, which is used for transformations ahead.

An instance of C1PivotEngine has also been initialized and connected to one of the C1DataEngine's table using the ConnectDataEngine method, based on the table name passed as a parameter to this method. This instance would help us apply Pivot transformation to data to perform in-memory data analytics.

var queryName = this.QueryName;
var settings = new
   Sales = Op.Sum(invoice.ExtendedPrice__c),
   Quantity = Op.Count(invoice.Quantity__c),
   QuantityTotal = Op.Sum(invoice.Quantity__c)

this.ExecuteQuery(queryName, settings);
this.engine = new C1PivotEngine();
this.engine.Workspace = this._workSpace;

Helper methods

The code below defines two helper methods, which have been invoked in the code snippet defined above, basically used to execute C1DataEngine query.

private void ExecuteQuery(string queryName, object settings)
   dynamic query = GetQuery(queryName, settings);

 private dynamic GetQuery(string queryName, object settings)
    return this._workSpace.query(queryName, settings);

Save Fetched Data to CSV

So, we are done with populating a base table in workspace and defining a query over the base table. The result set of the query or query data can be exported to JSON or CSV using the appropriate API members. For instance, the code below depicts saving the query data to CSV using the Write method of DataList class, a static method. You can alter the second parameter of this method i.e., OutputFormat, to save the same data to JSON.

string file = @"allquery.csv";
using (TextWriter writer = File.CreateText(file))
   IDataList dlist = this.GetQueryResults(queryName, settings);
   DataList.Write(dlist, writer, OutputFormat.Csv);

Apply Pivot Transformation to Salesforce data using C1PivotEngine

The sections above have completed the data extraction process for Salesforce data and defined the C1PivotEngine instance to apply the transformations to data. So, let's move forward by defining a common method that would perform different types of transformations depending upon the list of row fields, column fields, value fields, and query name passed to it. The query name specifies that the mentioned transformation should be applied to which query data. At the same time, the other fields determine the type of transformation such as single column, multi-column, or other operations such as sorting, grouping, or filtering.

private C1PivotEngine CreateFlexPivotEngine(string queryName, string[] colFields, string[] rowFields, string[] valFields)
   var fp = this.engine;          
   return fp;

Pivot Transformations

Let's explore few of the many transformations that can be applied to data using C1DataEngine.

Single Column

The single column pivot transform, applies transformation on a single column i.e. the data is aggregated based on a single column. An example for single column transformation from the Invoices table could be how much sales were done in each country. Applying the single column pivot transform will group the data by country and add all the sales in that particular country.

Here is a sample code snippet describing how this transformation can be applied:

public IBindingList SalesByCountry()
   var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { "" }, new string[] { "Country__c" }, new string[] { "Sales" });
   return fp.PivotDefaultView;

The data will be transformed based on the fields passed to the CreateFlexPivotEngine method. For the above code example, the values of country column will be displayed in a column since it is passed as a row field, and the aggregated sales data for each country will be displayed alongside each country in a different column. Here is a quick view of the transformation result displayed in a grid:


Multiple Column

The multi-column pivot transform groups the data by multiple columns and then applies the aggregation. The grouping begins at the top-level column followed by the column below this level, and so on. For example, in addition to total sales for a country, to know the total sales of each product inside a particular country, we can apply multi-column pivot transform with country as the top-level column, product as the second level and apply aggregation of sales for each item. The code below defines the same:

public IBindingList ProductSalesByCountry()
   var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { "Country__c" }, new string[] { "ProductName__c" }, new string[] { "Sales" });
   var fld = fp.ColumnFields[0].Filter;
   fld.Condition1.Operator = ConditionOperator.Contains;
   fld.Condition1.Parameter = "m";
   return fp.PivotDefaultView;

The data will be transformed based on the fields passed to the CreateFlexPivotEngine method. For the above code example, there will be a column containing each distinct product, followed by aggregated sales of each country in multiple column where in each distinct country will have its own sales column.



To understand the grouping transformation, consider a Date column, for example, OrderDate. As the column name implies, there will be a large collection of orders, each having a specific order date, however, each order date might not be unique. In this case, we try to evaluate the aggregate sales each day. Then, it would make sense only if the timespan considered is short. However, if it is a longer span, then analyzing data based on dates would seem difficult and make no sense.

In such a scenario, you can group the data by a specific attribute of the date such as year or month. This would minimize the data, and even the analyzed data would provide you valuable information. Hence, here we will consider applying grouping to dates based on the year attribute and aggregate quarterly sales for each fiscal year. For this, we will use the Format and RangeType property of pivot fields. Here is a code sample depicting the same:

public IBindingList ProductsSoldInYearQuarterly()
    var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { }, new string[] { "OrderDate", "OrderDate" }, new string[] { "Sales" });
    var year = fp.RowFields[0];
    year.Range.RangeType = RangeType.FiscalYear;
    year.Format = "yyyy";

    var month = fp.RowFields[1];
    month.Range.RangeType = RangeType.FiscalQuarter;

    return fp.FlexPivotDefaultView;

Here is a quick view of the expected output when viewed in a grid:



To apply the sorting transformation, we can use the Sort method of DataList class, a static method. It accepts an IDataList, column name, and a sorting order as parameters to sort the list according to the column name and the specified sorting order. The sorting order parameter is a boolean parameter, which default sets to True, implying an ascending sort, which when set to False sorts the data in descending order. For example, the code below applies a descending sort on the Sales column, assuming a SalesByCountry query, which returns the Sales aggregated by each country.

var result = this._workSpace.GetQueryData("SalesByCountry"); 
DataList.Sort(result, "Sales", false); // sort result in descending order by sales column

Here is a quick view of the resulting data displayed in a grid:



To apply the filtering transformation, you can set the Filter for a Pivot field by accessing its Filter property which is of type C1FlexPivotFilter. The C1FlexPivotFilter class properties can be used to apply different types of filters by varying the number of filter conditions and the type of operators used to define the filtering criteria.

For example, here, we are applying filtering to the ProductSalesByCountry query, which returns the aggregated sales of each product in a country. We would tweak the query to enhance the transformation to include the filtering criteria that filters the result set to return the data for a specific country Brazil, here is the sample code implementing the same:

public IBindingList FilterESalesByCountry()
   var fp = this.CreateFlexPivotEngine(this.QueryName, new string[] { }, new string[] { "Country__c", "ProductName__c" }, new string[] { "Sales" });
   var fld = fp.RowFields[0].Filter;
   fld.Condition1.Operator = ConditionOperator.Equals;
   fld.Condition1.Parameter = "Brazil";
   return fp.PivotDefaultView;

Here is a quick view of the resulting data displayed in a grid:


Similarly, you can use other operators such as NotEquals, GreaterThanOrEqualTo, LessThanOrEqualTo, etc., to achieve the required results.

The transformations explained above are just a part of the different types of transformations that can be applied using C1DataEngine. Download the sample implementing the transformations explained above, along with some other examples here. Try implementing different types of transformations on your data to perform in-memory data analytics using C1DataEngine to understand and explore the exceptional performance of C1DataEngine.

Manpreet Kaur

Senior Software Engineer
comments powered by Disqus