Aggregates are calculations performed on a group of data rows. For example, counting the number of rows in a group is an aggregate calculation. In Exago, this is done by using an aggregate formula in the Group Footer, Group Header, Report Header (as of v2019.2+) or Report Footer section of a report.
Exago can run aggregate formulas on any arbitrary data. However, asking an Exago execution engine—Web Application or Scheduler Service—to do so can be under-performant for large sets of data for two reasons:
- The application must pull every individual detail row from the database
- The calculations are done in the application, not the database
Thus both the network and the application server will be bottlenecks to higher performance. This inefficiency can be optimized away with Database Aggregation.
Database Aggregation pushes aggregate calculations to the database when possible, which allows for far fewer data rows to have to be sent across the network and aggregated on the web server.
Most databases have the ability to calculate aggregates themselves. This tends to be much faster than performing the calculations in memory. The Database Aggregation feature allows Exago to push these calculations to the database.
Supported database aggregate functions are AggCount(), AggDistinctCount(), AggSum(), AggAvg(), AggMax(), and AggMin().
Enabling Database Aggregation
NoteDatabase Aggregation is enabled by default in v2019.1+.
Database Aggregation can be enabled (or disabled) by either:
- changing Admin Console > General > Database Settings > Aggregate and Group in Database setting
- by editing the
<aggregateandgroupindatabase>
node in the config file.
A value of True enables Database Aggregation, False disables it.
WarningReview the following section to ensure that aggregates for One-to-Many joins are calculated correctly.
Identifying One-to-Many Joins
All One-to-Many joins must be identified as such in the application configuration. Exago calculates aggregates based on the unique key relationship between categories. But some databases calculate aggregates by simply counting the rows. This means Exago must know which joins are one-to-many, so that it can decide whether it is acceptable to push the calculation to the database, or whether it must do the calculation in the application.
NoteThe Data Source Metadata Discovery tool cannot identify whether joins are One-to-Many. By default it sets all joins as One-to-One. If you used this tool to autofill your Objects and Joins, you must take these precautions before enabling Database Aggregation.
Admin Console
To set your joins using the Admin Console, expand Data > Joins, and then double-click on the Join, or select the Join and click the Edit icon. This will open the Joins tab.
In the Relation Type drop-down menu, select One To Many. Then click Apply or Okay to save your changes. Do this for each relevant Join.
Config File
To set joins by editing the config file, open the file in a text or XML editor. Each join is a <join>
element. In each relevant join element, locate the <relation_type>
field and change the value from "11" (one-to-one) to "1M" (one-to-many). Save the config and restart the web server.
Exceptions
Database Aggregation will not work on every report. In the following cases aggregation occurs in memory even if Database Aggregation is enabled:
- ExpressViews (pre-v2017.3)
- Reports with Execution Caching enabled (v2017.3.19+)
- Reports with more than one data source pre-v2020.1.0and any of the following conditions are met:
- three or more data sources are included
- there is an aggregate value on a field from the same database as the innermost group field's database, AND there is not a direct join from the innermost group entity to the other database, OR there is a direct join but the innermost group is not on the join column of the direct join
- both of the data sources contain aggregates in the
SELECT
cause - none of the groups on the report are single field sorts, OR are formulas where all of the fields in the formula are from a single data source
- Reports with an aggregation on the ‘One’ side of a One-to-Many join (e.g.
AggCount({Categories.CategoryName})
on a report with both Categories and Products) pre-v2020.1.0 - Reports with an aggregation on the ‘One’ side of a One-to-Many join and the recordLevel argument is passed as False or is omitted and the Run Aggregate Functions at Record Level by Default setting is False. v2020.1.0+
- Reports with a visible Detail section or cells referencing a hidden Detail section
- Reports with Formula Sorts or Custom Columns pre-v2020.1.0
Note
As of v2020.1.13+, v2021.1.1+, reports with Custom Columns are no longer automatically disqualified from database aggregation.
- Reports with Custom Columns or Formula Sorts that cannot be converted to SQL, including Custom Aggregate Functions v2020.1.0+
Note
As of v2020.1.13+, v2021.1.1+, reports with Custom Columns are no longer automatically disqualified from database aggregation.
- Reports containing an
AggDistinctCount()
formula (pre-v2019.1.3) - Reports containing a CrossTab Report
- Aggregate formulas with a non-data field argument, such as another formula if the formula cannot be translated to SQL by Exago.
AggAvg() and AggDistinctCount()
need to be in the innermost group if there are multiple groups or a group and a report footer- Reports with one or more Interactive Filters, Column Sorts, Group Min/Max Filters, Top N Filters, Linked Report Formulas or Transformed Vertical Tables
- Reports that include Data Objects from an Elasticsearch Data Source
- Reports that include Data Objects from an OLAP Data Source
NoteGenerally, formulas can be translated to SQL if all functions used in a given formula have SQL mappings in the FormulaDictionary section of the dbconfigs.json (or dbconfigs.overrides.json) file.
Best Practices when Report Building
While Database Aggregation is designed to work invisibly without requiring any additional user input, it cannot work in all scenarios. Each best practice is designed to work around one of the inherent limitations that will prevent aggregation from occurring in the database. In summary they are:
- Separate visualization detail rows to a Linked Report
- Construct tables in the database, not the application
- Use pre-aggregated tables
- Combine functions that can be aggregated in the database
Separate visualization detail rows to a Linked Report
Limitation: Showing detail rows defeats the purpose of database aggregation.
Solution: For visualization reports based on group data, suppress or remove the Detail section. Create a secondary report with the detail rows and link it to the chart as a Linked Report. This way, the primary visualization runs quickly. Then when the database must be queried for detail rows, you will retrieve smaller, filtered subsets instead of the full data set.
Example
Instead of this:
Do this:
Construct tables in the database, not the application
Limitation: Certain types of joins—one-to-many and cross joins (Cartesian products)—and custom columns, which are created in the application rather than the data source, are incompatible with database aggregation.
Solution: Do join and formula logic in the database instead of in the application. As long as a data object generates SQL, then any constructed table will support aggregation in the database. Stored procedures, table-valued functions, user-defined functions, and custom SQL statements are all supported.
Use pre-aggregated tables
Aggregate in the database using custom SQL statements. Using the .NET API's ability to supply custom SQL objects for new reports, create data objects on the fly where the detail rows themselves are simply database-aggregated values for other tables.
Example
The following SQL statement, supplied in a custom SQL object through the API, joins two tables and returns three aggregated columns before ever reaching the Exago application.
SELECT Order.Id AS Id, COUNT(OrderDetail.Id) AS IdCount, MAX(OrderDetail.OrderId) AS OrderMax, MAX(OrderDetail.ProductId) AS ProdMax FROM OrderDetail INNER JOIN Order ON (OrderDetail.OrderId = Order.Id) GROUP BY Order.Id ORDER BY Order.Id ASC
Combine functions that can be aggregated in the database
Limitation: Custom aggregate functions cannot be written to the dbconfigs.json FormulaDictionary, so they cannot be aggregated in the database.
Solution: Instead combine functions that can be aggregated in the database together.
Example: The Custom Aggregate Function: AggCountIf() can instead be made to aggregate in the database by nesting AggSum() and If().
Instead of applying this formula with a single custom aggregate function:
=AggCountIf({Products.ProductName}, {Products.Discontinued})
Use a combination of these functions:
=AggSum(If({Products.Discontinued},1,0))
Logging
Database aggregation is transparent to users. With Admin Console > General > Other Settings > Enable Debugging set to DEBUG and database aggregation enabled, Exago BI will indicate whenever it was unable to use it, and the reason why. Just search the logfile for the following text:
SQL aggregate requirement failed due to:
Text after the colon (:) will indicate the reason why database aggregation could not occur. For example:
SQL aggregate requirement failed due to: cell [=AggCount({OrderDetails.Id})] contains aggregates that can be duplicated due to one-to-many relationships
SQL aggregate requirement failed due to: cell [=AggCountIf({Products.ProductName}, {Products.Discontinued})] contains an aggregate function other than AggSum, AggMin, AggMax, AggCount, AggDistinctCount
SQL aggregate requirement failed due to: detail section in report