Skip to main content Skip to footer

Break up Data Clusters in a Chart With a Logarithmic Scale

The FlexChart control allows you to create Excel-style data-bound charts. The control supports multiple chart types, rich styling, and has extensions for interactions, analytics, and more. The FlexChartIntro sample is a great introduction to FlexChart's basic features (along with the documentation, of course). You'll see how to create charts; customize legends, axes, and titles; add tooltips; style series; and use selection modes. Here we'll start with a simple bubble chart and improve it by changing the axes scales. Unlike the sample, we won't focus on the chart properties and features; instead, we'll focus on simple concepts you can use to improve the quality of your charts in common real-case scenarios. We'll start with a dataset containing the population, gross domestic product, and per-capita income for about 200 countries. Here's what the data looks like:

$scope.countries = [  
{ country: 'Luxembourg', pop: 558514, gdp: 62395 },  
{ country: 'Norway', pop: 5156463, gdp: 500244 },  
{ country: 'Qatar', pop: 2234895, gdp: 210002 },  
…

After defining the data, we calculate the per-capita income and add that to each data point:

for (var i = 0; i < $scope.countries.length; i++) {  
var c = $scope.countries[i];  
c.pci = c.gdp / c.pop * 1e6; // US$/year/capita  
}

Once the data is ready, we define the chart:

<wj-flex-chart  
items-source="countries"  
chart-type="Bubble"  
binding-x="pop"  
tooltip-content="{country}:  
{pci:n0} US$/year/capita">  








The markup starts by setting the chart’s data source and chart type; the name of the property used for the X axis; and the tooltip content, to a template that contains the country name and per-capita income. It then defines a single series with the binding property set to a comma-delimited list of fields that will be used for the Y-axis value, bubble size, and the style to be used for the series. Finally, it defines the titles for the axes and hides the chart legend. This is the result: Wijmo FlexChart: Clustered Bubble Chart Data outliers force the other points to cluster into an unreadable mess. The chart rendered correctly, but it's not very clear. First of all, the values plotted against the axes are very large and hard to read. The chart would be clearer if it showed GDP in billions and population in millions. This can be achieved easily by changing the format property for the axes. Wijmo’s Globalization class supports scale modifiers in format strings. Each trailing comma in the format string divides the value being formatted by 1000. We can achieve the scaling we want by changing the axes markup: Notice how the format property was used to scale the population values to millions by adding two commas to the format (n0,,) and GDP values were scales from millions to billions by appending a single comma to the format (n0,). We also changed the axes titles to reflect the scaling. The result is as follows: Wijmo FlexChart Min Max Settings Now we've lost important data. The axes are more readable now, but most values are still clustered around the origin. (The three outliers are the Unites States, China, and India.) Many types of data have this type of pattern, including economic and demographic data. The data tends to cluster around the average, and outliers force the bulk of the data to bunch together. There are two simple ways to address this problem: First, we could set the min and max properties of the axes to exclude the outliers. This would spread the cluster over a larger portion of the chart and would make it more readable. The problems with this approach are:

  1. The outliers would be excluded from the chart, and they tend to be important data points. In this case, we'd deliberately exclude the countries with the largest GDP and population.
  2. Expanding the initial cluster would probably reveal more clusters, and we'd be back where we started—having to exclude more and more data points. We'd sacrifice accuracy for clarity.

Change the Axes to a Logarithmic Scale

A better approach in this case is to change the axes to a logarithmic scale. With logarithmic axes, the distance between tick marks is a multiple of the log base, rather than a constant value. Because of this, logarithmic axes naturally compress the scale as values grow towards the maximum. To enable logarithmic scale in the FlexChart, we set the logBase property on the axes to a value such as 2, 5, or 10. Once we do this, each tick mark represents a distance 2, 5, or 10 times larger than the previous one. Here's the revised markup: The revised markup sets the logBase property of both axes to 10, and changes the n0 format specifier to g4. The g4 specifier displays up to four decimal places, but removes trailing zeros. This provides a good representation when using logarithmic axes showing values that may range from 0.001 to 1000. And here's the new chart: Wijmo FlexChart BubbleChart with Graduated Axes Presto! Accuracy AND clarity in a bubble chart. After setting the logBase property on both axes, the data is spread over the entire chart, spreading the points to reveal an interesting pattern and making them easy to inspect using the tooltips. In a real application, we could use the selectionMode property to make the points selectable and display detailed information for the selected country below the chart. To summarize:

  • Scale formatters can display the axis labels in a clear and concise format.
  • Use logarithmic axes to spread clustered data and improve the clarity of your charts without sacrificing accuracy.

MESCIUS inc.

comments powered by Disqus