Histogram and ranged histogram charts (available with ComponentOne FlexChart for WinForms, WPF, UWP) empowers you with more flexibility to visualize distribution and dispersion of statistical data.
- What are histogram charts?
- Fundamental differences between histogram and bar charts
- Configuring histogram using FlexChart
- What is ranged histogram chart?
- Use of histogram in image processing
What are histogram charts?
One of the most common statistical charts, histograms visualize the underlying frequency distribution of a given set of continuous data. You can easily summarize a large range of values by grouping/splitting the entire data set into defined intervals or classes – commonly known as bins. Each bin contains the number of occurrences of the data in the dataset that are contained within that bin. The bins in a histogram are plotted as a vertical (or horizontal) bar on the chart with the height of the bar (width when horizontal) representing frequency of data values falling in that bin.
The presence of bars in a histogram chart can give the false impression that they're specialized versions of the widely used bar and column charts. In the next section, we'll look at how histogram chart is fundamentally different from the bar chart.
Fundamental differences between bar and histogram charts
While it's important how we visualize the data, interpretation of data is also key aspect of working with charts. The whole purpose of creating a chart fails, if we cannot correctly interpret the data from it.
Given the histogram's visual similarity of having side-by-side placed bars, they're usually mistaken for bar charts by the users--that means they won't be interpreted correctly. However, distinct differences between bar and histogram charts affect data interpretation significantly, so they should be read differently and carefully.
Here are some of those differences.
Bar charts are useful for comparing distinct values of data, e.g. sales data by country. Histograms, on the other hand, are useful for comparing distribution of continuous data: for instance, the price distribution of items that have been sold across the countries:
Height of bars: actual value vs. frequency
While a bar chart's bar height represents the actual value of items, in histogram, it represents the frequency of items that fall in each bin.
X-axis values: qualitative vs quantitative
Bar charts are used to plot categorical variables (the qualitative data on the x-axis), while histograms are used to plot numerical variables (the quantitative data on the x-axis).
Gap between bars
The x-axis in a histogram represents a continuous variable that has been grouped into multiple bins. To represent this continuity, bars in a histogram do not have gaps between them. If you see a gap between two consecutive bars it means that the bin(s) in between have 0 items, i.e., there are bars with 0 height at those gaps.
On the other hand, the x-axis in a bar chart represents a discrete variable. Each item on the axis is independent of the other item. To make this more apparent visually, the bars on a bar chart are usually separated by constant gaps between them.
Ordering of bars
The x-axis of a histogram is divided into ordered-groups, and so the bars in a histogram chart would always appear in a specific order. For example, the bars representing the price range [100,150], [200-250], [400-450] are ordered as per the groups. The x-axis items in a bar chart, on the other hand, are qualitative in nature, and can be displayed in any order. In the country-wise sales chart shown earlier, it doesn't matter whether Denmark is placed before or after Greece in chart.
Now that we're aware of the differences between a bar chart and a histogram and know when to use them, let’s look at how to create histograms for your charting needs using FlexChart.
Creating and configuring histograms using FlexChart
Taking histogram further: ranged histogram
While the categories in a histogram chart have traditionally been associated with quantitative data only, in some scenarios, a user would want to plot text-based categories in the histogram. For example, the age distribution example taken at the start of this blog can also be visualized based on categories such as 'Youth', 'Young Adults', 'Adults', 'Middle Aged' and 'Older’. With time, there’s been a steady rise in demand for data visualization tools to provide support for such non-numerical x-axis categorization in histograms. This was apparent when Microsoft introduced the histogram chart type with this new definition in Excel 2016.
Along similar lines, we introduced a new chart type – ranged histogram – in the 2017 v3 release of FlexChart for WinForms, WPF, and UWP to give flexibility to developers who need to give support for Excel-like chart types in their applications.
The ranged histogram can be considered as an advanced version of the histogram chart that allows users to work with a non-numeric x-axis on the chart. Each item on the x-axis is a bin that can represent either a text-based category or a numeric range of values. For a better understanding on how to display text-based categories or numeric ranges, let's classify the working of the ranged histogram chart in two modes: Category and Non-Category.
The x-axis items on the histogram represent numeric range of values in this mode. In this mode, the chart groups the data source in several ranges that are determined by a combination of ranged histogram properties in FlexChart such as BinMode, BinWidth, NumberOfBins, OverflowBin, UnderflowBin, etc. The data points are then binned into these groups and the frequency of items in each range is plotted on the y-axis.
rangedHistogram.BinMode = HistogramBinning.BinWidth; rangedHistogram.BinWidth = 100; rangedHistogram.OverflowBin = 600; rangedHistogram.UnderflowBin = 100; rangedHistogram.ShowOverflowBin = true; rangedHistogram.ShowUnderflowBin = true;
The x-axis items on the histogram are text-based categories in this mode. The chart groups the data of the same categories in the data source together and sum their values on the y-axis. All you need to do in FlexChart is to set the BindingX property of the ranged histogram series to the property name that contains the text-based category names. For example:
rangedHistogramSeries.BindingX = "AgeGroup";
Please note that when the BindingX has been set, the chart ignores all other properties of the ranged histogram series such as BinMode, BinWidth, NumberOfBins, OverflowBin, UnderflowBin, etc.
Use of histogram in image processing
Image processing is a complicated field that involves using different techniques to analyze and manipulate digitized images, especially for improvement of their quality. Histograms, or image histograms to be specific, are one of such techniques that are widely used in image processing. Image histograms are graphical representation of an image where the x-axis represents the pixel intensities and the y-axis represents the count of pixels at these intensity values:
One can draw conclusions about the brightness and contrast of an image just by looking at its image histograms. Let's understand this further in the following section.
Histograms with values concentrated towards the left(low-intensity) reflects dark images as can be seen in Fig 1, while histograms with values concentrated towards the right (high-intensity) reflects bright images as can be seen in Fig 2.
Histograms, where the distribution of pixels is restricted to a small range of intensities, indicates an image with low contrast. On the other hand, histograms where the pixels are distributed over a broad range of intensities indicates an image with a sharp contrast.
Multiple image enhancements algorithms such as contrast stretching and histogram equalization make use of histograms to improve the contrast of an image. The image below shows the contrast-stretching algorithm in action. You can observe the distribution of pixels of the x-ray image before and after the algorithm is applied.
In fields like medical diagnosis, information provided by the histogram allows the medical experts to opt for image enhancement techniques for getting a sharper image rather than expose the patients to another round of radiation.
Before the algorithm is applied, we can see that the pixels are clustered along the center of the histogram and there aren't many pixels in the high-intensity range (> 200). The histogram here is thus providing the information that the contrast of the x-ray can be improved upon. In fields like medical diagnosis, this small piece of information provided by the histogram allows the medical experts to opt for image enhancement techniques for getting a sharper image rather than expose the patients to another round of radiation. Once the contrast-stretching algorithm applies a linear scaling function to the pixel values in the image, they occupy the full range of intensity values and provide a much sharper image.
To summarize, grouping the data items into bins using histograms gives a true picture of data as it exists. It provides answers to a wide range of questions about data, such as:
- Where is the center of the data located?
- How far is the data is distributed, i.e. its range?
- Are there any outliers?
- How is the data skewed?
- Is the data unimodal, bimodal or multimodal?
Answers to questions like these make histogram applicable in a wide variety of applications, from analyzing the distribution of student grades to something as complex as digital image processing, as we saw above.