Plotting Data in C++

In most of the work I do, the ability to easily plot data in Python is a major determinate when choosing a programming language for a project. A while back I was looking into what it would take to replace my Python machine learning workflow with C++ tools. Unfortunately, there were not a lot of great options to replace things like Keras or Pandas. This led to me developing a very primitive version of the latter to help work with tabular data structures in C++.

One thing the project was lacking was a way to plot and visualize the data. Fortunately, many others thought this was a problem in C++ as well and created an API that incorporates all of the functionality of matplotlib.pyplot. This library is essentially a set of C++ functions in a single header file that uses C++/Python bindings to call the matplotlib functions. In this post, I will provide example usage of matplotibcpp by recreating some plots that I created in a previous post about finding undervalued stock market sectors.

Including matplotlibcpp in a Project

matplotlibcpp is a header-only plotting API for C++. The source can be found in the project’s GitHub repository and can be included in any C++ project via

assuming the header file is in the same directory as the C++ source file. For convenience, the matplotlibcpp namespace is set to plt. This allows the following usage difference

As seen here, the code is not only easier to read and write but also draws a direct parallel to the common practice of importing matplotlib.pyplot as plt in Python. The header file containing the C++ source code is omitted here since it is ~3,000 lines of code and would be as long as this post. Instructions to install the dependencies and basic usage compiling on different operating systems can be found on the projects GitHub. Obviously, Python will be required since the C++ bindings are just calling the Python functions. The instructions use Python2.7 but using similar install instructions and compilation includes, Python3.x can be used as well. In my case, I used Python3.8 by more or less replacing “python2.7” with “python3.8” wherever applicable in the usage documentation.

Basic Usage

To demonstrate some basic usage of matplotlibcpp I will be recreating plots from a previous post where price divergence is used as a quick comparison to determine how fairly valued a stock market sector is. The method won’t be described here but I encourage anyone interested to check out that post and its sister post which focuses on individual stocks. These plots aren’t overly complicated with respect to what can be done in matplotlib but they do demonstrate the basic functionality with some extra, less used function calls.

Ticker Data Preprocessing

For starters, we will need a way to retrieve and store daily stock ticker data for index funds/ETFs. In two other posts, I created a library to work with tabular data structures in C++ and a small set of functions to retrieve stock market data from Yahoo Finance; the latter depends on the former (loosely). I won’t include details about these libraries either as they are not within the purview of this post. Below are some helper functions along with all of the API/library includes used by this program. There are some comments to describe what’s going on but this functionality is just to help us get to the main conclusion of plotting the data so a thorough explanation is omitted.

Plotting the Data

With the data available and processed, matplotlibcpp functions can be called to visualize the dataset.

In the main function above, the helper functions are used to retrieve the ETF ticker data in a format compatible with matplotlibcpp. Next, some vectors are filled with 1, -1, and 0 which are used for filling in sections of the graph with red or green, in this same loop the divergence measure is also calculated. Finally, the matplotlibcpp functions are called which plot the data and save it to a plots/ directory. Below is example output when running the program as it is shown above.

The divergences of SOXX, a semiconductor industry ETF, versus the S&P 500.
The divergences of VDE, an energy industry ETF, versus the S&P 500.
The divergences of VHT, a healthcare industry ETF, versus the S&P 500.
The divergences of XRT, a retail industry ETF, versus the S&P 500.

As a quick overview, hypothetically when the green divergence line is in the red section of the graphs, the sector is undervalued, at 0 the sector is fairly valued, and above 0 the sector is overvalued. This is just meant to be an initial view of these sectors and should not be used to make investing decisions. However, finding a sector that appears to be undervalued might warrant further research.

A Minor Change to matplotlibcpp

When creating these plots I found a small issue in the matplotlib::fill_between(…) function. When passing a parameter map that used numeric values as the parameter values (e.g. alpha=0.2) the parameter would be parsed as a string and would not work to set the value. To fix, a small change was made to the function to special case this value. This is part of a GitHub issue that had other recommended fixes such as using the C++17 any class. I went with the simplest solution since my case doesn’t need to be generic. The fix is in the fill_between method of the library around line 842 at the time of writing:

This fix special cases the alpha parameter and parses it as a double rather than as a string value. This seemed to correct the issue in my case.

Conclusion

In this post matplotlibcpp, a C++ API that calls the Python matplotlib library, was introduced. Sample usage was provided by reproducing plots that were created as part of another project of mine. This is just the tip of the iceberg for this library and almost anything that can be done with matplotlib can be done with matplotlibcpp. The intention of this post was to introduce the library for those who might have known it was available.

Full Code

Yahoo Finance API (yfapi.h)

Plotting Logic

Leave a Reply

Your email address will not be published. Required fields are marked *