Finding Undervalued Sectors in the Stock Market

For an investor making investment decisions based on the underlying fundamentals of a company, i.e. fundamental analysis, finding companies to buy can be a daunting task. With ~3,600 publically listed companies in the United States alone, it can be nearly impossible to get a shortlist of companies to begin doing fundamental analysis on. Thankfully, many brokerages offer stock screeners to help mitigate this issue but, even then, you can end up with way too many companies to look through and many stock screeners lack some important criteria.

In this post, I will introduce a simple way to pre-screen the market based on market sectors by comparing the price of an overall sector to the S&P 500. The stocks in the S&P 500 index are said to represent about 82% of the total U.S. equity market value. Because of this, this post makes the assumption that, in general, all sectors should trade more or less together and, if there is a divergence it means that a sector has fallen out of favor and has been discounted by the market. For example, the retail sector was taking a beating before the 2020 Coronavirus Pandemic due to the rise in e-commerce sales. Many companies in the sector however were still healthy, growing, and stable but were sold off anyway as the sector took a hit. Finding this divergence can be the first step in finding these undervalued companies.

There are two things to note before diving further into this post. The first is that the S&P 500 index may not be the best way to measure the market as a whole since the index is “top-heavy” (i.e. it is weighted by market capitalization so the largest stocks in the index have a much greater effect on its price movements than smaller companies in the index). One way to mitigate this is to compare the sector to other market indexes as well such as the DJIA or the Russell 2000. The second thing to note is that a sector may be beaten down but not necessarily undervalued. As new technologies come to light older companies that fail to pivot with the rise of the new technology will perform worse and worse year after year. The railroad was a good example of this. For decades the amount of freight transported via rail car continued to decline. During this decline, it would be easy to assume that the sector was just in a slump and thus undervalued. But even today the industry has failed to bounce back to its peak. Therefore, the technique described below is meant as a starting point to your research and analysis, you shouldn’t skip out on the research altogether.

Determining Sector Divergence

A short Python script was written to determine the divergence of a particular market sector compared to the S&P 500. Below are the dependencies of the project, including the Yahoo Finance API I created and wrote about in a different blog post:

from yfapi import YahooFinanceAPI
import pandas as pd
import datetime
import matplotlib.pyplot as plt

Next, a dictionary is created which maps the tickers that we’re interested in comparing with the S&P 500 to a short description of which sector the ETF was created to track. After this dictionary is created, S&P 500 data from the last ~11 years is retrieved from Yahoo Finance. This data isn’t actually from the S&P 500 but rather the SPY ETF which closely tracks the index.

sector_tickers = {
    "XRT": "Retail",
    "SOXX": "Semiconductors",
    "VDE": "Energy",
    "VHT": "Health Care"
}
price_column = "Close"
date_column = "Date"

api = YahooFinanceAPI()
sp_data = api.get_ticker_data("spy", datetime.datetime(2010, 1, 1), datetime.datetime(2020, 12, 31))

One immediate problem is the scaling of the data. Some ETFs trade at considerably lower prices than others. For example, two popular ETFs that track the S&P 500 index are SPLG and SPY. The former has a price of $45 at the time of writing and the latter a price of $382.88. Simply subtracting these two values probably won’t yield any interesting results due to the large discrepancy in prices. Because of this, the price data for the two ETFs must be normalized so that all the data lie between 0 and 1. Note that, if we actually normalized SPLG and SPY and found the divergence it should be very near 0 at all times since they both track the S&P 500. To squeeze the values between 0 and 1 the following formula is used:

    \[price_{normal} = \frac{price_{actual} - price_{min}}{price_{max} - price_{min}}\]

This is implemented for the S&P data using Pandas data frames in the code below. The same thing will be done later to the price data for the individual sectors.

normal_sp_close = (sp_data[price_column] - sp_data[price_column].min())/ \
                  (sp_data[price_column].max() - sp_data[price_column].min())

To compute the divergence, the data for each ticker is read from Yahoo Finance (over the same time frame), normalized, and differenced from the SPY data via

for key in sector_tickers:
    ticker, desc = key, sector_tickers[key]
    comp_data = api.get_ticker_data(ticker, datetime.datetime(2010, 1, 1), datetime.datetime(2020, 12, 31))
    normal_comp_data = (comp_data[price_column] - comp_data[price_column].min())/ \
                       (comp_data[price_column].max() - comp_data[price_column].min())
    ## Divergence = 0 -> no divergence between S&P and Sector (or stock)
    ## Divergence < 0 -> sector gains less than S&P gains (sector may be undervalued)
    ## Divergence > 0 -> sector gains greater than S&P gains (sector may be overvalued)
    divergence = normal_comp_data - normal_sp_close

As seen in the comments for this section of the script, since the S&P data is subtracted from the sector data, the sector can be considered “undervalued” if the divergence variable is less than 0, it’s “overvalued” if the variable is greater than 0, and “fairly valued” if the divergence is 0. Again, this doesn’t ACTUALLY mean that the sector is undervalued just that it might be worth digging into the data surrounding the sector or some leading companies in the sector as they may be unpopular to market participants and oversold. This divergence analysis is plotted so it can be visualized with the code below.

div, = plt.plot(comp_data[date_column], divergence, label="Divergence")
sp, = plt.plot(sp_data[date_column], normal_sp_close, label="SPY - S&P 500")
comp, = plt.plot(comp_data[date_column], normal_comp_data, label="{} - {}".format(ticker, desc))
plt.legend(handles=[div, sp, comp])
plt.axhline(0, color="black")
plt.savefig("DiscountedSectors/plots/{}.png".format(ticker))
plt.clf()

Examples

Below are a few examples of this divergence analysis. I chose XRT which tracks the US retail segment, SOXX which tracks semiconductors, and VDE for the energy sector.

The green and red regions below indicate when the divergence line (blue line) is in a potentially profitable or potentially unprofitable region, respectively.

The divergence line and normalized price data for the S&P and retail sector (XRT) are plotted in the graph below.

The divergence between the US retail sector (XRT) and the S&P 500 index (SPY)

As seen in this chart, the sector may have become undervalued in the 2018-2019 timeframe. In hindsight, purchasing during this time would have paid off since it would have experienced more gains than the S&P 500 (percentage-wise). In this timeframe, e-commerce was becoming more and more prevalent (Amazon particularly). Many thought it was the end of the brick-and-mortar store, it still might prove to be, but finding quality, well-capitalized companies that could either sustain and grow their brick-and-mortar operations or build their e-commerce presence (and were doing so) would have paid off through 2020.

The next sector analyzed is the semiconductor industry (SOXX). This sector has experienced massive growth (in terms of market capitalization) in 2020 after lagging somewhat behind the S&P 500 for a few years.

The divergence between the semiconductor industry (SOXX) and the S&P 500 (SPY).

The final sector is the energy sector represented by VDE. Since the Covid-19 pandemic outbreak oil prices have taken a big hit due to entire economies being more or less shut down. This caused a sell-off of many companies in the industry (sometimes rightfully so).

The divergence between the energy sector (VDE) and the S&P 500 (SPY).

As seen here the sector has diverged from the overall market (pretty significantly). Again, this doesn’t mean that VDE or any stocks in the sector should be bought. It might be the case that oil prices will stay low indefinitely (however unlikely) and many of the companies in this sector will go bankrupt due to the depressed prices but this divergence indicates some companies in the sector, or the sector itself, might warrant further research to determine if there are any inefficiently valued securities.

Individual Stocks

The method described here can be used to find the divergence between two assets that normally trade in lockstep. This is the basis of pairs trading wherein two securities that are typically highly correlated are traded together when there is a divergence (like the divergence illustrated in this post) between the two securities. In pairs trading, a short position is opened on the security that has become “overvalued” (by our divergence definition), and a long position is taken against the security that has become “undervalued” (again by the divergence definition). This could be done with the S&P 500 index as one of the underlying securities, like what was done in this post with sectors, but it’s not advisable as individual stocks are subject to changes in industries, financial well-being, corporate mismanagement, and similar factors. Because of this individual stocks should not be expected to trade with the overall economy which would make this type of divergence analysis useless. With an entire sector, as long as the sector is healthy (i.e. not being phased out by new technology although it may be an unpopular sector at the time) I think we can reasonably expect that the sector will “bounce-back” or at least the best performing and most well-managed companies in the sector will.

Full Code

from yfapi import YahooFinanceAPI
import pandas as pd
import datetime
import matplotlib.pyplot as plt

# ticker -> description dict (place new tickers here)
sector_tickers = {
    "XRT": "Retail",
    "SOXX": "Semiconductors",
    "VDE": "Energy",
    "VHT": "Health Care"
}
# just in case these change
price_column = "Close"
date_column = "Date"

api = YahooFinanceAPI()
sp_data = api.get_ticker_data("spy", datetime.datetime(2010, 1, 1), datetime.datetime(2020, 12, 31))
normal_sp_close = (sp_data[price_column] - sp_data[price_column].min())/ \
                  (sp_data[price_column].max() - sp_data[price_column].min())

for key in sector_tickers:
    ticker, desc = key, sector_tickers[key]
    comp_data = api.get_ticker_data(ticker, datetime.datetime(2010, 1, 1), datetime.datetime(2020, 12, 31))
    normal_comp_data = (comp_data[price_column] - comp_data[price_column].min())/ \
                       (comp_data[price_column].max() - comp_data[price_column].min())
    divergence = normal_comp_data - normal_sp_close

    ## Divergence = 0 -> no divergence between S&P and Sector (or stock)
    ## Divergence < 0 -> sector gains less than S&P gains (sector may be undervalued)
    ## Divergence > 0 -> sector gains greater than S&P gains (sector may be overvalued)
    div, = plt.plot(comp_data[date_column], divergence, label="Divergence")
    sp, = plt.plot(sp_data[date_column], normal_sp_close, label="SPY - S&P 500")
    comp, = plt.plot(comp_data[date_column], normal_comp_data, label="{} - {}".format(ticker, desc))
    plt.legend(handles=[div, sp, comp])
    plt.axhline(0, color="black")
    plt.savefig("DiscountedSectors/plots/{}.png".format(ticker))
    plt.clf()

2 thoughts on “Finding Undervalued Sectors in the Stock Market

  1. Hi,
    In regards to your Undervalued Sectors script, I received the following error: FileNotFoundError: [Errno 2] No such file or directory: ‘DiscountedSectors/plots/XRT.png’. How do I fix this this error. I am using Windows and Python 3.9.2.

    1. Hi Ton,

      It looks like this coming from the creation of the output files. One easy fix is to modify line 37 of the script by changing:

      plt.savefig(“DiscountedSectors/plots/{}.png”.format(ticker))

      to

      plt.savefig(“{}.png”.format(ticker))

      That way the images will be saved to the directory in which your Python script is located (that depends on how you’re running the script too since some IDEs manage the file system differently). Alternatively, you could add a “DiscountedSectors” folder and a “plots” subfolder to that newly created folder so that the file path exists. That is, you would need a directory structure as follows:

      – discounted_sectors.py
      – DiscountedSectors
      —- plots

      The former option of modifying the script is probably the easiest.

      I hope that helps,
      Anthony

Leave a Reply to Ton Cancel reply

Your email address will not be published. Required fields are marked *