### Summary

In this article, we have used hedgecraft‘s approach to portfolio management. However, unlike hedgecraft, we have used a sub-graph centrality approach.This sub-graph centrality approach is what makes our approach unique. Using insights from Network Science, we build a centrality based risk model for generating portfolio asset weights. The model is trained with the daily prices of 31 stocks from 2006-2014 and validated in the years 2015, 2016, 2017, 2018 & 2019. As a benchmark, we compare the model with a portfolio constructed with Modern Portfolio Theory (MPT). Our proposed asset allocation algorithm significantly outperformed both the Sensex30 and Nifty50 indexes in every validation year with an average annual return rate of 26.51%, a 13.54% annual volatility, a 1.59 Sharpe ratio, a -21.22% maximum drawdown, a return over maximum drawdown of 6.56, and a growth-risk-ratio of 1.86. In comparison, the MPT portfolio had a 9.63% average annual return rate, an 18.07% annual standard deviation, a Sharpe ratio of 0.41, a maximum drawdown of -22.59%, a return over maximum drawdown of 2.2, and a growth-risk-ratio of 0.63.

### Background

In this series, we play the part of an Investment Data Scientist at Bridgewater Associates performing a go/no go analysis on a new idea for risk-weighted asset allocation. Our aim is to develop a network-based model for generating asset weights such that the probability of losing money in any given year is minimized. We’ve heard down the grapevine that all go-decisions will be presented to Dalio’s inner circle at the end of the week and will likely be subject to intense scrutiny. As such, we work with a few highly correlated assets with strict go/no go criteria. We build the model using the daily prices of each stock in the Sensex. If our recommended portfolio either (1) loses money in any year, (2) does not outperform the market every year, or (3) does not outperform the MPT portfolio—the decision is no go.

### Asset Diversification and Allocation

The building blocks of a portfolio are assets (resources with the economic value expected to increase over time). Each asset belongs to one of seven primary asset classes: cash, equity, fixed income, commodities, real estate, alternative assets, and more recently, digital (such as cryptocurrency and blockchain). Within each class are different asset types. For example stocks, index funds, and equity mutual funds all belong to the equity class while gold, oil, and corn belong to the commodities class. An emerging consensus in the financial sector is this: a portfolio containing assets of many classes and types hedges against potential losses by increasing the number of revenue streams. In general the more diverse the portfolio the less likely it is to lose money. Take stocks for example. A diversified stock portfolio contains positions in multiple sectors. We call this asset diversification, or more simply diversification. Below is a table summarizing the asset classes and some of their respective types.

An investor solves the following (asset allocation) problem: given X rupees and N, assets find the best possible way of breaking X into N pieces. By “best possible” we mean maximizing our returns subject to minimizing the risk of our initial investment. In other words, we aim to consistently grow X irrespective of the overall state of the market. In what follows, we explore provocative insights by Ray Dalio and others on portfolio construction.

The above chart depicts the behavior of a portfolio with increasing diversification. Along the x-axis is the number of asset types. Along the y-axis is how “spread out” the annual returns are. A lower annual standard deviation indicates smaller fluctuations in each revenue stream, and in turn a diminished risk exposure. The “Holy Grail” so to speak, is to (1) find the largest number of assets that are the least correlated and (2) allocate X rupees to those assets such that the probability of losing money any given year is minimized. The underlying principle is this: the portfolio most robust against large market fluctuations and economic downturns is a portfolio with assets that are the most independent of each other.

### Visualizing How A Portfolio is Correlated with Itself (with Physics)

The following visualizations are rendered with the Kamada-Kawai method which treats each vertex of the graph as a mass and each edge as a spring. The graph is drawn by finding the list of vertex positions that minimize the total energy of the ball-spring system. The method treats the spring lengths as the weights of the graph, which is given by 1 – cor_matrix where cor_matrix is the distance correlation matrix. Nodes separated by large distances reflect smaller correlations between their time-series data, while nodes separated by small distances reflect larger correlations. The minimum energy configuration consists of vertices with few connections experiencing a repulsive force and vertices with many connections feeling an attractive force. As such, nodes with a larger degree (more correlations) fall towards to the center of the visualization where nodes with a smaller degree (fewer correlations) are pushed outwards. For an overview of physics-based graph visualizations see the Force directed graph drawing wiki.

In the above visualization, the sizes of the vertices are proportional to the number of connections they have. The color bar to the right indicates the degree of dissimilarity (the distance) between the stocks. The larger the value (the lighter the color) the less similar the stocks are. Keeping this in mind, several stocks jump out. Bajaj Finance, ITC, HUL, and HeroMotoCorp all lie on the periphery of the network with the fewest number of correlations above Pc = 0.325. On the other hand ICICI Bank, Axis Bank, SBI, and Yes Bank sit in the core of the network with the greatest number connections above Pc = 0.325. It is clear from the closing prices network that our asset allocation algorithm needs to reward vertices on the periphery and punish those nearing the center. In the next code block we build a function to visualize how the edges of the distance correlation network are distributed.

### Observations

- The degree distribution is left-skewed.
- The average node is connected to 86.6% of the network.
- Very few nodes are connected to less than 66.6% of the network.
- The kernel density estimation is not a good fit.
- By eyeballing the plot, the degrees appear to follow an
*inverse power-law*distribution. (This would be consistent with the findings of Tse,*et al*. (2010)).

### Intraportfolio Risk

We read an intraportfolio risk plot like this: ICICI Bank is **0.091/0.084 = 4.94 **times riskier than Maruti Suzuki (MSPL). Intuitively, the assets that cluster in the center of the network are most susceptible to impacts, whereas those further from the cluster are the least susceptible. The logic from here is straightforward: take the inverse of the relative risk (which we call the “relative certainty”) and normalize it such that it adds to 1. These are the asset weights. Formally,

Next, Let’s visualize the allocation of 100,000 (INR) in our portfolio

### Subgraph Centrality-Based Asset Allocation

Bajaj Finance receives nearly 12.58%, Bajaj Auto gets about 12.58%, HUL 8.15%, Infosys 4.52%, and the remaining assets receive less than 0.5% of our capital. To the traditional investor, this strategy may appear “risky” since 60% of our investment is with 5 of our 31 assets. While it’s true if Bajaj Finance is hit hard, we’ll lose a substantial amount of money, our algorithm predicts Bajaj Finance is the least likely to take a hit if and when our other assets get in trouble. Bajaj Finance is clearly the winning pick in our portfolio.

It’s worth pointing out that the methods we’ve used to generate the asset allocation weights differ dramatically from the contemporary methods of MPT and its extensions. The approach taken in this project makes no assumptions of future outcomes of a portfolio, i.e., the algorithm doesn’t require us to make a prediction of the expected returns (as MPT does). What’s more—we’re not solving an optimization problem—there’s nothing to be minimized or maximized. Instead, we observe the topology (interrelatedness) of our portfolio, predict which assets are the most susceptible to the subgraph centrality of volatile behavior and allocate capital accordingly.

### Alternative Allocation Strategy: Allocate Capital in the Maximum Independent Set

The maximum independent set (MIS) is the largest set of vertices such that no two are adjacent. Applied to our asset correlation network, the MIS is the greatest number of assets such that every pair has a correlation below Pc = 0.325. The size of the MIS is inversely proportional to the threshold Pc. Larger values of Pc produce a sparse network (more edges are removed) and therefore the MIS tends to be larger. An optimized portfolio would therefore correspond to maximizing the size of the MIS subject to minimizing Pc . The best way to do this is to increase the universe of assets we’re willing to invest in. By further diversifying the portfolio with many asset types and classes, we can isolate the largest number of minimally correlated assets and allocate capital inversely proportional to their relative risk. While generating the asset weights remains a non-optimization problem, generating the asset correlation network becomes one. We’re really solving two separate problems: determing how to build the asset correlation network (there are many) and determining which graph invariants (there are many) extract the asset weights from the network. As such, one can easily imagine a vast landscape of portfolios beyond that of MPT and a metric fuck-tonne of wealth to create. **Unfortunately, solving the MIS problem is NP-hard. The best we can do is find an approximation**.

### Using Expert Knowledge to Approximate the Maximum Independent Set

We have two options: randomly generate a list of maximal indpendent sets (subgraphs of such that no two vertices share an edge) and select the largest one, or use expert knowledge to reduce the number of sets to generate and do the latter. Both methods are imperfect, but the former is far more computationally expensive than the latter. Suppose we do fundamentals research and conclude Bajaj Finance and HUL must be in our portfolio. How could we imbue the algorithm with this knowledge? Can we make the algorithm flexible enough for portfolio managers to fine-tune with goold-ole’ fashioned research, while at the same time keeping it rigged enough to prevent poor decisions from producing terribe portfolios? We confront this problem in the code block below by extracting an approximate MIS by generating 100 random maximal indpendent sets containing Bajaj Finance and HUL.

The generate_mis function generates a maximal independent set that approximates the true maximum independent set. As an option, the user can pick a list of assets they want in their portfolio and generate_mis will return the safest assets to complement the user’s choice. Picking Bajaj Finance and HUL left us with Sun Pharma, Hero Moto Corp amongst others. The weights of these assets will remain directly inversely proportional to the subgraph centrality.

**Allocating Shares to the Deep Learning Network Portfolio**

In this section we write production (almost) ready code for portfolio analysis and include our own risk-adjusted returns score. The section looks something like this:

We obtain the cumulative returns and returns on investment, extract the end of year returns and annual return rates, calculate the average annual rate of returns and annualized portfolio standard deviation, compute the Sharpe Ratio, Maximum Drawdown, Returns over Maximum Drawdown, and our own unique measure: the Growth-Risk Ratio.

Finally, we visualize the returns, drawdowns, and returns distribution of each model and analyze the results the performance of each portfolio.

**Visualizing the Returns**

Pictured above are the daily returns for Deep Learning Network MIS (solid green curve), Deep Learning Network (solid blue curve), and the Efficient Frontier portfolio (solid red curve) from 2015 to 2019. The colorcoded dashed curves represent the 50 day rolling averages of the respective portfolios. Several observations pop: (1) Deep Learning Network MIS significantly outperformed Deep Learning Network Portfolio, (2) Deep Learning Network Portfolio substantially outperformed the Efficient Frontier, (3) Deep Learning Network MIS grew 158.4% larger, falling below 0% returns 0 out of all the trading days, (4) Deep Learning Network grew 139.3% larger and (5) the Efficient Frontier grew 49.8% larger. Next, let’s observe the annual returns for each portfolio and compare them with the market.

In comparison, the Nifty50 had a -4.1%, 3%, 28.6%, 3.2%, -1.03% (YTD) annual return rate in 2015, 2016, 2017, 2018, & 2019 (YTD) respectively. Deep Learning Network Portfolio and Deep Learning Network MIS substantially outperformed both the market and the Efficient Frontier. Both Deep Learning Network portfolios grew at an impressive rate. Deep Learning Network MIS grew 19.1% larger than Deep Learning Network Portfolio and 108.6% larger than the Efficient Frontier, while Deep Learning Network Portfolio grew 89.5% larger than the Efficient Frontier. What’s more, Deep Learning Network MIS’s return rates consistently increased about 25% every year, whereas the return rates of Deep Learning Network Portfolio and the Efficient Frontier were less consistent. Deep Learning Network Portfolio MIS clearly has the most consistent rate of growth. We’d expect this rapid growth to be accompanied with a large burden of risk—either manifested as a large degree of volatility, steep and frequent maximum drawdowns, or both. As we explore below, the Deep Learning Network portfolios’ sustained their growth rates with significantly less risk exposure than the Efficient Frontier.

**Visualizing Drawdowns**

Illustrated above is the daily rolling 252-day drawdown for Deep Learning Network MIS (filled sea green curve), Deep Learning Network (filled royal blue curve), and the Efficient Frontier (filled dark salmon curve) along with the respective rolling maximum drawdowns (solid curves). Several observations stick out: (1) the Deep Learning Network portfolios have significantly smaller drawdowns than the portfolio generated from the Efficient Frontier, (2) both Deep Learning Network portfolios have roughly the same maximum drawdown (about 22%), (3) Deep Learning Network on average lost the least amount of returns, and (4) Deep Learning Network’s rolling maximum drawdowns are, on average, less pronounced than Deep Learning Network MIS. These results suggest the subgraph centrality has predictive power as a measure of relative or intraportfolio risk, and more generally, that network-based portfolio construction is a promising alternative to the more traditional approaches like MPT.

Deep Learning Network and its MIS variant dramatically outperformed the Efficient Froniter on every metric (save Deep Learning Network MIS’s annual volatility). These results give credence to the possibility that we are on to something substantial here as we have passed the criteria of our go/no go test. Outperforming MPT by these margins is no simple feat, but, the real test is whether or not Deep Learning Network Portfolio can consistently beat MPT on many randomly generated portfolios. To wrap up this notebook, let’s take a look at how the returns for each portfolio are distributed and move to the conclusion of Deep Learning Network Portfolio Optimzation.

### Visualizing the Distribution of Returns

Above are the returns distribution for each portfolio: Efficient Frontier (in red), Deep Learning Network (in blue), and Deep Learning Network MIS (in green). The Efficient Frontier algorithm somewhat produced a portfolio with a normal distribution of returns; the same can’t be said of the Deep Learning Network portfolios as they’re heavily right-skewed. The right-skewness of the Deep Learning Network portfolios is caused by their strong upward momentum, that is, their consistent growth. In general, we’d expect a strong correlation between the right-skewness of the returns distribution and the growth-risk-ratio.

It’s important to emphasize that deviation-based measures of risk-adjusted performance implicitly assume the distribution of returns follows a normal distribution. As such, the Sharpe ratio isn’t a suitable measure of performance since the standard deviation isn’t a suitable measure of risk for the Deep Learning Network portfolios.

While Deep Learning Network had less pronounced maximum drawdowns it was more frequently bellow 0% returns (1.59% of the time) than its MIS variant (0.53% of the time). These values dwarf that of the Efficient Frontier, which painfully experienced negative returns a third of the time. It’s interesting to note that the maximum loss of the Deep Learning Network portfolio is an order of magnitude smaller than their maximum drawdowns. This relationship is in contrast to the Efficient Frontier’s maximum loss which is on the same order of magnitude as its maximum drawdown. It’s also interesting to point out that Deep Learning Network has a lower probability of falling below its rolling 30, 50, and 90 averages than its MIS variant. Taken together, Deep Learning Network’s smaller average rolling maximum drawdown and smaller probabilities of falling below the above rolling averages indicate its growth is more consistent than its MIS variant. On the one hand, Deep Learning Network’s growth is more consistent than its MIS variant while on the other hand, the MIS variant has a more consistent growth rate. Stated another way: Deep Learning Network’s “velocity of returns” is more consistent than that of the MIS variant’s, whereas Deep Learning Network MIS’s “acceleration of returns” is more consistent than that of the Deep Learning Network portfolio.

### Future Portfolio Allocation & Conclusion

A similar analysis as displayed above was repeated for generating the optimal portfolio and the subsequent allocation. Following were the results of the same:

HUL: 11.73% , ITC: 11.73% , Bajaj Auto: 11.73% , Sun Pharma: 11.73% , ONGC: 11.73% , Asian Paints: 11.73% , NTPC: 7.6% , PowerGrid: 7.6% , Tech Mahindra: 3.88% , Infosys: 3.88% TCS: 2.76% HCL Tech: 2.76% HeroMotorCorp: 0.87%

Thus, Sector Allocation proposed by our Deep Learning Network Algorithm is as follows: FMCG: 23.46% , Automobile: 12.6% , Pharma: 11.73% , IT: 13.28% , Energy: 26.93% , Paints & Varnishes: 11.73%

### Conclusion

In this article, we built a novel algorithm for generating asset weights of a minimally correlated portfolio with tools from network science. Our approach is twofold: we first construct an asset correlation network with energy statistics (i.e., the distance correlation) and then extract the asset weights with a suitable centrality measure. As an intermediate step we interpret the centrality score (in our case the subgraph centrality) as a measure of relative risk as it quantifies the influence of each asset in the network. Recognizing the need for a human-in-the middle variation of our proposed method, we modified the asset allocation algorithm to allow a user to pick assets subject to the constraints of the maximal independent set.

Both algorithms (Deep Learning Network and Deep Learning Network MIS, including the benchmark Efficient Frontier) were trained on a dataset of thirty-one daily historical stock prices from 2000-2014 and tested from 2015-2019. The portfolios were evaluated by cumulative returns, return rates, volatility, maximum drawdowns, risk-adjusted return metrics, and downside risk-adjusted performance metrics. On all performance metrics, the Deep Learning Network algorithm significantly outperformed both the portfolio generated by the Efficient Frontier and the market–passing our go/no go criteria.