*Disclaimer: This article is less about finance and more on the nerdy side.
Today, we have another respectable green candle on SP500 that brings us back slightly above our entry point. We now arrive at the end of a triangle of decision where we have consistently made higher lows but also lower highs. Tomorrow is "quad witching" which should reintroduce volatility, but the direction is usually not indicative of the upcoming trend. I guess we'll find out next week what the market will do. Most data are neither high nor low. The SKEW has cooled down but could still have room to fall. Most metrics are neither in the buy zone nor in the hedge zone. I guess we'll have to wait for next week before truly knowing what's coming next. I still maintain that given the current low volatility and the market's timid reaction to bad news, nothing tells us now that we shouldn't be in the market.
BTC that as often lead the market since the end of 2021 is also recovering from it's own move down. Things could change rapidly, but the good thing is being neither in the buy zone nor the sell zone of the hedge signal places us not far from many of the sell thresholds in the hedge signal. This again greatly limit the potential downside.
Now, since there's not much to say about the current market state, I wanted to use these lines to post on another topic related to methodology, especially around finding optimal thresholds during the design phase. This is a topic I have mentioned in some of our latest posts since we had a hedge signal that happened right on the threshold and then stayed there for a while.
How to find a strategy threshold?
A strategy is always about three things: 1) the data we use as input, 2) the processing we do on these data and 3) the thresholds that trigger the buy or sell of the asset. For instance, someone using a simple MACD signal uses stock price SMA or EMA as input (and some calculations) and employs the crossing point between the MACD line and the signal line as the threshold. The WU hedge signal utilizes about 11 different data feeds as input. Some of them are post-processed, and then there's a condition on each for buying or selling.
When designing a strategy, a significant part of the process is to optimize these thresholds for performance on past data. In doing this, I usually make the hit rate (success rate) of the strategy over a long period the primary objective. The hit rate is crucial since it demonstrates that what you designed is superior to flipping a coin. Also, if 50% of your moves end up being wrong, it would become psychologically challenging to apply the strategy, even if it would be profitable. I understand that what truly matters when it comes to investment is profit, so perhaps I should prioritize strategy return? The issue with focusing too much on strategy return is that a poor strategy could, by luck (or design), perfectly hedge big move like the 2020 and 2008 crashes and creating outstanding returns. Hedging successfully these dramatic event are important but the point is that the returns from these moves could mask the "actual performance" of the algorithm. Thus, it's vital to consider both. When testing a strategy, my priority order is typically hit rate first, then return. I have various other objectives, like number of trades, max drawdown, performance on shorts, depending on what I'm designing.
To help understand what I've just explained, here's an example with the current indicator I'm actively working on: Intrinsic Entropy. Intrinsic Entropy is a novel concept imported from particle physics and has been applied in various areas of finance over the last few years. More recently, a team of researchers from two different universities decided to extend this concept to measure realized volatility in the stock markets. If you're unfamiliar with realized volatility, it essentially measures the actual volatility in the price action over a past duration period. This contrasts with implied volatility, like the VIX, which predicts future volatility. Though it might seem simple to measure past volatility since it's already occurred, it's not that straightforward. The original method proposed was fundamentally flawed from all angles. Many scientists and quant analysts have since suggested different approaches to address these flaws. In 2022, the consensus was that the most accurate way to measure realized volatility in the stock market was the Yang-Zhang variation of the Garman Klass realized volatility estimation. The novel "Intrinsic Entropy" method is a variation of the Yang-Zhang equation that considers trading volume, recognizing that a given price change means something different if triggered by low volume than by high volume. I will delve deeper into the issues with prior realized volatility algorithms and how Intrinsic Entropy addresses most of them when I publish our WU version of this algorithm as an investing tool. Returning to the main topic of methodology, here's a preview of the current version of this novel indicator.
The signal oscillates somewhere between -2 and 2, depending on the trend and volatility. The way it's displayed here is that when it's above 0, the signal appears in green; otherwise, it's in red. Is this a good threshold that would allow us to successfully hedge against the bumps along the road? It may seem so at first glance, but we need to be sure. The way to validate this is by backtesting. TradingView has a special type of script for that called “Strategy” which allows users to easily set conditions for buying and selling and see the financial return along with all the related stats. This is what I use for calculating the hedge signal or margin signal statistics. The challenge with this is that it doesn't allow for the automatic adjustment of certain parameters. For instance, if you want to compare the strategy return of hedging when the “Intrinsic Entropy signal” drops below zero with another scenario where you would hedge later at -0.1 instead of zero, you'd have to run two different simulations and manually compare the results. What if you wanted a different buy threshold than the sell threshold? For example, selling when the Intrinsic Entropy signal goes below 0 but buying back when it rises above 0.1, creating a "dead zone" to avoid bounces? This would require even more simulations. To address this, we need to shift to other programming software that provides full flexibility to sweep the algorithm parameters space. Here are the results of different sell and buy thresholds applied to the Intrinsic Entropy signal on the QQQ from 2003 to 2023:
This graph represents the strategy's hit rate or success rate over that period. The hit rate is indicated by the color of the graph. The success rate oscillates between 0.4 (which is worse than flipping a coin) and 0.75 (while 75% might not sound impressive if you think in terms of school grades, it's commendable in the investing world over a prolonged period). This graph shows us that the maximum hit rate is achieved when we set the sell threshold at -0.75 and buy back at around 0.1. However, there's another favorable zone with a high hit rate around selling at -0.8 and buying back quickly at -0.5.
Now, let's examine a similar graph that presents the strategy return compared to the buy-and-hold value over the same period.
The zone that had the highest hit rate is not actually associated with a high return. This means that this strategy doesn't fail often but probably re-enters too late to be considerably profitable. So in this example, I would pick the point associated with the maximum Return ratio since it is also connected to a very good zone in terms of hit rate. This would configure a QQQ Intrinsic Entropy strategy with a Sell threshold set at -0.78 and a buyback at -0.54. This strategy would have a hit rate of 73% and would yield about the same return as buying and holding. This might not sound impressive, but this spans over 20 years during which the Nasdaq has changed significantly, and includes two major bear markets. Bear markets have a very distinct behavior regarding volatility. If we exclude them, the return is more than 2X the buy and hold. Also, over a period closer to the present, this return is significantly higher. Another advantage of having a signal with a high hit rate that matches the buy and hold is that by filtering out the major drawdowns, it could make us more comfortable investing in a leveraged play that would greatly outperform the buy and hold while avoiding the pain of being invested during a massive downtrend. I'll revisit how to use this indicator when we release it. At the moment, it's still just an example for methodology.
Now, let's return to the hit rate graph mentioned earlier. Suppose the maximum hit rate was also associated with a high return. I still would not have chosen that point of operation for the strategy. Remember recently when the hedge signal was triggered by a signal that ended up being precisely on the threshold. This means that if the data had been 1-2% lower, we would not have exited the market. We also bought back right on a threshold before a reversal, which explains why the QQQ IOFund strategy is still short and the SPY is in the market. One thing I mentioned back then is that we shouldn't be overly concerned about that since the thresholds aren't necessarily the utmost maximum, but rather the ones that are good and provide similar results around the threshold. What do I mean by that? Returning to our example here, look at the 3D view of the hit rate:
You see how the zone with the maximum hit rate is not a stable one. A small deviation of the threshold results in the strategy's hit rate falling from a 75% success rate to 45%. I tend to stay away from these zones since they tell me that a minor change in the behavior of the stock market could have a tremendous negative impact on the strategy's success rate if it operates in such an unstable zone. If we look at the other zone with a good hit rate that we highlighted before, this zone, although not as high as the other one, is super progressive around the threshold with actually a large plateau around the local top. These are the kinds of thresholds that I prefer. This tells me that even if the QQQ behavior evolves over the years, although we may not end up achieving the same success rate, it should still remain relatively similar. This is the kind of analysis we did for each algorithm threshold. This implies that our SPY hedge signal is not the algorithm version that yields the highest historical results and hit rate, but the one that represents the best compromise between stability and return/hit rate with the expectation that this stability will bring robustness in the face of never-before-seen data. This way of designing the strategy, combined with our backtesting methodology borrowed from AI (3 buckets of data), explains why we have had very similar performance in our 3 strategies (BTC, SPY, and QQQ) since they went live on TradingView compared to what we observed in backtesting. Another aspect I always consider is the number of trades that the strategy generates. This one is more personal. I am an investor first, not a trader. I usually discard strategies that involve a significant amount of trading. Others might be more appealed by frequent trading; it all comes down to investor type. For the QQQ example, here is the same parameter mapping in regards to the number of trades involved over that 20-year period.
The parameters we chose would generate around 40 trades, which is 2 trades per year over that period.
I hope this helps you understand our process a bit more. This was a simple example since it was only two parameters. With two parameters, it's easy to sweep all possible combinations. On 20 years of data, it takes about 45 seconds to run this algo on my MacBook Pro M1 Pro with 10 cores. If I were to add another parameter, it would take something like an hour with the same resolution. For more parameters, I could end up having to run the algo for days, so in such cases, we need to use methods other than simple parameter sweeping. One method that I like is Evolutionary Algorithms, a topic on which I've co-published a few scientific works over the years. But that is an entirely different topic.
To conclude this article, here are the Apple (AAPL) hit rate and return ratio charts over the last 10 years period. I assume that if you've followed this post, you can now easily spot yourself what would be the best parameters to use. What would you choose?