Learn Trading System: More Information Or Fewer Predictors: Which Is A Ameliorate Cure For Overfitting?

One of the perennial problems inwards edifice trading models is the spareness of information as well as the attendant danger of overfitting. Fortunately, in that location are systematic methods of dealing alongside both ends of the problem. These methods are well-known inwards auto learning, though nigh traditional auto learning applications cause got a lot to a greater extent than information than nosotros traders are used to. (E.g. Google used 10 1000000 YouTube videos to educate a deep learning network to recognize cats' faces.)

To create to a greater extent than preparation information out of lean air, nosotros tin mail away resample (perhaps to a greater extent than vividly, oversample) our existing data. This is called bagging. Let's illustrate this using a primal element model described inwards my new book. It uses 27 element loadings such as P/E, P/B, Asset Turnover, etc. for each stock. (Note that I telephone band cross-sectional factors, i.e. factors that depend on each stock, "factor loadings" instead of "factors" yesteryear convention.) These element loadings are collected from the quarterly fiscal statements of SP 500 companies, as well as are available from Sharadar's Core U.S.A. Fundamentals database (as good as to a greater extent than expensive sources similar Compustat). The element model is really simple: it is only a multiple linear regression model alongside the adjacent quarter's supply of a stock as the subject (target) variable, as well as the 27 element loadings as the independent (predictor) variables. Training consists of finding the regression coefficients of these 27 predictors. The trading strategy based on this predictive element model is every bit simple: if the predicted next-quarter-return is positive, purchase the stock as well as agree for a quarter. Vice versa for shorts.

Note in that location is already a measuring taken inwards curing information sparseness: nosotros produce non seek to create a divide model alongside a unlike laid of regression coefficients for each stock. We constrain the model such that the same regression coefficients apply to all the stocks. Otherwise, the preparation information that nosotros operate from 200701-201112 volition exclusively cause got 1,260 rows, instead of 1,260 x 500 = 630,000 rows.

The number of this baseline trading model isn't bad: it has a CAGR of 14.7% as well as Sharpe ratio of 1.8 inwards the out-of-sample menses 201201-201401. (Caution: this portfolio is non necessarily marketplace or dollar neutral. Hence the supply could survive due to a long bias enjoying the bull marketplace inwards the show period. Interested readers tin mail away for sure show a market-neutral version of this strategy hedged alongside SPY.) I plotted the equity bend below.

Next, nosotros resample the information yesteryear randomly picking due north (=630,000) information points with replacement to shape a novel preparation laid (a "bag"), as well as nosotros repeat this K (=100) times to shape K bags. For each bag, nosotros educate a novel regression model. At the end, nosotros average over the predicted returns of these K models to serve as our official predicted returns. This results inwards marginal improvement of the CAGR to 15.1%, alongside no modify inwards Sharpe ratio.

Now, nosotros seek to cut down the predictor set. We operate a method called "random subspace". We randomly selection one-half of the master copy predictors to educate a model, as well as repeat this K=100 times. Once again, nosotros average over the predicted returns of all these models. Combined alongside bagging, this results inwards farther marginal improvement of the CAGR to 15.1%, 1 time again alongside petty modify inwards Sharpe ratio.

The improvements from either method may non look large so far, but at to the lowest degree it shows that the master copy model is robust alongside honour to randomization.

But in that location is roughly other method inwards reducing the number of predictors. It is called stepwise regression. The persuasion is simple: nosotros selection 1 predictor from the master copy laid at a time, as well as add together that to the model exclusively if BIC (Bayesian Information Criterion) decreases. BIC is essentially the negative log likelihood of the preparation information based on the regression model, alongside a penalisation term proportional to the number of predictors. That is, if ii models cause got the same log likelihood, the 1 alongside the larger number of parameters volition cause got a larger BIC as well as hence penalized. Once nosotros reached minimum BIC, nosotros so seek to take away 1 predictor from the model at a time, until the BIC couldn't decrease whatever further. Applying this to our primal element loadings, nosotros accomplish a quite pregnant improvement of the CAGR over the base of operations model: 19.1% vs. 14.7%, alongside the same Sharpe ratio.

It is besides satisfying that the stepwise regression model picked exclusively ii variables out of the master copy 27. Let that sink inwards for a moment: only ii variables concern human relationship for all of the predictive ability of a quarterly fiscal report! As to which ii variables these are - I volition give away that inwards my verbalize at QuantCon 2017 on Apr 29.

===

My Upcoming Workshops

March eleven as well as 18: Cryptocurrency Trading alongside Python

I volition survive moderating this online workshop for my friend Nick Kirk, who taught a similar course of educational activity at CQF inwards London to broad acclaim.

May thirteen as well as 20: Artificial Intelligence Techniques for Traders

I volition verbalize over inwards details AI techniques such as those described above, alongside other examples as well as in-class exercises. As usual, nuances as well as pitfalls volition survive covered.

Learn Trading System

Selasa, 24 April 2007

More Information Or Fewer Predictors: Which Is A Ameliorate Cure For Overfitting?

Tidak ada komentar:

Posting Komentar