Menampilkan postingan yang diurutkan menurut relevansi untuk kueri artificial-intelligence-and-stock. Urutkan menurut tanggal Tampilkan semua postingan
Menampilkan postingan yang diurutkan menurut relevansi untuk kueri artificial-intelligence-and-stock. Urutkan menurut tanggal Tampilkan semua postingan

Senin, 23 April 2007

Paradox Resolved: Why Adventure Decreases Expected Log Provide Simply Non Expected Wealth

I accept been troubled past times the next paradox inwards the past times few years. If a stock's log returns (i.e. alter inwards log toll per unit of measurement time) follow a Gaussian distribution, in addition to if its cyberspace returns (i.e. percentage alter inwards toll per unit of measurement time) accept hateful m in addition to criterion distribution s, in addition to then many finance students know that the hateful log returns is m-s2 /2That is, the chemical compound increase charge per unit of measurement of the stock is m-s2 /2. This tin survive derived past times applying Ito's lemma to the log toll procedure (see e.g. Hull), in addition to is intuitively satisfying because it is maxim that the expected chemical compound increase charge per unit of measurement is lowered past times run a hazard ("volatility"). OK, nosotros learn that - run a hazard is bad for the increase of our wealth.

However, let's notice out what the expected toll of the stock is at fourth dimension t. If nosotros invest our entire wealth inwards 1 stock, that is actually bespeak what our expected wealth is at fourth dimension t. To compute that, it is easier to get-go notice out what the expected log toll of the stock is at fourth dimension t, because that is but the expected value of the nub of the log returns inwards each fourth dimension interval, in addition to is of course of instruction equal to the nub of the expected value of the log returns when nosotros assume a geometric random walk. So the expected value of the log toll at fourth dimension t is but t * (m-s2 /2). But what is the expected toll (not log price) at fourth dimension t? It isn't right to nation exp(t * (m-s2 /2)), because the expected value of the exponential part of a normal variable is non equal to the exponential part of the expected value of that normal variable, or E[exp(x)] !=exp(E[x]). Instead, E[exp(x)]=exp(μ+σ2 /2) where μ in addition to σ are the hateful in addition to criterion departure of the normal variable (see Ruppert). In our case, the normal variable is the log price, in addition to thence μ=t * (m-s2 /2), in addition to σ2=t *s. Hence the expected toll at fourth dimension t is exp(t*m). Note that it doesn't involve the volatility s. Risk doesn't touching on the expected wealth at fourth dimension t. But nosotros but argued inwards the previous paragraph that the expected chemical compound increase charge per unit of measurement is lowered past times risk. What gives?

This brings us to a famous recent paper past times Peters in addition to Gell-Mann. (For the physicists with you, this is the Gell-Mann who won the Nobel prize inwards physics for inventing quarks, the key edifice blocks of matter.) This happens to survive the nearly read newspaper inwards the Chaos Journal inwards 2016, in addition to basically demolishes the utilization of the utility part inwards economics, inwards understanding with John Kelly, Ed Thorp, Claude Shannon, Nassim Taleb, etc., in addition to against the entire academic economic science profession. (See Fortune's Formula for a history of this controversy. And but to survive clear which side I am on: I loathe utility functions.) To brand a long even short, the mistake nosotros accept made inwards computing the expected stock toll (or wealth) at fourth dimension t, is that the expectation value at that topographic point is ill-defined. It is ill-defined because wealth is non an "ergodic" variable: its finite-time average is non equal to its "ensemble average". Finite-time average of wealth is what a specific investor would sense upwardly to fourth dimension t, for large t. Ensemble average is the average wealth of many millions of like investors upwardly to time t. Naturally, since nosotros are but 1 specific investor, the finite-time average is much to a greater extent than relevant to us. What nosotros accept computed above, unfortunately, is the ensemble average.  Peters in addition to Gell-Mann exhort us (and other economists) to exclusively compute expected values of ergodic variables, in addition to log provide (as opposed to log price) is happily an ergodic variable. Hence our average log provide is computed correctly - run a hazard is bad. Paradox resolved!

===

My Upcoming Workshops

May thirteen in addition to 20: Artificial Intelligence Techniques for Traders

I volition utter over inwards details AI techniques equally applied to trading strategies, with enough of in-class exercises, in addition to with emphasis on nuances in addition to pitfalls of these techniques.

June 5-9: London in-person workshops

I volition learn three courses there: Quantitative Momentum, Algorithmic Options Strategies, in addition to Intraday Trading in addition to Market Microstructure.

(The London courses may qualify for continuing didactics credits for CFA Institute members.)


Kamis, 30 Juli 2020

Artificial News Together With Stock Picking

There was an article inward the New York Times a brusk piece agone nearly a novel hedge fund launched past times Mr. Ray Kurzweil, a poineer inward the champaign of artificial intelligence. (Thanks to my swain blogger Yaser Anwar who pointed it out to me.) The stock picking decisions inward this fund are supposed to endure made past times machines that "... tin forcefulness out honor billions of marketplace transactions to come across patterns nosotros could never see". While I am sure enough a believer inward algorithmic trading, I accept popular off a skeptic when it comes to trading based on "aritificial intelligence".

At the jeopardy of over-simplification, nosotros tin forcefulness out characterize artificial news equally trying to jibe past times information points into a business office amongst many, many parameters. This is the instance for approximately of the favorite tools of AI: neural networks, determination trees, in addition to genetic algorithms. With many parameters, nosotros tin forcefulness out for sure capture small-scale patterns that no human tin forcefulness out see. But produce these patterns persist? Or are they random noises that volition never replay again? Experts inward AI assure us that they accept many safeguards against plumbing fixtures the business office to transient noise. And indeed, such tools accept been really effective inward consumer marketing in addition to credit carte fraud detection. Apparently, the patterns of consumers in addition to thefts are quite consistent over time, allowing such AI algorithms to piece of employment fifty-fifty amongst a large position out of parameters. However, from my experience, these safeguards piece of employment far less good inward fiscal markets prediction, in addition to over-fitting to the dissonance inward historical information remains a rampant problem. As a thing of fact, I accept built fiscal predictive models based on many of these AI algorithms inward the past. Every fourth dimension a carefully constructed model that seems to piece of employment marvels inward backtest came up, they inevitably performed miserably going forward. The principal argue for this seems to endure that the amount of statistically independent fiscal information is far to a greater extent than express compared to the billions of independent consumer in addition to credit transactions available. (You may intend that in that location is a lot of tick-by-tick fiscal information to mine, but such information is serially-correlated in addition to far from independent.)

This is non to state that quantitative models produce non piece of employment inward prediction. The ones that piece of employment for me are ordinarily characterized past times these properties:

• They are based on a audio econometric or rational basis, in addition to non on random regain of patterns;
• They accept few or fifty-fifty no parameters that involve to endure fitted to past times data;
• They involve linear regression only, in addition to non plumbing fixtures to approximately esoteric nonlinear functions;
• They are conceptually simple.

Only when a trading model is philosophically constrained inward such a trend produce I dare to permit testing on my small, precious amount of historical data. Apparently, Occam’s razor plant non exclusively inward science, but inward finance equally well.

Selasa, 01 Mei 2007

Time Serial Analysis Too Information Gaps

Most fourth dimension serial techniques such equally the ADF essay for stationarity, Johansen essay for cointegration, or ARIMA model for returns prediction, assume that our information points are collected at regular intervals. In traders' parlance, it assumes bar information amongst fixed bar length. It is slowly to run into that this mundane requirement instantly presents a work fifty-fifty if nosotros were only to analyze daily bars: how are nosotros create bargain amongst weekends as well as holidays?

You tin run into that the statistics of provide bars over weekdays tin differ significantly from those over weekends as well as holidays. Here is a tabular array of comparing for SPY daily returns from 2005/05/04-2015/04/09:

SPY daily returns
Number of bars
Mean Returns (bps)
Mean Absolute Returns (bps)
Kurtosis (3 is “normal”)
Weekdays only
1,958
3.9
80.9
13.0
Weekends/holidays only
542
0.3
82.9
23.7

Though the absolute magnitude of the returns over a weekday is similar to that over a weekend, the hateful returns are much to a greater extent than positive on the weekdays. Note also that the kurtosis of returns is most doubled on the weekends. (Much higher tail risks on weekends amongst much less expected returns: why would anyone concur a seat over weekends?) So if nosotros run whatsoever form of fourth dimension serial analysis on daily data, nosotros are force-fitting a model on information amongst heterogeneous statistics that won't function well.

The work is, of course, much worse if nosotros travail fourth dimension serial analysis on intraday bars. Not exclusively are nosotros faced amongst the weekend gap, inward the illustration of stocks or ETFs nosotros are faced amongst the overnight gap equally well. Here is a tabular array of comparing for AUDCAD 15-min returns vs weekend returns from 2009/01/01-2015/06/16:

AUDCAD 15-min returns
Number of bars
Mean Returns (bps)
Mean Absolute Returns (bps)
Kurtosis (3 is “normal”)
Weekdays only
158,640
0.01
4.5
18.8
Weekends/holidays only
343
-2.06
15.3
4.6

In this case, every of import statistic is dissimilar (and it is noteworthy that kurtosis is genuinely lower on the weekends here, illustrating the mean-reverting grapheme of this fourth dimension series.)

So how should nosotros predict intraday returns amongst information that has weekend gaps? (The same solution should apply to overnight gaps for stocks, as well as thence omitted inward the next discussion.) Let's regard several proposals:

1) Just delete the weekend returns, or laid them equally NaN inward Matlab, or missing values NA inward R. 

This won't function because the root few bars of a calendar week isn't properly predicted past times the terminal few bars of the previous week. We shouldn't purpose whatsoever linear model built amongst daily or intraday information to predict the returns of the root few bars of a week, whether or non that model contains information amongst weekend gaps. As for how many bars constitute the "first few bars", it depends on the lookback of the model. (Notice I emphasize linear model hither because to a greater extent than or less nonlinear models tin bargain amongst large jumps during the weekends appropriately.)

2) Just pretend the weekend returns are no dissimilar from the daily or intraday returns when building/training the fourth dimension serial model, but create non purpose the model for predicting weekend returns. I.e. create non concur positions over the weekends.

This has been the default, as well as peradventure simplest (naive?) way of treatment this number for many traders, as well as it isn't besides bad. The predictions for the root few bars inward a calendar week volition 1 time again last suspect, equally inward 1), thence 1 may desire to refrain from trading then. The model built this way isn't the best possible one, but thence nosotros don't convey to last purists.

3) Use exclusively the most recent menstruation without a gap to prepare the model. So for an intraday FX model, nosotros would last using the bars inward the previous week, sans the weekends, to prepare the model. Do non purpose the model for predicting weekend returns nor the root few bars of a week.

This sounds fine, except that at that spot is commonly non plenty information inward only a calendar week to construct a robust model, as well as the resulting model typically suffers from severe information snooping bias.

You mightiness retrieve that it should last possible to concatenate information from multiple gapless periods to shape a larger preparation set. This "concatenation" does non hateful only piecing together multiple weeks' fourth dimension serial into 1 long fourth dimension serial - that would last equivalent to 2) as well as wrong. Concatenation only way that nosotros maximize the full log likelihood of a model over multiple independent fourth dimension series, which inward theory tin last done without much fuss since log likelihood (i.e. log probability) of independent information are additive. But inward practice, most pre-packaged fourth dimension serial model programs create non convey this facility. (Do add together a comment if anyone knows of such a bundle inward Matlab, R, or Python!) Instead of modifying the guts of a likelihood-maximization routine of a fourth dimension serial plumbing equipment package, nosotros volition examine a brusk cutting inward the adjacent proposal.

4) Rather than using a pre-packaged fourth dimension serial model amongst maximum likelihood estimation, only purpose an equivalent multiple linear regression (LR) model. Then only lucifer the preparation information amongst this LR model amongst all the information inward the preparation laid except the weekend bars, as well as purpose it for predicting all futurity bars except the weekend bars as well as the root few bars of a week.

This conversion of a fourth dimension serial model into a LR model is fairly slowly for an autoregressive model AR(p), but may non last possible for an autoregressive moving average model ARMA(p, q). This is because the latter involves a moving average of the residuals, creating a dependency which I don't know how to contain into a LR. But I convey establish that AR(p) model, due to its simplicity, ofttimes industrial plant meliorate out-of-sample than ARMA models anyway. It is of course, real slowly to only omit sure enough information points from a LR fit, equally each information indicate is presumed independent. 

Here is a plot of the out-of-sample cumulative returns of 1 such AR model built for predicting 15-minute returns of NOKSEK, assuming midpoint executions as well as no transaction costs (click to enlarge.)












Whether or non 1 decides to purpose this or the other techniques for treatment information gaps, it is e'er a skillful watch to pay to a greater extent than or less attending to whether a model volition function over these particular bars.

===

My Upcoming Workshop


This is a novel online workshop focusing on the practical purpose of AI techniques for identifying predictive indicators for property returns.

===

Managed Accounts Update

Our FX Managed Account programme is 6.02% inward June (YTD: 31.33%).

===

Industry Update
  • I previously reported on a cardinal stock model proposed past times Lyle as well as Wang using a linear combination of only 2 work solid fundamentals ― book-to-market ratio as well as provide on equity. Professor Lyle has posted a novel version of this model.
  • Charles-Albert Lehalle, Jean-Philippe Bouchaud, as well as Paul Besson reported that "intraday toll is to a greater extent than aligned to signed restrain orders (cumulative club replenishment) rather than signed marketplace seat orders (cumulative club imbalance), fifty-fifty if club imbalance is able to forecast brusk term toll movements." Hat tip: Mattia Manzoni. (I don't convey a link to the master paper: delight inquire Mattia for that!)
  • A novel investment contest to tending yous heighten majuscule is available at hedgefol.io.
  • Enjoy an Outdoor Summer Party amongst immature human being quants benefiting the New York Firefighters Burn Center Foundation on Tuesday, July 14th amongst groovy nutrient as well as cool drinks on a terrace overlooking Manhattan. Please RSVP to bring together quant fund managers, systematic traders, algorithmic traders, quants as well as high frequency sharks for a groovy evening. This is a complimentary event (donations are welcomed). 
===

Follow me on Twitter: @chanep