Transaction Cost Analysis using Machine Learning
by Tom Finke, OneMarketData
Two hot topics in quantitative finance today are machine learning and transaction cost analysis, and this article will cover them both. With transaction cost analysis (TCA), the goal is to determine whether trades are performed at favorable prices, low for purchases and high for sales. Machine learning techniques can help with this analysis, both pre-trade and post-trade.
Take implementation shortfall (aka slippage), a post-trade component of TCA which indicates the difference between the prevailing price when a trade decision is made and the final net execution price. By joining together orders with their fills, then comparing the order, execution, and market prices, one can calculate slippage. Using an appropriate machine learning (ML) algorithm one can take this analysis a step further. For example, with a set of historical training data, using what's known as a support vector machine which implements a kind of ML algorithm to classify data into different discrete categories, various input parameters such as the placement time of the order, execution time, order duration, and quantity, along with the volume-weighted average price (VWAP) of the execution can be extracted to train a model to predict whether it's more optimal to place an order for a particular security at market open or at market close in order to minimize slippage values.
Another type of post-trade TCA would be to search for anomalous orders in an order book, for example to detect toxic order flow where trading counterparties are manipulating prices to their favor, as market participants may want to avoid trading against this sort of liquidity in their execution algos. One could perform such analysis using supervised machine learning where explicit training is required, following these general steps: first generate a training data set where orders are classified as toxic based on a proprietary set of input features, next train a machine learning model (like an SVM) using those features, acquire an out-of-sample test data set, and then predict the type of order based on the SVM classification model. Predictions can be evaluated for their accuracy, and input parameters altered to improve that accuracy to a desired level. Part of the art of setting up a machine learning analysis is selecting the right features to use for training the model, and in this case some likely candidates might be: how frequently an order was amended, the average time duration between order amendments, average order size, order-to-cancel ratio, and so on. Other types of ML algorithms also could be applied to perform the classification, such as decision trees or even deep learning algorithms such as multi-layered neural networks.
When discussing machine learning it's important to remember the importance of normalization, which is the practice of adjusting data values which are a measured on different scales to a common scale, such that one type of input doesn't overwhelm the other inputs during the ML calculations. In the examples above, before running the training and prediction, the input data features would need to be run through a normalization function which might for example produce outputs which fall only in the range between -1 and +1.
Quant shops today are experimenting with using ML analytics for TCA, but the application of these techniques is still in its infancy. Eventually as this specialized field matures, we can expect to see ever more sophisticated implementations. For example the output of post-trade TCA can be made to feed into the inputs of pre-trade TCA to minimize transaction costs and/or to help achieve best execution, perhaps even with intraday data in real-time, using an applicable ML algorithm such as stochastic sub-gradient descent or a convolutional neural network. The future of ML in TCA will certainly be exciting and worth watching for quant finance professionals.
To learn more about OneTick please visit us HERE