JQ Investments Stats Evaluation

JQ Investments Stats Evaluation
Note that you are forbidden from distributing or referencing this document or any of its attach- ments to any third parties.
1 Introduction
Please address the questions below in a clear and concise write-up. Feel free to use tables and graphs to illustrate your findings. Please note that though you should attach your code, the write-up remains your most important presentation.
mdLog.csv contains snapshots of market data from a single security. This means you are not provided with new information each time the market changes. Instead, you are given periodic and regular updates. The allocation mechanism of this market follows price-time priority, incoming orders must interact first with the opposing order with the best price, followed by the earliest order insertion time.
Here is a list of meanings of the columns in mdLog.csv
updateCount: a consecutive counter.
lastprice: the execution price of the most recent trade in the market from the perspective of the current update.
volume: the cumulative volume traded in the security from the beginning of the dataset to the current update.
bid/ask: the price at the best bid / best ask. bidsize/asksize: size at the bid/ask.
• ticksize: ticksize is the minimum price increment. The ticksize is 0.01 in this case. • spread: spread is defined as (ask price – bid price)
• mid price: mid price is defined as (ask price + bid price ) / 2
• market size: market size is defined as (bidsize + asksize) / 2.
• BBO: best bid and offer. In this case is bid and ask price. Questions
How many times did the bid-ask spread widen in this dataset? What proportion of those times did it widen on both sides (bid decreased and ask increased)? What proportion of those times did it widen on one side (bid decreased xor ask increased)? How many times did the bid-ask spread tighten? Report the distribution of market size by figures and statistics.
Code Help, Add WeChat: cstutorcs
Focus on instances in which the mid price changes but the bid-ask spread does not. For an increase (decrease) of the mid price, define the new bid (ask) as the aggressive side, and the other as the defensive side. Report the distribution of the sizes of the defensive side and aggressive side immediately following a change in mid price, plot the distribution histogram into a single figure. Can you intuitively explain your results?
If, from one update to the next, volume increases from 100 to 150, we know 50 was traded between the two updates. Many trades of varying size at various prices could have happened between the two updates; we only know that the most recent trade happened at the lastprice.
1. A naive volume allocation method is allocate all trade volume on lastprice. Please calculate the volume allocation on last update’s bid and ask. Report the distribution of trade volume allocated on last bid and last ask by figures and statistics.
2. Come up with a better volume allocation method that allocates the traded volume to price levels. Please explain your method and add report the distribution of trade volume allocated on last bid and last ask. Notice: though only volume allocated on last bid and last ask is calculated, you are not constrained to allocate volume only to last bid and last ask. For example, if the trade volume update is 50, you can allocate 20 on last bid, 20 on last ask and 10 on another price level.
Use naive volume allocation in Q3-(1) for all following questions. We define sizeDelta as the size added or cancelled on a level from one update to the next, net of any traded volume. For example, on update 1, the bidsize was 100. On update 2, the bid price is unchanged and the bidsize is now 70. Suppose, from part 3, that we believe a total of size 20 was traded at the bid. Then we have sizeDeltaAtBid = -10, as in orders totaling size 10 were canceled between update 1 and update 2 on the bid. Note that sizeDelta is only defined for price levels that are unchanged from the previous update. Report the distribution of sizeDeltaAtBid and sizeDeltaAsk by figures and statistics.
Now we will focus on the aggressive side, which is defined in Q2. Assume the size on the aggressive side we see immediately after a price change is from a single limit order, which we call the top order. We wish to track the performance of top orders. Assume cancellations always happen from the back of the queue and the order may be partially cancelled.
update bid price bidsize volume allocated sizeDeltaAtBid comment at 20.05
2 20.05 100 10 60 the top order was filled 10
Table 1: top order lifespan
Note that all top orders are inserted at the BBO, as in a buy (sell) top order is always inserted at the best bid (ask). However, not all top orders spend their entire lifespan on the BBO. Report the total number of top
assume this is the aggressive side, so the top order is of size 50
the top order was filled another 10 for a total of 20 traded,
and there must have been 10 cancelled from the top order (because the bidsize is now 20)
20 more was filled at the bid. The rest of the top order is now completely filled.
Github
orders, the number that do not spend their entire lifespan on the BBO, and the number that do. Out of those that do spend their entire lifespan on the BBO, how many are filled for their original size?
Table 1 is an example of top order lifespan.
1. Though sizeDeltaAtBid and sizeDeltaAtAsk is only defined for price levels that are unchanged from the previous update, you need to figure out a method to determine what happens to a top order when BBO changes by allocated volume.
We define return as the signed difference between the execution price of an order and the mid price 40 market updates after the time of the final trade. Denote mid price 40 market updates as mid F40 and the execution price as exec price. For bid side, return is mid F40 – exec price, for ask side, return is exec price – mid F40. By convention, we normalize these returns by the ticksize. Therefore, if a buy order is all traded at 19.20, the bid 40 updates later is 19.20, and the ask 40 updates later is 19.21, then the return is 0.5. Report average returns of top orders that spend their entire lives at the BBO and are completely filled. Please round your answer to 4 decimals.
Shift focus to top orders that do not spend their entire lives at the BBO. Report the average returns of these orders. Please round your answer to 4 decimals. Explicitly state your assumptions regarding your treatment of these orders while they are not at the BBO, explain why you make these assumptions.
At this point, you may tackle none, some or all of the following questions. You may answer qualitatively or quantitatively.
Focusing on top orders that spend their entire lifespan at the BBO and are completely filled, can you write a model for the return of these orders based on the size of such orders and the sizeDelta variables you see on the second update after the order was inserted? Can you improve your model via other predictors with information taken on or before the second update after the order was inserted?
Discuss the above definition of returns.
1. What are the pros and cons of using the above definition of returns as a metric for order attractiveness 2. What are some (potentially better) alternatives?
Any other interesting observations you want to point out? What assumptions in the problems struck you as unrealistic or overly simplistic?
浙大学霸代写 加微信 cstutorcs