Caml Trading
Talk given at Carnegie-Mellon University by Yaron Minsky in 2011 (but apparently posted on YouTube in 2016).
Originally, Jane Street started out using industry-standard tools like Visual Basic, Java, and C++. But, after awhile, they decided to use OCaml. This talk explains some of the reasons why.
Most projects that use functional languages are usually linked to academics, but OCaml’s use at Jane Street is entirely industrial. Incidentally, because of this connection, the pool of OCaml programmers tends to have an academic background which can be good for recruiting purposes.
What does Jane Street do?
Jane Street is a trading company. In particular, they are a market maker. In order to explain what a money maker does, there are four main players in the trading market.
First, there is the investor. The market was originally created for the investor. An investor puts capital at risk in order to make money. The rational investor will typically put money in an index fund. They do not speculate about changing prices.
Second, there are speculators. They make money by trying to buy undervalued stocks and selling stocks once the market brings the stock price up to its “expected” price. The speculator relies on the market to bring the price up in a timely manner so that they can execute these buy-low sell-high transactions multiple times.
The investor doesn’t require a professional. The speculator is mixed. You can do it yourself (day trader) or have someone else do it for you (hedge fund).
The third player is the money maker. This is largely a professional role. The money maker makes money by facilitating the trades of others. To me, this is somewhat similar to the speculator because the money maker is also taking advantage of stocks that will eventually rise. They try to maximize the “spread” or the positive difference between the buying price and the selling price. And they hope to do this multiple times. The spread is similar to the mark up in the retailer world or the vigorish (vig, juice, etc.) in professional gambling.
The tighter the spread, the more liquid the price. And the more liquid the price, the more transactions the money maker is able to make. The money maker doesn’t try to predict prices, at least long term. They want to collect as many spreads as possible by doing many transactions.
The fourth and final player is the arbitrageur. This is a professional who makes money by executing an arbitrage. The most straightforward is the currency arbitrage. The arbitrageur starts with money in a single currency then exchanges the money to a different currency. They may do this multiple times, but at the end of all these exchanges, they can, and hope to, have more money than they started with. That is an arbitrage.
Money making and arbitrages can often be linked and these two roles often play against each other to make markets more efficient (bring prices up to their expected values).
The time scale ranges from seconds to hours, but rarely longer than days. Jane Street is also proprietary. This means that they do not have customers or investors. They don’t have customers because they don’t provide a service to any particular person. And they don’t have investors because the firm provides it’s own capital to trade. However, Jane Street’s service is felt through the market when they trade and influence prices.
This freedom from investors and customers played a big role in allowing Jane Street to use OCaml, a relatively obscure programming language. And they try to write everything in OCaml.
Technology requirements
Broadly, Jane Street requires correctness, agility, and performance.
Correctness is extremely important. Because of the time scales, writing software that makes bad trading decisions can lose money very quickly. The trades need to also be fast in order for the trades to be most effective.
Agility is the ability to change your code quickly when circumstances change. New ways to make money, new regulations, new securities, new markets: All require changes to code.
Performance is needed because data rates are so high. You need to react and react quickly. You need to store data for various purposes (1 TB per day in 2011). Software needs to be efficient so that you can minimize the amount of hardware and infrastructure you need. The reason isn’t because hardware is expensive; it’s managing that hardware. At least in 2011, they want to minimize the people needed to manage infrastructure and maximize the people doing other things.
Why OCaml?
Again, correctness. Jane Street has always valued high quality software, especially reading code. At the start, the founders were committed to reading every line of code, partly out of paranoia. But code is a way to express ideas and how systems work. This affected important technology decisions.
Therefore readability is important. OCaml is suited for this because of brevity and types.
OCaml is able to express patterns and abstractions very concisely. This helps during code reviews. OCaml also allows you to minimize boilerplate. Boilerplate code is a sign that the language you are using somehow can’t quite capture a certain pattern or idea effectively. It gets copied and pasted. It is also very dull to read. You can’t pay people enough to review dull code. It gets glossed over. You need to review everything carefully.
Types also contribute to readability, but also helps verify the correctness of your code. When you read code, you are informally proving that the code is doing what you think it is doing. Types allow you to offload some of that proving responsibility in an automated way. A couple OCaml type tricks: parametric polymorphism, algebraic data types, type inference, modules, functors, phantom types, type-indexed values.
An example of how types can help prove correctness is abstract data types. You
need two kinds of types: products and sums. A product type is something like a
struct where a single type can hold a bunch of other values that can have
different types. A sum type is a union of various types. A variable with a sum
type can be a union of types. But one key language feature makes the type
system more effective: the match statement. It allows you to do case analysis
with static type checks. This allows OCaml to automatically check for missed,
redundant, and impossible cases. And this kind of check is needed for
everything, not just complex operations.
The type system also allows you to be more agile simply because the compiler can check things. It avoids boilerplate where you can miss things when copying and pasting.
Performance. OCaml has good enough performance as a declarative language for most of Jane Street’s use cases. Good code generation and garbage collector.
Problems
- Building user interfaces. Used curses.
- Concurrency. In 2011, not able to do physical parallelism or message passing.
- External libraries. OCaml is not popular so not as many third-party libraries.
- Programming in the large. Lack of package managers and smaller community.