Exchange order book data is one of the most foundational data types in the crypto asset industry— arguably, even more foundational than trades data, as two orders must be matched for a trade to occur. Order book data is useful for various entities, including market makers, systematic or quantitative traders, and funds studying trade execution patterns. The Coin Metrics Market Data Feed offering includes various API endpoints that allow users to retrieve order book snapshots and updates across a collection of top crypto exchanges.
Resources
This notebook demonstrates basic functionality offered by the Coin Metrics Python API Client and Market Data Feed.
Coin Metrics offers a vast assortment of data for hundreds of cryptoassets. The Python API Client allows for easy access to this data using Python without needing to create your own wrappers using requests and other such libraries.
To understand the data that Coin Metrics offers, feel free to peruse the resources below.
The Coin Metrics API v4 website contains the full set of endpoints and data offered by Coin Metrics.
Download the entire notebook as either a jupyter notebook to run yourself or as a pdf from the two links below
Notebook Setup
import os
from os import environ
import sys
import pandas as pd
import numpy as np
import seaborn as sns
import logging
from datetime import date, datetime, timedelta
from coinmetrics.api_client import CoinMetricsClient
import json
import logging
from pytz import timezone as timezone_conv
from datetime import timezone as timezone_info
import matplotlib.ticker as mticker
from matplotlib.ticker import ScalarFormatter
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
%matplotlib inline
import ast
import plotly.express as px
from tqdm import tqdm
from plotly import graph_objects as go
logging.basicConfig(
format='%(asctime)s %(levelname)-8s %(message)s',
level=logging.INFO,
datefmt='%Y-%m-%d %H:%M:%S'
)
now = datetime.utcnow()
last_day_date_time = now - timedelta(hours = 24)
# We recommend privately storing your API key in your local environment.
try:
api_key = environ["CM_API_KEY"]
logging.info("Using API key found in environment")
except KeyError:
api_key = ""
logging.info("API key not found. Using community client")
client = CoinMetricsClient(api_key)
2024-09-16 16:02:07 INFO Using API key found in environment
Order Book Depth
Coin Metrics collects and serves 3 types of order book snapshots.
One type (depth_limit=100) consists of a snapshot of the top 100 bids and top 100 asks taken once every 10 seconds for major markets.
The second type (depth_limit=10pct_mid_price) includes all levels where the price is within 10 percent of the midprice taken once every 10 seconds.
The third type (depth_limit=full_book) consists of a full order book snapshot (every bid and every ask) taken once every hour for all markets that we are collecting order book data for . All of these snapshots are served through our HTTP API endpoint /timeseries/market-orderbooks.
def get_depth(df_orderbook,within=2):
"""
Takes orderbook as returned by API and returns
cumulative qty bid/offered at each snapshot and where the liquidity is: how far from best
"""
dfs=[]
for row in df_orderbook.itertuples():
timestamp_ = row.time
#asks
asks = pd.DataFrame(ast.literal_eval(row.asks))
asks["price"]=asks.price.apply(float)
best_ask = float(asks.price.min())
asks["size"]=asks['size'].apply(float)*-1
asks["percent_from_best"]=((asks.price/best_ask)-1)*100
#asks["best_ask"] = best_ask
asks["time"] = timestamp_
asks["side"] = "ask"
asks["position"] = range(len(asks))
asks["cumulative_vol"] = asks['size'].cumsum()
asks["size_usd"] = asks["size"] * asks["price"]
asks["cumulative_vol_usd"] = asks.size_usd.cumsum()
#bids
bids = pd.DataFrame(ast.literal_eval(row.bids))
bids["price"]=bids.price.apply(float)
best_bid = float(bids.price.max())
bids["size"]=bids['size'].apply(float)
bids["percent_from_best"]=abs(((bids.price/best_bid)-1)*100)
#bids["best_bid"] = best_bid
bids["time"] = timestamp_
bids['side'] = 'bid'
bids["position"] = range(len(bids))
bids["cumulative_vol"] = bids['size'].cumsum()
bids["size_usd"] = bids["size"] * bids["price"]
bids["cumulative_vol_usd"] = bids.size_usd.cumsum()
# within depth limit - default 2%
asks = asks[asks.percent_from_best <= within]
bids = bids[bids.percent_from_best <= within]
# group into bins of 0.01% (1 bps)
bids['grouping'] = pd.cut(bids.percent_from_best,bins=20,include_lowest=True,precision=1)
asks['grouping'] = pd.cut(asks.percent_from_best,bins=20,include_lowest=True,precision=1)
# collapse
bids = bids.groupby('grouping').agg({"size":[sum],"size_usd":[sum]})#.cumsum()
bids.index = [x/100 for x in range(1,201,10)]
bids['side']='bid'
asks = asks.groupby('grouping').agg({"size":[sum],"size_usd":[sum]})#.cumsum()
asks.index = [x/100 for x in range(1,201,10)]
asks['side']='asks'
#concat together
bids_asks = pd.concat([bids,asks])
dfs.append(bids_asks)
df_liquidity = pd.concat(dfs)
df_liquidity['time'] = df_orderbook.time.iloc[0]
df_liquidity.columns = ["size_ntv","size_usd","side","time"]
#df_resampled_hourly = df_liquidity.groupby(['side','position']).resample('1h',on='time').mean()
return df_liquidity
# collapse into depth by distance from best bid/ask
print("Getting order book data for {}...".format(market))
dfs=[]
for i in tqdm(range(len(df))):
dfs.append(get_depth(df.iloc[i:i+1]))
# get rolling 3 hour window
df_aggregated = pd.concat(dfs)
df_aggregated['pct_from_best'] = df_aggregated.index
df_aggregated.sort_values(["side","pct_from_best","time"],inplace=True)
df_aggregated['rolling_3hr_usd'] = df_aggregated.reset_index().groupby(['side','pct_from_best']).size_usd.rolling(3).mean().values
df_aggregated = df_aggregated[df_aggregated['rolling_3hr_usd'].notnull()].copy()
Getting order book data for coinbase-btc-usd-spot...
100%|█████████████████████████████████████████████████████████████████████████████| 168/168 [02:21<00:00, 1.19it/s]