# Acyr Locatelli

## model.fit for a living

Recently I ran into the trouble of having to mimic the behaviour of pandas.merge_asof in SQL. Other databases like kdb, for example, already have this functionality built-in.

## ASOF joins

The asof joins are really useful when dealing with time-series. A typical scenario where this comes up:

We have two tables:

• Table-1 contains a collection of timestamped trades in GBP.
• Table-2 contains collection timestamped GBP/USD fx-rates.

The timestamps in the two tables might not necessarily match. If we want to merge these two tables so that we know the rate that each trade was executed at, then for each trade with timestamp t, we need to find the most recent rate that has occurred before time t in the fx-rates table.

Using pandas, this is straight-forward:

 pandas.merge_asof(trades, fx_rates, on='ts', by='currency')


To get this to work in SQL, we will need to do a bit more work. My first attempt was a bit clumsy but gets the job done – just about.

select *
from (
select f.*, t.*,
rank() over (partition by t.id order by
abs(extract(epoch from (t.ts - f.ts))) ASC, t.ts DESC
) rnk
from fx_rates f
inner join trades t on f.currency = t.currency
where t.ts > f.ts
) e
where e.rnk=1


This is not so offending if the database is very small, but even for small databases sizes we can run into trouble. The main troubles are:

1. It is very slow.
2. It cause large amounts of data to be written to disk by the subquery.

Lateral joins to the rescue.

## Lateral joins

Lateral joins were introduced in postgresql 9.3. We can use it to write a much simpler query.

The basic idea behind lateral joins is that we “tell” the query planner that the sub-query will be run many times; this allows the it to be optimised.

select *
from (
select *