Summing time series with different dates?

Christine_Russo · January 6, 2024, 11:45am

Hello! Just found out about Grist and am trying a couple simple exercised to see if it’d work in our workflow.

In brief: my question appears to be the same as generating a chart in the Portfolio Performance template that tracks over time the total portfolio value across all securities, i.e. summing the three history charts into one.

I have a table containing dated entries for different categories referenced from another table; each entry records a few measurements for a different category. For example:

Table 1 contains two categories, Alpha and Beta.
On January 1st, Alpha’s quantity attribute is measured to be 5 and Beta’s is measured to be 9.
On January 4th, Beta’s quantity attribute is measured to be 2.
On January 10th, Alpha is measured to be 6.

Generating a timeseries chart for a given category and measurement is easy; but I’m having trouble how to view one summed timeseries for a given measurement among all categories, since each category does not necessarily have an entry for each timestamp. In the above example, on January 10th the chart should show a summed total of 8 (2 for Beta and 6 for Alpha), but instead it shows 6 (i.e. 0 for Beta) because Beta has no January 10th entry. I’ve fiddled with it a bit but I’m having trouble getting out of the mindset of just generating a table on the fly, which doesn’t seem doable here (unless there’s something with triggers?).

Is this possible? Ideally, I’d like to retain the ability to filter the summed chart by category since I’m interested in the sums for various category combinations. But baby steps while I get a handle on things!

Much appreciated!

dmitry-grist · January 7, 2024, 1:50am

Hello @Christine_Russo , welcome to the community!

This is indeed a bit tricky. If I understood your description correctly, you’d need to collect all dates, and for each date find the latest value of each category. The sum of those values is what should be shown.

To accomplish this in Grist, you’d need to collect all dates available for a given category (it’s a list that may have gaps), and then for any date D (including for a date inside a gap), find the nearest date in the list that’s <= D, and use the measurement point from that date.

I put together an example that takes this approach here:

https://public.getgrist.com/8GJf1cSBPfDV/Summing-Time-Series/m/fork

Categories and Measurements are tables like you describe. MEASUREMENTS [by Category] produces a list of measurement records for each category, and sorts it by date, in the Sorted by Date column. The sorting is important to keep this efficient (otherwise it’s very easy to get into quadratic complexity, where 1000 dates would require 1000^2 operations).

The MEASUREMENTS [by Date] table is where the magic happens, in this formula for All:

import bisect
latest = []
for c in Measurements_summary_Category.all:
  index = bisect.bisect(c.Sorted_by_Date, $Date, key=lambda r: r.Date)
  if index > 0:
    latest.append(c.Sorted_by_Date[index - 1])
latest

For each Date, and for each category, it searches for the nearest record with r.Date <= $Date using the Python bisect module (that’s where I take advantage of the efficiency of searching in a sorted list).

The result is that All contains all measurements to account for on a given date. Then Value, which is simply SUM($All.Value), adds up all of those. The corresponding chart is below that table. The chart that shows the original measurements for each category is in the bottom right.

Christine_Russo · January 15, 2024, 8:20pm

Thank you @dmitry-grist ! Your suggested solution works perfectly. Using intermediate tables with Python manipulation like this was counterintuitive at first but is making more sense with practice, especially with your explanation - it makes this all the more exciting! I’ll continue to play around and then see what the team thinks.