Performance Optimisation

Mark_Daniel · February 20, 2023, 11:43am

I’ve got a performance problem with a particular calculated column on a table. It now takes at least two seconds to add a row and it’s getting worse. It’s fairly obvious why it’s slow but I wondered if anyone had any ideas of a more efficient way of getting the same result? Or if there are back-end improvements that could be done?

The general idea is to be able to allocate assets to jobs and know if there are enough assets to cover all the jobs that are out at once.

The referenced tables (summarised with most of the columns removed)

# This is a table of general types of assets, eg `MacBook Pro`,  `Mouse`, `IEC Power Cable` 
# ~150 rows
@grist.UserTable
class Asset_Types:
  Description_Formal = grist.Text()

# This is how many of this asset type we own, calculated by counting Assets
  def Have(rec, table):
    return len(Assets.lookupRecords(Type=rec.id,Disposed=None))

# This is a table of instances of Asset Types, one row for each physical instance
# ~500 rows
@grist.UserTable
class Assets:

  Disposed = grist.Date()
  Type = grist.Reference('Asset_Types')
 
# This table describes load lists. A load list is a list of asset types and quantities to be taken to a job. 
# The load lists have a out and in date and multiple load lists might overlap
# ~15 rows (and growing)
@grist.UserTable
class Load_Lists:

  Out_Date = grist.Date()
  In_Date = grist.Date()

# This is the set of load list item rows that make up this list
  def Items(rec, table):
    return Load_Items.lookupRecords(List=rec.id)

# This is the set of lists that intersect the same date range as this one (including this one)
  def Overlapping(rec, table):
    return list(r for r in Load_Lists.all if ((min(rec.In_Date - r.Out_Date, r.In_Date - rec.Out_Date).days + 1) > 0))

This is the problem table (Most of the columns removed) - Stock is the problem column

# These are the items for the load lists. One row describes an asset type and quantity
# ~650 rows and growing quickly
@grist.UserTable
class Load_Items:
  List = grist.Reference('Load_Lists')
  Asset = grist.Reference('Asset_Types')
  Quantity = grist.Int()

# This compares the number of assets of this type that we have (the count of matching items in the Asset table) 
# to the number of assets of that type in load lists which have the same date range as this one, including this one.
  def Stock(rec, table):
    if(not rec.Asset):
      return None
    return rec.Asset.Have - sum(r.Quantity for r in Load_Items.all if(r.Asset == rec.Asset and rec.List in r.List.Overlapping))

Obviously adding a Load_Items row invalidates Stock for all other rows because it references Load_Items.all.

It’s worth mentioning that I view the Load_Items table with a SELECT BY on a Load_Lists card. The performance was acceptable when there was only one or two load lists in the system (ie about 100 item rows) but now it’s slow even when I’m viewing a list with only a few items. I suppose it’s recalculating all the rows in the table but it would be quicker if I could make it only recalculate the ones in view.

I also though about hiding the Stock column then just checking it at the end but that doesn’t make it faster, only deleting the Stock column makes it fast again.

Any suggestions?

Thanks,
Mark Daniel

paul-grist · February 20, 2023, 3:02pm

Hi @Mark_Daniel, I’d say whittling away at any uses of Table.all could be the easiest win. For example here:

Rather than depending on all and then filtering it for matching assets, let Grist know that is what you want by using lookupRecords:

for r in Load_Items.lookupRecords(Asset=rec.Asset) if (rec.List in r.List.Overlapping)

As soon as you do that, Grist will know to update this cell only when assets that could affect that lookup change.

If you added another column Overlapping to Load_Items whose value was =$List.Overlapping (or =rec.List.Overlapping, they are the same), then you may be able to go further and replace the for ... code entirely with:

Load_Items.lookupRecords(Asset=rec.Asset, Overlapping=CONTAINS(rec.List))

I’m not sure if adding this column would be a net benefit or not, but it is an option.
Apologies in advance if I’ve made typos or misunderstood the document structure.

Mark_Daniel · February 20, 2023, 3:15pm

Thanks for the hint. Just using lookupRecords on Asset is already hugely faster. I’ll experiment with the other part later.

It occurs to me that it’s actually rare for there to be more than one or two lists that need to be considered and as the number of rows in the load list items table grows the number in overlapping lists will remain low so if I can trim the items to be considered early in the query (using lookup which seems very fast) we should be able to keep the performance fairly flat with respect to the table size rather than it growing O(n*n).

Thanks,
Mark