first, this is the wrong section of the community, you want this one: monday Apps & Developers - monday Community Forum
one thing can do the first request use items_page
then for subsequent requests use next_items_page
which skips resolving the whole board, and returns items directly. This is faster than the resolving the board each time.
Depending on size of boards, you could also return just the item ID for each item (returning the limit of 500 items should be fine when its just the ID, low risk of errors there), then spin up worker threads that fetch batches of 100 from those 500 IDs returned with a simple items
query (not inside board.items_page). You can fetch the second batch of IDs while the worker threads are fetching the items. At this point since you don’t want an infinite pool of worker threads anyway, your “cost” in performance is the short delay of fetching the first batch of IDs, I suspect getting your second batch of IDs will take less time than getting 100 items will so at this point your queue is unlikely to drain.
The advantage of the second strategy over your current is you also don’t need any of your logic for junk requests or cancelling at the end - you’ll only be fetching valid items, so it may be simpler in that regard.
This is a great idea, thanks. The primary board is not absurdly big, but still fairly large at ~4000 items. I definitely want to try out your second idea, and like that it eliminates the extraneous requests.
I much appreciate you finding this in the wrong section and helping out.
Glad that makes sense to you.
I’d combine both strategies, the next_items_page
with just IDs will be faster and also uses less API complexity than items_page
so you will hopefully be able to make more requests per minute. (I assume you may run into complexity limits)
Yes that is what I ended up doing, Initial board query spawns off a request for each items_page
getting item ids only, marching through the cursors getting 500 item ids at a time. These requests then spawn off batches of items
requests with 100 explicit item ids requested in each. All of these requests are fed into a worker pool asynchronously. The cursor requests stay just ahead of the item details requests.
Overall, it is faster than the method I had before, and I think much cleaner as well logically. Thanks for the suggestion!
was this in node.js? Just curious about platform details.
I also couldn’t tell if you used next_items_page to get subsequent pages.
It’s in python, all async/await, no threads. Yes, I’m using next_items_page
. First query queries a board_id
with info about groups, columns and 500 items via items_page
:
boards (ids:i{boards}] ) {{
name
id
columns{{
id
title
type
}}
groups {{
id
title
}}
items_page (limit: {limit}){{
cursor
items {{
id
}}
}}
}}
Then subsequence requests are spawned off via next_items_page:
next_items_page (limit:{limit}, cursor: "{cursor}") {{
cursor
items {{
id
}}
}}
Meanwhile, the item ids retrieved in items_page
from the first request, and the next_items_page
requests get fed into queries directly to the items endpoint:
items (ids:i{items}], limit:{limit}) {{
id
name
group {{ id }}
column_values {{
id
text
value
}}
}}
Ignore the double curly braces, these are the string templates I made for my queries in python, they are needed to escape the braces since python uses braces for template variables.
Yup, get the {{}} part. That said, I’d research GraphQL variables. You can created a primitive (not template) string, and just pass a variables object along with the query and the server takes care of substitution. Saves you all the heartache of trying to escape and quote things.
But nto sure your use case is big enough to do this since your only variable data is the boards and items, and boards is just a single ID.
FYI right now IDs are numeric strings from the API and can be provided as an Int in the query. But I would not count on this forever, I suspect eventually IDs will be required to be a true string. May as well write with that assumption in mind today.