No Events firing by Asana

Hi everyone, sorry for the issues!

Yes, this was caused by problems on our side :frowning_face: here’s the underlying issue:

When objects change state in Asana, the new data is written to our main databases, which hold the canonical object state. As a part of this, there is a separate table that holds a log of all objects which have had any changes since [whenever], and a request is made to a Redis cluster to record for the API’s use what changed on that object. After this, an asynchronous job runs that plays forward all the events since the last time the job has run by looking up all the changed objects in the object change table, cross-referencing with the information stored in Redis, and pushing these events to all event/webhook subscribers of that object that haven’t sent out their events yet. This somewhat complex flow is pretty typical for long-running jobs that we don’t want to handle in real time in our servers to keep them responsive. Other such jobs include sending out emails, sending the “consider updating your project progress” tasks, updating Inboxes, and so on.

What happened was that one of our background job processors got stuck due to local state (it had a local database have an integer overflow which put us into a strange place) and so jobs on it got “stuck”. When this happens, the job gets rescheduled, but it gets rescheduled on the same machine, which still was in a bad state, so stayed stuck. Unfortunately, one of these jobs was the distributor job that fans out some (but not all) of our events/webhooks. The “some” part of these jobs would affect only some users or apps, but this “some” is consistent, so if you were seeing problems you would consistently see problems even with new webhooks since your subscription would keep getting scheduled for the same stuck job.

We’ve since forcibly kicked these jobs to a new machine, which means we should be sending out events and webhooks again. Sorry for the issues here, this is one of those “We didn’t plan for this failure mode” things that we’ll run our postmortem process on and try to fix for in the future.

3 Likes

Thanks for the update. Will the web hooks that have been missed be “re-played” at some point?

If I understand correctly what exactly went wrong, I believe they will be. There are a lot of moving parts here, so there are things that might cause this to be untrue, but, well, perhaps the folks on this topic can confirm/deny that they’re getting a flood of fast-forward events recently/now.

If this isn’t true, then I’m afraid the only recourse is to re-scan the state of all of the resources you might have missed - that is, for every task or project you had a webhook/event attached to, re-get its state and compare that to the old state to see what has changed, which isn’t great :cry: but is one way to recover.

So far I am not getting a flood of back-logged webhooks. But my webhooks are working again since about 12:46pm CST. I’ll update if I see a change.

Matt,
Thanks for the very descriptive response. It goes a long way to build understanding and empathy. I appreciate the quick response to the thread and it’s good to know you were able to reconcile why “some” were and “others” were not working.

Thanx for the update Matt.

Thanks, @dannyramirez11. It sounds like everyone who was missing webhooks might need to re-scan the state of anything they might have missed. Sorry about that, everyone :confounded:

Hi Team,

Today again, I was facing the same problem. However it occurred only for 15-20 minutes.

1 Like

Hi,
Event not trigger ,Please update the downtime ,so that we can schedule accordingly .

Thanks

Unfortunately, the official response (and the same is stated in docs) that webhooks are never 100% reliable. You will always have to do manual sync if you want to be sure that you didn’t miss any of events.

1 Like

Like but not like :stuck_out_tongue:

Hi
Can you tell me how I can scan to get the events in the past. The sync token used in events api is only giving events after the the time the sync token was generated.

Hey @Arun2,

You cannot get events on the past.

You will need to refetch the tasks of the hooked project and update them as well as the project itself.

1 Like