find_or_create_by race conditions

In my application, I communicate with a third party API to build up user records. I also use the same API to update users.
As such, I use find_or_create_by to either create a new record or update an existing one. However, it’s possible (as I’m sure some of you are aware) to mistakenly create multiple records if multiple requests are made simultaneously for a user which does not yet exist.

How would you guys recommend handling such a situation? I can obviously add a unique index on the column to prevent duplicates, but how do you recover from a situation where multiple requests are sent? Do you rescue from the error and try again?

What’s the best practice here? If I just leave it, the user will obviously get an ugly error page and that’s not very ideal.

I use a background worker to process the API calls and handle the creation/updating of new users. What about keeping track of which jobs are running - so we avoid duplicate jobs? But would there still be a race condition even then?

Thoughts? This is a little frustrating.

If you can enforce some kind of uniqueness constraint on the same attributes that you are using for your find criteria, you can rescue from ActiveRecord::RecordNotUnique and retry. It’s fairly safe to rescue from that exception, since it only happens if a record got saved before you finished. In the retry, the exception wouldn’t happen, because now .find_or_create_by finds a record the next time through.

If I want my process to always update the record with some non-finder criteria even if it already exists in the database, I will use .find_or_initialize_by, followed by #update_attributes! or #update! to avoid writing to the database twice.

If you can’t enforce uniqueness on the thing you’re creating (maybe the unique thing is a timestamp, and not a date, for example, and so the second submission ends up with a timestamp 1 second after the first), then you might have to check at the end of your method to see if a record was already saved in the last X minutes, or build a job that looks for double submissions. In any event, dealing with that problem is probably going to be messy.

1 Like

That sounds like an excellent solution - and I am definitely able to enforce uniqueness on the thing I’m creating.

But it also has an association. I was originally taking the parent object, and if the association exists (if parent.child), we update the association. If it doesn’t, then we create a new association.

It seems like there would also be a race condition there.

Would you recommend wrapping and rescuing on RecordNotUnique there as well? The only downside I see is that if this were to occur, then the parent object would need to be updated to reflect that the child association exists, or it would just sit there throwing RecordNotUnique all day.

I would rescue and retry from a method further up the chain that triggers the create/update of the parent and the association, then you won’t get stuck in a loop.