kiramclean
Thanks for the awesome video! I was just wondering what is a good/safe way to get some production data out of a production database, if we don’t want a full backup of all the data?
Thanks for the awesome video! I was just wondering what is a good/safe way to get some production data out of a production database, if we don’t want a full backup of all the data?
Hey Kira, glad you enjoyed the video. Personally I tend to pull the full Upcase production DB regularly using Parity (I sum up this approach in the Parity section of the Heroku Weekly Iteration). If this is an option I highly recommend it to get a real picture. At a minimum, perhaps you could backup production → staging and then tinker there (after some local experimentation).
If that is not an option, you might consider building a script to generate the data. We have an example of this in the dev:prime task in Upcase, which uses FactoryGirl’s methods to aid in building structured data (with a few helper methods).
I’ve also heard of folks applying an automated anonymization script to work around compliance / privacy concerns. With this, you’d work from a copy of the production data set, but scramble any identifying data, for example replacing names with “Jane M Doe”. I don’t have any solid examples of this that I can point you to, but wanted to point it out as a third option.
@christoomey @jferris
Baller episode, thanks for putting this together.
@christoomey, @jferris - When would you want to add a NoSQL db to the mix?
I would add another database only when there’s a production issue that can’t be reasonably fixed in your primary database and that you’re sure will be fixed by introducing a new tool.
The cost of having two databases is higher than is generally appreciated:
I would suggest this litmus test: if you’re not sure whether or not to introduce another database, stick with Postgres.