I like to know your opinion when to upgrade heroku database. We are putting tons of money in heroku dynos but still things are not good.
New relic screenshot : http://cl.ly/image/32423E2K2S2A
Everyday, we have to perform big database operation like 300k rows were updated or created. We are using very powerful server (just for 1 job) to do this. But, when we perform such operation our website barely works. I see tons of request timeout error. I have also included the image of newrelic, the database thing always takes much time. Is it normal ?
- Should i consider upgrading my database ?
- Why ? what was the thing which is missing. we are using heroku crane plan.
- How you will going to handle 300k transactions everyday.
We are using Rails 4 and postgres 9.3.2 version. But, for performing big operations like bulk updating and deleting we tend to use raw SQL.
I am using crane, heroku legacy production database plan , the smallest plan. it costs 50$ per month.
Yes. We are getting about 100x more performance. Also, the Active record doesn’t support upsert, so i am manually building up query and executing it. For upsert, i am building this type of query upsert - Insert, on duplicate update in PostgreSQL? - Stack Overflow . This allow us to execute 1000 records per transaction .
Given that you are “putting tons of money in Heroku dynos” I would just test another database plan and measure if it makes a difference.
Are you executing your transactions sequentially? If you are, can / have you tried to run them in parallel?
Unfortunately that is not possible. Since i am using legacy plan,if i upgrade my database i can’t revert back to my old plan. Old plan gives us what we want in lesser amount. Also, upgrading the database is not an easy process yet. I guess they should add scaling thing like we have for worker and dynos. All, i need is more memory i think that will be the root of the problem. One more thing, database upgrading aren’t charged in hours or day. It will straight cost me 200$ for just test.
upsert we are using multithreading, it wasn’t good. It was much slower. Since now we can insert
1000 rows in inserted/updated in sec so we dont have any concern with speed.
Sorry, I understood that you were spending more than $200 in dynos. If you eventually decide to try a new plan, you can take a snapshot of your current database and copy it to a new database without loosing the current one.
With respect to parallelizing your insertions, the New Relic screenshot shows that the insertions take ~1h. I would expect Postgres to perform way better than that, even in your current plan.
In this post the author ends up inserting 1M records in less than two minutes using 20 connections. Of course you’ll get different results due to the different size of your records vs the timestamps used in the post, the network latency and the different capacity of the database server. But I think it’s worth experimenting in your local server to see if that makes a difference with respect to your current insertion strategy.
I’d be interested what happens when you run it on your local development machine as well.
i really never tried out with my database. Since we create/insert the data based on external datas which are in GBs. and in amazon server this GBs data is downloaded instantly. I dont have that much of fast internet but it it worth trying.
Thanks for giving much a brilliant idea, i surely try this out.