Testing with a legacy database

Hi there,

I’m building an app that needs to be integrated with a legacy database where there are a lot of tables and
some of them are database views.

I configured ActiveRecord to prepend table names with the app name to keep things organized.

So we have something like this in the development database:

  • legacyTable1
  • legacyTable2
  • legacyTable30
  • legacyView1
  • legacyView2
  • app_name_foos
  • app_name_bars

The problem is when the test database is created, Rails tries to create a schema of the whole legacy database and I only need my namespaced tables and some legacy ones.

@marcelojnk I think if you have all the tables in your db/schema.rb but the app only needs a subset, there’s no way to stop Rails from creating all of them (either through rake db:schema:load or rake db:test:prepare). I’d be interested to hear a solution to this but nothing comes to mind to me personally.

@jferris or @cpytel, ever dealt with anything like this?

Actually its no only one rails app but a couple more… and they all face this challenge.

I could have created a separate database for each app but I would have to relate data across databases (I don’t think that is the right thing to do?).

My workaround is to config every app with this:

prefix = "#{Rails.application.class.parent_name.underscore}_"
ActiveRecord::Base.table_name_prefix = prefix
ActiveRecord::SchemaDumper.ignore_tables = [/^(?!#{prefix})/, /schema_migrations$/]
  1. It prepends the app name in tables created by migrations;
  2. It ignores every table that doesn’t start with the app name (legacy ones and schema_migrations) when dumping the schema.rb;

Right now I’m having problems because Rails also adds the prefix to the shema_migrations table but fails to use it when comparing migrations before running tests, so I get a message saying that I have pending migrations and my tests don’t run :frowning:

I think given what you are doing, your workaround is the best way I know.

In order to fix the schema_migrations I believe you may need to only add a table_name_prefix to the models, not ActiveRecord::Base.

That being said, what is the goal of the system you are building? Generally, multiple applications should not share a database and they should expose API’s used to integrate the applications. I would only recommend doing what you’re doing if it is temporary as part of a migration from a legacy system to Rails.

These apps share database because they share some models. How would you share models coupled to the legacy database between apps that have their own database (assuming the separate database approach)?

I started with separate databases for each app with the shared models configured to prepend their database name to their table_name in AR, but I had some trouble with associations and is it right to relate data in different databases?

Thank you all for the quick responses.

I would expose the legacy applications data as an API/service to be integrated against.

How closely are the systems integrated? What data are they reusing between them?

The data used between the apps is part of our domain model (Banking) so we end up having Client, Branch and Employee models duplicated across all of our apps because all apps are modeled around these Entities.

Lets say for example that we start an app to manage Loans. We’d probably have something like this:

class Loan < ActiveRecord::Base
  belongs_to :client
  belongs_to :branch
  belongs_to :author, class_name: Employee
end

…and another app that manages tickets of our IT department:

class Ticket < ActiveRecord::Base
  belongs_to :branch
  belongs_to :author, class_name: Employee
end

Those two apps deal with completely different stuff but they have two models in common.

I’ve moved this topic to the Rails category, as its no longer specific to testing, and is more about Rails architecture and database configuration.

Based on your description, I definitely believe that the better way to architect overall system would be to expose the data in the remote systems in an API that your other systems would use.

Obviously, that would be a big architectural change from where you’re at now, but one I would highly recommend. To get back to your original question, I think you will probably need to more explicitly only set a table prefix on your models.

You should be able to do this in your Rails config by looping through the models and setting it just on them instead of ActiveRecord::Base.

Understood.

In that case when I have, for example, a list of 30 comments to display in a view like this:

<% @comments.each do |comment| %>
  <%= comment.author_name %>
<% end %>

Then that view would make 30 API calls to get the names of the authors (assuming they are different)?

Because right now I can do this:

Controller: @comments = Comment.includes(:author)

@marcelojnk it wouldn’t make 30 API calls, the user information would likely be included in the API response for comments and/or the client would do caching of any information like that. The proper solution would likely depend on the usage characteristics of the applications.

@cpytel I guess this is one of those situations that we don’t usually find in books and examples out there. Would recommend any resources that might help me implement this solution? I’m taking the Test Driven Rails workshop and after that I’ll take the Advanced Rails one.

Also I was reading through the “Rails Recipes - Rails 3 ed” and the third recipe is about a problem exactly like mine, where you need to establish relationships with data in an external database. The book explains pretty well how ActiveRecord connections work and says that, when possible, we should consider replicating the data that is needed in the app’s database or, like you said, make it available as an API.

Do your guys face this problem a lot?

@marcelojnk: What database are you using?

I’ve worked in similar environments in the past, but I can’t say that I remember all of the specifics. The last time I did something similar, the database was Postgres. I stuck all of the data I was solely responsible for in its own postgres schema that I could blow away and recreate at will. While other applications could read this data, it was mine to maintain. I could add columns, remove columns, add tables, etc. The postgres driver understands schemas.

In my case, the legacy data was not mine to modify. I could read it, but could not write to it. I added migrations that created views of this legacy, read-only data in my postgres schema. This meant I had to switch to a SQL dump of schema rather than schema.rb, which isn’t great, but it worked.

On that project, shared data that was meant to be writeable was all handled via API calls. This allows the system most responsible for that data to implement all the business logic. Without an API, you run the risk of varying business logic implementations amongst the different apps that write to a given table.

@derekprior Thanks for your comment. I’ve heard about Posgres’s schemas but I’m using MySQL. I Think I’ll go with the Services/API approach. Today I discovered a great book on the topic: “Service-Oriented design with Ruby and Rails”. Started to read and it seem to be just what I need.