How would you archive a subset of your db if you had to?

I have a database with 6 tables that are all joined together in one way or another. At certain points in time I want to take a subset of the records in one table(based on a criteria in that table) and 'catalog' them somehow along with the related records in the other tables.
It's somewhat hard to explain. I want to preserve a point in time in the db so that at a later point in time I can reinstate the records as they were in the past.
Is this even possible
I thought one way to do it would be to create a view(with all 6 tables) that accepts a parameter and then exports all of the records that are returned into another table. Not sure how I would rehydrate this back into the original schema.

Is there even a name for this kind of operation I'm having trouble googling it because I don't even know what a name is for what I am trying to do.

Another way I thought I could get this done would be to add a flag field to each table and then modify the flag somehow.

Yet another way could be to roll transactions back and forth but I only want to do this for a subset. I do keep my own audit table that lists changes so that might work.

Yet even another way could be to keep a lookup table telling me which version some record belongs to and then roll back based on that value. (I figure any schema I come up with would have to have a way to keep track of versions).

Any suggestions are very much appreciated




Answer this question

How would you archive a subset of your db if you had to?

  • bnn

    Sorry, I should have stated I am using SqlServer 2000

  • starnamer

    Are you using SQL Server 2005 The way to do what you want is through Integration Services.

    You can transform data between Databases, flat files, almost anything you can imagine...well mostly.

    I can provide some pointers to get you started if you'd like.



  • Jessica Mamer

    First thing you need to do is get a book on relational database theory. That theoretical basis will provide you the tools you need to do the kind of work you are talking about.  This type of database work is not really that relevant to VB coding skills, it is a different world entirely.

    For your specific problem here is one possible architecture approach:

     

    Create target data structures that are an exact match for the ones that house your source records, with one addition.  Add a datetime column to each row of each of the target structures which will be used to record the batch date and time of the extract.  Use sql in a stored procecure to extract the records and insert into the cloned data structures along with the batch date and time.  Make sure you have unique identifiers for each row of each table and make sure to extract them with each copies row.

    I would also probably normalize the target structures so that I coud easily persist info about each batch run.

    So when that is all working you now have a system to copy a "batch" of related data whenever you want. 

    Step 2 is to write another stored procedure that updates the original tables with the data from a particular "batch" of extracted data.  Obviously it will use the unique identifiers for each row that were stored when the extract is created.  This is good time to remind you to use permanently unchangeable unique idntifiers for each source row.  If they don't exist in the source data structures you will need to create them (surrogate keys).  Those identifiers will be the only way to  later match up the rows between the original and copied rows. The update sql will update every column of every target table with the respective columns from the copied "batch", using the unique identifiers to match up the rows.

    If you have multiple different sets of tables for which you want to perform snapshots, you will need one "extraction"stored procedure and one "restoring" stored procedure- unless you want to use dynamic sql which sounds like it would be tough for you.  Even if I were to use dynamic sql, I would probaby do at least one using hard coded sql first.

    Good luck!

     

     

     



  • jdarias

    I had to look at this just recently, and found an article in the Dec 2005 issue (#379) of Dr Dobb's called 'Table patterns and changing data' that identified the landscape of strategies for managing historical data.

    Hope this helps.


  • AmyKutty

    Well SQL Server 2000 has DTS ( Data Transformation Services the predecessor to Integration Services). It doesn't include as many transformations out of the box, but what you want to do seems straightforward enough to make it a trivial task.

    You can access the DTS designer in SQL Server Enterprise Management Console.



  • cfjgrind

    Thank you for that response, it was insightful.
    You chose what I what refer to as the 'extraction' methodology - copy out the records into a mirrored db structure and flag them for easy retrieval. Sounds doable but definitely results in db bloat. One of my tables has 1.2 million records and the others are less than 500 each (6 tables total). Each extraction instance is going to increase the size of the db considerably.
    Another method that I have been kicking around is what i would call the 'audit' method. I maintain a table that records edits to any of the 6 tables including datetime, former value and new value. When I want to roll back to a point in time I don't concern myself with all of the possible changes that could have occured between the current time and the time I wish to roll back to, i only concern myself with which records have changed and what was their value at the point in time I want to roll back to.
    This method does not copy records out, the only bloat comes in the audit table. The rehydration would be a series of update statements to get to a state in the past. These statements would occur on the 'outside' tables first (the one in the one to many) to preserve ref integrity.

    Does that sound like a plausible alternative also




  • How would you archive a subset of your db if you had to?