Query:Source from multiple tables to a Fact table

Greetings,

Iam new to SQLl2005. Iam using DTS to transfer data from my source to the warehouse. I have a couple of tables in my source whein I have to join these to tables fields and insert the same in teh warehouse fact table. I have used a Join query in my Oledb source component, What other component needs to be used to insert the data into the fact table.
I also need to extract same data with aggregation and insert the same into an another Fact table.

Kindly help.


Answer this question

Query:Source from multiple tables to a Fact table

  • Alex Filipovici

    Thanks for your input. I did not want to use a SCD wizard. But Iam getrting teh fopllowing error.

    [DTS.Pipeline] Warning: The output column "PHONE_ID" (6428) on output "Sort Output" (6275) and component "Sort 1" (6273) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.


  • VamsiKiran

    Thansk for your input, I did not want to use the scd wizard for thsi as it has to read 500,000 records, but Iam getting the following error

    DTS.Pipeline] Warning: The output column "PHONE_ID" (6428) on output "Sort Output" (6275) and component "Sort 1" (6273) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.


  • J. Amend

    I could figure out the above problem is for performance. I foolwed the example from the link.

    I have used one source from my source table and a second source from my destination table. Provided a sort functionality, which is an input to merge join and have given a left join, which is supposed to give me 1 row for insert into my destinatio, but it provides me all the rows and results in violation of primary key

    But after the merge join with left join, Iam still getting all the rows for my destination, I am supposed to get only one row.


  • anjin

    I recheked and the columns are mapped correctly with corresponding datatypes. About 70,000 rows are going to oledb command, out of which only one row and one particular column needs updation and thats not being acheived. I placed a data viewer displaying all columns on the connector connecting the default split and oledb command. It does not display any rows. Surprised, it even while inserting, Data viewer does not display any record, but displays buffered 1 record in the viewer.
  • Archie

    Then you need to find out which of the rows are new rows and which are not. There is a method for doing that documented here: http://www.sqlis.com/default.aspx 311

    Some people will suggest using the SCD Wizard to do this. My own personal opinion is that you should understand and try and use the core components (i.e. see link above) before resorting to wizards which hide you from the detail in some respects.

    -Jamie



  • unbob

    Use an OLE DB Destination Adapter for the insertion.

    Use the Aggregate component to aggregate the data.

    -Jamie



  • Deepak Sivaraman

    Are records actually going to the OLE Command If they are, then it could be that you have your parameters mapped incorrectly.

    The annoying part about performing updates via the OLE Command is that you can't used named parameter holders. They are sequentially named starting from the first instance of a . Therefore you have to be very careful about how you map your parameters. I have many Fact/Dimension updates where some 30+ columns are being updated and it is quite painfull to make sure that the right columns are mapped to the right parameters.

    You can also put a dataviewer on the connector between the conditional split and the OLE Command so that you can view the data. It might help you trace down what the error is.

    Larry Pope

  • Andreas Ames

    Iam a surprised, that the default split is not updating teh records with the sql command, it is neither giving an error
  • BenK

    Thanks Larry. It worked. With the same setup of components, how do we update a fact table
  • Slaxa

    I have used a Innerjoin sql in my source adapter and mapped the columns in my destination adapter, But there destination table already has the same data as the source, I just want a new row to be inserted, but it is trying to insert all the rows and throwing me a violation primary key error
  • Yenen

    Have you implemented a Conditional Split after the merge join with the test of ISNULL(DestKey) and labeling it NewInserts

    The default output of the Conditional Split will be the records that are in the destination table which you don't need to connect to any other transform. The NewInserts output is what you should connect to your OLE Destination.

    Larry Pope



  • RameshKandukuri

    I guess I need to go with option 1, as I cannot afford to have a dup of fact table.

    Thanks


  • Charles Levy

    Perhaps I wasn't entirely clear in my explanation of Option 2.

    It is only duplicate in the sense of schema. It is merely a temporary table in the database to hold data for a Transact-SQL bulk update. Once the real fact table is updated with the values in the temporary table, the temporary table's contents would be irrelevant.

    How you manage the temporary table is really up to you. You could create a temporary table, populate it, perform the update and drop the table. Or you could persist the temporary staging table and manage it with truncates/deletes so that you don't need to incorporate DDL statements in your Fact Processing package.

    For large numbers of updates, it is more performant to bulk load a temporary table and bulk update a persistent table than it is to perform each update one-by-one.

    Larry Pope




  • frozen12

    Using the same setup you have two options. Which option you use is really going to be a function of how many updates you need to perform.

    Option 1.
    a. Pipe the default output of the conditional split (existing records) to a OLE Command transform to perform a UPDATE statement
    b. The syntax of the update would be
    Update MyTable Set
    Col1 = ,
    Col2 =
    Where
    Key =
    c. Then you would map the parameters to your input columns. The parameter for Col1 would be the first labelled Param_0. Col2 would be Param_1 and so on.
    d. The OLE Command performs updates row by row, so for many updates it may take a long time.

    Option 2.
    a. Pipe the default output of the conditional split to a OLE Destination linked to a staging table (exact dup of fact table).
    b. Perform an SQL task in the control flow that Updates the real fact table with the values in the staging fact table.
    c. This will force the updates to be performed in bulk and for a large no of updates, will be much more performant.

    What is a large no of records Testing and performance goals will determine that. I generally go with option 2 when I'm dealing with more than 100k updates, although I'm not necessarily concerned about performance in my current project.

    Larry Pope



  • Query:Source from multiple tables to a Fact table