Merge Replication - Weird Conflicts for no reason

Hi. I am using Merge Replication on Sql Server 2005 (which works much better than 2000 by the way) and everything is working great except occassionally I will get some conflict rows for no reason.

The same column(s) of the same row was updated at both 'PublishingServerName.DatabaseName' and 'SubscribingServerName.DatabaseName'. The resolver chose the update from 'PublishingServerName.DatabaseName' as the winner.

I checked our audit table and in every case there was no one updating the conflicting row(s) on the publisher that day. I have had to submit the losing row (which is the suscbriber) in every case. I am using COLUMN level tracking which means that even if someone DID update the same row on the publisher and the subscriber at the same time, it should not result in a conflict unless they updated the same columns. As I mentioned earlier, no one is updating these rows but yet we are sometimes getting column conflicts anyway. Does anyone know a possible reason why We don't have any triggers on the table at all.

Is there some adjustment I can make to the merge profiles to make it better

Thanks,
 Nate



Answer this question

Merge Replication - Weird Conflicts for no reason

  • Marius T.

    Ok, just to recap:

    • A subscriber update is conflicting with an update at the publisher.
    • The values being updated at the publisher are default values that would be used on an insert.
    • This only happens on one table.

    Since merge replication doesn't know anything about default values for a column, my only guess is that it would be replicating the subscriber update as some sort of insert/update, i don't see how that's possible, but i can ask around anyways.

    Last question before I ask you to email me a bunch of info:

    Let's assume for a sec someone else is making changes at the publisher.  Can you run a trace at the publisher to capture who is connecting   To make it non-intrusive so it doesn't affect performance, you can save the trace to a table, and capture maybe just the audit:login, audit:logout events, or whatever events display the Application Name.  Replication has specific Application Name that should distinguish itself from other connections.

    ALso, I'm guessing your data is partitioned   Would you be updating a primary key column that could move it out of one partition and into another

     


  • nunuk

    So far we have 22 subscribers. Yes it is possible but unlikely that should ever happen. I know that has not been the cause so far because our audit table saves all the sql commands and the server that it was originally ran against. In each case only 1 subscriber is updating the row(s) in question.

  • nougat

    Good question Greg. We are doing our auditing with our vb application. That being the case, it is possible that there is some update occuring that is not being audited by our application. If that is the case, I have no idea what it could be! The only other process accessing this database is replication itself!

  • Deepak_SQL

    There are a couple things you can try to do, although I imagine it'll be very hard to do since the problem happens sporadically:
     
    - Check the merge history table on the distributor for other error messages.
    - Profile the sync and look for user errors applying change on the publisher/subscriber
    - Increase debug log severity for the merge agent to 4 and send us the log files or try to trace the rowguid in the log file
     
    The above can help pinpoint where the problem is, although it will involve some hands-on investigation.  Should the original problem cause non-convergence or loss of data, then I suggest calling CSS, otherwise the problem you're experience is not one we've seen yet.
     

  • Alex Chertov

    Thank you for your comment Greg. Yes, the conflict viewer shows the values for the columns that are conflicting. There are two bigint columns and a datetime column that supposedly have been updated both on the publisher and the subscriber. However, in each case the publisher values are 0, 0 and 9/9/9999 which are the default values that those columns are initialized with when the row is created. The subscriber values are whatever value the user inputted to our vb application at the subscribing server (which I verified using our audit table).

    Only one update was really made, and that was on the subscriber. All three columns are updated in one update statement on the subscriber.

    Any ideas

    P.S. We used merge replication with this application with Sql Server 2000 and didn't encounter this specific problem. (We had other problems related to dri (declared referential integrity) and merge processing order. Those have thankfully been fixed in Sql Server 2005.) I bring this up to make the point that the problem doesn't lie with our application. By the way, compatibility level of the database and the publication is set to 90RTM. We are using Sql Express on our subscribers.

  • ZSTRCZ

    Hi Nate, I'd like to get more information about your scenario, could you email me the following information:

    1. script out your publication
    2. script out the schema of the user table in question (the one with the majority of the conflicts)

    You can send it to the address in my profile, just take out the "online" and obvious thing at the end.

    There's a good chance this is a new bug in SQL 2005, and I'd like to get to the bottom of this problem.


  • DotFrammie

    How many subscribers do you have   Is it possible that two subscribers are trying to update the same row
  • enid1229

     Nate Cook wrote:
    I checked our audit table and in every case there was no one updating the conflicting row(s) on the publisher that day.

    ...

     We don't have any triggers on the table at all.

    Just out of curiousity, if you don't have triggers on the tables in question, how are you doing your audits

     


  • Steve Wertz

    The conflict viewer should show you the values for the columns for both publisher and subscriber.  What columns are conflicting, and what values are being submitted for both the publisher and subscriber
  • Roberto B

    To help any others seeing this problem I wanted to update this forum:

    Thanks to the diligent help from Greg and his team @ Microsoft: The problem/bug is using getdate() in your subset_filterclause of your table filters. Use a workaround, such as having a column called Active in your table that you update manually with a batch job.

  • Aryam

    I just sent you the email. Please let me know if you don't get it.

    Thank you!
    Nate

  • MJC_Eagle

    I should have mentioned that all of the "false alarm" conflicts are occuring on the same table. Also, I just got a conflict a minute ago on that table where all the values on the publisher (winning side) and subscriber (losing side) were identical! Strange...

  • UtterMan

    I used this query on the distribution database to get the history where conflicts occurred. I have these results saved in a csv file that I can happily email to you or anyone else that would be willing to examine it.< xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

     

    select * from MSmerge_history where comments like '%conflict%' and comments not like '%0 conflict%'

     

    Is this the right query or should I not restrict it to conflicts only

     

    -----

    You’re right profiling is going to be tough since it is intermittent.

    ---------

     

     When you say “increase debug log severity to 4” I’m not sure what you mean. I looked at the merge agent profile I see HistoryVerboseLevel (set to 2 – the default) and OutputVerboseLevel (which is not specified in the profile – so uses default of 2 which is the highest level). How do I create the merge log file (I’ve never done that before)

     

    By the way I need to correct myself: The conflicts are NOT limited to a single table. However, the MAJORITY of the conflicts thus far have been on one table.

     

    So far the problem has not caused non-convergence or loss of data! Every problem so far seems to be showing up in the conflict viewer and is easily resolved by clicking “submit loser”. It is more of an annoyance at this point, but as the number of subscribers grow it is going to become a maintenance headache if we don’t resolve it.

     

    Thanks for your continued help on this important problem!

     

    Here is a subset of the merge history table (limited to a single session rather than to rows with the word conflict in them). As you can see it's not very helpful unfortunately. (Also it's hard to read in this format. I would be glad to email this and more using excel.)

     

    session_id agent_id comments error_id timestamp updateable_row time
    70238 19 Merge completed after processing 463 data change(s) (206 insert(s), 257 update(s), 0 delete(s), 1 conflict(s)). 0 0x000000000020090B 0 41:39.8
    70238 19 Enumerating deletes in all articles (generation batch 2) 0 0x00000000002007CD 1 40:57.2
    70238 19 Downloaded 6 change(s) in 'PatientAccounts' (1 insert, 5 updates, 1 conflict): 6 total 0 0x0000000000200711 0 40:30.3
    70238 19 Enumerating inserts and updates in article 'PatientAccounts' (generation batch 1) 0 0x0000000000200690 1 40:16.1
    70238 19 Enumerating deletes in all articles (generation batch 1) 0 0x0000000000200649 1 39:58.4
    70238 19 Enumerating changes in filtered articles using Subscriber's assigned partition ID. 0 0x0000000000200647 0 39:50.7
    70238 19 Downloading data changes to the Subscriber 0 0x0000000000200645 1 39:49.5
    70238 19 Enumerating inserts and updates in article 'PatientAccounts' 0 0x0000000000200629 1 39:31.7
    70238 19 Enumerating deletes in all articles 0 0x00000000002005F8 1 39:24.3
    70238 19 Uploading data changes to the Publisher 0 0x00000000002005CC 1 39:06.6
    70238 19 Connecting to Subscriber 'Smithfield-PC1' 0 0x0000000000200593 0 39:06.3
    70238 19 Connecting to Publisher 'RMISQL' 0 0x0000000000200584 0 39:01.2

  • Seth Maffey

     Sorry about the delay in the response. I typed all of this up and lost it so I had to retype it.

    • That’s correct.
    • Let me clarify with an example:
      1. Row is inserted. Some of the columns default to zero (for example column A = 0, column B = 0).
      2. Row sits there for a month and is fine. (col A = 0, col B = 0)
      3. Row is updated at the subscriber. (col A at subscriber = 7, col B at subscriber = 4, columns at publisher still zero)
      4. When update is merge, a conflict is raised. The merge agent thinks someone explicitly updated the columns at the publisher in that row to their original values, when in fact, the values at the publisher have not changed at all. (col A at subscriber = 7, col B at subscriber = 4, columns at publisher still zero)
    • At first I thought it was happening with just one table, but I’ve seen it happen with a few tables now. The vast majority are on one table.

    See these screenshots. I think they will be helpful.

    This is the typical example (similar to my word example above):

    http://home.comcast.net/~nathanielcook/PatientDischarged.jpg

    Here’s one where it says that the row was deleted at the publisher; when I queried the table though, it was still there! You can see that in this screenshot. You can also see that the only update that was performed on that table was at a subscriber by my query on the audit table.

    http://home.comcast.net/~nathanielcook/BogusConflict3_DeletedAndUpdated.jpg

    Here’s one where the values at the publisher and subscriber are exactly the same; why would that ever be a conflict

    http://home.comcast.net/~nathanielcook/BogusConflict2.jpg

    I assume that you are talking about replication partitioning (and not physical partitioning). We are not using dynamic filtering (or parametized filtering, as it’s now called in Sql Server 2005), so every subscriber gets the same data. However, we are using static row filtering on some tables; that filtering affects other tables by means of join filters. I will also mention that the tables that I have seen bogus conflicts on so far have all been tables that are involved in the join filtering. The filters have not changed since I used them on Sql Server 2000, and they worked fine there. The only problem we had with Sql Server 2000 was the merge processing order, which thankfully has been corrected in 2005. I’ll say again that I didn’t see the specific problem that we are talking about in 2000.

    I am running profiler now to audit the logins. So far everything looks normal, with just our application, replication processes, Backup Exec and WSUS connecting to the server. I’ll continue to run profiler and report the results.


  • Merge Replication - Weird Conflicts for no reason