Socket, detect message boundaries for objects - working code?

Hi,

I'm stuck with a problem that is bugging me for days. I'm custom serializing objects and sending them across async TCP sockets. However, I have a problem deserializing messages when the receive is split up in different packet sizes and I just can't figure out how to get it right. The problem is, that all works fine for a couple messages but then the header size is read incorrectly (I'm sending a header for each messages that tells the length of the incoming message so I can detect the message boundary):

2005-11-21 16:13:12.0468|INFO|Test.Server.ClientCommunicator.Server|Received connection from: 127.0.0.1:1990

2005-11-21 16:13:12.1093|TRACE|Test.Server.ClientCommunicator.Server|335 bytes received from 127.0.0.1:1990

2005-11-21 16:13:12.1250|DEBUG|Test.Common.Networking.StreamManager|Header message size: 327

2005-11-21 16:13:12.1406|TRACE|Test.Common.Networking.StreamManager|Received message: Authentication

2005-11-21 16:13:12.2500|TRACE|Test.Server.ClientCommunicator.Server|1024 bytes received from 127.0.0.1:1990

2005-11-21 16:13:12.2500|DEBUG|Test.Common.Networking.StreamManager|Header message size: 327

2005-11-21 16:13:12.2500|TRACE|Test.Common.Networking.StreamManager|Received message: Authentication

2005-11-21 16:13:12.2500|DEBUG|Test.Common.Networking.StreamManager|Header message size: 327

2005-11-21 16:13:12.2500|TRACE|Test.Common.Networking.StreamManager|Received message: Authentication

2005-11-21 16:13:12.2500|DEBUG|Test.Common.Networking.StreamManager|Header message size: 327

2005-11-21 16:13:12.2500|TRACE|Test.Common.Networking.StreamManager|Received message: Authentication

2005-11-21 16:13:12.2500|DEBUG|Test.Common.Networking.StreamManager|Header message size: 327

2005-11-21 16:13:12.2500|TRACE|Test.Server.ClientCommunicator.Server|1024 bytes received from 127.0.0.1:1990

2005-11-21 16:13:12.2500|TRACE|Test.Common.Networking.StreamManager|Received message: Authentication

2005-11-21 16:13:12.4531|DEBUG|Test.Common.Networking.StreamManager|Header message size: 4116423536034590317

2005-11-21 16:13:12.4531|TRACE|Test.Server.ClientCommunicator.Server|967 bytes received from 127.0.0.1:1990

So, as you can see in that trace, at the end it reads Header message size: 4116423536034590317. That's a bit high ;) and thus all my following messages are read incorrectly. However, I just can't find what's wrong in the code ... it must be a buffer problem somewhere but I can't figure it out. I sniffed the network traffic and the client sends all out correctly.
Maybe someone has some working code that properly detects message boundaries I'd greatly appreciate this.

Thanks,

Tom


Answer this question

Socket, detect message boundaries for objects - working code?

  • HForcht

     JOshLewis wrote:
    Another question I just thought of, I mentioned the technique of posting multiple BeginReceive()s in another post (http://forums.microsoft.com/MSDN/ShowPost.aspx PostID=132847&SiteID=1)
    If I use the scheme mentioned in my immediately previous post (receiving the message size message async then blocking on receive for the actual message), what are the consequences of posting multiple BeginReceive()s For example, say two async receives are active. The message-size message is recieved on thread, and while the message size is being processed, another thread receives the actual message message.
    How likely is this scenario and can it be caught
    Am I missing the point of posting multiple BeginReceive()s completely to begin with



    You want to make sure that you only have one call to BeginReceive per socket at a time. Else you'll run into problems because you'll get several calls to endreceive on several threads at the same time.

  • otgoo

    Performance is a loaded term. I would not write a single line of code because it is performance issue. You can buy perf by adding RAID or more memory of CPUs for a significantly lower price than software development. Sorry to give you this lecture but
    have you used other alternatives and really benchmarked the perf In the end you may be
    surprised to see that the code you write is actually not that performant compared to the coe that is already written tested and shipped.

     



  • Amit Chadha

    There basically were two problems:
    1. I accidentially had BeginReceive called in the send callback. Thus, I got multiple callbacks at once in the ReceiveCallback, therefore messing up my memory stream.
    2. The other problem was a simple bug in my boundary detection algorithm. I simply forgot to attach the incoming buffer from the the new packet to the already existing buffer in the StateObject.
    Maybe post your code and I can have a loot at what's wrong.

    Regards,

    Tom

  • moodyj

    In The Name of God

    hi

    this is what you told in response to some question on networking, plese explain how to force the EndReceive to be repeated for getting the complete messag( the red part below):

    The way this works is actually not too hard: When serializing the message to be sent, I serialize to a MemoryStream, get the length of the stream, add 8 bytes (Int64) to the beginning which are the length of the stream and convert it to a byte array and send it off.

    The receiving side is the tricky part since you don't know how big of a packet you will get. So, you need a state object to store header information that tells you how much data of a particular message you already received and how much there is still to come.
    Therefore, when the first bytes arrive you need to get the message size from the header. So, you start reading the first 8 bytes and you know how much more data there is to expect (But beware that you might receive only 1 byte so only set the validity of the header to true once you received all 8 bytes). Then you continue reading the rest of the available bytes into a memory stream of the stateobject. When you received all bytes for the message, you simply deserialize the message. If your message is e.g. 300bytes long, but you only received 100, store those 100 bytes in the stateobject memorystream and when the next call to endreceive occurs, attach the incoming buffer to the existing 100 bytes of the stream .. then continue reading the stream until you get your full message.



  • s0ulburn24

    Hi Josh,

    you don't need to worry about packet order because TCP guarantees that the packets will arrive and in the correct order. However, it does NOT guarantee the size of the packet you'll receive for each call to endreceive. So packets will be split it up.

    The way this works is actually not too hard: When serializing the message to be sent, I serialize to a MemoryStream, get the length of the stream, add 8 bytes (Int64) to the beginning which are the length of the stream and convert it to a byte array and send it off.

    The receiving side is the tricky part since you don't know how big of a packet you will get. So, you need a state object to store header information that tells you how much data of a particular message you already received and how much there is still to come.
    Therefore, when the first bytes arrive you need to get the message size from the header. So, you start reading the first 8 bytes and you know how much more data there is to expect (But beware that you might receive only 1 byte so only set the validity of the header to true once you received all 8 bytes). Then you continue reading the rest of the available bytes into a memory stream of the stateobject. When you received all bytes for the message, you simply deserialize the message. If your message is e.g. 300bytes long, but you only received 100, store those 100 bytes in the stateobject memorystream and when the next call to endreceive occurs, attach the incoming buffer to the existing 100 bytes of the stream .. then continue reading the stream until you get your full message.

    Hope this makes a lil bit of sense ;)

  • VincentITA

    Can you post what the problem was Tom
    I'm also having issues with message framing, and deserializing structs.
    Thanks,
    -JOsh

  • P Glenn

    Another question I just thought of, I mentioned the technique of posting multiple BeginReceive()s in another post (http://forums.microsoft.com/MSDN/ShowPost.aspx PostID=132847&SiteID=1)
    If I use the scheme mentioned in my immediately previous post (receiving the message size message async then blocking on receive for the actual message), what are the consequences of posting multiple BeginReceive()s For example, say two async receives are active. The message-size message is recieved on thread, and while the message size is being processed, another thread receives the actual message message.
    How likely is this scenario and can it be caught
    Am I missing the point of posting multiple BeginReceive()s completely to begin with


  • glt

    ya, that's what I'm guessing I just can't figure out where ;) that's why It'd be great to have some reference for this.

    I'm using sockets because performance is a real big issue here. The way I'm doing it now allows me to compress my objects and easily deserialize them on the receiving side. As far as authentication, connection management, etc. goes I have all this built already.

  • hero281

    I haven't written any framing code yet. We're currently using a fixed buffer size of 2048 bytes for sending and receiving, while we're developing the rest of the server, and hoping against hope that it will be sufficient. It's been working ok for us. (We're serializing structs then sending).
    I'll probably end up sending the message size and then the message, but I'm a bit scared of writing the code, to be candid.
    The scheme I was thinking of using  is:
        -   ansync receive the message size.
        -   unpack the message-size message
        -   set up a buffer of the correct size
        -   call Receive() (blocking) for the actual message.
    One of my concerns is if in between receiving the message-size message and the actual message, the client (or server) receives another message, possibly the next message before the first message due to congestion or something. This might be mitigated by TCP (I'm not sure if it guarantees the order in which above-TCP layer messages (i.e. not TCP packets) will be received. If the above case happens, (an out of sequence message), this will cause the algorithm to bomb out...
    Another problem is that a messgae size scheme increases the traffic.
    Anyways,  I know I'm not the first developer to come across this problem, and I'd like to hear from other guys about how they solved the problem.
    Taking advice given earlier, I'm not sure something like http will suit us, but I'll look into it.

  • Deken

    Ha, finally figured out what was wrong. Works smooth now

  • mightymoe

    Writing this kind of code is not easy but not very difficult either. Obviously there is a logic error some where in your code. Perhaps you are reading a byte less or more and it gets off eventually.

    To answer your question, this is exactly the problem higher level protocols are designed to solve and implemented. You could for example very easily use HTTP as your transport.
    Then on the client side use HttpWebRequest and on the server side use HTTP Listener
    YOu will have a ton of stuff already implemented including autrhentication, header parsing, connection management, etc. Think again as to why you need to use sockets directly...



  • Larry Cleeton

    It makes a lot sense, and I was planning on doing the same thing.
    I s'pose the fact that TCP guarantees the order of packets helps a lot :P
    Question: do you use blocking sockets once you've got the message size Or do you use a higher-scope state variable and async receiving
    I guess I'll have to do some testing some how.
    Thanks Tom


  • JohnT

    Well, I see your point ... but what are my options
    The requirements I have are that transmitted packets are small, because transmission must be as fast as possible (we're talking about an execution platform for stocks and options). Also, I need bidirectional transmission and I need to transfer objects. So, one option I have is remoting, which I tested and it didn't meet my performance requirements. Not sure what other options I have given the fact that sockets actually work very nicely besides my only problem of detecting message boundaries correctly, which I'm pretty sure is a lil bug somewhere and can be fixed quite easily.

  • Socket, detect message boundaries for objects - working code?