{{Quickfixn}} [quickfixn] Heartbeat not sent during constant traffic (#57)

Chris Busbey cbusbey at connamara.com
Fri Apr 27 10:50:00 PDT 2012


Hey Matt,

Based on those numbers that behavior is definitely suspect.  I'm going to
re-open the issue.  The TestRequest is meant for slow consumers, but I
wouldn't expect those messaging rates to particularly hammer the receiver.

I dug around the c++ QuickFIX code a bit to see to see if there are any
differences in the Session event loop that are noticeably different than
the .NET impl, and lo, there is.  In the C++ code, after processing a
message, the Session always checks for a if a heartbeat is necessary (This
is done by a call to next() at the very bottom of the Session::next( const
Message& message, const UtcTimeStamp& timeStamp, bool queued ) method).
 The .NET Session code on the other hand, does no such check.  It returns
to the caller immediately.

I've got a good idea of where the code needs to be changed, I should be
able to get a pull request in to fix the issue soon.  Once it's in master,
please verify that it solves your issue.  Thanks again for following up on
this!

Chris.

On Fri, Apr 27, 2012 at 8:26 AM, Matt Wood <mjwood7 at gmail.com> wrote:

> And James, to your point: Why don't you think the app (my side) should
> be sending a heartbeat during that time? From the perspective of the
> other end (vendor side) they haven't heard from me for several
> heartbeat intervals. Plus, what are your thoughts on what the protocol
> spec said (that I quoted previously) with regard to this? Perhaps its
> open to a bit of interpretation so I'm curious to hear your thoughts.
>
> Thanks,
>
> Matt
>
> On Fri, Apr 27, 2012 at 11:16 AM, Matt Wood <mjwood7 at gmail.com> wrote:
> > Yes I have seen it in our production environment. At the start of the
> > trading day today I saw ~68 IOI's/second. This can go on for several
> > minutes until we reach the typical market inventory of about 30,000
> > offerings. I calculate that at that rate it should take just over 7
> > minutes, which seems about right when I'm watching it. Before I
> > applied any logic to force a heartbeat every 30 seconds, we'd get
> > logged off around (2x HeartBtInt), if I remember correctly.
> >
> > -Matt
> >
> > On Fri, Apr 27, 2012 at 10:18 AM, Chris Busbey <cbusbey at connamara.com>
> wrote:
> >> Hey Matt,
> >>
> >> Thanks for re-posting to the mailing list.  Just curious, have you seen
> this
> >> as an issue in a production environment?  What is the saturation point
> >> (msgs/sec) in constant traffic where the receiving app doesn't get a
> chance
> >> to send a heartbeat?
> >>
> >> On Fri, Apr 27, 2012 at 7:03 AM, James Downs <jcdowns at connamara.com>
> wrote:
> >>>
> >>> Matt,
> >>> IMHO, I don't think the receiving app in this case should send a
> >>> heartbeat. Obviously, it is necessary for the sending app to send the
> >>> TestRequest and the receiving app to respond with a heartbeat.
> >>> I have seen the situation where the receiving app is a slow consumer
> and
> >>> fails to process the TestRequest and send the heartbeat in time to
> prevent
> >>> the sending app from disconnecting.
> >>>
> >>> Jim
> >>>
> >>>
> >>> On Fri, Apr 27, 2012 at 8:14 AM, Matt Wood <mjwood7 at gmail.com> wrote:
> >>>>
> >>>> I have also included the distribution for any commentary on this:
> >>>>
> >>>> I did some further reading, specifically from the sections you quoted.
> >>>> Right before the section you included was this line:
> >>>>
> >>>> "When either end of a FIX connection has not *sent* any data for
> >>>> [HeartBtInt] seconds, it will transmit a Heartbeat
> >>>> message. "
> >>>>
> >>>> source:
> >>>> http://fixprotocol.org/documents/346/fix-44_w_Errata_20030618_PDF.zip
> >>>> Volume2, Page 16, Section “Administrative Messages”:
> >>>>
> >>>>
> >>>> So with the current code, my scenario where I receive constant traffic
> >>>> (perhaps a HeartBtInt+ second long stream of IOI messages) causes me
> >>>> skip a heartbeat. You are right that other end should send a test
> >>>> message, but in the same sense, I should be sending a heartbeat.
> >>>>
> >>>> any thoughts?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> -Matt
> >>>>
> >>>> On Tue, Apr 24, 2012 at 10:42 AM, Chris Busbey
> >>>>
> >>>> <
> reply+i-3929507-2099e5ccd1f624f6f2364dd2edaa1dde4ad196cb-1503061 at reply.github.com
> >
> >>>> wrote:
> >>>> > I don't think this is necessarily a bug.  The fix protocol handles
> both
> >>>> > sides of the heartbeat transaction.  Take a look at this:
> >>>> >
> >>>> > http://fixwiki.fixprotocol.org/fixwiki/Heartbeat
> >>>> >
> >>>> > Specifically the part about when a side has not received a
> heartbeat in
> >>>> > the heartbeat interval.
> >>>> >
> >>>> >> When either end of the connection has not received any data for
> >>>> >> (HeartBtInt + "some reasonable transmission time")
> >>>> >> seconds, it will transmit a Test Request message. If there is
> still no
> >>>> >> Heartbeat message received after
> >>>> >> (HeartBtInt + "some reasonable transmission time") seconds then the
> >>>> >> connection should be considered lost and corrective
> >>>> >> action be initiated.
> >>>> >
> >>>> > In the scenario you gave, the counter party sending the constant
> >>>> > traffic should notice that no heartbeat has been sent and send a
> test
> >>>> > message.  Assuming the test message is properly received and
> responded to
> >>>> > (which it will, see Session.Next and Session.NextTestRequest), all
> should be
> >>>> > good.  It won't kill the connection until after the test message is
> sent +
> >>>> > heartbt int + "some reasonable transmission time"
> >>>> >
> >>>> > ---
> >>>> > Reply to this email directly or view it on GitHub:
> >>>> >
> https://github.com/connamara/quickfixn/issues/57#issuecomment-5305992
> >>>> _______________________________________________
> >>>> Quickfixn mailing list
> >>>> Quickfixn at lists.quickfixn.com
> >>>> http://lists.quickfixn.com/listinfo.cgi/quickfixn-quickfixn.com
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Connamara Systems, LLC
> >>> Made-To-Measure Trading Solutions.
> >>> Exactly what you need. No more. No less.
> >>> http://www.connamara.com
> >>>
> >>>
> >>> _______________________________________________
> >>> Quickfixn mailing list
> >>> Quickfixn at lists.quickfixn.com
> >>> http://lists.quickfixn.com/listinfo.cgi/quickfixn-quickfixn.com
> >>>
> >>
> >>
> >>
> >> --
> >> Chris Busbey
> >> Connamara Systems, LLC
> >>
> >> _______________________________________________
> >> Quickfixn mailing list
> >> Quickfixn at lists.quickfixn.com
> >> http://lists.quickfixn.com/listinfo.cgi/quickfixn-quickfixn.com
> >>
> _______________________________________________
> Quickfixn mailing list
> Quickfixn at lists.quickfixn.com
> http://lists.quickfixn.com/listinfo.cgi/quickfixn-quickfixn.com
>



-- 
Chris Busbey
Connamara Systems, LLC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quickfixn.com/pipermail/quickfixn-quickfixn.com/attachments/20120427/98d731b2/attachment-0002.htm>


More information about the Quickfixn mailing list