{{Quickfixn}} subsequent calls to lock(sync_) are blocked indefinitely following Socket Exception socket_.Send(rawData)

Thu Jan 5 16:43:18 PST 2017

I am hunting down a problem with our own service which sometimes gets 
"stuck"

I am testing locally using a modified Executor that I use to act as 
brokers replying to our Trading application.

I am running both locally, using 127.0.0.1 as my SocketHost

things normally work fine.

The socket exception says "An existing connection was forcibly closed by 
the remote host"

no matter why that condition occurred, I would have expected the 
lock(sync_) to have been cleared.

My only solution which works is to stop and start both the executor and 
my application.

Which is very  similar to this rare (but painful) condition we've seen 
in production, where suddenly our Fix Processor does not seem to be 
doing anything, and if we restart it, we are fine (in production, the 
other side is fine, we've simply had to restart our service and it 
reconnects and processes all of those requests which had been backed up 
until we noticed)

I can't say that I've noticed any similar exception in our logs from 
production (but will look again, now that I've seen this)

part of me assumes this local problem is not the same as what we had 
happen in production.

but the symptoms are spot on.

we are using version 1.2.0.0 of QuickFix.dll

we are building our code using .NET 4.6.1

.....

I've searched recent emails on this list and saw one mentioning these 
locks, but I haven't seen anyone mention getting stuck like this.

I need to come up with safe solution quickly. Thinking of wrapping our 
calls to Session.SendToTarget with code that would include a timeout and 
alerting us if we do timeout, so that we could at least restart our 
sessions.

Has anyone else reported similar issues? And does anyone have advice for 
short term/long term remediation?