Be careful using anonymous delegates across threads

[UPDATE]: It has been pointed out to me that the premise of this article is wrong. The compiler specifically avoids this problem by hoisting the captured variables into an inner class of which the generated method is also an instance member, and creating a new instance of this class for each call. I'm not sure what bug I ran into that causes the behaviour I denounced, or whether it was simply sloppy handling of reference types on my part -- but I feel particularly stupid having vaguely remembered "inner classes" from reading Raymond Chen's fine explanation of this over a year ago, yet not having checked it before posting this article. Nonetheless, I shall let the article remain as a reminder of the code that this wonderful feature saves you from having to write. Also, the queue/lock-based code may still be useful when one needs data passed to the GTK thread in a guaranteed order.

Anonymous delegates are incredibly useful, especially in that they can "capture" variables from a parent scope. Used within a single thread, they are very easy to understand. However, if you're using them to pass data across threads, you need to understand how the variable capture works.

Consider the following code snippet, in which the "data" variable is captured from the delegate's parent scope.

void ProcessIncomingData (string data)
{
	Gtk.Application.Invoke (delegate (object sender, EventArgs e) {
			textBuffer.AddText (data);
		});
}

What's wrong with this?

Let's expand the anonymous method into something resembling the compiler-generated code.

void ProcessIncomingData (string data)
{
	compilerHoistedVariable = name;
	Gtk.Application.Invoke (new EventHandler (compilerGeneratedMethod));
}
 
string compilerHoistedVariable = null;
 
void compilerGeneratedMethod (object sender, EventArgs e)
{
	textBuffer.AddText (compilerHoistedVariable);
}

Notice that the variable has been "hoisted" into the class scope, so that the generated method used for the anonymous delegate can access it. (The actual compiler-generated code is a bit different, encapsulating the hoisted variables in an inner class, but the principle is the same).

Now consider what happens if ProcessIncomingData gets called by the data-processing thread several times before the GTK+ loop gets run and can handle the invocations. Only the most recent value remains in the compiler-hoisted variable, and the earlier data strings are lost.

This isn't a problem if you're passing transient state, for example the visibility state of a button. But if you're passing data, think carefully about the internal mechanics. You may have to implement locking on the shared data structure.

void ProcessIncomingData (string data)
{
	lock (textBufferQueue) {
		textBufferQueue.Enqueue (data);
		if (!queueHandlerRunning) {
			queueHandlerRunning = true;
			Gtk.Application.Invoke (
				delegate (object sender, EventArgs e) {
					lock (textBufferQueue) {
						while (textBufferQueue.Count > 0) {
							textBuffer.AddText (textBufferQueue.Dequeue ());
						}
						queueHandlerRunning = false;
					}
				});
		}
	}
}
bool queueHandlerRunning = false;
Queue<string> textBufferQueue = new Queue<string> ();

This could hardly be described as "pretty", and the anonymous delegate has grown so big that we may as well move its definition out into a conventional method.

Things are easier if the delegate consumer offers an alternate call allowing you to pass arguments along with the delegate, such as Gtk.Application.Invoke and its Invoke (object sender, System.EventArgs args, System.EventHandler d) overload.

void ProcessIncomingData (string data)
{
	Gtk.Application.Invoke (
		this,
		new dataEventArgs (data),
		delegate (object sender, dataEventArgs e) {
			textBuffer.AddText (dataEventArgs.Data);
		});
}
 
class dataEventArgs : EventArgs
{
	public string Data;
	public dataEventArgs (string data)
	{
		this.Data = data;
	}
}

This is much more verbose than the method with which we started, but at least you won't lose any data and don't have to worry about locking.

Sadly, the anonymous types in C# 3.0 can't be used to simplify this. For a start, they don't allow inheritance. One might think that the "object sender" argument could be abused to pass an anonymous type, however in order to leave the scope of a method, the anonymous type must be cast to "object". The only way to access its properties would be via reflection, which would not only be verbose but would also carry a performance penalty.

Comments

This is wrong...

This is wrong:

Notice that the variable has been "hoisted" into the class scope, so that the generated method used for the anonymous delegate can access it. (The actual compiler-generated code is a bit different, encapsulating the hoisted variables in an inner class, but the principle is the same).

The inner class is very important - it stops the problem. The body of ProcessIncomingData ends up looking like:

InnerClass ic = new InnerClass();
ic.data = data;
Gtk.Application.Invoke(new EventHandler(ic.func));

Note that a new instance of InnerClass is created each time through - meaning that all of the receivers will have a new instance of the delegate and it's bound to the actual value ic.data was before. You can also test it by QueueUserWorkItem'ing over and over again passing a counter and insuring the values never get duplicated in a dictionary.

Now consider what happens if ProcessIncomingData gets called by the data-processing thread several times before the GTK+ loop gets run and can handle the invocations. Only the most recent value remains in the compiler-hoisted variable, and the earlier data strings are lost.

Yeah, so that's wrong. And all the work arounds seem to be trying to get past this "bug"; finally coming down to dataEventArgs which is effectively what C# has done for you in the very first example. The only differenece is that C# anonymous delegate is closed over an instance to store the data (so you can think of it's signature as innerClass, object, EventArgs) and you're closed over a static method w/ no instance (think object, dataEventArgs).

Maybe you were hitting some other bug or were using something more complicated than a string as data?

seems OK on my end too

yep, I also created a simple winform app on my end to see if calling the method hosting the anonymous delegate multiple times before the UI thread can execute the invoked anonymous delegate results in wrong values being accessed by the delegate.

it seems OK on my end, no problem.

Thanks for the correction

Oops, yes. Thanks for the explanation; I've added a correction to the top of the article. It was actually quite a long time ago that I ran into this problem (no idea what really caused it), but I felt like writing an article and this came to mind. I should have researched it better :-/

Dejavu

I also experienced a problem like that, but my case is a bit more nested :)
I am not sure if I use anonymous methods here incorrectly, or if the compiler generates incorrect IL.
I have to re-fetch the server model in the anonymous method, else it's always the first one.

Here is my snippet:

foreach (ServerModel server in servers) {
	...
	if (server.OnConnectCommands != null && server.OnConnectCommands.Count > 0) {
		// BUG: server is always the same object!
		protocolManager.Connected += delegate {
			// HACK: getting the server model with the data of the protocol manager 
			ServerModel ser = serverCon.GetServer(protocolManager.Protocol, protocolManager.Host);
			foreach (string command in ser.OnConnectCommands) {
				...
			}
		};
	}
}

This is an interesting subtlety

Miguel pointed out to me last night that there are some interesting effects related to the scope in which the variable is captured, which is quite unintuitive. It's described in an interesting CodeProject article called Inside C# 2.0 Anonymous Methods.

Basically, the "ServerModel server" is captured in the scope that contains the foreach, so it's shared between all of the anonymous delegates within that scope. If you redeclare it within the loop's scope, each delegate gets its own copy, hence you should find that this would work:

 foreach (ServerModel server in servers) {
 	//...
 	ServerModel serverCaptured = server; //captures the reference in the loop's scope
 	if (server.OnConnectCommands != null && server.OnConnectCommands.Count > 0) {
 	 	 protocolManager.Connected += delegate {
 	 	 	foreach (string command in serverCaptured.OnConnectCommands) {
 	 	 	 	//...
 	 	 	}
 	 	};
 	}
}