Grasshopper

algorithmic modeling for Rhino

Hello,

as per discussion title, I am beginning to delve into multi-threading (mainly to understand how it works and where and when I might use it to speed up calculations of repetitive tasks). So far I took a look around and mainly took advantage of the examples at the following links:

http://www.grasshopper3d.com/forum/topics/c-multithreading-failure

http://james-ramsden.com/multithreading-a-foreach-loop-in-a-grassho...

As well as studying the code that Vicente Soler wrote for this: http://www.grasshopper3d.com/video/differential-growth, which contains several multi-threaded forEach loops, like this one for instance:

System.Threading.Tasks.Parallel.ForEach(springs, spring => spring.restLength += growth);

That updates the rest length of the springs with a growth factor.

I tried with a very simple operation of adding a quantity to an array of numbers and doing the same operation on the Y coordinate for a list of points (using ConcurrentBag as variable structure for the list).

My main question is: why my code does not work? More specifically: the code performs the loop and the values in there are changed, but the output is as the loop never happened (see attached .gh file). There are 3 components, one with a simple list of points from the main script context, the second one from within a class and the third one is a working example - here the list used to go through the loop is not the same containing the data I am manipulating, but then what I don't understand is why Vicente's example (in which the list used to perform the forEach is the same where the data is manipulated) works.

Like I said, I am a beginner at multi-thread in general and multi-thread in C# in particular, so any help in understanding why my cases do not work, why the other working I cited do work, and if there are more general rules that I didn't consider in coding multi-thread loops is appreciated. Also, any hint on introductory reading/literature on the matter (of multi-threading in C# and within GH) would really be appreciated; there is really a ton of stuff out there and I could very well start from that, but any advice from people who have already gone through the same struggle would be a huge bonus.

Thank you in advance!

Views: 2699

Attachments:

Replies to This Discussion

I don't have time right now to delve into your code, but these are a few things to keep in mind when writing parallel algorithms:

  • There are different ways to multi-thread code, and which one is right for you depends on the specific problem. If you wish to simultaneously perform distinct, long-running processes you will need a different approach (probably System.Threading.Task based) than if you need to perform loads of really short calculations.
  • Data which is accessed for reading from multiple threads may get duplicated by the runtime if it feels that is required. This will result in (potentially very significant) overhead.
  • Data which is accessed for writing from multiple threads is always a huge problem. Either the threads access the data simultaneously which can result in missing or corrupt data, or the threads must lock the collection during each write, which tanks performance. The collection types in the Concurrent namespace may well resort to locking and may thus be slow to use.
  • The single most useful design pattern for multi-threaded code is immutability. Design your classes in such a way that they cannot be changed once constructed. Consider the System.String type as a prime example of this pattern. Classes which are immutable are much safer to use because they cannot be corrupted after the fact.
  • Finally, writing multi-threaded code is hard. It's difficult to make it safe. It's difficult to debug. It's difficult to actually make it fast. Besides infinite recursion, there's no more efficient way to completely crash your program than writing MT code.

David, thank you for the tips. I'll keep on searching and looking into tutorials, while keeping them in mind!

"Are all of the new concurrent collections lock-free?"

Answer: mostly, but not quite.

http://blogs.msdn.com/b/pfxteam/archive/2010/01/26/9953725.aspx

I attempted to 'fix' your first component, giving the code below, only to later realise that this is almost exactly what you'd made in your third component anyway! Here it is for argument's sake:

var pts = new List<Point3d>();
for (int i = 0; i < n; i++)
{
var temppt = new Point3d(i, 0, 0);
pts.Add(temppt);
}

var ptsbag = new System.Collections.Concurrent.ConcurrentBag<Point3d>();
var rnd = new System.Random();

System.Threading.Tasks.Parallel.ForEach(pts, pt =>
{
ptsbag.Add(new Point3d(pt.X + rnd.Next(0, 1000) * 0.01, pt.Y + 0.1, 0));
}
);

A = ptsbag;

The main difference between your example and mine is that you have used a ConcurrentBag, whereas I used a ConcurrentDictionary. When I wrote my own example, I'm pretty sure I tried the Bag, but didn't end up with a solution I liked.

The reason for this, as I understand, is that the Bag is unsorted, and therefore unindexed. In other words, you can't access an item in a Bag using the list[i] notation or similar. 

The foreach loop allows us to go through the bag one item at a time, which is why we're able to read the data. But when we want to edit the pt and write it back to pts, the code no longer knows from where within pts pt came from, so it's unable to write. That's my guess anyway :)

The fix works because, when we save to pts, we are now adding a new item to the Bag, rather than attempting to overwrite an existing one.

One possible issue with the 'fix' is that I'm not sure whether accessing the single instance of rnd with multiple threads is threadsafe, and I would guess it isn't. Since creating a new instance of Random for every loop is inefficient, and crucially can cause non-randomness, I'm not sure what the best answer is here.

To echo David's word of warning, multithreading is hard, and we have to start asking many more questions of how our code is working. It can be very rewarding to get it right, but do start small, and test thoroughly. Read up the data types your are using on MSDN, especially checking for thread safety, and have a look at the various multithreading tutorials there too. 

James, thank you very much for your answer and advice. Indeed I know multi-threading is difficult, this is why I started with baby steps. At this stage my concern was not so much about performance (but it will soon be), rather to gain an understanding why things that work are working and why things that do not work aren't working. Apparently, a Rhino class such as Point3d cannot be modified directly in a multi-threading loop but a custom class created within a C# component can (this is why Vicente's algorithm works - Spring is a custom class he created).

To test this I created an updated version of the component that selectively uses a ConcurrentBag for Point3d (and, as expected, it does not work in updating the fields within members of the collection), a ConcurrentDictionary of Point3d (which gives... odd results, but I guess it has to do with some other complicated aspects of mutithreading that I still need to understand), and finally updates a List of a custom Dot class and then populates a Concurrent collection of Point3d. This last method succesfully updates fields in the custom class inside the loop.

I was just curious to know why Vicente's loop was working and now I found out that (so far) custom classes fields can be updated in a parallel loop, but still no idea why this is not possible with native classes (such as Point3d). I'll keep studying and make exercises and tutorials!

Again, thanks to both you and David for the advice!

Attachments:

Hi again Alessio,

Just had a look at your most recent gh file. My thoughts aren't conclusive, but here's what I've found:

1) It seems you can update your dots (case 2) as you are updating a List, not a ConcurrentDictionary. Certainly, when I try to update a dictionary directly through an index (i.e. dictionary[key]) I get an error along the lines of the dictionary being read-only. Try testing a dictionary with a custom class, and a list with a Rhino class, and see what happens.

In any case, without further testing and some scientific rigour, I don't think it's right to decisively say that only custom classes can be updated in parallel loops.

2) I was able to get your case1 working with a slight modification. The second loop is:

System.Threading.Tasks.Parallel.ForEach(pts2, pt =>
{
double height = rnd1.Next(1000) * 0.01;
pts2.TryUpdate(pt.Key, new Point3d(Math.Cos(height), pt.Value.Y, height), pt.Value);
}
);

The 'strange results' you were getting looked like a classic case of parallel read/write conflicts. Not quite sure why you got this though. I found it strange that you looped through the list of indices, rather than the dictionary of points directly, as I have done. This might have something to do with why yours failed perhaps.

Anyway, keep your investigations coming! We're all learning through this together :)

RSS

About

Translate

Search

Videos

  • Add Videos
  • View All

© 2024   Created by Scott Davidson.   Powered by

Badges  |  Report an Issue  |  Terms of Service