Very Large Gasshopper files, Collaboration, UI ideas

Hi all,

First of all David has already responded to some of the ideas I was planning to raise here.

With 1.0 getting close I thought I'd come back out from lurking on these forums and give a bit of feed back:

Last year a group of graduate students at McGill did this: http://web.farmmresearch.com/pavilion/

I was a member of this group and was one of the people involved in the creation of the grasshopper file (see screen shot at bottom of post - please excuse the spelling errors if you find them). There were effectively no drawings, only the grasshopper script, a bit of post processing in soildworks, cnc fabrication, and hand assembly using labels etched in at the time of cnc fabrication.

As you can see from the screen shot the GH file is a hellish nightmare, right on the edge of the memory limit of rhino 4 (which we had to use because of a minor difference in the way rhino 5-GH handled sweeping), and, towards the end, it took about 15 minutes of computer time just to get to the point were we could work on the script because of all the calculations need in the stack.

Here is a list of things that would have really helped us:

1. More programming experience. i had a bit, most others didn't. David can't help to much with this in a direct way.

2. Clusters: at the time we started clusters didn't work well, by the time they did we were almost done - oh well. I do have some ideas about how it would be nifty for clusters to have version control so that the master assembly could revert thinks that cause other things to break. - I'm sure this can be done manually right now but i have never tried it. could GH be plumbed into something like svn?

3. Concurrent editing - Like a video game. It would, i imagine, require some sort of cloudy file storage and server solution or something but there were MANY hours were 3 or more people were crowded around one computer trying to solve a problem. Alowing everyone to have there own view of the file that stayed in sync would just have been more comfortable...

4. Progress bars and the UI and Display running in a separate thread from the solver. Davids addition of esc to allow us two sometimes save ourselves if we connected a wire wrong but often not. It would be cool to be able to interact with the grasshopper file right away and sill know that the solver was working away (especially when a single run of the file took 15 minutes). Progress bars would be nice on any component taking more than a second or two to run (which was the case with many of our components - especially the ones that we wrote ourselves) but they only make sense to add after the threads for the solver and the UI have been separated.

So, anyone else done a really huge GH file? Thoughts?

Replies to This Discussion

Permalink Reply by Ángel Linares on September 3, 2013 at 11:20pm

I think that another huge problem that arises here is the lack of possibilities to extend the definitions in the canvas out of the line that goes from left to right; It's very annoying that we have a 2D canvas that is almost used in 1D when the definition becomes big. My opinion is that this issue is the result of the distribution of inputs and outputs from left to right in squared/rectangular components and the behaviour of wires and connections (designed to match the components and input/output general design).

I'll stay tuned :)

Best Regards.

Permalink Reply by David Rutten on September 4, 2013 at 3:34am

Hi Dieter,

I agree it is difficult to work on a large file on your own, let alone with multiple people. Unfortunately these are really difficult issues to solve.

Yeah, not much I can do about this one. Though I suppose providing a help file which lists some useful tricks for some operations would be a good place to start.
It would be possible to add persistent undo to Clusters, and it wouldn't even be that difficult. Adding undo data into the GH file is something I've been meaning to add since the first day of undo/redo, and the plumbing is in fact there, but it was never fully hooked up. I will definitely try this for GH2. And I'll also have a think about how to implement version history for clusters.
Phew, my brain hurts even just to think about this. I suppose step one would be to write a clever merge algorithm for two files that have some things in common and some not. But even that will be tricky as heck.
This is a major problem. First of all, running the solver in a thread and keeping the UI alive will only slow things down even more. On a file which takes 15 minutes to solve that's no big deal, but you certainly don't want to be adding a 20 millisecond delay to a solution which only takes 30 milliseconds.
Multi-threading will be something I'm going to try and implement in GH2, but there's only so much I can do. If you run a solid boolean operation on a boatload of shapes, it's a single operation that is performed inside Rhino and there's nothing I can do to make it run on multiple threads. This is in general an issue, sometimes it takes a long time because there are many operations to perform; like offsetting 2500 curves. I can probably multi-thread that provided the Rhino curve offsetter is thread-safe. However stuff may also take a long time because there is a single operation (like the aforementioned huge solid boolean).
Lastly, I have no way to predict how long a component is going to take. I can probably work out how far along in steps a component is, but not how far along in time.

What would you do with a solver which runs in the background? How does it differ from only running solutions when you want to? Let's say the solver is threaded and the canvas remains responsive. As soon as you make a change to the GH file, the solver needs to be terminated as it is now computing stale data. Wouldn't it be just as effective to disable the solver, make all the changes you want to make, then press F5?

Just because something runs in a thread doesn't mean you can shoot it in the head any time you want without consequences. Aborting threads typically means setting a boolean somewhere and then letting the thread commit suicide, while performing all the necessary cleanup. If you just destroy a thread there's no saying in what state you leave the memory.

I think a good place to start with these sort of problems is to keep on improving clusters, add more flexible structuring UI such as Layers or Filters or Pages or whatever to the canvas, add ways to share data between remote parts of a file without suffocating the display with wires, and to provide easy ways to temporarily disable parts of a file (think of it as Clipping planes for GH). That way you can make local changes and see local effects before solving the entire file again.

I'm certainly impressed by the sheer extent of the file you people made, it will be a lovely test case for UI improvements.

David Rutten

david@mcneel.com

Tirol, Austria

Permalink Reply by Daniel Hambleton on September 4, 2013 at 11:43am

In response to 3:

- could you use Git (or GitHub) on the XML version of the GH file? The idea being that you probably want some kind of rhyme and reason to how the definition gets diffed/merged, and Git is pretty good at that.

- Also, I'm quite interested in this, and I wrote a little peer to peer data sharing plugin (Ortoo) that allows peers to share mesh data. I don't think there is any reason not to be able to share other kinds of native GH data as well.

Permalink Reply by Dieter Toews on September 4, 2013 at 6:21pm

Cool, I guess there are two different types of concurrency possible.

Repositories like git (with merging)- would offer a lot of utility (imho).

Real time collaboration - like the now defunct Google wave. This is the thing that makes David's brain hurt but would be very cool - i'm not holding my breath...

Data sharing is also very cool although i think it address a different issue than problem solving in groups. I could imagine that the ability to share arbitrary GH data between computers could be used to spit up heavy processing tasks...

Permalink Reply by Dieter Toews on September 4, 2013 at 7:53pm

1. Actually i think you have probably taught many people some fundamentals of programming just by creating such an accessible environment - or maybe "caused them to learn" is a better way of thinking about it.

2. Nick's comment below got me thinking about unit testing for clusters. Being able to work will data flowing in from outside the cluster or having multiple states to test against could be really cool. Creating definitions that were valid across a general cross section of possible input parameters was a significant issue for us. It was all too easy to write the definition as if we were drawing (often we were working from sketches) and then have it fail when the input parameters changed slightly.

4. I wasn't thinking about threading the solver itself. I was thinking along the lines of some IDEs that I've seen which compile your project while you type it. I know that threading within components and at the rhinocommon-level is a freaking hard problem that has been discussed at length already. (although when, 5-10 years from now, it's finished it will be very cool)

Let's say the solver is threaded and the canvas remains responsive. As soon as you make a change to the GH file, the solver needs to be terminated as it is now computing stale data.

What if the solver was a little more atomic and like a server? A GH file is just a list of jobs to do with the order of the jobs and the info to do them rigidly defined - right? The UI could pass the solver stuff to do and store the results back in the components on a component by component basis (i have no idea what the most efficient way to do this is in reality - I'm just talking conceptually) this might even allow running multiple solvers to allow for at least the parallelism the might be built into a given GH file to be exploited (not within components but rather solving non-interdependent branches of components simultaneously). This type of parallelism would more than make up for the performance hit you alluded to for separating the UI and the solver (at least for most of the definitions i write).

I was imagining a couple of scenarios:

a) Writing a parallel module: solver starts chewing away - you see it working - you know it's done 1/3 of the work - if you have something to do at that point you could connect up to some of the already calculated parameters and write something in parallel to the main trunk which is still being solved.

b) Skipping modifications: you need to make a series of interventions at different intervals along a section of code. Sure you could freeze out that bit of a section of down steam code and make modifications so you can observe the effects more quickly. Unfreeze a bit more and repeat etc. etc. until your done and then unfreeze that big chunk at the end to make sure you haven't blown anything up. Just letting it resolve as far as it can while you sit there waiting for inspiration seems a lot more intuitive to me though.

On a file which takes 15 minutes to solve that's no big deal, but you certainly don't want to be adding a 20 millisecond delay to a solution which only takes 30 milliseconds.

You also wouldn't notice it at that point :-) perhaps for things where it would really make a difference, like Galapagos interactivity, it could be disabled - or could the existing "speed" setting just digest this need? Since the vast majority of time that Gh is solving is on files under active development not on finished code, i think qualitative performance is probably more important that quantitative performance (again with cases like Galapagos needing to be accommodated). In our case the code only had to "work" once since its output went to a cnc machine to make a one off project and it didn't really matter if it took 15 seconds or 15 hours for the final run.

Lastly, I have no way to predict how long a component is going to take. I can probably work out how far along in steps a component is, but not how far along in time.

that's ok, from a user point of view, just seeing a percentage tick along once in a while would be nice reassurance that the thing is just slow and has not, in fact, crashed. Maybe there could be two modes of display: the simple percentage version for unpredictable code and, for those of us able to calculate the time taken for our algorithm based on the number of input parameters, a count down in seconds or minutes or whatever.

I think a good place to start with these sort of problems is to keep on improving clusters, ... etc etc

i totally agree.

Permalink Reply by Nick on September 4, 2013 at 9:54am

I'm not entirely sure what's going on in your file, as I am at work and can't really dig deep into it, but couldn't it have been separated into smaller parts? aka achieving a certain threshold, baking and then continuing in a new file from this result?

Permalink Reply by Sylvain Usai on September 4, 2013 at 11:06am

Correct me if I am wrong, but big file such as the one shown in this post only occur in an academic context or for research pavillion, ie. where the entire project is run by the same person or by the same group (i'd like to see example from "real world"). It is still a problem as much for visualisation as for computation when you change something in the early definition.

Separate into smaller part via baking and importing is a good solution but is is a nightmare to find back the data structure then and only works with proper naming of each baked elements or groups of elements (it's easy to get the name back with human plugin or so via the ID of an object and use it to sort datas or as key values).

About the horizontal extension, I am having the same problem but the main reason I do so is to use the "alt drag" tool without messing up with the surrounding. "Named view" can also be usefull to jump from one group to another.

Data dam is also very usefull to avoid computing the entire definition when you change something in the first steps... It's just easy to forget one in the middle of a big definition.

I am really interrested by the "parameter higway" inside the yellows groups on the middle of your file. How do you use it?

Very interresting topic!

Permalink Reply by David Rutten on September 4, 2013 at 11:50am

...but is is a nightmare to find back the data structure...

That's why you'd use [Geometry Cache]. It bakes and re-imports while maintaining data structures.

David Rutten

david@mcneel.com

Tirol, Austria

Permalink Reply by Dieter Toews on September 4, 2013 at 5:48pm

The parameter highway was inspired by computer buses (when you look at a mother board and see a whole bunch of parallel traces chances are it's a bus). In face we called it "the bus". I really don't like invisible wires or wires that cross the canvas randomly thus we adopted the bus for parameters which where used in multiple places or had to cross large distances. I use the "data" component to route things around for this purpose. Some sort of special graphic object that could keep itself organized, named, and ease tracing for debugging would be nifty but possibly pretty low on DR's priorities at the moment.

Permalink Reply by Dieter Toews on September 4, 2013 at 11:14am

When we started GH was in the 0.6.x stage so there was only geometry baking not state saving directly into the grasshopper file like there is now.

We could have baked the geometry but the would have destroyed the complex way we structured our data to keep everything straight. Also, the design process was not linear and, since all the detailing was encoded in code rather than drawings, we had to go back to earlier points in the code to revise things often: thus even state saves would have had limitations for us.

I think there is also something conceptually clean to me about the whole thing generating one last time before it goes into production.

Lastly, there were points where somethings were baked out and these thing caused enormous problems for us because we had to accommodate them in later downstream code (thus i would strongly recommend against baking out anything from a project you intend to fabricate until after your program is finished, tested, and fully debugged)

Permalink Reply by Nick on September 4, 2013 at 11:39am

I figured as much. Baking does defeat the purpose of the parametric algorithm but sometimes is a necessary evil...

Another option might be to divide the pavilion into different modules, although looking at the generated form, I am unsure how that would work.

In light of David's comments, having a huge file that is increasingly difficult in terms of time and work-ability might be the only option at this point for certain designs.

Permalink Reply by David Rutten on September 4, 2013 at 11:54am

...thus even state saves would have had limitations for us.

How about being able to save data to a file, rather than bake it. First off you could store all sorts of data, not just geometry, and I could make sure all the data trees remain intact.

Let's say I introduce a file-type called *.ghcache, and this file is written by a component with a flexible number of inputs. Then you can import that cache from any other GH file and get all your data as outputs again. And of course you can generate more than one cache file, in case you want to 'save' some solutions without having to recompute them all the time if you want to switch between them.

It wouldn't be particularly difficult to implement, but I do want to know if it solves an actual problem before I start doing this.

David Rutten

david@mcneel.com

Tirol, Austria

‹ Previous
1
2
Next ›

RSS

Very Large Gasshopper files, Collaboration, UI ideas

Replies to This Discussion

About

Translate

Search

Photos

Polar Ribs

Kangaroo Structures

Circular Extrusions

Voronoi Canopies

Attractor Modules

Videos

Polar Ribs

Kangaroo Structures

Circular Extrusions

Voronoi Canopies

Attractor Modules

Weave Facade