algorithmic modeling for Rhino
I want to take advantage of all the unused desktops that my company has sitting around at night. I'm searching for a way to create something like a render farm or cluster computing through GH.
When using GH as a translator to run multiple iterations through 3rd party programs like Radiance or E+ (honeybee modules), the calc doesn't live in GH/rhino. GH simply calls these programs. Unfortunately, every iteration is called in series. Once one ends, the next begins. I'd like to send these iterations to multiple computers on a server so that I can run in parallel and reduce my wall-clock time.
I'd like to generate all 500 batch files from one machine, then somehow call those routines on other computers on my network, say 100 runs on 5 machines. Or 5 runs on 100 machines (even better!). I hope this is possible because we are not splitting any single computation between multiple machines. We only want to call unique runs across a network.
Has anyone explored this? If we figure this out, companies or students could take advantage of all the unused desktops in offices or computer labs to calculate large sets overnight. Which would be amazing.
Thanks for your help.
Replies are closed for this discussion.
Sarith Subramaniam and Dr. Mistrick from Penn State had some great input via email. I've copied their response below.
Dr. Mistrick and I had a discussion about your email. Here are a few things to consider:
Finally, unless there is a pre-existing platform to handle such parallel processing, some scripting effort would be required to direct calculation files outwards into different systems/processors and then fetch and consolidate results from those calculations into a single location and then visualize those results on an interface like Mostapha’s Design Explorer.
Identify the aspect of calculations that consumes the most amount of time and resources:
Based on past projects, I expect the following computation time breakdown. This all depends on how complex the models are. If we are running multi-room E+ studies, that will take far longer to calculate.
My first instinct is to avoid this problem by running GH on one computer only. Creating the batch files is very fast. The trick will be sending the radiance and E+ batch files to multiple computers. Perhaps a “round-robin” approach could send each iteration to another node on the network until all iterations are assigned. I have no idea how to do that but hope that it is something that can be executed within grasshopper, perhaps a custom code module. I think GH can set a directory for Radiance and E+ to save all final files to. We can set this to a local server location so all runs output to the same location. It will likely run slower than it would on the C:drive, but those losses are acceptable if we can get parallelization to work.
I’m concerned about post-processing of the Radiance/E+ runs. For starters, Honeybee calculates DA after it runs the .ill files. This doesn’t take very long, but it is a separate process that is not included in the original Radiance batch file. Any other data manipulation we intend to automatically run in GH will be left out of the batch file as well. Consolidating the results into a format that Design Explorer or Pollination can read also takes a bit of post-processing. So, it seems to me that we may want to split up the GH automation as follows:
The above workflow avoids having to parallelize GH. The consequence is that we can’t parallelize any post-processing routines. This may be easier to implement in the short term, but long term we should try to parallelize everything.
I agree that the best way to enable large numbers of iterations is to set up multiple unique runs of radiance and E+ on separate computers. I don’t see the incentive to split individual runs between multiple processors because the modular nature of the iterative parametric models does this for us. Multiple unique runs will simplify the post-processing as well.
It seems that the advantages of optimizing matrix based calculations (3-5 phase methods) are most beneficial when iterations are run in series. Is it possible for multiple iterations running on different CPUs to reference the same matrices stored in a common location? Will that enable parallel computation to also benefit from reusing pre-calculated information?
Clustering computers and GPU based calculations:
Clustering unused computers seems like a natural next step for us. Our IT guru told me that we need come kind of software to make this happen, but that he didn’t know what that would be. Do you know what Penn State uses? You mentioned it is a text-only Linux based system. Can you please elaborate so I can explain to our IT department?
Accelerad is a very exciting development, especially for rpict and annual glare analysis. I’m concerned that the high quality GPU’s required might limit our ability to implement it on a large scale within our office. Does it still work well on standard GPU’s? The computer cluster method can tap into resources we already have, which is a big advantage. Our current workflow uses image-based calcs sparingly, because grid-based simulations gather the critical information much faster. The major exception is glare. Accelerad would enable luminance-based glare metrics, especially annual glare metrics, to be more feasible within fast-paced projects. All of that is a good thing.
So, both clusters and GPU-based calcs are great steps forward. Combining both methods would be amazing, especially if it is further optimized by the computational methods you are working on.
Moving forward, I think I need to explore if/how GH can send iterations across a cluster network of some kind and see what it will take to implement Accelerad. I assume some custom scripting will be necessary.
Hi all and sorry for being late to this discussion.
One thing here to remember is until we implement 3-phase method annual analysis are running using Daysim which introduces limitations in using multiple CPUs.
If you're fine with a little bit of coding, a quick solution that I used before is to use the shared drive in the office and ask people to run a batch file on their systems in background (which can be automated). The way it works is that the script generates all the batch files in the same folder. Then the batch files in other systems are looking for new available batch files and execute them. You need to manage the results to be saved in different folders. It's similar to how clustering computers work. You can read more about it in this paper. I used this approach to run annual daylight simulations for a master plan project overnight: http://www.ibpsa.org/proceedings/asim2012/0097.pdf
For long-term solutions I hope OpenStudio team set up their cloud-based services generic enough that we can use it also for Honeybee.
Hi! Interesting discussion.
I'm about to implement a clustering solution using octopus E, some excel writers/readers and Flux.io.
Each generation of chromosomes is going to be pushed to the cloud through Flux.io where the slave computers reads it and simulate the parameters contained in the chromosomes. Will try to make some function to distribute these values in the cluster so that there are no two machines calculating the same input and so that as one simulation is complete the corresponding chromosome is taken away from the list and instead input into a list of solutions. In a subsequent step, when the list of solutions is full, these values are fed to the genetic algorithm, mutated and crossed over into a new set of chromosomes to be checked and the circle continues.
Maybe Im stating the obvious here. Or being naive in my intentions. :) Though Flux.io was not very well known when the posts here were written.
Best of luck and please share your progress, and Ill let you know how my system works out.
Hello, Ludvig. I have the intention to do the very same thing that you have explained above. Did you manage to make it work?
Hi. This has been a very interesting read.
I think that you need to start using systems that support concurrent processing in the cloud for these batch processes if the generating radiance .ill files seems to be the issue here. Do the batch files need to be ran procedurely here or can they be ran concurrently?
If they can be ran concurrently, you would be able to define their dependencies and package them into a container to run on multiple nodes in a cloud compute cluster such as on AWS or Google cloud engine. This would also enable you to scale resources according to how much money you want to spend.
Do you think you could share a diagram of the typical workflow here? Maybe with details of the runtime requirements?
If you are running Radiance on Unix or OS even multiple ill files aren't required as most (if not all) of the ray-tracing components within Radiance support parallel processes.
To answer your question about concurrency, batch files can be run concurrently and it issomething that Mostapha has already implemented in Honeybee. The primary dependencies for running Radiance are just it's binary and library files which can either be loaded into any environment using PATH and (custom) RAYPATH variables respectively. In the highly unlikely scenario that the user cannot set their environment variables (or set their .bashrc or .profile file in Unix), it is also possible to hardcode the pathnames ..something like /home/user/radiance/bin/oconv geometry.rad > geo.oct etc.
Having run Radiance on Unix (workstations and remote clusters/cloud) and Windows systems for the past 3-4 years, the biggest issue to me is that of portability between different operating system environments.