Benchmarks comparing Fractal Domains 2.0b5 and 2.0

Fractal Domains v2.0 detects the total number of cores available and allocates one calculation thread per core, allowing calculations to be performed in parallel. This yields performance improvements on all dual-CPU PPC Macs and almost all Intel Macs (I cannot say "all" because Apple was briefly selling an Intel single-core Mac Mini model).

I ran some informal benchmarks on the "Christmas Tree" fractal (the parameter file can be found on the web site under the Fractal of the Week for December 16, 2006). I opened the parameter file itself and noted the generation time in the Statistics window. I then rendered the same image with 4x4 anti-aliasing. The following results were achieved on the available machines:

Note:
FD 2.0b5 was compiled with CodeWarrior and is single-threaded.
FD 2.0 is compiled with gcc 4.0 and is multi-threaded.

PowerMac MDD Dual G4 1GHz
FD 2.0b5: No Anti-Alias, 36 s; With Anti-Alias, 9 m 33 s (573 s)
FD 2.0: No Anti-Alias, 18 s; With Anti-Alias, 5 m 32 s (332 s)

MacBook Pro (Core Duo 1.67 GHz)
FD 2.0b5: No Anti-Alias, 37 s; With Anti-Alias:, 9 m 57 s (597 s)
FD 2.0: No Anti-Alias, 6.7 s; With Anti-Alias, 2 m 1 s (121 s)

Mac Pro (Xeon 2.66Ghz x 4)
FD 2.0b5: No Anti-Alias, 19 s; With Anti-Alias, 5 m 9 s (309 s)
FD 2.0: No Anti-Alias, 2.4 s; With Anti-Alias, 45 s

The results are shown below in graphical form. After the graphs I have some additional remarks.

Elapsed time to render fractal image (shorter is better)
fd-benchmark-bar-graph

Let's look for a moment at the results for G4 only -- since both the old and new FD (Fractal Domains) run natively there, we can compare "apples to apples" so to speak and see the effects of the implementation of multithreading.

For the "no anti-aliasing" case we see that the ratio of rendering times for FD 2.0b5 and FD 2.0 is 36 seconds to 18 seconds or exactly two to one. This is exactly what you would expect since 2.0 is using two threads for calculation whereas 2.0b5 uses one, and the computer has two processors.

For the "anti-aliasing" case, however, the ratio of 573 s to 332 s is only about 1.7 to 1.

In both cases, the optimum performance is actually not being achieved because Fractal Domains is based on a framework in which threading was inherently "cooperative" due to the old pre-OS X way of doing things. In this context, preemptive threads were installed for computation, but the design couldn't be made entirely preemptive due to the way the old framework is designed. Therefore, there is some overhead involved in supporting this old design, and the overhead happens to have a greater effect in the code that implements the anti-aliased rendering.

Moving on to the Intel machines, we can see that for the MacBook Pro, non-anti-alias case, the ratio is 37 s to 6.7 s or 5.5 to 1. The speed up is due to two factors, because 2.0b5 is handicapped both by being single-threaded and by needing to run in emulation under Rosetta. If we assume that the speed up due to multi-threading is 2 to 1, there is an additional speed-up of about 2.7 to 1 due to running native on the Intel processor.

On the MacPro, non-anti-alias case, the ratio is 19 s to 2.4 s or 7.9 to 1. Although the Mac Pro has double the number of cores as the MacBook Pro has, this ratio did not itself double. This is due to the limitations of the old design, as mentioned above, which prevents Fractal Domains from utilizing all cores close to 100%. In the Activity Monitor, which shows CPU usage of individual processes, Fractal Domains 2.0 never achieves above 300% (the maximum for a Mac Pro would be 4 x 100% = 400%).

Note that for the anti-alias case the performance ratio of 2.0 to 2.0b5 is 4.9 to 1 for the MacBook Pro and 6.9 to 1 for the Mac Pro. As in the case of the G4, the additional oomph from multiple processors is somewhat less for anti-alias rendering.

The redesigned program Fractal Domains X will not suffer from these limitations and should achieve higher absolute performance numbers and higher CPU utilization rates. This will be especially important for the anti-alias case where you need the extra speed the most.

Finally, we can look at the performance of the G4 vs. Intel, but this is not very meaningful since we are comparing a four-year old computer to the latest/greatest Intel chips. It would be more interesting to run the benchmark on a dual G5 tower, but unfortunately I don't have one available at the time of this writing.

Since FD 2.0b5 had almost identical rendering times on the 1GHz G4 and the 1.86GHz MacBook Pro and almost half the rendering time on the Mac Pro, you could at least draw the conclusion that for someone with a machine in the class of the Dual 1GHz G4, running PowerPC-only apps on MacBook Pro will not result in much of a slowdown and will actually be faster on the Mac Pro.

Even this conclusion is probably not justified based on this one benchmark, since rendering fractals is far from an ordinary application -- it is much more CPU-intensive than most tasks and generally does very little in the way of memory access. Of course, the latter fact would actually favor the G4, since it suffers a large handicap compared to the newer chips in accessing external memory, and Fractal Domains doesn't make it do that much compared with more typical applications.



Powered by Template Toolkit

Copyright © 1998-2010 Fractal Domains. All Rights Reserved.
This site was created using the TT2Site Site Builder.
info/benchmarks.html last modified 05:47:46 13-Feb-2007