For those of you interested in grid computing I found an older, but great post about scalability of ec2 for grid based applications. The thing that caught my eye was the final test using Windows HPC and Velocity. The tests were not comparable to each other, but the final test shows how much degradation you suffer when you're data is stored away from your computations. In there tests 31x reduction in performance when your data is stored "out of the cloud". I think this really shows the importance for good redundant storage at the point of computation.
http://highscalability.com/your-cloud-scalable-you-think-it
The good news is for GridGain is the near linear scalability up to 512 nodes in pure CPU tests. Not as high as 2000 nodes for Hadoop, but that's the only real numbers I've seen anywhere on it. Does hint that GridGain's network overhead is really pretty light.
3/12/2009
Grid Computing: Intro To GridGain Talk is Online
I finally got some time to put up the slides, and source code for my talk I gave at the Devnexus conference in Atlanta. Here is the link to the slides, and the source code is here.
Labels:
devnexus,
gridgain,
java,
presentation
2/21/2009
Actionscript and Concurrency (III of III)
In the previous article we covered techniques for breaking up our long running job, but the performance was 40x slower than if we just ran our algorithm straight out. The problem is our algorithm spends very little time doing work, and a lot of time waiting for the next frame. Actionscript's performance is really quite high. We need to increase the time spent running our algorithm and minimize the time we spend doing nothing. We can do that by doing many iterations per frame instead of just one. Using getTimer() we measure how much time we spent looping and back off right before the next frame. Let's look at the code:
public function start() : void {
Application.application.addEventListener( Event.FRAME_ENTER, onCycle );
}
private function onCycle( event : Event ) : void {
var cycle : Boolean = true;
var start : Number = getTimer();
var milliseconds = 1000 / Application.application.stage.frameRate - DELTA;
while( cycle && (getTimer() - start) < milliseconds ) {
cycle = doLongWork();
}
if( cycle == false ) {
Application.application.removeEventListener( Event.FRAME_ENTER, doLongWork );
}
}
public function doLongWork() : Boolean {
// do some work
i++;
return i < total;
}
Now we broken up our algorithm into an extra method. First the start() method which we're already seen. The new method is the onCycle which calculates how long a frame is in milliseconds. The loop continues until either the doLongWork method returns false, or we run out of time. Notice the DELTA constant is some constant that keeps us from eating up the entire frame. We need to give a little breathing room for Flash to drain the queue. Notice how our doLongWork method is just the code pertaining to our job. This makes it's easier to build a general purpose solution that we can reuse.
We can't use true OS threads in Actionscript, but any language can emulate threads. This technique is often called Green Threads. Lots of languages have used this in the past. Threads in Ruby are still green, and early versions of Java were green as well. Now Actionscript can too. I should pause and give credit to Drew Cummins who implemented a version of this for Flash player 10. I've rewritten this to remove the dependency of Flash 10, and changed some of the API so event dispatch is more natural, added easy progress events, and optional progress tracking. Let's see how our Mandelbrot algorithm changes when we use this.
In order to use GreenThreads create a subclass of GreenThread, override run method, and optionally override initialize method to add code that runs at the start. Here is an example:
public class Mandelbrot extends GreenThread {
private var _bitmap : BitmapData;
private var _maxIteration : uint = 100;
private var _realMin : Number = -2.0;
private var _realMax : Number = 1.0;
private var _imaginaryMin : Number = -1.0;
private var _imaginaryMax : Number = 1.0;
private var _shader : Shader;
private var _realStep : Number;
private var _imaginaryStep : Number;
private var screenx : int = 0;
private var screeny : int = 0;
override protected function initialize( ) : void {
_bitmap = new BitmapData( width, height, false, 0x020202 );
screenx = screeny = 0;
_realStep = (_realMax - _realMin) / Number(_bitmap.width);
_imaginaryStep = ( _imaginaryMax - _imaginaryMin ) / Number( _bitmap.height );
}
override protected function run():Boolean {
if( screenx > _bitmap.width ) {
screenx = 0;
screeny++;
}
if( screeny < _bitmap.height ) {
var x : Number = screenx * _realStep + _realMin;
var y : Number = screeny * _imaginaryStep + _imaginaryMin;
var x0 : Number = x;
var y0 : Number = y;
var iteration : int = 0;
while( x * x + y * y <= (2 * 2) && iteration < _maxIteration ) {
var xtemp : Number = x * x - y * y + x0;
y = 2 * x * y + y0;
x = xtemp;
iteration = iteration + 1;
}
if( iteration == _maxIteration ) {
_bitmap.setPixel( screenx, screeny, 0x000000 );
} else {
_bitmap.setPixel( screenx, screeny, shader.lookup( Number(iteration) / Number(maxIteration) ) );
}
screenx++;
return true;
} else {
return false;
}
}
}
The run() method is the body our of loop. The intialize() method is called once after the user calls the start() method. After that run() method is called repeatedly until it returns false. It's perfectly acceptable to call start() more than once to kick off the thread again after it's finished. That means you can calculate the Mandelbrot set from different zoom levels without needing to recreate new instances. The initialize() method will be called every time start() is called. Check out the results here.
You can also add optional progress tracking by setting maximum, and progress members. This will automatically dispatch ProgressEvents so that your instance can be a source to a ProgressBar. It makes tracking your job easy. GreenThread also subclasses EventDispatcher so you can dispatch events from within the run method.
By in large we've solved the performance problems or we've gotten very close. What's holding us back is the resolution of getTimer(). Since we only have precision of millisecond we really can't run the risk of going smaller than 1 millisecond for our DELTA. That costs us a few iterations on our run() method which can make a difference over 1000 cycles. We could be a full second behind Actionscript that just ran the job straight through. There are a few things we can do to squeeze a little more performance out of GreenThreads.
Frame rate governs everything we do, and by default Flex applications run at 24 frames/s, but really most Flex applications don't do that much animation so if we dropped the frame rate in half to 12 frames/s we would be able to run for longer periods uninterrupted. The fewer interruptions we have, the faster we'll go.
GreenThreads also allows you to configure how much of the frame's time you dedicate to running your thread. By default it's set at 0.99 that roughly leaves 1 ms to update the UI. Under some experimentation this has proven to work quite well without creating lots of timeouts, but if you want to tweak it just provide a new value in the start method like so:
public function go() : void {
start( 0.5 );
}
If the delta is less than 1 then it means a percentage of the length of a frame. If it's >=1 then it means the number of milliseconds to subtract from the length of a frame. Some more thought needs to go into this so that as you run your application on different machines with different CPUs so the pause is appropriate for the CPU. In the future it might need to be dynamically adjusted as the algorithm runs.
GreenThreads supports runtime statistics for tracking your job. To turn on thread statistics pass true to the GreenThread constructor. Thread statistics collects total time the job took, number of timeouts, min and max iteration times, average time a single iteration took, how many cycles it took, etc. There is a fair amount of information that can be gathered to help tune your thread. You can access that information by doing the following:
public class SomeJob extends GreenThread {
public function SomeJob() {
super( true ); // turn on debug statisitics
addEventListener( Event.COMPLETE, function( event : Event ) : void {
trace( statistics.print() );
});
}
}
There are some drawbacks to doing concurrency this way. One is algorithm have to be cooperative, and stop processing in the middle to let Flash do its thing. That means your algorithm normally have to be rewritten to conform with this approach. That can be particularly difficult for recursive algorithms. There needs to be more research done into how you might fix this with the callLater() technique. The biggest draw back is that we cannot take advantage of multi-processors. For all the code you write Flash runs on a single OS thread. This is a serious disadvantage for us going forward because as Actionscript developers we cannot access boosts in hardware performance as cores are added.
It's been a lot of information but hopefully you now understand the theory behind concurrency in Actionscript, and you have a new library that helps you optimize your code. You can access the source code here, and download the GreenThread's library here. I look forward to hearing about what sorts of long running jobs you create.
Full source code of the Mandelbrot set is here.
Download GreenThread's library here.
public function start() : void {
Application.application.addEventListener( Event.FRAME_ENTER, onCycle );
}
private function onCycle( event : Event ) : void {
var cycle : Boolean = true;
var start : Number = getTimer();
var milliseconds = 1000 / Application.application.stage.frameRate - DELTA;
while( cycle && (getTimer() - start) < milliseconds ) {
cycle = doLongWork();
}
if( cycle == false ) {
Application.application.removeEventListener( Event.FRAME_ENTER, doLongWork );
}
}
public function doLongWork() : Boolean {
// do some work
i++;
return i < total;
}
Now we broken up our algorithm into an extra method. First the start() method which we're already seen. The new method is the onCycle which calculates how long a frame is in milliseconds. The loop continues until either the doLongWork method returns false, or we run out of time. Notice the DELTA constant is some constant that keeps us from eating up the entire frame. We need to give a little breathing room for Flash to drain the queue. Notice how our doLongWork method is just the code pertaining to our job. This makes it's easier to build a general purpose solution that we can reuse.
Green Threads
We can't use true OS threads in Actionscript, but any language can emulate threads. This technique is often called Green Threads. Lots of languages have used this in the past. Threads in Ruby are still green, and early versions of Java were green as well. Now Actionscript can too. I should pause and give credit to Drew Cummins who implemented a version of this for Flash player 10. I've rewritten this to remove the dependency of Flash 10, and changed some of the API so event dispatch is more natural, added easy progress events, and optional progress tracking. Let's see how our Mandelbrot algorithm changes when we use this.
In order to use GreenThreads create a subclass of GreenThread, override run method, and optionally override initialize method to add code that runs at the start. Here is an example:
public class Mandelbrot extends GreenThread {
private var _bitmap : BitmapData;
private var _maxIteration : uint = 100;
private var _realMin : Number = -2.0;
private var _realMax : Number = 1.0;
private var _imaginaryMin : Number = -1.0;
private var _imaginaryMax : Number = 1.0;
private var _shader : Shader;
private var _realStep : Number;
private var _imaginaryStep : Number;
private var screenx : int = 0;
private var screeny : int = 0;
override protected function initialize( ) : void {
_bitmap = new BitmapData( width, height, false, 0x020202 );
screenx = screeny = 0;
_realStep = (_realMax - _realMin) / Number(_bitmap.width);
_imaginaryStep = ( _imaginaryMax - _imaginaryMin ) / Number( _bitmap.height );
}
override protected function run():Boolean {
if( screenx > _bitmap.width ) {
screenx = 0;
screeny++;
}
if( screeny < _bitmap.height ) {
var x : Number = screenx * _realStep + _realMin;
var y : Number = screeny * _imaginaryStep + _imaginaryMin;
var x0 : Number = x;
var y0 : Number = y;
var iteration : int = 0;
while( x * x + y * y <= (2 * 2) && iteration < _maxIteration ) {
var xtemp : Number = x * x - y * y + x0;
y = 2 * x * y + y0;
x = xtemp;
iteration = iteration + 1;
}
if( iteration == _maxIteration ) {
_bitmap.setPixel( screenx, screeny, 0x000000 );
} else {
_bitmap.setPixel( screenx, screeny, shader.lookup( Number(iteration) / Number(maxIteration) ) );
}
screenx++;
return true;
} else {
return false;
}
}
}
The run() method is the body our of loop. The intialize() method is called once after the user calls the start() method. After that run() method is called repeatedly until it returns false. It's perfectly acceptable to call start() more than once to kick off the thread again after it's finished. That means you can calculate the Mandelbrot set from different zoom levels without needing to recreate new instances. The initialize() method will be called every time start() is called. Check out the results here.
You can also add optional progress tracking by setting maximum, and progress members. This will automatically dispatch ProgressEvents so that your instance can be a source to a ProgressBar. It makes tracking your job easy. GreenThread also subclasses EventDispatcher so you can dispatch events from within the run method.
By in large we've solved the performance problems or we've gotten very close. What's holding us back is the resolution of getTimer(). Since we only have precision of millisecond we really can't run the risk of going smaller than 1 millisecond for our DELTA. That costs us a few iterations on our run() method which can make a difference over 1000 cycles. We could be a full second behind Actionscript that just ran the job straight through. There are a few things we can do to squeeze a little more performance out of GreenThreads.
Frame rate governs everything we do, and by default Flex applications run at 24 frames/s, but really most Flex applications don't do that much animation so if we dropped the frame rate in half to 12 frames/s we would be able to run for longer periods uninterrupted. The fewer interruptions we have, the faster we'll go.
GreenThreads also allows you to configure how much of the frame's time you dedicate to running your thread. By default it's set at 0.99 that roughly leaves 1 ms to update the UI. Under some experimentation this has proven to work quite well without creating lots of timeouts, but if you want to tweak it just provide a new value in the start method like so:
public function go() : void {
start( 0.5 );
}
If the delta is less than 1 then it means a percentage of the length of a frame. If it's >=1 then it means the number of milliseconds to subtract from the length of a frame. Some more thought needs to go into this so that as you run your application on different machines with different CPUs so the pause is appropriate for the CPU. In the future it might need to be dynamically adjusted as the algorithm runs.
Thread Statistics
GreenThreads supports runtime statistics for tracking your job. To turn on thread statistics pass true to the GreenThread constructor. Thread statistics collects total time the job took, number of timeouts, min and max iteration times, average time a single iteration took, how many cycles it took, etc. There is a fair amount of information that can be gathered to help tune your thread. You can access that information by doing the following:
public class SomeJob extends GreenThread {
public function SomeJob() {
super( true ); // turn on debug statisitics
addEventListener( Event.COMPLETE, function( event : Event ) : void {
trace( statistics.print() );
});
}
}
Conclusion
There are some drawbacks to doing concurrency this way. One is algorithm have to be cooperative, and stop processing in the middle to let Flash do its thing. That means your algorithm normally have to be rewritten to conform with this approach. That can be particularly difficult for recursive algorithms. There needs to be more research done into how you might fix this with the callLater() technique. The biggest draw back is that we cannot take advantage of multi-processors. For all the code you write Flash runs on a single OS thread. This is a serious disadvantage for us going forward because as Actionscript developers we cannot access boosts in hardware performance as cores are added.
It's been a lot of information but hopefully you now understand the theory behind concurrency in Actionscript, and you have a new library that helps you optimize your code. You can access the source code here, and download the GreenThread's library here. I look forward to hearing about what sorts of long running jobs you create.
Full source code of the Mandelbrot set is here.
Download GreenThread's library here.
2/19/2009
Actionscript and Concurrency (II of III)
Now that we understand more about how Flash works internally we can begin to talk about strategies to chop up our large running job into smaller pieces. We'll render the Mandelbrot set. It's a time consuming algorithm, and it's fun when your demos make pretty pictures too. Let's look at a simple implementation.
public function calculate( width : int, height : int ) : void {
_bitmap = new BitmapData( width, height, false, 0x020202 );
var realStep : Number = (_realMax - _realMin) / Number(_bitmap.width);
var imaginaryStep : Number = ( _imaginaryMax - _imaginaryMin ) / Number( _bitmap.height );
for( var screeny : int = 0; screeny < _bitmap.height; screeny++ ) {
for( var screenx : int = 0; screenx < _bitmap.width; screenx++ ) {
var x : Number = screenx * realStep + _realMin;
var y : Number = screeny * imaginaryStep + _imaginaryMin;
var x0 : Number = x;
var y0 : Number = y;
var iteration : int = 0;
while( x * x + y * y <= (2 * 2) && iteration < _maxIteration ) {
var xtemp : Number = x * x - y * y + x0;
y = 2 * x * y + y0;
x = xtemp;
iteration = iteration + 1;
}
if( iteration == _maxIteration ) {
_bitmap.setPixel( screenx, screeny, 0x000000 );
} else {
_bitmap.setPixel( screenx, screeny, shader.lookup( Number(iteration) / Number(maxIteration) ) );
}
}
}
var evt : Event = new Event( Event.COMPLETE );
dispatchEvent( evt );
}
Click here to see it in action. Notice how there was a pause before it actually drew the Mandelbrot set. Maybe you got the pinwheel of death or not responding. That is what happens when you hold up the Event Queue.
At a high level this algorithm calculates whether or not a pixel at (screenx,screeny) is inside the Mandelbrot set (i.e. stays below 4). If it stays below 4 after maxIterations of the loop then it colors the pixel black. If not it's color is based on how many times the inner while loop ran before going past 4. It's not as important you understand what the algorithm is doing as so much the parts of the algorithm. The parts that make this a long job are the three loops inside.
In order to split this job up and run across many frames we'll need to break up those top two loops. Before we jump into that let's talk in a little more general terms. Say we want to bust up a general purpose loop something like:
public function doLongWork( arg1 : String ) : void {
for( var i : int = 0; i < 1000000; i++ ) {
// do some work
}
}
We could use the UIComponent.callLater() method to trigger our loop that might look like the following:
public function doLongWork( arg1 : String, i : int, total : int ) : void {
// do work
if( i < total ) {
uicomponent.callLater( doLongWork, [ arg1, i + 1, total ] );
}
}
This is nice. We've removed the for loop and replaced it with what looks like a recursive call, but it's not. Actually what we're doing is doing a single iteration of the loop, and then scheduling the Event Queue to call us back later to do the next iteration of our loop. We do this until i == total, and that point we stop.
Another way we could restructure our code is to us a timer. Here is another way to do this:
public function start() : void {
i = 0; total = 1000000;
var milliseconds = 1000 / Application.application.stage.frameRate;
_runner = new Timer( milliseconds );
_runner.addEventListener( TimerEvent.TIMER, doLongWork );
}
public function doLongWork() : void {
// do some work
i++;
if( i >= total ) {
_runner.stop();
_runner.removeEventListener( TimerEvent.TIMER, doLongWork );
}
}
A little more code, but this works too. Now we're scheduling a timer to call us at an interval of a single frame. That's the first line where we calculate in milliseconds the length of a single frame. Then we register our doLongWork method to get called back by the timer. We then remove the listener and stop the timer once i reaches total. Notice that in this method we have to move i and total into instance variables which means we have to initialize those in some sort of start method.
The final option we could use is FRAME_ENTER event. That looks like this:
public function start() : void {
i = 0; total = 1000000;
Application.application.addEventListener( Event.FRAME_ENTER, doLongWork );
}
public function doLongWork( event : Event ) : void {
// do some work
i++;
if( i >= total ) {
Application.application.removeEventListener( Event.FRAME_ENTER, doLongWork );
}
}
This is nice because we don't have to fiddle with math in order to call us back at the frame rate. We just register a listener, when we're done we unregister our listener. Not as clean as callLater, but this works in both Flash and Flex. What's important to remember is that all of these techniques are equivalent. There is no discernible difference in performance between them.
So now we can restructure our Mandelbrot algorithm to match one of these patterns. In our case we'll move screenx, screeny, _realStep, _imaginaryStep, and our BitmapData outside of our calculate() method, and put them inside calculateAsync(). Here's our Mandelbrot algorithm restructured:
public function calculateAsync( width : int, height : int ) : void {
_bitmap = new BitmapData( width, height, false, 0x020202 );
screenx = screeny = 0;
_realStep = (_realMax - _realMin) / Number(_bitmap.width);
_imaginaryStep = ( _imaginaryMax - _imaginaryMin ) / Number( _bitmap.height );
Application.application.addEventListener( Event.FRAME_ENTER, calculate );
}
private function calculate( event : Event ) : void {
if( screenx > _bitmap.width ) {
screenx = 0;
screeny++;
}
if( screeny < _bitmap.height ) {
var x : Number = screenx * _realStep + _realMin;
var y : Number = screeny * _imaginaryStep + _imaginaryMin;
var x0 : Number = x;
var y0 : Number = y;
var iteration : int = 0;
while( x * x + y * y <= (2 * 2) && iteration < _maxIteration ) {
var xtemp : Number = x * x - y * y + x0;
y = 2 * x * y + y0;
x = xtemp;
iteration = iteration + 1;
}
if( iteration == _maxIteration ) {
_bitmap.setPixel( screenx, screeny, 0x000000 );
} else {
_bitmap.setPixel( screenx, screeny, shader.lookup( Number(iteration) / Number(maxIteration) ) );
}
screenx++;
} else {
Application.application.removeEventListener( Event.FRAME_ENTER, calculate );
}
}
Now if we ran this version of it you'd see it's 40x slower! Why is that? Well we're executing a single iteration every start of a frame (~40ms). If you timed the running of the calculate method. You'd see that it probably takes less than a millisecond to complete a single iteration. So we've just take a single iteration that ran in <1ms and now we're running it in 40ms! Yikes!
Remember when I said Flash is all about animation? Running jobs at the frame rate is great for animation, but horrible for general purpose concurrency. We need a better way. So our final part we'll discuss optimizing our technique and demonstrate a general purpose solution so that you don't have recreate the wheel when you need to run jobs for long periods of time. See you in the next installment.
public function calculate( width : int, height : int ) : void {
_bitmap = new BitmapData( width, height, false, 0x020202 );
var realStep : Number = (_realMax - _realMin) / Number(_bitmap.width);
var imaginaryStep : Number = ( _imaginaryMax - _imaginaryMin ) / Number( _bitmap.height );
for( var screeny : int = 0; screeny < _bitmap.height; screeny++ ) {
for( var screenx : int = 0; screenx < _bitmap.width; screenx++ ) {
var x : Number = screenx * realStep + _realMin;
var y : Number = screeny * imaginaryStep + _imaginaryMin;
var x0 : Number = x;
var y0 : Number = y;
var iteration : int = 0;
while( x * x + y * y <= (2 * 2) && iteration < _maxIteration ) {
var xtemp : Number = x * x - y * y + x0;
y = 2 * x * y + y0;
x = xtemp;
iteration = iteration + 1;
}
if( iteration == _maxIteration ) {
_bitmap.setPixel( screenx, screeny, 0x000000 );
} else {
_bitmap.setPixel( screenx, screeny, shader.lookup( Number(iteration) / Number(maxIteration) ) );
}
}
}
var evt : Event = new Event( Event.COMPLETE );
dispatchEvent( evt );
}
Click here to see it in action. Notice how there was a pause before it actually drew the Mandelbrot set. Maybe you got the pinwheel of death or not responding. That is what happens when you hold up the Event Queue.
At a high level this algorithm calculates whether or not a pixel at (screenx,screeny) is inside the Mandelbrot set (i.e. stays below 4). If it stays below 4 after maxIterations of the loop then it colors the pixel black. If not it's color is based on how many times the inner while loop ran before going past 4. It's not as important you understand what the algorithm is doing as so much the parts of the algorithm. The parts that make this a long job are the three loops inside.
Breaking Apart Algorithms
In order to split this job up and run across many frames we'll need to break up those top two loops. Before we jump into that let's talk in a little more general terms. Say we want to bust up a general purpose loop something like:
public function doLongWork( arg1 : String ) : void {
for( var i : int = 0; i < 1000000; i++ ) {
// do some work
}
}
callLater() Technique
We could use the UIComponent.callLater() method to trigger our loop that might look like the following:
public function doLongWork( arg1 : String, i : int, total : int ) : void {
// do work
if( i < total ) {
uicomponent.callLater( doLongWork, [ arg1, i + 1, total ] );
}
}
This is nice. We've removed the for loop and replaced it with what looks like a recursive call, but it's not. Actually what we're doing is doing a single iteration of the loop, and then scheduling the Event Queue to call us back later to do the next iteration of our loop. We do this until i == total, and that point we stop.
Timer Technique
Another way we could restructure our code is to us a timer. Here is another way to do this:
public function start() : void {
i = 0; total = 1000000;
var milliseconds = 1000 / Application.application.stage.frameRate;
_runner = new Timer( milliseconds );
_runner.addEventListener( TimerEvent.TIMER, doLongWork );
}
public function doLongWork() : void {
// do some work
i++;
if( i >= total ) {
_runner.stop();
_runner.removeEventListener( TimerEvent.TIMER, doLongWork );
}
}
A little more code, but this works too. Now we're scheduling a timer to call us at an interval of a single frame. That's the first line where we calculate in milliseconds the length of a single frame. Then we register our doLongWork method to get called back by the timer. We then remove the listener and stop the timer once i reaches total. Notice that in this method we have to move i and total into instance variables which means we have to initialize those in some sort of start method.
FRAME_ENTER Technique
The final option we could use is FRAME_ENTER event. That looks like this:
public function start() : void {
i = 0; total = 1000000;
Application.application.addEventListener( Event.FRAME_ENTER, doLongWork );
}
public function doLongWork( event : Event ) : void {
// do some work
i++;
if( i >= total ) {
Application.application.removeEventListener( Event.FRAME_ENTER, doLongWork );
}
}
This is nice because we don't have to fiddle with math in order to call us back at the frame rate. We just register a listener, when we're done we unregister our listener. Not as clean as callLater, but this works in both Flash and Flex. What's important to remember is that all of these techniques are equivalent. There is no discernible difference in performance between them.
Restructuring Our Demo
So now we can restructure our Mandelbrot algorithm to match one of these patterns. In our case we'll move screenx, screeny, _realStep, _imaginaryStep, and our BitmapData outside of our calculate() method, and put them inside calculateAsync(). Here's our Mandelbrot algorithm restructured:
public function calculateAsync( width : int, height : int ) : void {
_bitmap = new BitmapData( width, height, false, 0x020202 );
screenx = screeny = 0;
_realStep = (_realMax - _realMin) / Number(_bitmap.width);
_imaginaryStep = ( _imaginaryMax - _imaginaryMin ) / Number( _bitmap.height );
Application.application.addEventListener( Event.FRAME_ENTER, calculate );
}
private function calculate( event : Event ) : void {
if( screenx > _bitmap.width ) {
screenx = 0;
screeny++;
}
if( screeny < _bitmap.height ) {
var x : Number = screenx * _realStep + _realMin;
var y : Number = screeny * _imaginaryStep + _imaginaryMin;
var x0 : Number = x;
var y0 : Number = y;
var iteration : int = 0;
while( x * x + y * y <= (2 * 2) && iteration < _maxIteration ) {
var xtemp : Number = x * x - y * y + x0;
y = 2 * x * y + y0;
x = xtemp;
iteration = iteration + 1;
}
if( iteration == _maxIteration ) {
_bitmap.setPixel( screenx, screeny, 0x000000 );
} else {
_bitmap.setPixel( screenx, screeny, shader.lookup( Number(iteration) / Number(maxIteration) ) );
}
screenx++;
} else {
Application.application.removeEventListener( Event.FRAME_ENTER, calculate );
}
}
Now if we ran this version of it you'd see it's 40x slower! Why is that? Well we're executing a single iteration every start of a frame (~40ms). If you timed the running of the calculate method. You'd see that it probably takes less than a millisecond to complete a single iteration. So we've just take a single iteration that ran in <1ms and now we're running it in 40ms! Yikes!
Remember when I said Flash is all about animation? Running jobs at the frame rate is great for animation, but horrible for general purpose concurrency. We need a better way. So our final part we'll discuss optimizing our technique and demonstrate a general purpose solution so that you don't have recreate the wheel when you need to run jobs for long periods of time. See you in the next installment.
Labels:
actionscript,
concurrency,
flash,
flex
Actionscript and Concurrency (Part I of III)
This is a three part series from a talk I gave at the Atlanta Flex and Flash User Group meeting in February. Original slides are here.
I can hear it now Actionscript can't do that. And some of you might say what is concurrency? Concurrency goes by a lot of names, but it simply means doing two things at once. More accurately stated doing two things simultaneously. Now I can hear what you might say "But, Flash already does that? I mean I can animate two objects on the screen at once." While that's true, Flash is optimized for animation not for general purpose concurrency. In fact the design of animation is so foundational it governs everything that happens in Flash.
If you're a Flash developer I'm sure you're familiar with the timeline. The timeline is divided into a series of frames, and Flash executes those frames at a particular rate one after the other. For flex applications the rate is 24 frames / second. If you do the math that means each frame lasts a little more than 40ms. The timeline is very natural for designers. But, for developers this concept is a little strange because it's hard to understand where my code is executed?
For developers I think a thought experiment helps clarify the timeline. Say you were hired to implement the timeline concept. What data structure would help you in doing this task? The answer is surprisingly simple and bears a lot of resemblance to most UI toolkits out there. At the heart of the Flash platform there exists a queue of events. The Event Queue, as it's known, is where all code is triggered. Every piece of code in Flex and Flash is related back to some event being triggered. So when you move your mouse, click a button, type on the keyboard, set timers they all go onto the queue. Flash then pops off those events from the queue and executes them one by one. When one event is done processing the next event is processed. There's no way any two events can be processed at the same time. It all happens one at a time.
Now in Flash even those frames from the timeline are modeled as events. The difference between them and other events is that frame events must occur at certain points in time. Unlike mouse or keyboard which can wait till the next frame to be processed. Frame events must be processed every ~40ms (or 1000 ms / 24).
That means if one event takes too long to process Flash can't update the screen, process other events, or doing anything. What happens is Flash locks up and you get a pinwheel of death or a Not Responding next to your application. In fact Flash punishes such acts and stops processing mouse, keyboard, or button clicks till it catches up. Why? Well that because if you block the queue from processing it won't update that nice animation, and Flash prioritizes that over anything else.
Rule number one is that any code you write must execute within the time span of a frame. If not you run the risk of having Flash coming down on you. This leads us to the second rule which is Flash is not concurrent. There is no way for Flash to update the UI and run your job at the same time.
So what do you do if you have a long running job? Well the answer lies in how animation works in Flash. Animating several objects at once requires a series of many smaller steps. Say you want to move a ball from one side of the screen to the other. That means you need to move it a little, redraw, move it some more, redraw, etc until the ball is at the other side of the screen. In a way animation is chopping up a long running job into several smaller jobs that run once per frame. We can do that too by chopping up our job into many smaller jobs, and executing them a little at time until we're done!
So how might you do this? Well there are several tricks you can use, and they all perform the same. It's more a matter of taste in which one you choose, but remember technically there is no real difference between these. Our next part we'll look at each of these techniques and discuss unique problems when doing concurrent actions in Flash.
- Part I Intro to the Event Queue
- Part II Techniques for Long running Jobs
- Part III Increasing Performance and a General Purpose Solution
I can hear it now Actionscript can't do that. And some of you might say what is concurrency? Concurrency goes by a lot of names, but it simply means doing two things at once. More accurately stated doing two things simultaneously. Now I can hear what you might say "But, Flash already does that? I mean I can animate two objects on the screen at once." While that's true, Flash is optimized for animation not for general purpose concurrency. In fact the design of animation is so foundational it governs everything that happens in Flash.
Timeline
If you're a Flash developer I'm sure you're familiar with the timeline. The timeline is divided into a series of frames, and Flash executes those frames at a particular rate one after the other. For flex applications the rate is 24 frames / second. If you do the math that means each frame lasts a little more than 40ms. The timeline is very natural for designers. But, for developers this concept is a little strange because it's hard to understand where my code is executed?
Event Queue
For developers I think a thought experiment helps clarify the timeline. Say you were hired to implement the timeline concept. What data structure would help you in doing this task? The answer is surprisingly simple and bears a lot of resemblance to most UI toolkits out there. At the heart of the Flash platform there exists a queue of events. The Event Queue, as it's known, is where all code is triggered. Every piece of code in Flex and Flash is related back to some event being triggered. So when you move your mouse, click a button, type on the keyboard, set timers they all go onto the queue. Flash then pops off those events from the queue and executes them one by one. When one event is done processing the next event is processed. There's no way any two events can be processed at the same time. It all happens one at a time.
Now in Flash even those frames from the timeline are modeled as events. The difference between them and other events is that frame events must occur at certain points in time. Unlike mouse or keyboard which can wait till the next frame to be processed. Frame events must be processed every ~40ms (or 1000 ms / 24).
That means if one event takes too long to process Flash can't update the screen, process other events, or doing anything. What happens is Flash locks up and you get a pinwheel of death or a Not Responding next to your application. In fact Flash punishes such acts and stops processing mouse, keyboard, or button clicks till it catches up. Why? Well that because if you block the queue from processing it won't update that nice animation, and Flash prioritizes that over anything else.
Rule number one is that any code you write must execute within the time span of a frame. If not you run the risk of having Flash coming down on you. This leads us to the second rule which is Flash is not concurrent. There is no way for Flash to update the UI and run your job at the same time.
Timeslicing
So what do you do if you have a long running job? Well the answer lies in how animation works in Flash. Animating several objects at once requires a series of many smaller steps. Say you want to move a ball from one side of the screen to the other. That means you need to move it a little, redraw, move it some more, redraw, etc until the ball is at the other side of the screen. In a way animation is chopping up a long running job into several smaller jobs that run once per frame. We can do that too by chopping up our job into many smaller jobs, and executing them a little at time until we're done!
So how might you do this? Well there are several tricks you can use, and they all perform the same. It's more a matter of taste in which one you choose, but remember technically there is no real difference between these. Our next part we'll look at each of these techniques and discuss unique problems when doing concurrent actions in Flash.
Labels:
actionscript,
concurrency,
flash,
flex
1/07/2009
Fun with Fluent Interfaces and Java
I've written about fluent interfaces before, but I thought I'd share this one I use quite a bit. You never know how much you like something until it's gone. I never thought I really liked Java's InputStream and OutputStream that much until I had to do a lot of streaming work in Actionscript. They have no abstraction for doing stream manipulations. But, let's be honest Java's io first settler's haven't changed much since their introduction. In fact their interfaces have not changed one bit. Sad really because they are so ubiquitous. I find I'm always copying data from one stream the other, dealing with IOExceptions, remembering to close streams, etc. And I got really tired of doing it over and over. What started out as static methods has evolved into a very simple object called ExtendInputStream. Extended as in extending the interface to add more rich functionality rather than the use of inheritance.
The greatest single thing about InputStream and OutputStream is that it's the quintessential example of a decorator. Decorator is one of the foundational software patterns. What I love about decorators is the ability to encapsulate related classes behind a new interface while still retaining interoperability with other decorators.
ExtendedInputStream is a InputStream so it can interact just as plain old InputStream would, but it adds methods like copy, closeQuietly, copyAndClose, and integration with File objects which has always been a pet peeve of mine with InputStream. Let's look at some examples:
Here is copying a file to a directory.
new ExtendedInputStream( someFile ).copyAndClose( dir );
One liner! It's amazing how File object doesn't have these methods already implemented, but then again this approach is much more flexible because we can copy this file to any OutputStream. Here is copying a set of files to a zip.
ZipOutputStream zout = new ZipOutputStream( out );
for( File myfile : files ) {
ZipEntry entry = new ZipEntry( myfile.getName() );
zout.putNextEntry( entry );
new ExtendedInputStream( myFile ).copyAndClose( zout );
}
Five lines of code! Not bad given that 4 of those lines is just to work with ZipOutputStream. Notice how I'm not saving the reference to the ExtendedInputStream here. The copyAndClose() method copies the contents of the file to the OutputStream and closes the InputStream. Closing the OutputStream is your responsibility.
And the more general case of copying an plain old InputStream to any OutputStream.
URLConnection remote = new URL("...").openConnection();
new ExtendedInputStream( new URL("...").openStream() ).copyAndClose( remote.openOutputStream() );
Here is a more advanced version. Say we want to pull down a URL and save it to a file on our local filesystem.
File someDirectory = ...;
new ExtendedInputStream( new URL("...").openStream() ).name( "SavedUrl.txt" ).copyAndClose( someDirectory );
Here we use the optional method name() to set the name of the stream so when we save something to a directory it will use this name as the filename. You could have just as easily done new File( someDirectory, "SaveUrl.txt" ), but it's not always convenient.
You can use a similar pattern for increasing the buffer size used when copying as well.
new ExtendedInputStream( new URL("...").openStream() ).bufferSize( 8096 * 2 ).copyAndClose( someDir );
While I have enjoyed writing this simple class I think I've enjoyed using it more so. I really can't start a new Java project without it now. It's a lot of fun to use. I'd be interested in hearing other features people might want to see added.
package com.wrongnotes.util;
import java.io.*;
public class ExtendedInputStream extends InputStream {
private InputStream delegate;
private String name;
private int bufferSize = 8096;
public ExtendedInputStream( InputStream stream ) {
this( "no_name_file", stream );
}
public ExtendedInputStream(String name, InputStream delegate) {
this.name = name;
this.delegate = delegate;
}
public ExtendedInputStream( File src ) throws FileNotFoundException {
name = src.getName();
delegate = new BufferedInputStream( new FileInputStream( src ) );
}
public int read() throws IOException {
return delegate.read();
}
public int read(byte b[]) throws IOException {
return delegate.read(b);
}
public int read(byte b[], int off, int len) throws IOException {
return delegate.read(b,off,len);
}
public long skip(long n) throws IOException {
return delegate.skip(n);
}
public int available() throws IOException {
return delegate.available();
}
public void close() throws IOException {
delegate.close();
}
public synchronized void mark(int readlimit) {
delegate.mark(readlimit);
}
public synchronized void reset() throws IOException {
delegate.reset();
}
public long copy(File dest) throws IOException {
if( dest.isDirectory() ) {
dest = new File( dest, name );
}
FileOutputStream out = new FileOutputStream(dest);
try {
return copy( out );
} finally {
out.close();
}
}
public long copy(OutputStream out) throws IOException {
long total = 0;
byte[] buffer = new byte[bufferSize];
int len;
while ((len = this.read(buffer)) >= 0) {
out.write(buffer, 0, len);
total += len;
}
out.flush();
return total;
}
public void closeQuietly() {
try {
close();
} catch( IOException ioe ) {
// ignore
}
}
public void copyAndClose( File file ) throws IOException {
try {
copy( file );
} finally {
close();
}
}
public void copyAndClose(OutputStream out) throws IOException {
try {
copy( out );
} finally {
close();
}
}
public ExtendedInputStream bufferSize( int size ) {
bufferSize = size;
return this;
}
public ExtendedInputStream name( String newName ) {
name = newName;
return this;
}
}
The greatest single thing about InputStream and OutputStream is that it's the quintessential example of a decorator. Decorator is one of the foundational software patterns. What I love about decorators is the ability to encapsulate related classes behind a new interface while still retaining interoperability with other decorators.
ExtendedInputStream is a InputStream so it can interact just as plain old InputStream would, but it adds methods like copy, closeQuietly, copyAndClose, and integration with File objects which has always been a pet peeve of mine with InputStream. Let's look at some examples:
Here is copying a file to a directory.
new ExtendedInputStream( someFile ).copyAndClose( dir );
One liner! It's amazing how File object doesn't have these methods already implemented, but then again this approach is much more flexible because we can copy this file to any OutputStream. Here is copying a set of files to a zip.
ZipOutputStream zout = new ZipOutputStream( out );
for( File myfile : files ) {
ZipEntry entry = new ZipEntry( myfile.getName() );
zout.putNextEntry( entry );
new ExtendedInputStream( myFile ).copyAndClose( zout );
}
Five lines of code! Not bad given that 4 of those lines is just to work with ZipOutputStream. Notice how I'm not saving the reference to the ExtendedInputStream here. The copyAndClose() method copies the contents of the file to the OutputStream and closes the InputStream. Closing the OutputStream is your responsibility.
And the more general case of copying an plain old InputStream to any OutputStream.
URLConnection remote = new URL("...").openConnection();
new ExtendedInputStream( new URL("...").openStream() ).copyAndClose( remote.openOutputStream() );
Here is a more advanced version. Say we want to pull down a URL and save it to a file on our local filesystem.
File someDirectory = ...;
new ExtendedInputStream( new URL("...").openStream() ).name( "SavedUrl.txt" ).copyAndClose( someDirectory );
Here we use the optional method name() to set the name of the stream so when we save something to a directory it will use this name as the filename. You could have just as easily done new File( someDirectory, "SaveUrl.txt" ), but it's not always convenient.
You can use a similar pattern for increasing the buffer size used when copying as well.
new ExtendedInputStream( new URL("...").openStream() ).bufferSize( 8096 * 2 ).copyAndClose( someDir );
While I have enjoyed writing this simple class I think I've enjoyed using it more so. I really can't start a new Java project without it now. It's a lot of fun to use. I'd be interested in hearing other features people might want to see added.
package com.wrongnotes.util;
import java.io.*;
public class ExtendedInputStream extends InputStream {
private InputStream delegate;
private String name;
private int bufferSize = 8096;
public ExtendedInputStream( InputStream stream ) {
this( "no_name_file", stream );
}
public ExtendedInputStream(String name, InputStream delegate) {
this.name = name;
this.delegate = delegate;
}
public ExtendedInputStream( File src ) throws FileNotFoundException {
name = src.getName();
delegate = new BufferedInputStream( new FileInputStream( src ) );
}
public int read() throws IOException {
return delegate.read();
}
public int read(byte b[]) throws IOException {
return delegate.read(b);
}
public int read(byte b[], int off, int len) throws IOException {
return delegate.read(b,off,len);
}
public long skip(long n) throws IOException {
return delegate.skip(n);
}
public int available() throws IOException {
return delegate.available();
}
public void close() throws IOException {
delegate.close();
}
public synchronized void mark(int readlimit) {
delegate.mark(readlimit);
}
public synchronized void reset() throws IOException {
delegate.reset();
}
public long copy(File dest) throws IOException {
if( dest.isDirectory() ) {
dest = new File( dest, name );
}
FileOutputStream out = new FileOutputStream(dest);
try {
return copy( out );
} finally {
out.close();
}
}
public long copy(OutputStream out) throws IOException {
long total = 0;
byte[] buffer = new byte[bufferSize];
int len;
while ((len = this.read(buffer)) >= 0) {
out.write(buffer, 0, len);
total += len;
}
out.flush();
return total;
}
public void closeQuietly() {
try {
close();
} catch( IOException ioe ) {
// ignore
}
}
public void copyAndClose( File file ) throws IOException {
try {
copy( file );
} finally {
close();
}
}
public void copyAndClose(OutputStream out) throws IOException {
try {
copy( out );
} finally {
close();
}
}
public ExtendedInputStream bufferSize( int size ) {
bufferSize = size;
return this;
}
public ExtendedInputStream name( String newName ) {
name = newName;
return this;
}
}
12/17/2008
AC/DC Black Ice Tour: Atlanta
So my ears are still ringing even as I rock out to AC/DC at work. Last night AC/DC rocked Atlanta to a sold out crowd. The set list was the typical set list they've been playing in other cities. The concert started with a mini-movie segment of Angus driving a train, and two very slutty women come into the cab to derail the train. It's chocked full of not-so subtle innuendo and general male humor. Of course I'm not so sure why the women felt it necessary to try and stop the train. They looked like groupies so why they were stopping the train is unexplained. The overt sexual references and general absurdity of the piece had me laughing. Of course it ended in a huge on stage explosion and pyrotechnics display, a huge train smashes through the stage and out pops AC/DC playing their latest "Rock 'N Roll Train". It was an awesome entrance.
Then they followed it up a Bon Scott original "Hell Ain't A Bad Place To Be". While those are some my favorite AC/DC tunes. I really wanted them to play "It's a Long Way To the Top (If you wanna Rock 'N' Roll)", and Rock 'N' Roll Singer which are probably my favorites. Well that's not true I always have trouble choosing my favorites, but I do play those a lot.
Then they broke out "Back in Black" which I thought they played too early. This is their anthem get your damn hands up, but it definitely got the crowd going. It's hard to pick what they could have shuffled around because it's a solid set, but "Back in Black" is just such a powerful song it's got to be deeper in the set list.
Then "Big Jack" off their new album. Followed by "Dirty Deeds Done Dirt Cheap" followed by "Thunderstruck" which was my wife's favorite. I do have a warm spot in my heart for "Thunderstruck". Thank god her favorite wasn't "Dirty Deeds".
Then it was back to the new album for "Black Ice", and at this point I thought I'd have to go to the bathroom. But, I stuck it out, and was treated to another movie with "Warmachine". My favorite part was the parachuting women walking on the tank treads. Hilariously.
Then it was back to the hits with "Anything Goes" and "You Shook Me All Night Long" with a flaming model dancing on the screens. Cue the 5 ladies in the audience to jump up and start dancing. While a lot of AC/DC songs are about women I can tell you there aren't many who listen to it. Then it was into T.N.T. which rocked.
Other highlights of the night were Angus' guitar solo a top an elevated stage, and the gigantic inflatable Rosie that road the train during "Whole Lotta Rosie". Finally it was real cannon fire with "For Those About to Rock We Salute You". There was one long encore ending with "Highway to Hell" finishing the nite. I don't know if my ears could take anymore. It rocked them off.
Then they followed it up a Bon Scott original "Hell Ain't A Bad Place To Be". While those are some my favorite AC/DC tunes. I really wanted them to play "It's a Long Way To the Top (If you wanna Rock 'N' Roll)", and Rock 'N' Roll Singer which are probably my favorites. Well that's not true I always have trouble choosing my favorites, but I do play those a lot.
Then they broke out "Back in Black" which I thought they played too early. This is their anthem get your damn hands up, but it definitely got the crowd going. It's hard to pick what they could have shuffled around because it's a solid set, but "Back in Black" is just such a powerful song it's got to be deeper in the set list.
Then "Big Jack" off their new album. Followed by "Dirty Deeds Done Dirt Cheap" followed by "Thunderstruck" which was my wife's favorite. I do have a warm spot in my heart for "Thunderstruck". Thank god her favorite wasn't "Dirty Deeds".
Then it was back to the new album for "Black Ice", and at this point I thought I'd have to go to the bathroom. But, I stuck it out, and was treated to another movie with "Warmachine". My favorite part was the parachuting women walking on the tank treads. Hilariously.
Then it was back to the hits with "Anything Goes" and "You Shook Me All Night Long" with a flaming model dancing on the screens. Cue the 5 ladies in the audience to jump up and start dancing. While a lot of AC/DC songs are about women I can tell you there aren't many who listen to it. Then it was into T.N.T. which rocked.
Other highlights of the night were Angus' guitar solo a top an elevated stage, and the gigantic inflatable Rosie that road the train during "Whole Lotta Rosie". Finally it was real cannon fire with "For Those About to Rock We Salute You". There was one long encore ending with "Highway to Hell" finishing the nite. I don't know if my ears could take anymore. It rocked them off.
Subscribe to:
Posts (Atom)