Since some weeks, a new beta scratch version for RPi is around, announced on raspberrypi.org.
The work done by Tim Rowledge is in the area of performance. First impression is ‘it is faster’ in editing and runtime. So it is time to measure some performance numbers.
I measured timings for three systems:
- RPi-1.4-scratch is current scratch/squeak as on raspian, clocked at 1GHz.
- RPi-1.4-beta is current version of beta scratch (2014-06-13).
- win-1.4-scratch: To compare with a more powerful system, I have run some of the tests on a laptop machine, running scratch 1.4 from scratch.mit.edu, windows 7, 4 core processor 2.2GHz
Update: jamesh asked to repeat the tests with ‘HW cursor implementation for X’ xf86-video-fbturbo – video driver. Sounds complicated, but installation was straightforward. The tests executed with this modified X-system are marked with ‘X’
- RPi-1.4-scratch-X, modified X running RPi-1.4-scratch
RPi-1.4-beta-X , modified X running RPi-1.4-beta.
Results
In loops and calculations, the new scratch version on RPi even outperforms my windows-machine running legacy-1.4-scratch from mit.edu. On Pi, it needs only 50% execution time compared to current pi-scratch. This is impressive good.
For the other tests, execution time is down to some 85%, 80%.
One exceptional improvement is in these cases where variables are displayed on stage. This slows down current scratch, but in beta and with the modified X it executes 1o times faster (move2_presentation).
Especially for the graphic operations, improvements are noticeable.
The results are blue, dark yellow for raspbian system, and light blue, light yellow for the modified driver.
The modified driver in X results in better performance, execution times are 0.8 times only in most cases. The quite simple rotate and move-examples do not benefit too much, but whenever it gets crowded on stage it is noticeable.
For scratch remote sensor connections, the improvements are not so impressive and I assume it is based on overall performance optimizations. But the tests show that remote connections for broadcasts or variables need 40 ms for sending or receiving. Which is not bad. The great improvement in pingpong_remote is due to the comparison of presentation mode operations. Here, the RPi-1.4-scratch is much slower in presentation mode. Compared with full-stage mode, this is in the 80% range of other results.
The scratch projects are in performance.zip.
For the scratchClient, see download page.
Graphic system (performance_rotate.sb)
Rotating sprites needs quite a lot of computation power. It needs rotating the sprite by an angle and redisplay the graphics. In order to avoid possible caching of calculated sprite graphics, I have choosen to apply extra ‘one degree’ rotations in between.
RPi-1.4-scratch 10.6 sec
RPi-1.4-beta 8.5 sec
Graphic system 2 (performance_rotate_say.sb)
Displaying the ‘say’-bubble is a challenge. The system needs to look for the solid icon inside the alpha background, and adjust the bubble accordingly.
RPi-1.4-scratch 22.5 sec
RPi-1.4-beta 14.0 sec. This is impressive good.
win-1.4-scratch 6.0 sec
Graphic system move, move2
The move sample I usually explain to the kids in school as a scratch-antipattern: while true; goto x,y; inc x; inc y; endwhile; This works, but movement speed is limited by cpu-usage. The second is movement of two sprites with variable display on stage. This slows down execution speed drastically in RPi-1.4-scratch. In this area, the beta is a class better.
The presentation mode timings are
RPi-1.4-scratch 159.4 sec
RPi-1.4-scratch-X 119.2 sec using the modified X driver
RPi-1.4-beta 17.6 sec.
RPi-1.4-beta-X 14.3 sec using the modified X driver
win-1.4-scratch 12.1 sec
Scratch Sensor Network performance (performance_pingpong_remote.sb)
There are many assumptions on remote access for scratch timings. So I took the opportunity to measure some values.
It is not possible to measure time from a broadcast in scratch till it arrives in a remote system. It would need software ‘instrumentation’ inside scratch. But it is possible to send out a broadcast, and wait for a response coming back, using a remote scratchClient.
For the test, scratch script sends broadcast “ping”, and my scratchClient-software responding with “pong”. In scratch, this is repeated 200 times and time recorded.
RPi-1.4-scratch fullscreen 30.3 sec (!)
RPi-1.4-scratch edit mode 18.4 sec
RPi-1.4-beta fullscreen 16 sec.
The legacy scratch in fullscreen needs much longer than in edit mode, although the script animations cost some time. Strange.
The new scratch is 10 percent faster.
What does this mean on IO-Performance ? One event out, one in in 16sec/200 = 80ms or one way in 40ms. This is much faster than reported elsewhere. Not to forget: nothing else running around, no animations or alike.
When you want to run this test on your machine, load my scratchClient software, and use command line
cd ~/scratchClient sudo python src/scratchClient.py -c config/config_pingpong.py
Scratch Sensor Network performance (performance_pingpong_sensor_remote.sb)
Similiar setup as in the broadcast example, but there are variable values send over the network.
When scratchClient receives the ‘a’-value, it increments it by ‘1 and sends it back.
RPi-1.4-scratch 77.1 sec
RPi-1.4-beta 67.2 sec
win-1.4-scratch 25.0 sec
These values are very close to the broadcast-timings.
The scratchClient is same as for the broadcast test.
Scratch Calculations (performance_calculations.sb)
Simply a loop, and a few calculations.
RPi-1.4-scratch 41.3 sec
RPi-1.4-beta 20.7 sec (checked twice, real fast)
win-1.4-scratch 25.0 sec
Scratch Broadcasts (performance_pingpong.sb)
Sending broadcasts inside scratch. Remote sensor connections are disabled, and code is executed in presentation mode to avoid the script animations during executions.
RPi-1.4-scratch 83.0 sec presentation screen
RPi-1.4-scratch 75.0 sec full stage screen
RPi-1.4-beta 41.9 sec presentation screen
RPi-1.4-beta 84.5 sec full stage screen
win-1.4-scratch 50.0 sec