Raspberry Pi 2
John Sandgrounder (1650) 574 posts |
10 REM >!SpeedTest 15 PRINT "Takes 25.7 seconds on A410 20 TIME=0 30 FOR A%=0 TO 10000000: NEXT A% 40 PRINT TIME/100;" seconds" 50 GOTO20 This very simple program has shown steady decreases in time for each new computer since my first A410 (at which time, I put the ‘printed comment’ at line 15). |
David Pitt (102) 743 posts |
REM >!SpeedTest Oh yes! RPi 1.26 RPi2 1.12 VRPC iMac 0.56 |
Kuemmel (439) 384 posts |
…okay, that routine should theoretically scale just like the Mhz increase (for the same CPU architecture) if I think in machine code. Question is, what the Basic interpreter is doing…if I scale the 1.26 they should give something like 0.98 on the RPi2 (700/900). Weird as it’s just totally non-fancy integer stuff…on my Panda I get 0.24s at 1500 MHz. As the NEON/VFP isn’t running yet. Could give somebody my fixed point frac from that archive a run on PI/PI 2 for further investigation ? It uses a 800×600×8bit screen, hope that works. |
David Pitt (102) 743 posts |
RPi 5.85s RPi2 4.85s |
Kuemmel (439) 384 posts |
Thanks ! So clock by clock also no increase also here, more like a little less than expected (clock by clock if it’s 700 to 900 we get -6%)…either something still wrong with the setup or that CPU just isn’t that much better when it comes to integer stuff without any memory access. At least we should get major speedups when it comes to heavy memory access apps. I guess stuff like sorting should benefit from that. I’ll try to adjust my sorting code stuff later in the day for a test… |
Chris Gransden (337) 1148 posts |
With a stable overclock using these settings at a resolution of 2048×1152 16m colours, arm_freq=1100 wget to a ramdisc averages about 10.4 MiB/s download. SunFish, copying a file to and from ramdisc 5MiB/s download and 6MiB/s upload. FixFrac_64BitMUL 2.86s !SpeedTest 0.58s Hard disk is a USB SSD.
|
John Sandgrounder (1650) 574 posts |
@David, Interesting figures in your “Oh yes|” post. But not what I am seeing. I am getting
Same SD card in both with build files from last night and no arm_freq entry in config.txt A RISCOS 5.19 from ages ago also gives 1.07 on a Pi 1 (won’t run on Pi 2) Yes, please. Your test would be usefull. |
David Pitt (102) 743 posts |
Just to confirm my earlier results I have repeated the test. This time I used the same card in both machines. The one difference is that ROM has been updated to today’s ROM, (07-Feb-15), which slowed up the RPi2 a bit in this test. RPi 1 1.25 RPI 2 1.16 |
Kuemmel (439) 384 posts |
@Chris: Now your results at 1100 MHz make totally sense…clock by clock the Pi 2 is 30% better at the FixFrac, and also at the !SpeedTest the gain is there…so what’s the difference to David’s and John’s set up !? Can’t be just only the extra 200 MHz…may be you can check again with stock 900 MHz speed !? Core and dram settings should not give any difference for those tests… |
Chris Gransden (337) 1148 posts |
It looks like ‘force_turbo=1’ is the crucial bit. With just this set, (results are the same as after adding ‘arm_freq=900’ !SpeedTest 0.74s FixFrac_64BitMUL 3.61s As defaults, !SpeedTest 1.15s FixFrac_64MUL 5.54 |
David Pitt (102) 743 posts |
That will be why I could not get overclocking to work with just _freq entries. With only force_turbo=1 then :- John S's SpeedTest 0.74s (was 1.16s) FixFrac_64BitMUL 3.14s (was 4.85s) P.S. Adding John S's SpeedTest 0.59s FixFrac_64BitMUL 2.53s Stability might be an issue though. |
George T. Greenfield (154) 716 posts |
Certainly on a Pi 1, the ‘force_turbo’ setting controls whether or not upclocking is applied: if set to =1, it is; if set to =0 then the default speeds apply regardless of what is entered in arm_freq, core_freq and sdram_freq, IME. I guess it would not be a surprise if the Pi 2 worked similarly. As to the stability of overclocking the Pi 2, 900>1100 is relatively less than 700>900, and the latter seems to work stably on the Pi 1. |
Kuemmel (439) 384 posts |
@David: Could you check about that force_turbo setting with your Pi 1 and post the results from that again ? Then we could finally see the clock by clock comparison of the two different CPU’s. Interesting that your new results are even higher than the one’s from Chris at the same arm_freq. |
David Pitt (102) 743 posts |
@George. Thanks for the explanation.
From the old Raspberry Pi :- force_turbo=0 force_turbo=1 force_turbo=1 force_turbo=1 arm_freq=700 arm_freq=900 John S's SpeedTest 1.23s 1.23 1.25 0.96 FixFrac_64BitMUL 5.81s 5.91 5.86 4.51 There does seem to a difference in the two machines when the default The RPi 2 in these tests is faster than the RPI 1 at 900MHz. |
David Pitt (102) 743 posts |
Romark from Raspberry Pi 2 at 1100MHz RISCOSmark 1.01 (14 May 2003) Comparison with RiscPC SA 202MHz running RISC OS 4.02 x600,256 (HD benchmarks are in kilobytes/sec) OS/Machine/Processor: ?? Graphics Resoloution: 1680x1050,16M colours Test Benchmark Processor - Looped instructions (cache) 1250809 703% Memory - Multiple register transfer 6994 4317% Rectangle Copy - Graphics acceleration test 1113 459% Icon Plotting - 16 colour sprite with mask 19625 981% Draw Path - Stroke narrow line 5211 334% Draw Fill - Plot filled shape 4727 323% HD Read - Block load 1MB file 11484 385% HD Write - Block save 1MB file 1402 46% FS Read - Byte stream file in 456 220% FS Write - Byte stream file out 157 81% N.B. core_freq & sdram_freq not set. |
Kuemmel (439) 384 posts |
Nice ! So for FixFrac we really end up 45% better clock by clock, and due to the extra 200 MHz it’s of course even more…the efficiency of the Cortex-A7 within that test is at the same level like a Cortex-A9 in the Panda. If you’re still keen on testing you could run my sorting algorithm collection, found here . For a full run you need a copy of ARMSort from here . Would be interesting also here to see Pi1/Pi2 I guess also my MemSpeed results would be better after that force_turbo is fixed now, espcially that level 1 cache obscurity. |
David Pitt (102) 743 posts |
MemSpeedPi from RPi2 at 1100MHz, ROM (07-Feb-15) Testing RAM->RAM Transfer with ARM Rx LDM/STM instructions Size [KByte];Speed [MByte/s] 1;3685 2;3583 4;3338 8;2325 16;1860 24;1548 32;1457 36;1456 40;1455 48;1451 64;1446 128;1436 256;1388 512;914 1024;654 1152;642 1280;633 1536;614 2048;606 2176;610 2304;598 2560;595 4096;588 8192;588 16384;577 32768;578 Testing RAM->VRAM Transfer with ARM Rx LDM/STM instructions Size [KByte];Speed [MByte/s] 1;554 2;561 4;561 8;554 16;610 32;634 64;651 128;641 256;649 512;574 1024;396 1152;389 2048;364 2176;362 4096;364 |
John Sandgrounder (1650) 574 posts |
I have come to the conclusion that I have something wrong with my Pi 2 build because I am seeing dfferent results to everybody else. (although functionally it seems to work). I am out tomorrow but will rebuild it on Monday and try it all again. Is there a link to a complete SD card image? EDIT: I have now rebuilt the SD card amd get results in line with everybody else. |
David Pitt (102) 743 posts |
Results from both Pi’s at 700MHz and 900Hz. RPi1 RPi2 RPi1 RPi2 700 700 900 900 OS_HeapSort.............: [s] 4.35 2.73 3.84 2.29 Radix ASMv1.............: [s] 1.00 0.37 0.88 0.32 Radix ASMv2.............: [s] 0.95 0.32 0.85 0.28 Radix BASIC (Steve).....: [s] 13.62 12.29 10.86 9.42 ArmSort V4.08 (Martin)..: [s] 1.70 1.24 1.53 1.10 Quicksort non recursive.: [s] 0.67 0.39 0.57 0.30 Quicksort recursive (Ja.: [s] 0.69 0.40 0.59 0.32 |
Kuemmel (439) 384 posts |
Very good results :-) Except the Radix Basic it’s clock by clock almost overall better than the Panda, with the Radix ASM even much better…so the Cortex A7 looks like an architecture in between A9 and A15, just lacking of the maximum MHz. …just the results from !MemSpeed (though much better overall now) are still a bit weird regarding the 1st level cache decline, but if there’s no config thingy about that, it’s just like it is… …now I count on Jeffrey regarding VFP/NEON ;-) If that’s sufficient too, I might send my Panda to retirement. |
Kuemmel (439) 384 posts |
In between we could prove the existence of the hardware integer divide command on the Pi 2 (until now only available on IGEPv5). You could download my Basic Assembler Extension here and go to the “Example” directory and run !Divide (it divides 10000000 random integers). On my IGEPv5 at 1500 MHz I get: May be if we got that on the Pi 2 now it would be cool to integrate this into the Basic Interpreter…would speed up any integer division by far…
|
David Pitt (102) 743 posts |
From the RPi2 running at 900MHz. Unsigned integer divide Software 1.40s Hardware 0.36s Signed integer divide Software 1.46s Hardware 0.35s |
David R. Lane (77) 728 posts |
What files do I need for my Raspberry-Pi 2 model B? I have downloaded from github the files BOOTCODE/BIN, FIXUP/DAT and START/ELF (yesterday), ignored all the other files, and got the latest RISCOS ROM (7/2/15). So do I just use these together with CONFIG/TXT for the bootloader partition? |
Dave Higton (1515) 3404 posts |
It doesn’t matter. |
Jeffrey Lee (213) 6046 posts |
VFPSupport has now been updated to support the Raspberry Pi 2. VFP short vectors are emulated in software, so expect them to be very slow. At the moment it’s using the basic full software emulation, I’m not sure if I’ll ever replace it with something faster (e.g. use a sequence of VFP scalar ops). It starts to get a bit messy when you consider the same code needs to run on the Pi 1 and 2. VFPSupport_Features 2 can be used to detect whether short vectors are implemented in hardware, software, or neither. |