The new Raspberry Pi 2 has a quad-core CPU (BCM2836), but it costs the same as the previous B+. All that extra CPU power isn’t a completely free lunch though. If you use it, it’ll cost you slightly more electrical power than the B+. But how much? That’s what we’re here to look at. In my testing, hammering different numbers of cores in parallel, it worked out at roughly 50 mA (250 mW) per core.
But before I show you the full results, let me back up and show you how we got them because the method is as interesting as the results.
Today we’re testing how much faster the quad-core A7 BCM2836 processor is than the single-core BCM2835. I’ve also done some performance measurements on an Allwinner A20-based (dual A7) processor for fun. But while testing these performance gains, I’ve measured the power usage as well.
How Did I Hammer The CPUs?
Back in early 2012, inspired by Pi, I started learning Python while waiting for the Pi to arrive. Everyone said “choose a project – it’s the best way to learn”. So after I’d learnt the basics, I decided to write a program to help me beat my friends at “Words With Friends”. The idea was you punch in the letters you have and it would search through the entire word list and give you a list of possible words you could use.
I got it working, but it was horribly slow and inefficient, even on a PC. It took a couple of minutes to run. I eventually came across the idea of pre-sorting the list of 172,820 words so that the computer could scan it more efficiently. This pre-processing only needed to happen once and it sped up the actual search to just a handful of seconds. It even ran acceptably fast on the Pi, which is why I’ve chosen this pre-sorting program to test the difference in CPUs. 20-30 seconds is a nice easy time period to measure.
But that left just one issue. To test the quad-core processor properly we need to be able to run this process 1, 2, 3, & 4 times at once, preferably at will. To do this, we need to be able to run multiple threads simultaneously. I’d used threading once before, but found it a bit fiddly. The online documentation sucked (why so sparse with the examples – documenters?), but I eventually found an example here that I was able to adapt for my own needs. I love examples. Please include lots of examples if you ever document anything. :) Here’s the code I used to drive the threading…
#!/usr/bin/env python2.7 # script by Alex Eames https://raspi.tv/?p=7684 import time from subprocess import call from threading import Thread cmd = "python /home/pi/presort.py" def process_thread(i): print "Thread: %d" % i start_time = time.time() call ([cmd], shell=True) end_time = time.time() elapsed_time = end_time - start_time print "Thread %s took %.2f seconds" % (i, elapsed_time) how_many = int(raw_input("How many threads?\n>")) for i in range(how_many): t = Thread(target=process_thread, args=(i,)) t.start()
All it does is ask you how many threads you want (how_many) and then call the presorting script how_many times, running each one in a different thread. It isn’t bullet-proof (there is no error-checking on the input), but I don’t care! It does what it’s meant to.
From what I can gather from the results and the current measurements, each thread is running on a different CPU when we use multi-core processors. So this is a great way to hammer as many or as few CPUs at a time as you choose. :)
What About The Results?
The B+ can run this pre-sorting script once in 26 seconds or twice simultaneously in about 53 seconds.
The new Pi2 can run it once in about 7.4 seconds, but because it’s quad-core it can actually run it 4 times in parallel in 7.7 seconds as well.
So not only is the ARMv7 BCM2836 3.5 times faster than the BCM2835’s old ARMv6, but there’s four of them so, for this example it can work 3.5 x 4 = 14 times faster than the B+
While I was at it I ran the same program on an A20 based board (nana). For one thread it took 7.5s. Two threads took 9.5 seconds. Four threads was roughly double that time (it’s dual-core).
As a point of reference, my MacBook Pro ran a single thread in 0.63s (gotta love the cross-platform nature of Python).
The measured current when using the presort.py script to hammer varying CPUs on the Pi2 at 5.18-5.19V
So you can see that each CPU uses ~40-50 mA when being pushed.
I Also Did My Standard Pi Test Series
The ‘standard’ set of measurements that I do on new Pis does not incorporate multi-core, but I did them anyway, so we can have an accurate comparison of all Pis. The lack of a multi-core test is why I created one above.
Looking at these data, the Pi 2 appears to use a little more than a B+. But since these data are all single-core tests, if we added in another 150 mA to simulate the hammering of all four cores, the numbers would be very similar to those of the model B. I will have to devise a test which shoots 1080p video while hammering all four cores. But that’s for another occasion.
If You Want To Try It
I’ve shared the code for this on GitHub if you want to have a go. To run this on your Pi…
git clone https://github.com/raspitv/pi2test/