image by Ray Reyes
Calculate the standard deviation of a Ruby array
Standard deviation is a measure of the amount of variance within a group of values. A high value indicates a wide spread of values around the mean, whereas a low value indicates tight clustering of values.
The calculation gives additional context to the range of values around the mean value of a series of data. Wikipedia has some good examples of real-world applications.
Ruby doesn’t provide a native method to generate the standard deviation of an array of integers. Its built-in Math
library focuses on trigonometry and logarithmic calculations. This isn’t surprising given there’s not even a way to calculate the mean of an array in Ruby.
Ensure you use…
…Array#sum
when calculating both the mean and the standard deviation from an array of integers:
a = [1, 2, 3, 4, 5, 6, 7, 8]
mean = a.sum(0.0) / a.size
#=> 4.5
sum = a.sum(0.0) { |element| (element - mean) ** 2 }
#=> 42.0
variance = sum / (a.size - 1)
#=> 6.0
standard_deviation = Math.sqrt(variance)
#=> 2.449489742783178
Why?
Using the #sum
method from Array
is many, many times faster than using the alternative, inject
.
The #sum
method was only added to Array
in Ruby 2.4, which is why you might see alternative implementations in other places on the Internet.
I compared the performance of Ruby-native vs. implementing the algorithms yourself when I wrote about calculating the mean and the same principles apply: native implementations (in C) are much faster.
Anything else?
In all honesty, if you’re doing a lot of statistical number-crunching work you probably want to reach a little closer to “the metal”.
A version of the standard deviation calculation done in Ruby is much slower than if it were done natively in C.
If you’re doing a lot of this sort of calculation or in a situation where performance is key you might want to look at the enumerable-statistics
gem. It has natively implemented versions of several statistical summary methods mixed in directly to Ruby’s Array
and Enumerable
classes.
Last updated on June 28th, 2021