Metrics Matter (Part 1)

Zachary Greenberg
3 min readAug 6, 2021

--

In data science, we have the capability to predict on both continuous and categorical data. These are the two general camps of data today — notice I did not say categories purposefully to avoid any confusion — and for the types of models we create around them we have various metrics we use to evaluate them. It is of paramount importance that we choose the best one. It is also important to understand that the best one for the model should be heavily dependent on the context of the problem we are trying to solve. For the first part in a two-part series, I will focus on continuous data and their metrics.

What is continuous data?

‘Continuous data is data that can take any value. Height, weight, temperature and length are all examples of continuous data.’ — as defined by OpenLearn

Continuous data has almost an infinite range of possibilities in nature. For example we can have the weight of a fruit to be 3.456… and so on pounds. Two great examples of algorithms where continuous data is used are regression analysis and time series, regression being something like predicting the box office revenue of a movie and time series being something like predicting the price of a movie ticket over a specified period of time. Both of these problems have continuous data as their outcomes.

What metrics are used to evaluate models involving continuous data?

Common metrics for continuous data involve: mean absolute error, mean squared error, and root mean squared error.

Mean absolute error (MAE)

Mean absolute error is measured by taking the absolute value of the average difference between the actual and predicted values. The mean absolute error can give you an interpretable number for the error. It is not as affected by outliers as the other two methods below. A disadvantage of this is that because of the use of the absolute value, there is no indication of directionality in terms of whether the error was overshooting or undershooting the actual values.

#MAE example
actual = [1, 2, 3, 4]
predicted = [1, 1, 2, 1]
numerator = 0
denominator = len(actual)
for i in range(len(actual)):
numerator+=(abs(actual[i]-predicted[i]))

print(numerator/denominator)

Mean squared error (MSE)

Mean squared error is measured by taking the average of the squared difference between the actual and predicted values. Because of the squared component, outliers will have heavier weights on the error. The larger the number, the larger it will be after exponentiation. In a lot of cases it is important to consider outliers, so this metric would be most ideal in those situations. It is important to note that the result of this metric is NOT in the same units as the variable we are trying to measure.

#MSE example
actual = [1, 2, 3, 4]
predicted = [1, 1, 2, 1]
numerator = 0
denominator = len(actual)
for i in range(len(actual)):
numerator+=((actual[i]-predicted[i])**2)

print(numerator/denominator)

Root mean squared error (RMSE)

Root mean squared error is simply the square root of the mean squared error. Again, like mean squared error above, outliers will have an affect on the error. Taking the square root will then have an affect on the outliers as well. The benefit of doing this versus using the mean squared error is that it converts the error back into terms we can understand based on the original units of the dependent variable. Generally, this particular metric is favored over mean squared error.

#RMSE example
from math import sqrt
actual = [1, 2, 3, 4]
predicted = [1, 1, 2, 1]
numerator = 0
denominator = len(actual)
for i in range(len(actual)):
numerator+=((actual[i]-predicted[i])**2)

print(sqrt(numerator/denominator))

In summation, these are the top 3 evaluation metrics utilized on continuous data. I should really be saying 2, in my experience, we often choose between mean absolute error and root mean squared error for metrics. When choosing between these two, the decision should be based on what you are trying to accomplish. If outliers do not matter to you and you want something easily interpretable, mean absolute error is a quick way for an error calculation. Otherwise, if outliers need be considered, root mean squared is a better alternative.

This is probably the simpler task when describing the performance metrics of continuous versus categorical data, and as we get into categorical data, this will matter even more. Overall, it is important to consider the circumstances and choose the right metric for the problem at hand.

References

Continuous data — https://www.open.edu/openlearn/ocw/mod/oucontent/view.php?id=85587&section=1

Metrics — https://www.analyticsvidhya.com/blog/2021/05/know-the-best-evaluation-metrics-for-your-regression-model/

--

--

Zachary Greenberg
Zachary Greenberg

No responses yet