Single Number Evaluation

This is very important to simplify the journey. Most real life problems require a multi factor evaluation. Not just in machine learning, any solution is good if it runs fast and gives good values for all the output parameters.

How would we compare two solutions that give a trade off between them? How do we compare two solutions that give better accuracy on one output at the cost of the other? How would we compare that to a solution that gives similar, but higher error on both outputs?

These are some questions that we have to ask ourselves and translate our answer into numbers. This ensures we do not get stuck in adjectives like similar, better... Once we have a single numeric value for evaluation, the task of development gets a lot easier and independent of human monitoring.

There are different types of solution metrics. Sometimes, we have multiple outputs, where each very important - where we want to optimize each of them. At other times, optimization of a given output is no more meaningful if it crosses a given limit.

Thus, we have some satisfying metrics and some optimizing metrics. For a satisfying metrics, we just need to ensure that it crosses a given threshold. For optimizing metrics, we want to improve it on and on. In fact, we can also have a situation where it is more important to satisfy some metrics compared to others.

Common examples for multiple parameters are

  • Positive and negative errors
  • Short term and long term errors
  • Output accuracy and run time

Based on the problem at hand, we need to identify the category, importance and relationship of each evaluation metric. Based on these, we should identify a single numeric formula that can be used to evaluate a given model. Until we get this single number, we will need human intervention at each step to see how a model affects the output.