Edit detail for ThresholdsAndLimitsAreBad revision 1 of 1

1
Editor: DonovanBaarda
Time: 2010/08/27 10:22:53 GMT+0
Note:

changed:
-
There seems to be a human habit of trying to divide everything into black and white, high and low, and then making hard decisions based on that division. This works OK when there really is only two possible options; do or don't, stop or go. However, in most cases there is not two possible options, but a continuous range of possibilities; shades of grey, degrees of height, how much you do, and how fast you go. For some reason people seem to still insist on treating these continuous ranges as if they were on/off binary problems by introducing arbitrary thresholds and limits; blacker than grey is black, more than 2m is high, do lots or none, stop or go fast. For some reason people seem to think that transforming a continuous problem into a binary problem makes it simpler. Unfortunately it usually it makes it far more complicated.

The first complication it introduces is deciding where to set your arbitrary limits and thresholds; how black is black, how high is high, how much is lots, how fast is fast? No matter how good you guess these values, they will be wrong when the situation changes; you encounter a blacker black, a higher high, more than lots, or faster than fast. As systems grow, the numbers they deal with get bigger, and your arbitrary limits and threshold values will not scale with the system.

The second complication is the consequences of getting these limits and thresholds wrong is much worse. A tiny tweak of the threshold can produce the extreme opposite reaction to the same situation. The threshold becomes this cliff where everything dramatically changes, causing things that are close to that edge to flip wildly when they go over it. Imagine trying to drive a car where the steering-wheel only gave you hard-left and hard-right.

When people realize the problems with their simple on/off solution, their first reaction is to introduce more thresholds and limits in the form of "bands". If one doesn't fix it, just add more! You now have many arbitrary thresholds and limits that you need to set right. And each one of these thresholds becomes a mini-cliff to fall off. The fall is not as bad as a single big cliff, but it is still bad for things that are on the edge, and because you have more edges, you have more things on those edges.

From a control-systems point of view, systems with steps and bands are very complicated to make stable. They are mathematically difficult to model, and each step change is like a lump of crap in the system that causes it to skip and wobble every time it crosses over it. Each skip and wobble can have surprising interactions with other parts of the system, and can be amplified into an increasing oscillation that makes the whole system to go unstable and crash. This applies to all sorts of systems, whether they are aircraft control systems, taxation systems, or general software systems.

Describing and implementing bands also becomes increasingly complicated the more you try to refine it to make it work. What might have started out as a simple on/off solution rapidly becomes more tangled as you add more bands and try to tune them. Take `Individual income tax rates <http://www.ato.gov.au/individuals/content.asp?doc=/content/12333.htm>`_ for 2010 of the AustralianTaxationSystem as an example. The Python code for this looks like this::

  def GetTax(income):
    """Return the income tax payable for the provided income."""
    if income <= 6000.00:
      return 0.0
    elif income <= 35000.0:
      return 0.15 * (income - 6000.0)
    elif income <= 80000.0:
      return 4350.0 + 0.30 * (income - 35000.0)
    elif income <= 180000.0:
      return 17850.0 + 0.38 * (income - 80000.0)
    else:
      return 55850.0 + 0.45 * (income - 180000.0)

Check out all the "magic numbers" in this implementation. For each band there is a base tax figure that is calculated from the accumulated taxes in the lower bands, that will need to be recalculated if any of the thresholds or rates are adjusted. Imagine trying to write an excel spreadsheet cell formula for this.  If we calculate and graph the overall taxation rate vs income for this, it looks like the red line in this graph. Note how it kicks and bucks along its way as it hits the different thresholds;

.. image:: IncomeTaxCalculation.PNG

For all the complication in this function, what is it trying to achieve? The objective is to have a total tax rate that goes from 0% at $0 income, and goes towards 45% as income goes above $180000. At about $80000 income the tax rate should be about half the top tax rate of 45%. The yellow line in the above graph was calculated using this formula in Python::

  def GetTax(income):
    """Return the income tax payable for the provided income."""
    taxrate = 0.45 * income / (income + 80000.0)
    return income * taxrate

Notice how much simpler this is than the monstrosity banding function before it. This is a simple function you can easily put in a spreadsheet cell. There are only 2 magic numbers; the 0.45 peak tax rate, and the 80000 income at which the overall tax rate is half the peak tax rate. It also does a better job of smoothly increasing the taxation rate between the desired points, without any weird wobbling along the way. Tuning and adjusting this function to achieve the desired result will be much easier than adjusting the many different variables in the banded solution. Note that this is used as a simple example, but I think there are even better ways to simplify the AustralianTaxationSystem than this.

So if you ever find yourself about to add some arbitrary limit, threshold, or banding to something you are designing, stop! Try to figure out a continuous function instead... it will turn out simpler to implement, will be much easier to tune and make stable, and will scale better as your system grows.

Arbitrary limits and thresholds don't scale, continuous functions do.


There seems to be a human habit of trying to divide everything into black and white, high and low, and then making hard decisions based on that division. This works OK when there really is only two possible options; do or don't, stop or go. However, in most cases there is not two possible options, but a continuous range of possibilities; shades of grey, degrees of height, how much you do, and how fast you go. For some reason people seem to still insist on treating these continuous ranges as if they were on/off binary problems by introducing arbitrary thresholds and limits; blacker than grey is black, more than 2m is high, do lots or none, stop or go fast. For some reason people seem to think that transforming a continuous problem into a binary problem makes it simpler. Unfortunately it usually it makes it far more complicated.

The first complication it introduces is deciding where to set your arbitrary limits and thresholds; how black is black, how high is high, how much is lots, how fast is fast? No matter how good you guess these values, they will be wrong when the situation changes; you encounter a blacker black, a higher high, more than lots, or faster than fast. As systems grow, the numbers they deal with get bigger, and your arbitrary limits and threshold values will not scale with the system.

The second complication is the consequences of getting these limits and thresholds wrong is much worse. A tiny tweak of the threshold can produce the extreme opposite reaction to the same situation. The threshold becomes this cliff where everything dramatically changes, causing things that are close to that edge to flip wildly when they go over it. Imagine trying to drive a car where the steering-wheel only gave you hard-left and hard-right.

When people realize the problems with their simple on/off solution, their first reaction is to introduce more thresholds and limits in the form of "bands". If one doesn't fix it, just add more! You now have many arbitrary thresholds and limits that you need to set right. And each one of these thresholds becomes a mini-cliff to fall off. The fall is not as bad as a single big cliff, but it is still bad for things that are on the edge, and because you have more edges, you have more things on those edges.

From a control-systems point of view, systems with steps and bands are very complicated to make stable. They are mathematically difficult to model, and each step change is like a lump of crap in the system that causes it to skip and wobble every time it crosses over it. Each skip and wobble can have surprising interactions with other parts of the system, and can be amplified into an increasing oscillation that makes the whole system to go unstable and crash. This applies to all sorts of systems, whether they are aircraft control systems, taxation systems, or general software systems.

Describing and implementing bands also becomes increasingly complicated the more you try to refine it to make it work. What might have started out as a simple on/off solution rapidly becomes more tangled as you add more bands and try to tune them. Take Individual income tax rates for 2010 of the AustralianTaxationSystem as an example. The Python code for this looks like this:

def GetTax(income):
  """Return the income tax payable for the provided income."""
  if income <= 6000.00:
    return 0.0
  elif income <= 35000.0:
    return 0.15 * (income - 6000.0)
  elif income <= 80000.0:
    return 4350.0 + 0.30 * (income - 35000.0)
  elif income <= 180000.0:
    return 17850.0 + 0.38 * (income - 80000.0)
  else:
    return 55850.0 + 0.45 * (income - 180000.0)

Check out all the "magic numbers" in this implementation. For each band there is a base tax figure that is calculated from the accumulated taxes in the lower bands, that will need to be recalculated if any of the thresholds or rates are adjusted. Imagine trying to write an excel spreadsheet cell formula for this. If we calculate and graph the overall taxation rate vs income for this, it looks like the red line in this graph. Note how it kicks and bucks along its way as it hits the different thresholds;

IncomeTaxCalculation.PNG

For all the complication in this function, what is it trying to achieve? The objective is to have a total tax rate that goes from 0% at $0 income, and goes towards 45% as income goes above $180000. At about $80000 income the tax rate should be about half the top tax rate of 45%. The yellow line in the above graph was calculated using this formula in Python:

def GetTax(income):
  """Return the income tax payable for the provided income."""
  taxrate = 0.45 * income / (income + 80000.0)
  return income * taxrate

Notice how much simpler this is than the monstrosity banding function before it. This is a simple function you can easily put in a spreadsheet cell. There are only 2 magic numbers; the 0.45 peak tax rate, and the 80000 income at which the overall tax rate is half the peak tax rate. It also does a better job of smoothly increasing the taxation rate between the desired points, without any weird wobbling along the way. Tuning and adjusting this function to achieve the desired result will be much easier than adjusting the many different variables in the banded solution. Note that this is used as a simple example, but I think there are even better ways to simplify the AustralianTaxationSystem than this.

So if you ever find yourself about to add some arbitrary limit, threshold, or banding to something you are designing, stop! Try to figure out a continuous function instead... it will turn out simpler to implement, will be much easier to tune and make stable, and will scale better as your system grows.

Arbitrary limits and thresholds don't scale, continuous functions do.