I. Uncertain Numbers: Random Variables

The most basic building block of uncertainty is the concept of an uncertain number. Examples include next month's sales, this afternoon's temperature, or today's closing price of your favorite stock. In statistics courses the closely related term random variable is used in the discussion of uncertain numbers. However, in my experience, if you use this term in the workplace few people will understand you, so I will stick with uncertain number.

As uncertain numbers go, one of the simplest is the number pointed to by a game board spinner as shown on the right, which can vary between 0 and 1.

The RAND() function in your spreadsheet outputs a random number which is indistinguishable from the outcome of such a spinner. If you are not already aware of this function, you may be wondering why on earth the designers of spreadsheets felt it necessary to build in the simulation of a spinner. The answer is that it can help model uncertainty in a wide range of situations. Spinners, for example played an important role in the famous 1959 book on Portfolio Selection by Harry Markowitz [Markowitz 1959]. Today, spinners are at the heart of most simulations including those used by traders on Wall Street. We will start with a simple example that models a business uncertainty with a spinner.

A Risky Business

Suppose that you are considering the start up of a new business. Revenues in the first year are quite uncertain. Imagine, for example, that the uncertainty in revenue is the same as if God spins the spinner, and multiplies by $1Million. The downside is that if revenues are less than $200,000, you will go out of business and face personal bankruptcy.

Uncertainty and Risk

Uncertainty and Risk go hand in hand, but they are not identical. Uncertainty is an objective feature of the universe while risk is in the eye of the beholder. In this case, for example, the uncertainty, determined by the physics of the spinner, is the same for all observers. The risk to you is that revenues will be less than $200,000, forcing you out of business. The risk to your competitors is that revenues will be greater than $200,000, forcing them to continue to contend with you.

The Shapes of Uncertain Numbers

Associated with every uncertain number is a shape known as a histogram, which displays the likelihood that the number falls within various ranges of values referred to as bins. Histograms are an important output of simulations.

PUZZLE 1:
In general it is difficult to estimate these shapes, but can you guess the shape of the histogram for the spinner? Give it a try by opening the file histogram.xls. Now adjust the bars to show the percentage of spins that you expect to lie in each of the regions (0.0 - 0.2, 0.2 - 0.4 etc.)


Histogram.xls

When you have made your guess, go on to the following tutorial in which we will simulate the spinner.

Tutorial 1a - Simulating Uncertain Revenue

To gain a better understanding of this situation, we will perform a Monte Carlo simulation of the spinner, that is, repeat this random situation many times, and analyze the results.

To simulate the revenue of your proposed business, proceed as follows.

  1. On a blank worksheet place =RAND() in cell A2 as shown below. Press the F9 (calculate key) a few times to see how its output changes. Type "Revenue (Millions)" in cell A1.

  1. Invoke the Simulate, Simulation command.

The Simulation Settings dialog box will appear as shown below.

  1. Click the Add Outputs button. An output is a cell whose outcome you are interested in examining. Specify cell A2 as the output cell and cell A1 as the output name then click OK. You can also type in a name for the output in the Names box.

  1. Specify the number of trials on the Simulation Settings box, then click OK. The greater the number of trials the more reliable the results.
  2. The statistics of the simulation appear in a new workbook (Simstats.xls). This indicates that the average revenue is about .5, but since this was measured in Millions, this is really $500,000.

  1. Go on to Tutorial 1b before turning off your computer. Do not close Simstats.xls as you will need it for the next lesson.

Tutorial 1b - Histograms and Cumulative Distributions

Histograms

  1. Invoke the Simulate, Graphs command. Check off the "Revenue (Millions)" box on the Outputs tab.

Then go to the Graphs tab. Specify 5 bins and 2 decimal places, then click the Histogram button.

You should get a graph as shown below. Note two important features of this graph:

A. All the bars are about the same height because any number between 0 and 1 is equally likely. That is, the spinner is not more likely to point to some numbers than others. When presented with Puzzle 1, above, many graduates of statistics courses specify bars of different heights (see Reference [3]). In the words of Mark Twain, they have had their schooling interfere with their education. If you got this wrong, make sure you understand why the correct graph is flat across.

B. The bars add up to 100% because there is a 100% chance that the spinner will point to some number between 0 and 1. This assumes that there is no chance the arrow will fall off and land on the floor! The bars of all histograms must add up to 100%.

  1. Now specify 10 bins on the Graph tab and click the Histogram button again (if you closed the graph dialog box re-open it using Simulate, Graphs). Notice that the basic shape is the same, but now the bars are all roughly half as high as they were before (10% instead of 20%). This is because each bin is now half as big a target for the arrow of the spinner to hit.

Mean, Mode and Median

The mean, mode and median are often misunderstood concepts, which may all be grasped in terms of histograms. For symmetric histograms these three numbers are the same. But for asymmetric ones they will be different as shown in the figure below.

The Mean, Average or Expected Value
If the histogram were made of a solid material, this is the point at which it would balance. It is calculated by the sum of the height of each bar multiplied by its location on the x axis.

The Median
This is the point at which half the sum of all the bar heights is to the left, and half to the right. The median is also known as the 50th percentile.

The Mode
This is the location of the highest bar. If more than one bar is higher than both its neighbors, the distribution is said to be multi-modal.

Cumulative Distribution

A related shape associated with every uncertain number is its Cumulative Distribution, as discussed below.

  1. Click the Cumulative button on the Graphs page (if you closed the graph dialog box re-open it using Simulate, Graphs). You should see the Cumulative Distribution graph below. This graph shows the likelihood that the number falls below the value on the x axis.

For example, the percentage of spins less than 0.2 is 20%, so that is the chance of financial ruin in the business example. The vertical line displays the Mean (average) of Revenue at about 0.5.

Note that the percentiles shown on the Statistics tab of Simstats.xls are the numerical equivalent of the cumulative graph.

Summary

Uncertain numbers may be viewed as shapes that depict the likelihood of the possible outcomes.

Monte Carlo simulation, the process of bombarding a model with random inputs, can help us estimate these shapes.