Monte Carlo Methods and
|
Probability models are common in a variety of areas, ranging from physics and chemistry to sociology and finance. The simplest probability models involve systems with a finite number (N) of distinct outcomes and non-zero probabilities which can be assigned to these events:
The system could be very simple. For example, there are 36 possible outcomes from tossing a pair of dice and, if the dice are fair, the probabilities are P=1/36 for each of the outcomes. Or, the system could be incredibly complicated, such as the Ising lattice systems of this unit, with an almost incomprehensible number of possible states and state probabilities which vary over many orders of magnitude.
Consider some probability system (that is, a set of outcomes and values for the probabilities of these outcomes), and let F be some numerical function of these outcomes. For example, in the dice model, one could consider F(A) to be the total number of spots showing for outcome A. As a more complicated system, the set {A} could be the set of possible future price movements for some financial security, and F(A) could be a payoff function depending on the price of the stock at some "option expiration date". A particularly important quantity in such situations is the Expected Value of the function F (e.g., an investor will be really interested in knowing the expected income from a purchase). This quantity is given by a weighted sum over all possible outcomes:
This sum considers each possible outcome, evaluates the function F for that outcome, and then "weights" the function value by the probability that the selected outcome actually occurs. The expected value is thus an average of sorts, with the "of sorts" part meaning that really rare events (i.e., P(A) is very tiny) don't count much in evaluating the "average".
The Expected Value is a key concept in statistical physics, and many measurable quantities for a macroscopic chunk of matter are simply related to expected values of "atomic level" functions. Probabilities of the various microscopic states are given by the Boltzmann function:
In principle, then, the "averaging" problem for statistical physics would seem to be solved. Given a physical model, we could evaluate the energies E(S) for each of the possible (microscopic) states, and Equation 2 would then give the probabilities of these states. To evaluate any "average value" function of the system, we need only perform the sum in Equation 1.
However, as emphasized in the Ising Model Introduction document, the number of terms in summation in Equation 1 is incomprehensibly large. (If we could somehow use all computing resources on the planet for one full year we could only evaluate Equation 1 for "trivially simple" problems.)
In practice, the "formal" solution in Equation 1 is worthless!
Due to the "exponential explosion" in the number of possible states, it is simply impossible to calculate expected values from the definition in Equation 1. On the other hand, for any real system, if is fairly simple to estimate this quantity from a number of measurements.
To have a definite example in mind, let's pretend that the 2-dimensional Ising model described in the Ising Model Introduction document is a "real" system and, more importantly, let's assume that one can do actual experiments to measure the total magnetization of the system. The theoretical model for the average total magnetization is simply
For any particular state S, N(UP) and N(DOWN) are the number of individual sites with spins pointing up and down. The sum in Equation 4 is over all possible states, and the probabilities for each state are given by Equation 2.
We cannot evaluate the predicted magnetization in Equation 3, but we can imagine doing an experiment to measure the magnetization of the (assumed) real system. In an absolutely idealized world (which we assume), such a measurement provides an instantaneous "snapshot" of the fluctuating physical system. For the particular state at the time of the measurement, we actually measure the relative number of Up and Down spins.
A good experiment is generally not limited to a single measurement, and instead it is common to average the the values from a number of separate measurements.
Unlike the computationally overwhelming sum in Equation 1, one often gets reasonable values from the "experimental sum" in Equation 4 with a modest number of measurements.
"Status Report:"
That is,
The Law of Large Numbers is so important that it is worth rephrasing it in a slightly poetic fashion:
Nature freely (or at least cheaply) gives what man and all his computers cannot accomplish.The solution to the "summation problem" of the first section is now clear: the exact but impossible expression in Equation 1 must be replaced by an algorithm which approximates the "natural sampling" of Equation 2.(The sum on the left hand side of Equation 5 has a "godzillion" terms, while that on the right hand side might well have only a few hundred.)
There are, in fact, two significant differences between the sums in Equation 1 and Equation 4:
Thinking about these differences a bit leads to the identification of two obvious problems with the "theory sum" in Equation 1:
That is: Equation 1 leads to a lot of effort "computing zero".
That is: Equation 1 can be highly redundant.
Alternative Evaluation Strategy
- Replace the complete summation in Equation 1 by a (vastly) restricted sum over "representative" states.
- Select these representative states carefully according to the full probability function of Equation 2, meaning:
The frequency with which some class of events is included in the representative sum must be the same as the actual probability for that class.
Ignoring, for the moment, the question of how the selection of representative states is actually done, this leads to a fairly simple "flow chart" of a computer algorithm which simulates the "experimental evaluation" of Equation 4:
Each pass down through the three boxes amounts to a single computer simulation of an "experimental" event. In the first (and trickiest) step, random numbers are used to generate a state - such as the specification of all spins in an Ising model. The next two steps are "data analysis", performing the same calculations that might be done in a real experiment. The event generation loop is simply repeated until one has produced a large enough sample size "L" for the evaluation of Equation 4.
The usefulness of this procedure clearly depends on the ability of the computer code to generate random events "just like nature". That is, the frequency with which the algorithm generates any particular type of events must be the same as that specified by the Boltzmann distribution:
This correspondence is required in order to justify the use of a small sum to evaluate predicted expectations (as in Equation 4).
There are many different ways to design computer algorithms which satisfy Equation 6. Most of these algorithms share two common features:
Suppose that the algorithm as gotten to the box labelled "Old State". This means that all details of the Old State are specified (and "known" within the computer program). The algorithm must now decide between two alternatives for the next step. The procedure used is straightforward:
A Monte Carlo Calculation is simply a sequence of random choices. This reliance on chance is the origin of the name of the procedure, referring to the famous casino in Monaco.
There are many, many forms for Monte Carlo calculations depending on the details of the problem at hand. The standard technique for Statistical Physics simulations is the Metropolis Algorithm, which can be described as follows:
This question is answered using the standard "Decision Process" at each site. The appropriate probability function for the random selection is presented below.
The decision to flip a particular spin in the second step is done by using the Boltzmann probability function of Equation 2 and the Ising Model for the system energy, as described in the Ising Model Introduction document. The ratio of the probabilities before and after the possible spin flip is given by
The "Random Spin Flipping" algorithm uses this ratio as follows:
And that's it! The random decision in the third step is the key to ensuring that the constraint in Equation 6 is (eventually) satisfied.
The Ising Model code included as part of this unit is ultimately a simple package for probabilistic decisions based on Equation 7. Indeed, most of the actual code is devoted to extraneous "junk", such as checking bad entries in user requests and producing graphics output files.