DEN Discussion List Archive

[Date Prev][Date Next][Date Index] [Thread Index] [Author Index]

Re: R-Bar/d2



ajhansen2001@hotmail.com asked "Where does d2 come from?" as used in
spc.

Our respected lawyer friend, John David Kromkowski, provided a good
response but expressed dissatisfaction with his own answer.

This exchange provides me with an excuse to introduce my own
interpretation of why SPC works.  It also provides the ONLY area in
which Don Wheeler and I have agreed to disagree.  Don has provided
extensive evidence, by simulation, that even though the numbers we use
(such as d2, etc.) are derived from the Gaussian Distribution, the
underlying distribution in the process we are analyzing does not have to
be Gaussian -- the methods of SPC still work.  Therefore,  his advice,
"quit worrying."

My take is a little different.  I argue from the principle of maximum
entropy, which I have discussed extensively in my book, "Rational
Descriptions, Decisions and Designs" (Pergamon Press, First Ed. 1969).

  This principle begins with the question:  "How do you tell someone
neither more nor less than you truly know?"  The answer is: "You must
speak in the language of probability theory."  (See Chapter I above)

This then raises the question, "What is probability?"  The answer, based
on the work of Cox and Jaynes (plus many others in history) is:
"Probability is a number, assigned to allow you to describe incomplete
knowledge."  

This then begets another question: "Then how do you go about assigning
probabilities?"  
The final answer:  "Assign the probability distribution which agrees
with your (incomplete) information and maximizes the entropy of the
distribution.

But you, being an intelligent person, ask: "But why do you choose the
entropy defined by Claude Shannon (and others)?"  The answer is:
"Because entropy is a unique measure of what you do not know when all
you know is a probability distribution."

Now, how does all this apply to SPC?  The answer:  If all you intend to
record about a data stream is the average and the average squared
deviation, the the Gaussian Distribution will be found to have the
maximum entropy, subject to the constraint that it satisfy the mean and
average squared deviation.  Thus, encoding this limited knowledge in a
distribution provides a description of your knowledge without assuming
anything else beyond your knowledge.

Use of the maximum entropy distribution does not mean that the
underlying distribution IS a Gaussian.  Rather it means that the ONLY
justifiable incomplete description you can give is a Gaussian.

I guess the fact that the book is out of print testifies that it was not
well written.  However, the International Workshops on Maximum Entropy
have been held every year for 24 years, attended by scientists,
engineers and statisticians from all over the world. I take it that this
means that the maximum entropy principle is alive and well.  This
principle has been applied to numerous problems from sun spots to public
health.  SPC is one of the simplest examples.

The book, "Probability Theory: The Logic of Science", by Edwin T.
Jaynes, (edite posthumously by Larry Bretthorst), Cambridge Univ. Press
(2003) is the most comprehensive and authoritative source.

Myron Tribus



DEN Home | Main Index | Thread Index | Author Index