Randomness and Determinism in daily life

Where is the universe heading ?

Supriyo Banerjee
13 min readDec 15, 2018

After I read the books ‘Fooled by randomness’ and ‘The black swan’, I became obsessed about understanding the concepts of randomness. The former book lighted the fire in me and the later acted as the fuel. However, strangely enough I was diverted to a new field highly coupled with my major i.e. Computer Science.

I began thinking, that, is the randomness and in particular, the random bit flips in a computer is related to the uncertainty principle of the quantum world and the uncertainty that we see around us, specially in the stock markets ?

To tell you the truth, I thought this was an original idea at first. But then I found out(sadly !) that a lot of research is going on around these ideas. No body has the definitive answer but researchers have come up with beautiful ideas which I want to share with all of you. So let us start.

Inspiration

Let me first take an anecdote from ‘The black swan’. The Author NNT is asking a question to a business man named Fat Tony and a Mathematician Dr. John.

NNT (the author): Assume that a coin is fair, i.e., has an equal probability of coming up heads or tails when flipped. I flip it ninety-nine times and get
heads each time. What are the odds of my getting tails on my next throw?
Dr. John: Trivial question. One half, of course, since you are assuming 50
percent odds for each and independence between draws.
NNT: What do you say, Tony?
Fat Tony: I’d say no more than 1 percent, of course.
NNT: Why so? I gave you the initial assumption of a fair coin, meaning
that it was 50 percent either way.
Fat Tony: You are either full of crap or a pure sucker to buy that “50 percent” business. The coin gotta be loaded. It can’t be a fair game.
(Translation: It is far more likely that your assumptions about the fairness
are wrong than the coin delivering ninety-nine heads in ninety-nine throws.)
NNT: But Dr. John said 50 percent.
Fat Tony (whispering in my ear): I know these guys with the nerd
examples from the bank days. They think way too slow. And they are too
commoditized. You can take them for a ride.

So, who did you think is correct in this story ?

The laws of probability tell us that Dr. John is correct, but some thing deep inside us is forcing us to accept Tony’s answer. It’s just not sinking in with our intuition.

So is our intuition about chance events wrong ? Should we trust our intuitions or probability theory ? Let’s find out.

The two psychologists.

This kind of problem bugged two psychologist Daniel Kahneman and Amos Taversky long back in the 1982. As per the paper — ‘The Information-theoretic and Algorithmic Approach to Human, Animal and Artificial Cognition

The famous work of Kahneman, Slovic and Tversky (1982) aimed at understanding how people reason and make decisions in the face of uncertain and noisy information sources. They showed that humans were prone to many errors about randomness and probability. For instance, people tend to claim that the sequence of heads or tails “HTTHTHHHTT” is more likely to appear when a coin is tossed than the series “HHHHHTTTTT”.
In the “
heuristics and bias” approach advocated by Kahneman and Tversky, these ‘systematic’ errors were interpreted as biases inherent in human psychology, or else as the result of using faulty heuristics. For instance, it was believed that people tend to say that “HHHHHTTTTT” is less random than “HTTHTHHHTT” because they are influenced by a so-called representativeness heuristic, according to which a sequence is more random the better it conforms to prototypical examples of random sequences. Human reasoning, it has been argued, works like a faulty computer. Although many papers have been published about these biases, not much is known about their causes.

So, now the question we ask — is human intuition faulty like Kahneman and Taversky described them as or should we believe on our intuitions ?

The ideas of Kahneman and Taversky(henceforth K&T) prevailed after Kahneman was awarded the Nobel prize in economics in 2002. Taversky passed away before that(of cancer) , else he would have got the prize too.

Quite recently however, these findings of K&T are being questioned and a new theory of probability is being applied to this problem that fits the human intuition better than the classical theory. That is, instead of saying that the human intuition is at fault, researchers are trying to find out the problems in classical theory of probability and the reasons why it is not able to match the way we think.

The authors of this paper — ‘The Information-theoretic and Algorithmic Approach to Human, Animal and Artificial Cognition’ have called it a new paradigm — “During the last decades, a paradigm shift has occurred in cognitive science. The ‘new paradigm’ — or Bayesian approach — suggests that the human (or animal)mind is not a faulty machine, but a probabilistic machine of a certain type. According to this understanding of human cognition, we all estimate and constantly revise probabilities
of events in the world, taking into account any new pieces of information,
and more or less following probabilistic (including Bayesian) rules.
Studies along these lines often try to explain our probabilistic errors in terms of a sound intuition about randomness or probability applied in an inappropriate context.
For instance, a mathematical and psychological reanalysis of the equiprobability bias was recently published [Gauvrit & Morsanyi 2014]. The mathematical theory of randomness, based on algorithmic complexity (or on entropy, as it happens) does in fact imply uniformity. Thus, claiming that the intuition that
randomness implies uniformity is a bias does not fit with mathematical theory. On the other hand, if one follows the mathematical theory of randomness, one must admit that a combination
of random events is, in general, not random anymore. Thus, the equiprobability bias (which indeed is a bias, since it yields frequent faulty answers in the probability class)
is not, we argue, the result of a misconception regarding randomness, but a consequence of the incorrect intuition that random events can be combined without affecting their property of randomness.
We now believe that when we have to compare the probability that a fair coin produces either “HHHHHTTTTT” or any other 10-itemlong series, we do not really do so. One reason is that the question is unnatural: our brain is built to estimate the probabilities of the causes of observed events, not the apriori probability of such events. Therefore, say researchers, when we have participants rate the probability of the string s =“HHHHHTTTTT” for instance (or any other), they do not actually estimate the probability that such a string will appear on tossing a fair coin, which we could write as P(s|R) where R stands for “random process”, but the reverse probability P(R|s), that is, the probability that the coin is fair (or that the string is genuinely random), given that it produced s.

Apriori’ means before the event has occurred. For example before a coin is tossed, I assume, if the coin is fair the chance of a head or a tail is half i.e. P(Head or Tail) = 0.5. However, if the coin is not fair then P(Head or Tail) is not equal to 0.5. We then need to alter our apriori probabilities.

Q - How do we do that ?

A - We need to toss the coin several times to find the ratio of heads or tails coming up. Suppose after tossing the coin 500 times we see the ratio of heads to tails is 4:1. An experimenter thereafter, when using the coin has a different apriori belief than that of a fair coin.

Now, let us understand some of the basic notations of probability theory to understand what the paper meant.

A recap

P(A|B) means the probability of event A happening given event B has happened. For example, if A be the event — ‘It is raining’ and B be the event — ‘The Author is carrying an umbrella’. So now P(A|B) is interpreted as — ‘the probability that it is raining given that the author is seen carrying an umbrella.’ P(B|A) is interpreted vice versa.

So what the authors imply, is that the human intuition is able to assess P(R|s) i.e. the probability, that a given sequence of string is random(fair) rather than P(s|R) i.e. the probability that a given string s will occur given that the coin is random(fair). Please take a moment to let all this sink. Also, try calculating the apriori probabilities of the strings “HHHHHTTTTT” and “HTTHTHHHTT”. Hint their probabilities are the same !

So, is there a better way to find the probability that a given string is random or not ? Turns out, there is.

This new theory uses set of ideas called ‘Algorithmic information theory’ proposed out by Ray Solomonoff in1960, Andre Kolmogorov in 1964 and Gregory Chaitin in 1965. Chaitin was an undergraduate and still a teenager back in 1965 !

He has written many brilliant papers there after, which got me interested in this topic. So now, let us understand the concepts of this theory from an article written by Chaitin himself.

Algorithmic information theory

Let us look at what Chaitin has to say. Taken from his article — Randomness and Mathematical Proof, published in Scientific American in May 1975.

The new definition of randomness has its heritage in information theory, the science developed mainly since World War II that studies the transmission of messages. Suppose you have a friend who is visiting a planet in another galaxy and that sending him telegrams is very expensive. He forgot to take along his tables of trigonometric functions and he has asked you to supply them. You could simply translate the numbers into an appropriate code such as the binary numbers and transmit them directly, but even the most modest tables of the six functions have a few thousand digits so that the cost would be high. A much cheaper way to convey the same information would be to transmit instructions for calculating the tables from the underlying trigonometric formulas such as Eulers equation e^ix  = cosx  + i(sin x) Such a message could be relatively brief yet inherent in it is all the information contained in even the largest tables.
Suppose on the other hand your friend is interested not in trigonometry but in baseball He would like to know the scores of all the major league games played since he left the earth some thousands of years before. In this case it is most unlikely that a formula could be found for compressing the information into a short message.
In such a series of numbers
each digit is essentially an independent item
of information
and it cannot be predicted from its neighbors or from some underlying rule. There is no alternative to transmitting the entire list of scores.
In this pair of whimsical messages is the germ of a new definition of randomness. It is based on the observation that the information embodied in a random series of numbers cannot be compressed or reduced to a more compact form. In formulating the actual definition it is preferable to consider communication not with a distant friend but with a digital computer. The friend might have the wit to make inferences about numbers or to construct a series from partial information or from vague instructions. The computer does not have that capacity and for our purposes that deficiency is an advantage Instructions given the computer must be complete and explicit and they must enable it to proceed step by step without requiring that it comprehend the result of any part of the operations it performs Such a program of instructions is an
algorithm. It can demand any finite number of mechanical manipulations of numbers but it cannot ask for judgments about their meaning

Let us try to understand what Chaitin is saying. Suppose, we have 2 strings i.e. “HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH” and another one generated by flipping a fair coin 33 times(random series) i.e. “THHTTHHTTTHTTHHHTTHHTHTHTHHHTTHHTTHH”. I want to write instructions(program) to a machine(computer) on how to print these no’s.

In the first case, I will write — Print ‘H’ 33 times.

In the second case, there is no repeating pattern. So we will have to write — print “THHTTHHTTTHTTHHHTTHHTHTHTHHHTTHHTTHH”, i.e. print the whole string. Now imagine if the strings had been millions and billions of ‘H’ & ‘T’ characters !

So what, Chaitin is saying is that if we are able to write a short instruction for certain string, then it is random to a lesser degree than a string which has no patterns and is purely random. Is this consistent with our intuitions ?

Let us again look at the strings used by K&T i.e. “HHHHHTTTTT” and “HTTHTHHHTT”. Classical probability theory says both are equally likely, but according to AIT the second string is more likely to be random and correspondingly been generated from a fair coin.

To understand things better, let us go back to the anecdote given by NNT in ‘The black swan’ and now rethink who was right, Tony or John. I would say Tony, because he knows by his experience that an event like successive 100 heads is less likely or almost impossible from the tosses of a fair coin so he employs a somewhat ‘out of the box’ technique and says that the coin cannot be fair. John however, stays ‘inside the box’ and goes with the probability theory and concludes that any outcome is equally likely.

What bothers me, is that in an MBA entrance exam, a student needs to go with John and think ‘inside the box’. However, during the course and even after that, the same student is required to think ‘out of the box’ like Tony.

Implication for our lives

Predictive modelling :-

There is an old cliche i.e. ‘cause precedes effect’ , which still holds but there might be no understandable cause behind the effects. Cause is basically a short way(set of rules) of describing an effect. But as we saw above for some cases the cause of an effect, is the ‘effect’ itself ! Think about it for a moment before proceeding.

A random string cannot be compressed into a smaller string, so the only way to describe the string is the string itself. There is no other way ! The cause might me also be from the merger of a few(several) random events.

But can we say to ourselves or to others for that matter, that there is no cause for a certain event and that some random processes were responsible ? No body will buy that.

When we build predictive models we almost never get 100 % accuracy. If we get 90% accuracy using all our best effort we say that some part of the problem is random. What AIT shows us that randomness can be in shades of grey i.e somethings can be more random than others.

Indeed, if you go through the biggest competitions, held on kaggle for predictive model building, you will see that in some problems the highest accuracy reachable is ~60 %(or even less) and in one cases, i.e. in the first challenge Titanic: Machine Learning from Disaster, people have achieved 100% accuracy,as per the public leader-board i.e. they have managed to knock randomness, entirely out of the picture ! But that is only for one problem, all the other problems do not have 100% accuracy. Also there are data sets where achieving above 50 % accuracy(random) is considered a big achievement. Consider this paper by researchers at Alibaba group. If you look at table 4 in this paper you will see for a data-set SST-1 with 5 labels the highest accuracy possible till now is only 51.67% !

If the world is random, why are there objects like stars, planets and living beings which are filled with beautiful patterns ?

These types of questions are answered mostly, by accounting to intelligent design, which may be true but AIT gives us an alternate explanation.

Things which have repeating patterns like conch shells, sunflower etc. requires fewer instructions or a small program to be created. Think about the number pi, what is the probability that we roll a fair(purely random) dice(with 10 sides) labelled from 0–9 and the coin will roll out the digits of pi till 100 digits. The chance or probability of that happening is slim i.e. (1/10)¹⁰⁰, which is the no. — 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001.

The reader can also think about this experiment with coins, instead of the dice by considering value of pi in binary digits. Either way, the argument stays the same.

Now, if I wanted to write a computer program to calculate pi, I could just write — ‘print (22/7) up-to 100 decimal places’. This code will guarantee a better approximation than the random flips. If you need better approximations, some more methods are discussed here.

So, even the no. pi which has no repeating pattern after being simulated to billions of digits is less random then say a digit generated from random coin flips.

What I want to say that regularity or order is easier to create if we think in terms of AIT. Not saying this is a strong argument, but it is new way of thinking about the world and how it works based on empirical evidence.

Conclusion

The proper conclusion should be me telling you, if there is any relation between the randomness at quantum level and at the most macro level i.e. in markets. But I do not have the answer and as far as I know, neither does anyone else. But we have tools to advance our knowledge in this direction.

Some good work is being done in this area for example in this paper by Hector Zenil et al and another paper by Paul M.B. Vitanyi

Let us conclude by quoting Hector Zenil et al from this article.

Images(top and bottom) taken from the same article

The new approach suggests and quantifies the way in which the mind can behave more algorithmically than statistically. An obvious example from memorization that displays the algorithmic power of the mind involves the task of learning a rather long sequence that has, however, a short description
(or formula) that can generate the longer sequence. In this case (top), a sequence of consecutive digits in base 10 can
easily be remembered because each digit is the sum of the previous digits plus 1, something that statistical
approaches (middle) were unable to characterize, but that the new methods (bottom) could. This algorithmic nature of the mind is a reflection of the algorithmic nature of the world (Zenil et al. 2012) that the
mind harnesses to navigate the world, taking advantage of both its statistical and algorithmic regularities. A computer program is like a mathematical formula. Our suggestions do not imply, however, that the mind is a number cruncher trying to find the mathematical formula behind the data but that the mind does not only attempt to make statistical sense but also algorithmic sense of the world. That is, we attempt to learn patterns that display either statistical patterns or are algorithmically compressible but this does not imply that the mind always find neither the shortest, a shorter, or even an efficient algorithm. In fact, the problem is uncomputable even for Turing machines, and it is an open question
whether it is for the human mind (Zenil and Hernandez-Quiroz 2007), but the mind is equipped to do so, unlike previous assumptions in practice that would overlook such capabilities beyond simple statistical pattern recognition

--

--

Supriyo Banerjee
Supriyo Banerjee

Written by Supriyo Banerjee

Lead Data Scientist at Merck Life Science. Philosopher, introvert, avid reader, amateur photographer and musician.

No responses yet