Pseudo Monte Carlo

Share a topic shared by a member of Melbourne, nicknamed “Paper Machine”, who is known as the little prince of mathematics!

Special note: this article is not original, published with the consent of the contributor.

01. Algorithm introduction

Expectation: In probability theory and statistics, mathematical expectation (or mean, also known as expectation) is the probability of each possible outcome of an experiment multiplied by the sum of its outcomes. It is one of the most basic mathematical characteristics. It reflects the average value of the random variable.

If three points are randomly scattered in a 1*1 square, each of which forms a pair of vertices for a rectangle, then there are three rectangles in total, and we need to find the expectation of the area of the second largest rectangle.
Algorithm: three points are randomly selected each time, the second area is calculated, and finally the expectation is counted.

02. Monte Carlo

Monte Carlo method is also known as statistical simulation method, statistical test method. It is a numerical simulation method which takes probability phenomenon as the research object. It is a calculation method of calculating unknown characteristic quantity by sampling survey method. Monte Carlo is Monaco’s famous gambling town, named for its random sampling nature. Therefore, it is suitable for calculating and simulating test of discrete system. In computational simulation, by constructing a probability model similar to the system performance and conducting random experiments on a digital computer, the stochastic characteristics of the system can be simulated.

Monte Carlo Method refers to a class of methods that use random variables to solve probabilistic problems. The more common problems are the calculation of integral, probability, expectation and so on.

Common Monte Carlo methods rely on the “randomness” of random variables, meaning that events that have not occurred cannot be predicted based on existing information, such as coin flips or dice rolls. In computers, common random numbers are generated by a series of deterministic algorithms, which are commonly called pseudo random numbers. Because the computational accuracy is limited, and these random numbers are “not random enough” in the statistical sense, predictable repeated sequences will occur, and the convergence accuracy of these numbers is limited in the statistical sense.

Different from the common Monte Carlo method, the pseudo-Monte Carlo method uses low discrepancy sequence (commonly known as Halton sequence, SOBOL sequence, etc.) and does not use common (pseudo-) random number, so the convergence rate is faster (N is the number of samples, and the pseudo-Monte Carlo convergence rate can reach, while the convergence rate of ordinary Monte Carlo method is only. Another important property of pseudo-Monte Carlo is that the low-difference sequence used is replicable, that is, it does not change with the environment, and there is no random seed; However, the pseudo-random numbers used in ordinary Monte Carlo will lead to different results due to different random seeds, and the convergence effect is also different.

03. Topic analysis

This algorithm uses pseudo Monte Carlo to complete.

The CPP code is as follows:

#include <cmath> 
#include <cstdio> 
#include <vector> 
#include <cassert> 
#include <omp.h> 
const int UP=100; bool sieve[UP 100]; int primes[UP],top=0; void init(a) {  for (int i=2; i<=UP; i) if(! sieve[i]) {  primes[top ]=i;  for (intj=i; j<=UP/i; j) sieve[i*j]=true;  } } std: :vector<double> halton(long long i,const int &dim) {  assert(dim<=top);  std: :vector<double> prime_inv(dim,0),r(dim,0);  std: :vector<long long> t(dim,i);  for (int j=0; j<dim; j) prime_inv[j]=1.0/primes[j];  auto f=[](const std: :vector<long long> &t)->long long {  long long ret=0;  for (const auto &e:t)  ret =e;  return ret;  };  for(; f(t)>0;)  for (int j=0; j<dim; j) {  long long d=t[j]%primes[j];  r[j] =d*prime_inv[j];  prime_inv[j]/=primes[j];  t[j]/=primes[j];  }  return r; } double experiment(long long idx){  std: :vector<double> li=halton(idx,6);  double area1=fabs((li.at(0)-li.at(2))*(li.at(1)-li.at(3)));  double area2=fabs((li.at(0)-li.at(4))*(li.at(1)-li.at(5)));  double area3=fabs((li.at(2)-li.at(4))*(li.at(3)-li.at(5)));  double w=area1 area2 area3-std::max(std::max(area1,area2),area3)-std::min(std::min(area1,area2),area3);  return w; } const int BATCH=100000; const int THREADS=40; int main(a) {  init();  double total=0;  for (long long trial=0;;)  {  std: :vector<double> li(THREADS,0);  omp_set_dynamic(0);  omp_set_num_threads(THREADS);  #pragma omp parallel for  for (long long thread=0; thread<THREADS; thread) {  for (long long i=0; i<BATCH; i) li.at(thread) =experiment(trial thread*BATCH i);  }  for (const auto &d:li)  total =d;  trial =THREADS*BATCH;  printf("%lld: %.10f\n",trial,total/trial),fflush(stdout);  }  return 0; } Copy the code

Analysis: using parallel calculation, batch running random experiment, speed greatly improved. Where, halton function will generate halton low-difference sequence, whose range is [0,1], parameter I represents the ith sample, and dim represents the dimension of generated data (in this case, 6 points are required for each experiment, and 6-dimensional data points are enough). Different samples do not affect each other, so parallel computation can be used to speed up the process.

# represents the number of random trials ×10^7, Avg represents the average value of the second large area, Err represents the absolute error from the true value ×10^ (-10).

#	Avg	Err	#	Avg	Err
1	0.1017786804	55	2	0.1017786707	152
3	0.1017786905	46	4	0.1017786889	30
5	0.1017786809	50	6	0.1017786836	23
7	0.1017786849	10	8	0.1017786868	9
9	0.1017786799	60	10	0.1017786837	22
11	0.1017786845	14	12	0.1017786839	20
13	0.1017786874	15	14	0.1017786839	20
15	0.1017786848	11	16	0.1017786868	9
17	0.1017786851	8	18	0.1017786863	4
19	0.1017786854	5	20	0.1017786887	28
21	0.1017786858	1	22	0.1017786844	15
23	0.1017786841	18	24	0.1017786852	7
25	0.1017786849	10	26	0.101778684	19
27	0.1017786838	21	28	0.1017786852	7
29	0.1017786838	21	30	0.1017786846	13
31	0.1017786859	0	32	0.1017786862	3
33	0.1017786859	0	34	0.1017786853	6
35	0.1017786854	5	36	0.1017786859	0
37	0.101778685	9	38	0.1017786854	5
39	0.1017786853	6	40	0.1017786858	1
41	0.1017786848	11	42	0.1017786851	8
43	0.1017786847	12	44	0.1017786841	18
45	0.101778685	9	46	0.1017786842	17
47	0.1017786852	7	48	0.1017786848	11
49	0.1017786854	5	50	0.1017786851	8
51	0.1017786842	17	52	0.1017786844	15

As you can see, after the experiment, the convergence is up to 9 decimal places, very precise. Because the random numbers used are “not random enough”, ordinary Monte Carlo can only converge to five decimal places with the same number of experiments.

The above method can be extended to other random problems, very practical and efficient, welcome to discuss!

So, did you learn today’s quiz question? Leave your thoughts in the comments section!

I’ve compiled all the problems I’ve written into an e-book, complete with illustrations for each problem. Click to download

01. Algorithm introduction

02. Monte Carlo

03. Topic analysis

Related Posts

Checking for null values in Java? Use “If Else “to minimize

How do I use Kubernetes?

Python realizes handwritten digital image recognition +GUI interface + artboard number recognition