factorial randomness

The article below is about the question: Is there factorial randomness in the sequence of natural numbers. The answer is: Yes for now because the discribed results also indicate that there maybe a mathematical formula possible from wich you can calulate the factors from a given number. So with that formula it is possible to gess better than change what the next number of the factorial sequence will be.

 

Copyright & Disclaimer

 

 

 SATOCONOR.COM

J.G. van der Galiėn ‘Factorial Randomness’ 2.3. (2003)

Communication to the editor

SATOCONOR.COM Journal of RANDOMICS

 

 

Factorial Randomness

The Laws of Benford and Zipf with respect to the first digit distribution of the factor sequence from the natural numbers.

By: Johan Gerard van der Galien (M.Sc. in Chemistry)

(For Comments: johan.van.der.galien@satoconor.com)

 

HOME of SATOCONOR.COM

 

Download FACBEN21LR.pas

 

Six first digit distributions of the factors of the natural numbers up to n=10d (d Ī {2,3,4,5,6,7}) were calculated by means of a self-written PASCAL program. For each distribution a Chi-Square Goodness-Of-Fit was conducted, with expected values calculated from the Benford formula fD = log10(1+1/D).

The Chi-Square increased rapidly, from 6.91 for n=102 and 47.1 for n=103 to 222435 for n=107 Indicating that only for n=102 Benford's Law is followed. (It is a well known fact, and you will find it referenced in this article, that Chi-Square will always deviate if the sample space is large enough!)

These results are compared to the ones calculated from 20 Benford distributions found on the internet.

But the relative frequencies of the Factorial First Digit Distribution (fD) correlates well with fD = cD-a or Zipf's Law. Empirical and mathematical evidence is given that Benford's formula is an (approximate) special case of the more general formula of Zipf's Law. The Correlation Coefficient (R) from Linear Regression of scatter plotting log10(fD) = alog10(D) + b asymptotically goes to -1 with sample size. from R = -0.996 for n=102 to R = -0.9995 for n = 107. So a nearly perfect fit. Since the Benford formula also nearly perfect fits (R = -0.9992) with Zipf's Law it is reasonable to say that the Factorial First Digit Distribution follows Benford's and Zipf's Law, which also is confirmed by comparing the Linear Regression results of the Factorial First Digit Distribution with the regression formula log10(1+1/D) » 0.3135080577D-0.8636655870.

The consequences of these results for our understanding of randomness are discussed.

 

It is a widespread opinion that the randomness in our universe is a consequence of the uncertainty principle of Heisenberg from Quantum Mechanics.1 It is also the last hurdle mankind must take to obtain a deterministic and holistic theory of everything. Since there may be "hidden" variables in Quantum Mechanics.3-7,9,14

The definition of a truly random sequence used in this paper is: That no known deterministic process can do better than chance at guessing from a given sequence what the next element of this sequence will be.2 (Then this is also true for the sequence of factors of the natural numbers. Although there are a kind of patterns in it. Since there is no mathematical formula yet17, or in other words no deterministic process for calculating the [prime] factors of a natural number. If there was, then nowadays prime factors based encryption like RSA would be obsolete. There are only several trail divisions and sieve based algorithms that all are an enormous burden on nowadays computers.) This implies that true randomness is a temporary phenomenon. As we learn more from nature and science evolves we might develop deterministic processes which are for instance able to predict better than change what the next element of a sequence from a Quantum Mechanical source will be. It looks like randomness is the Achilles heel of traditional physics, and in my opinion it is the only way at the present day to tackle the deterministic properties of nature.

The way the PASCAL program FACBEN21LR generates the factorial sequence of the natural numbers is based on trail divisions. It takes about one hour to crack and factorize all natural numbers up to 107 on a COMPAQ Deskpro 220 MHz and 96 Mb RAM. The output of FACBEN21LR is shown in Table 3. It incorporates a Chi-Square Goodness-Of-Fit analysis with the expected values of the cumulative frequencies of the digits according to the Benford formula.18 Also a Linear Regression by scatter plotting the log10 data is performed for correlation with Zipf's formula.23

The Output of FACBEN21LR of the Chi-Square Goodness-Of-Fit8 analysis can be interpreted as follows. According to this test there is only a fit for the smallest sample n = 102. The Chi-Square (6.91) then falls within the 5% and 95% Confidence Interval > 2.71 and < 15.51.10,15,22 With increasing sample size the Chi-Square also increases to 222435 for n = 107. But it is a well known fact that Chi-Square always will deviate when the sample size is large enough!25 And the sample sizes of my experiment are very large indeed, see Table 3. So the deviation does not have to be significant. I compared these results by calculating the Chi-Squares of other suspected Benford distributions which can be found on the internet.18 These results are shown in Table 4. The Chi-Squares of these distributions have a range of 1.27 to 441. There are 2 who are below the 5% criterion and 10 above the 95% criterion. According to reference 19 when a Chi-Square is below 5% Confidence Level or above the 95%, the distribution is suspect (in other reject the H0-hypothesis, in the case of reference 19 the sequence is not random and in my case Benford's Law does not apply). The average distribution is definitely within this range (3.50). And this makes these 12 distributions less suspect. Tho comeback on the sample size of the average from the 20 distributions from reference 18 this is only 20229, compare this to 162728526 of the Factorial First Digit Distribution for n=107.

Now about the hypothesis that Benford's Law is in fact a special case of the more general Law of Zipf. There is evidence from the Linear Regression analysis of the results from Benford's formula (see Table 1).

 

D (=digit)

Log10(1+1/D)

Log10(D)

Log10(log10(1+1/D))

1

0.3010299957

0

-0.5213902277

2

0.1760912591

0.3010299957

-0.7542622013

3

0.1249387366

0.4771212547

-0.9033028900

4

0.0969100130

0.6020599913

-1.0136313480

5

0.0791812460

0.6989700043

-1.1013776680

6

0.0669467896

0.7781512504

-1.1742702440

7

0.0579919470

0.8450980400

-1.2366323100

8

0.0511525224

0.9030899870

-1.2911329450

9

0.0457574906

0.9542425094

-1.3395378010

Linear Regression results for log10(log10(1+1/D)) = alog10(D) + b

Correlation Coefficient R =

-0.9992296195

|Slope| = a =

0.8636655870

Intersection with Y-axis b =

-0.5037512926

 

Table 1: Evidence that the Benford first digit distribution is a special case of Zipf's law.

 

From the good R = -0.9992 of Table 1 can be deduced that the Benford first digit distribution law must be a special case of Zipf's law. (A perfect fit has an |R| = 1.20) The approximate formula of Zipf's law applied for the Benford distribution can be deduced from the result of Table 1 since:

 

log10(log10(1+1/D)) = alog10(D) + b Ž log10(1+1/D) » 10bD-a = cD-a (1.)

fD = log10(1+1/D) » 0.3135080577D-0.8636655870 (2.)

Here below is the mathematical proof that that Benford's Law is an (approximate) special case of Zipf's Law.

 

Theorema: log10(1+1/D) » cD-a D Ī {1,2,3,4,5,6,7,8,9} or 1 £ D £ 9 (3.)

 

x = (1+1/D) (4.)

log10(x) = ln(x)/ln(10) (5.)

ln(x) = (x-1)/x+1/2((x-1)/x)2+1/3((x-1)/x)3+ …….. (Taylor serial for x ³ 1/2)21 (6.)

ln(x) = (x-1)/x+(x-1)2/2x2+(x-1)3/3x3+ ……. Ž

(1/D)/(1+1/D)+1/2((1/D)/(1+1/D))2+1/3((1/D)/(1+1/D))3+ ……. Ž

(1/(D+1)+1/2(1/(D+1))2+1/3(1/(D+1))3+ ……. (7.)

(1/(D+1) > 1/2(1/(D+1))2 > 1/3(1/(D+1))3 > ……. (8.)

Now some gross approximations:

ln(1+1/D) » 1/(D+1) » 1/D (9.)

log10(1+1/D) = ln(1+1/D)/ln(10) » 1/(Dln(10)) Ž

fD » (1/ln(10))D-1 = 0.4342D-1 (10.)

So a = 1 and c » 0.4342 which is of the same magnitude found by Linear Regression (2.)

 

I also did Linear Regression on the Factorial First Digit Distribution. The |Correlation Coefficient| = R goes asymptotically to 1 with increasing amount of natural numbers tested. This fact is shown in Fig. 1.

 

 

Fig. 1: The development of the Correlation Coefficient (R = |Correlation Coefficient|) of the Factorial First Digit Distribution with Zipf's Law with increasing amount of natural numbers (n) tested.

 

Also the constants derived from the Linear Regression analysis (11.) of the data from n = 107 are in good agreement with the ones found for the Linear Regression analysis of Benford's formula (2.):

 

fD » 0.3237122232D-0.8996241844 (11.)

 

D

Log10(1+1/D)

0.3135080577D-0.8636655870

Factorial First Digit Distribution

1

0.30

0.31

0.32

2

0.18

0.17

0.18

3

0.12

0.12

0.12

4

0.097

0.095

0.096

5

0.079

0.078

0.074

6

0.067

0.067

0.063

7

0.058

0.058

0.056

8

0.051

0.052

0.050

9

0.046

0.047

0.045

 

Table 2: Comparing the outcome of the Benford formula, the Benford/Zipf regression formula and the factorial first digit distribution for n = 107. Rounded to two significant digits.

 

The results from the Linear Regression of the average of the 20 distributions from reference 18 also confirm Zipf's Law. Compare these results a = 0.89874592240, b = -0.4.8940098622 and R = 0.99661451810 with the results from the Factorial First Digit Distribution for n = 107 in Table 3 and Benford's formula in Table 1. You will find that they come very close. This can only mean that there is also empirical evidence for the hypothesis that all Benford distributions follow Zipf's Law.

From these results the following conclusions can be deduced:

 

  • Benford's Law is an (approximate) special case of Zipf's Law. Proven empirical and mathematical.
  • All Benford distributions also follow Zipf's Law.
  • If there is good correlation with Zipf's Law of a First Digit Distribution then Benford's law also applies.
  • From the above conclusions follows that the Factorial First Digit Distribution follows Zipf's AND Benford's Law.
  • It is a well-known (empirical?) fact that Chi-square will always deviate if the sample size is large enough. I wonder if this is mathematical proven! Can someone tell me that and give me a reference. At least my data supports this ‘hypothesis’. So it is highly likely that the Factorial First Digit Distribution follows Newcomb-Benford’s Law.
  • The factorial sequence is still a random sequence since there is no mathematical formula yet to calculate the factors of a given integer.
  • Benford's Law is an approximate law. Zipf's Law is more exact.
  • Benford's Law applies when at random "unbiased" distributions are selected and that random samples are taken from each of these distributions.24 The way FACBEN21LR gathers data seems to fulfil this criterium, in other words the factor sequence from the natural numbers is random.

 

It looks like there are at least two kinds of randomness. Pure randomness from Quantum Mechanical sources which fully concurs with an equidistribution of digits of the used number system (for instance by transforming random bits to byte integers [octal number system], or word integers [hexadecimal number system], etc.) according to a Chi-Square analysis. And randomness from an interaction between chaos and order for which the first digit distribution concurs with Newcomb-Benford's and Zipf's Law. But which does not pass the Chi-Square test at large sample sizes for accepting the H0-hypothesis that the distribution follows the log10(1+1/D) formula. In other words such kind of randomness has a deterministic and, a still, by lack of a better word, "undeterministic" component. It should be possible to develop experiments with which you can isolate the deterministic and "undeterministic" part of such a kind of randomness. (The factor sequence of the natural numbers is in my opinion the most accessible, and a limitless resource for Benford/Zipf randomness, to do such experiments with.) And also finding the mathematical formula for the factors of natural numbers will bring us closer to understanding pure randomness.

 

Epilogue:

As a matter of fact divising the Factorial Distribution in to a deterministic and by lack of a better word "undeterministic" part has been done recently by myself. One can extract the analytical pure chaos (= "undeterministic" part), which means that it passes critical Randomness Test Suites like DIEHARD and NIST, these temporary secret extracting methods are applied in the FRAG Pseudo Random Number Generator described in this journal and are on this site.26 An example of the deterministic part that leads from order to a Newcomb-Benford-Zipf distribution is the effect of the power of two calculation from bits in the formula needed to generate computer reals with different kind of mantissa bits. Research also described in this journal.27

 

-o0o- Please also visit: The new Journal of Randomics site and the cumulated result of the site here

 

Notes & References:

1a) Walker J. 'Index Librorum Liberorum’

http://www.fourmilab.ch

1b) Anonymous 'Generating random numbers' http://www.randomnumbers.info/content/Generating.htm

1c) Anonymous 'A fast and compact quantum random generator'

http://www.quantum.at/research/photonentangle/rng/

1d) Davies R. 'Hardware random number generators'

http://www.robertnz.net/hwrng.htm

2) Inspired by: Paul P. Budnik 'What is and what will be'

http://www.mtnmath.com/book.html

3) Anonymous 'The system at work' webpage.

4) Not used.

5) Van der Galien J.G. 'Are the prime numbers randomly distributed?' SATOCONOR.COM 1.2. http://www.satoconor.com/

6) Eric Weisstein's World of Physics 'Einstein Rosen Podolsky paradox'

http://scienceworld.wolfram.com/physics/Einstein-Podolsky-RosenParadox.html

7) Anonymous, 'Outline: Bohm, Bell - and Boom! The end of modern dualism'

http://www.drury.edu/ess/philsci/bell.html

8) Knuth, D.E 'The art of computer programming, Volume 2 / Seminumerical algorithms' Reading MA: Addison-Wesley (1969)

9) Anonymous 'A prime case of chaos'

http://www.ams.org/featurecolumn/archive/prime-chaos.html

10) Kreyzig E. ‘Advanced engineering mathematics: Table A12 chi-square distribution’ 4th edition, John Wiley and Sons (1979)

11) Not used.

12) Not used.

13) Not used.

14) Anonymous 'The hidden-variable theory of David Bohm'

http://www.meta-library.net/ghc-obs/hidvar-frame.html

15) Walker J. 'Chi-square calculator'

http://www.fourmilab.ch/rpkp/experiments/analysis/chiCalc.html

16a) Lowry R. 'Chapter 8: Chi-square procedures for the analysis of categorical frequency data. Part 1'

http://faculty.vassar.edu/lowry/PDF/c8p1.pdf

16b) Lowry R. 'Chapter 8: Chi-square procedures for the analysis of categorical frequency data. Part 2'

http://faculty.vassar.edu/lowry/PDF/c8p2.pdf

16c) Lowry R. 'Chapter 8: Chi-square procedures for the analysis of categorical frequency data. Part 3' http://faculty.vassar.edu/lowry/PDF/c8p3.pdf

17) Raiter B. 'Prime number hide-and-seek: How the RSA cipher works'

http://www.muppetlabs.com/~breadbox/txt/rsa.html

18) Eric Weisstein's Mathworld 'Benford's Law'

http://mathworld.wolfram.com/BenfordsLaw.html

19) Walker J. 'Ent: A pseudorandom number sequence test program'

http://www.fourmilab.ch/random/

20) Hays W.L. 'Statistics' 5th Edition, Harcourt Brace College Publishers (1994)

21) Efunda Engineering Fundamentals 'Taylor series expansions of logarithmic functions'

http://www.efunda.com/math/taylor_series/logarithmic.cfm

22) Kreyszig E. 'Advanced engineering mathematics' 4th Edition John Wiley & Sons (1979)

23) Wikipedia 'Zipf's law' http://en.wikipedia.org/wiki/Zipfs_law

24) Hill T.P. 'A statistical derivation of the significant-digit law' Stat. Sci. 10 354-363 (1995)

25) P.D. Scott, M. Fasli 'Benford's law: An empirical investigation and a novel explanation' CSM Technical Report 349

http://cswww.essex.ac.uk/technical-reports/2001/CSM-349.pdf

26) Van der Galien J.G. 'A factorial randomness generator (FRAG PRNG)' SATOCONOR.COM 3.1.

http://www.satoconor.com

27) Van der Galien J.G. 'Sample space for reals follow Newcomb-Benford and Zipf' SATOCONOR.COM 4.1.

http://www.satoconor.com

 

Appendix

 

Table 3: The output of PASCAL program FACBEN21LR (a = |slope|, intersection = b and c = 10b)

 

First digit distribution of all factors of natural numbers

including 1 and n, squares double. With Linear Regression and Chi-Square analysis.

 

Cumulative amount of factors under 100=492

e(n) is the observed cumulative amount of factors per digit

part1= 3.47560975609756E-0001 e1=171

part2= 1.78861788617886E-0001 e2=88

part3= 1.17886178861788E-0001 e3=58

part4= 9.34959349593496E-0002 e4=46

part5= 6.50406504065041E-0002 e5=32

part6= 5.48780487804878E-0002 e6=27

part7= 5.08130081300813E-0002 e7=25

part8= 4.67479674796748E-0002 e8=23

part9= 4.47154471544715E-0002 e9=22

Total= 1.00000000000000E+0000

Expected cumulative amount of factors digit 1=148

Expected cumulative amount of factors digit 2=87

Expected cumulative amount of factors digit 3=61

Expected cumulative amount of factors digit 4=48

Expected cumulative amount of factors digit 5=39

Expected cumulative amount of factors digit 6=33

Expected cumulative amount of factors digit 7=29

Expected cumulative amount of factors digit 8=25

Expected cumulative amount of factors digit 9=23

CHI h1= 3.57432432432432E+0000

CHI h2= 1.14942528735632E-0002

CHI h3= 1.47540983606557E-0001

CHI h4= 8.33333333333333E-0002

CHI h5= 1.25641025641026E+0000

CHI h6= 1.09090909090909E+0000

CHI h7= 5.51724137931034E-0001

CHI h8= 1.60000000000000E-0001

CHI h9= 4.34782608695652E-0002

CHI-square= 6.91921464025773E+0000

slope =-9.72906230000316E-0001

intersection =-4.64032106346377E-0001

Correlation coefficient =-9.95978305301610E-0001

 

Cumulative amount of factors under 1000=7100

e(n) is the observed cumulative amount of factors per digit

part1= 3.34366197183099E-0001 e1=2374

part2= 1.79295774647887E-0001 e2=1273

part3= 1.20845070422535E-0001 e3=858

part4= 9.46478873239437E-0002 e4=672

part5= 6.77464788732394E-0002 e5=481

part6= 5.87323943661972E-0002 e6=417

part7= 5.23943661971831E-0002 e7=372

part8= 4.78873239436620E-0002 e8=340

part9= 4.40845070422535E-0002 e9=313

Total= 1.00000000000000E+0000

Expected cumulative amount of factors digit 1=2137

Expected cumulative amount of factors digit 2=1250

Expected cumulative amount of factors digit 3=887

Expected cumulative amount of factors digit 4=688

Expected cumulative amount of factors digit 5=562

Expected cumulative amount of factors digit 6=475

Expected cumulative amount of factors digit 7=412

Expected cumulative amount of factors digit 8=363

Expected cumulative amount of factors digit 9=325

CHI h1= 2.62840430510061E+0001

CHI h2= 4.23200000000000E-0001

CHI h3= 9.48139797068771E-0001

CHI h4= 3.72093023255814E-0001

CHI h5= 1.16743772241992E+0001

CHI h6= 7.08210526315789E+0000

CHI h7= 3.88349514563107E+0000

CHI h8= 1.45730027548209E+0000

CHI h9= 4.43076923076923E-0001

CHI-square= 5.25678307028779E+0001

slope =-9.49139954699959E-0001

intersection =-4.71479877812393E-0001

Correlation coefficient =-9.98054367296489E-0001

 

Cumulative amount of factors under 10000=93768

e(n) is the observed cumulative amount of factors per digit

part1= 3.25974746182066E-0001 e1=30566

part2= 1.77704547393567E-0001 e2=16663

part3= 1.21853937377356E-0001 e3=11426

part4= 9.54376759662145E-0002 e4=8949

part5= 7.06104427949834E-0002 e5=6621

part6= 6.09802917839775E-0002 e6=5718

part7= 5.39843016807440E-0002 e7=5062

part8= 4.87906321986179E-0002 e8=4575

part9= 4.46634246224725E-0002 e9=4188

Total= 1.00000000000000E+0000

Expected cumulative amount of factors digit 1=28227

Expected cumulative amount of factors digit 2=16512

Expected cumulative amount of factors digit 3=11715

Expected cumulative amount of factors digit 4=9087

Expected cumulative amount of factors digit 5=7425

Expected cumulative amount of factors digit 6=6277

Expected cumulative amount of factors digit 7=5438

Expected cumulative amount of factors digit 8=4796

Expected cumulative amount of factors digit 9=4291

CHI h1= 1.93818719665568E+0002

CHI h2= 1.38087451550388E+0000

CHI h3= 7.12940674349125E+0000

CHI h4= 2.09574116870254E+0000

CHI h5= 8.70593939393939E+0001

CHI h6= 4.97819021825713E+0001

CHI h7= 2.59977933063626E+0001

CHI h8= 1.01836947456213E+0001

CHI h9= 2.47238405965975E+0000

CHI-square= 3.79919910326875E+0002

slope =-9.25143717743863E-0001

intersection =-4.80373761375731E-0001

Correlation coefficient =-9.98895639943961E-0001

 

Cumulative amount of factors under 100000=1167066

e(n) is the observed cumulative amount of factors per digit

part1= 3.21242329054227E-0001 e1=374911

part2= 1.77480108237238E-0001 e2=207131

part3= 1.22468652158489E-0001 e3=142929

part4= 9.56972442004137E-0002 e4=111685

part5= 7.22409872278003E-0002 e5=84310

part6= 6.21241643574571E-0002 e6=72503

part7= 5.47372642164196E-0002 e7=63882

part8= 4.92037296948073E-0002 e8=57424

part9= 4.48055208531480E-0002 e9=52291

Total= 1.00000000000000E+0000

Expected cumulative amount of factors digit 1=351322

Expected cumulative amount of factors digit 2=205510

Expected cumulative amount of factors digit 3=145812

Expected cumulative amount of factors digit 4=113100

Expected cumulative amount of factors digit 5=92410

Expected cumulative amount of factors digit 6=78131

Expected cumulative amount of factors digit 7=67680

Expected cumulative amount of factors digit 8=59698

Expected cumulative amount of factors digit 9=53402

CHI h1= 1.58384877975191E+0003

CHI h2= 1.27859520217994E+0001

CHI h3= 5.70027775491729E+0001

CHI h4= 1.77031388152078E+0001

CHI h5= 7.09988096526350E+0002

CHI h6= 4.05400980404705E+0002

CHI h7= 2.13132446808511E+0002

CHI h8= 8.66205903045328E+0001

CHI h9= 2.31137597842777E+0001

CHI-square= 3.10959652196646E+0003

slope =-9.13937169064273E-0001

intersection =-4.84462468715366E-0001

Correlation coefficient =-9.99231411946420E-0001

 

Cumulative amount of factors under 1000000=13971034

e(n) is the observed cumulative amount of factors per digit

part1= 3.17889141204581E-0001 e1=4441240

part2= 1.77206139502631E-0001 e2=2475753

part3= 1.22882100208187E-0001 e3=1716790

part4= 9.59136596475250E-0002 e4=1340013

part5= 7.33948539528284E-0002 e5=1025402

part6= 6.29282700192412E-0002 e6=879173

part7= 5.52830234326250E-0002 e7=772361

part8= 4.95320532467389E-0002 e8=692014

part9= 4.49707587856418E-0002 e9=628288

Total= 1.00000000000000E+0000

Expected cumulative amount of factors digit 1=4205700

Expected cumulative amount of factors digit 2=2460177

Expected cumulative amount of factors digit 3=1745523

Expected cumulative amount of factors digit 4=1353933

Expected cumulative amount of factors digit 5=1106244

Expected cumulative amount of factors digit 6=935316

Expected cumulative amount of factors digit 7=810207

Expected cumulative amount of factors digit 8=714654

Expected cumulative amount of factors digit 9=639279

CHI h1= 1.31914049028699E+0004

CHI h2= 9.86155776596562E+0001

CHI h3= 4.72973022412194E+0002

CHI h4= 1.43113728670473E+0002

CHI h5= 5.90776443894837E+0003

CHI h6= 3.37002301788914E+0003

CHI h7= 1.76784416328173E+0003

CHI h8= 7.17227637430141E+0002

CHI h9= 1.88966133722522E+0002

CHI-square= 2.58579326228841E+0004

slope =-9.05478543219127E-0001

intersection =-4.87634509667732E-0001

Correlation coefficient =-9.99384864491395E-0001

 

Cumulative amount of factors under 10000000=162728526

e(n) is the observed cumulative amount of factors per digit

part1= 3.15517102391747E-0001 e1=51343633

part2= 1.77055644196027E-0001 e2=28812004

part3= 1.23173173706495E-0001 e3=20043789

part4= 9.60524155426812E-0002 e4=15630468

part5= 7.42102463338235E-0002 e5=12076124

part6= 6.34936925563991E-0002 e6=10332235

part7= 5.56627115272955E-0002 e7=9057911

part8= 4.97573056121703E-0002 e8=8096933

part9= 4.50777081333607E-0002 e9=7335429

Total= 1.00000000000000E+0000

Expected cumulative amount of factors digit 1=48986167

Expected cumulative amount of factors digit 2=28655071

Expected cumulative amount of factors digit 3=20331096

Expected cumulative amount of factors digit 4=15770024

Expected cumulative amount of factors digit 5=12885047

Expected cumulative amount of factors digit 6=10894152

Expected cumulative amount of factors digit 7=9436944

Expected cumulative amount of factors digit 8=8323975

Expected cumulative amount of factors digit 9=7446049

CHI h1= 1.13453374320060E+0005

CHI h2= 8.59462762768935E+0002

CHI h3= 4.06005225930761E+0003

CHI h4= 1.23499350007330E+0003

CHI h5= 5.07841702035701E+0004

CHI h6= 2.89835055439836E+0004

CHI h7= 1.52237859087645E+0004

CHI h8= 6.19272279938371E+0003

CHI h9= 1.64339294570852E+0003

CHI-square= 2.22435460243621E+0005

slope =-8.99624184409583E-0001

intersection =-4.89840901540702E-0001

Correlation coefficient =-9.99457780214643E-0001

 

Table 4: The Linear Regression and Chi-Square Tests performed on some suspected Benford distributions found on the internet.18 (a = |slope|, intersection = b and c = 10b)

 

Suspected Benford Distribution = Benford formula (Program Test)

F1 = 3.0102999570E-01

F2 = 1.7609125910E-01

F3 = 1.2493873660E-01

F4 = 9.6910013000E-02

F5 = 7.9181246000E-02

F6 = 6.6946789600E-02

F7 = 5.7991947000E-02

F8 = 5.1152522400E-02

F9 = 4.5757490600E-02

slope =-8.6366558707E-01

intersection =-5.0375129256E-01

Correlation coefficient =-9.9922961953E-01

 

Suspected Benford Distribution = Rivers, Area

slope =-8.5007232486E-01

intersection =-5.1356871625E-01

Correlation coefficient =-9.7179279069E-01

Chi-Square = 4.9617226594E+00

 

Suspected Benford Distribution = Population

slope =-1.1823811930E+00

intersection =-3.7250056401E-01

Correlation coefficient =-9.7607292076E-01

Chi-Square = 1.1862939846E+02

 

Suspected Benford Distribution = Constants

slope =-1.0157812073E+00

intersection =-5.2066747111E-01

Correlation coefficient =-6.9676575170E-01

Chi-Square = 2.4440656854E+01

 

Suspected Benford Distribution = Specific Heat

slope =-9.5386971754E-01

intersection =-4.7117207520E-01

Correlation coefficient =-8.8848255381E-01

Chi-Square = 1.1121293917E+02

 

Suspected Benford Distribution = Pressure

slope =-8.8961353901E-01

intersection =-4.9167746996E-01

Correlation coefficient =-9.9371644466E-01

Chi-Square = 1.2703741165E+00

 

Suspected Benford Distribution = H.P. Lost

slope =-9.2398277231E-01

intersection =-4.7632655922E-01

Correlation coefficient =-9.8644255296E-01

Chi-Square = 3.4605628180E+00

 

Suspected Benford Distribution = Molecular Weigt

slope =-1.1411996629E+00

intersection =-3.8973413084E-01

Correlation coefficient =-9.5494997221E-01

Chi-Square = 1.2575708356E+02

 

Suspected Benford Distribution = Drainage

slope =-1.2128593722E+00

intersection =-3.5741388937E-01

Correlation coefficient =-9.2944807224E-01

Chi-Square = 1.1142218800E+01

 

Suspected Benford Distribution = Atomic Weight

slope =-1.0642120481E+00

intersection =-4.8758981189E-01

Correlation coefficient =-8.8867625323E-01

Chi-Square = 1.7245691826E+01

 

Suspected Benford Distribution = n^-1 and sqrt(n)

slope =-5.9540500961E-01

intersection =-6.4345712127E-01

Correlation coefficient =-8.5045516020E-01

Chi-Square = 4.4076412324E+02

 

Suspected Benford Distribution = Design

slope =-6.6096834428E-01

intersection =-5.9960139086E-01

Correlation coefficient =-9.6027583379E-01

Chi-Square = 1.9212693139E+01

 

Suspected Benford Distribution = Reader's Digest

slope =-9.4238206892E-01

intersection =-4.7591090015E-01

Correlation coefficient =-9.9364779402E-01

Chi-Square = 3.2271432117E+00

 

Suspected Benford Distribution = Cost Data

slope =-9.8005651446E-01

intersection =-4.5804015636E-01

Correlation coefficient =-9.6766307822E-01

Chi-Square = 1.5601254902E+01

 

Suspected Benford Distribution = X-Ray Volta

slope =-8.2429874377E-01

intersection =-5.2065454636E-01

Correlation coefficient =-9.8620626973E-01

Chi-Square = 5.4256271820E+00

 

Suspected Benford Distribution = American League

slope =-9.8581811852E-01

intersection =-4.5294564119E-01

Correlation coefficient =-9.8110932937E-01

Chi-Square = 1.4595355311E+01

 

Suspected Benford Distribution = Black Body

slope =-8.7963246096E-01

intersection =-5.0068542909E-01

Correlation coefficient =-9.8472190189E-01

Chi-Square = 9.5229019643E+00

 

Suspected Benford Distribution = Addresses

slope =-8.5696179253E-01

intersection =-5.0719069791E-01

Correlation coefficient =-9.9332558555E-01

Chi-Square = 1.2966139294E+00

 

Suspected Benford Distribution = n^1,n^2....n!

slope =6.4992448601E-01

intersection =-6.0023561887E-01

Correlation coefficient =-9.9052764917E-01

Chi-Square = 2.4993708303E+01

 

Suspected Benford Distribution = Death Rate

slope =-8.6699419233E-01

intersection =-5.0216308616E-01

Correlation coefficient =-9.7115965914E-01

Chi-Square = 7.5549793791E+00

 

Suspected Benford Distribution = Average over 20 distributions

Samplesize = 20229

slope =-8.9874592240E-01

intersection =-4.8940098622E-01

Correlation coefficient =-9.9661451810E-01

Chi-Square = 3.5050637915E+01

 

Suspected Benford Distribution = Factorial First Digit Distribution

Samplesize = 162728526

F1 = 3.1551710239E-01

F2 = 1.7705564420E-01

F3 = 1.2317317371E-01

F4 = 9.6052415543E-02

F5 = 7.4210246334E-02

F6 = 6.3493692557E-02

F7 = 5.5662711527E-02

F8 = 4.9757305612E-02

F9 = 4.5077708133E-02

slope =-8.9962418441E-01

intersection =-4.8984090154E-01

Correlation coefficient =-9.9945778023E-01

Chi-Square = 2.2243549463E+05