Orthography Statistics (Part 1: The Count Tables)

Many comments about the results from the Orthographic Password Creator indicate that the passwords are a bit thick and not much better than random characters. Given such, I am re-thinking the logic how the process should operate.

My original thought was to include every possible orthographic combination. Upon examination of the list, a bunch of combinations seem unlikely or unfamiliar. I decided to do an experiment. The Random Word Password Creator uses a mostly unabridged list of English words. 109,462 to be exact. I loaded the entire dictionary into a text editor (RAD Studio IDE) and used the search operation to count how many times each orthographic element is used within the entire dictionary. I made a table with three columns. The first column is the text letters of the orthography. The second column is simply a count of the number of letters in that element. That column provides a convenient sorting mechanism. The third column is how many times that text pattern was found in the dictionary. The search used a simple pattern match, so if within a word the pattern occurs twice, then that counts as two occurrences.

The concept is to see what are the most and least popular constructs. Below are two tables with the results.

VOWEL ORTHOGRAPHY COUNTS

VowelCharactersOccurrences
y115850
u131227
o157402
a171192
i181941
e1108571
uy222
aa238
ii238
ao263
ez2140
oh2140
uo2261
yr2263
ah2281
ae2338
oy2352
eh2389
eu2569
ye2578
oe2602
ey2639
aw2733
eo2822
ew2842
ei21013
wo21018
gh21057
ay21072
oa21080
ue21153
ui21158
eg21219
ua21231
oi21232
au21303
ow21715
ir22155
ig22167
ai22205
oo22455
ee22561
ut22664
ot22784
ia23321
ur23846
et24207
ou24500
ea24588
ie26374
ic26893
or27630
ar27866
is28144
al28993
es218963
er221015
aae31
eie31
iee31
aoh32
uye33
aah34
aow34
aar35
aie35
aue35
oea35
oeh37
oeu38
eah39
aor310
awy311
uoy313
ieu321
ooe337
eir338
iew344
aig347
oye355
awe357
eor371
oup381
aer384
eau384
uet390
aur391
oul3106
oor3116
ewe3145
ais3149
eou3149
eye3162
aye3164
eur3168
uar3176
irr3177
eig3180
oar3189
urr3237
eer3256
air3339
owe3360
err3411
ach3460
arr3464
our3490
are3522
oll3538
ert3572
ete3671
igh3686
ear3744
ore3753
ure3774
olo3935
ere31445
ier31449
ers35161
ayre40
eyre40
aahe41
aowe41
eaue41
ioux41
uoye41
eure42
ueur43
yrrh44
oare47
ueue47
oore49
ayor410
arrh413
owar414
aigh425
aure425
ighe427
iere430
eere446
aire449
oure458
augh469
eare469
arre479
ayer484
erre492
urre496
irre497
eigh4123
ough4172
aughe53
ougha54
eighe511
aille512
oughe528

CONSONANT ORTHOGRAPHY COUNTS

ConsonantCharactersOccurrences
q11712
j11758
x12718
z14102
w18020
k18143
v19726
f112791
y115850
b118293
h120751
m125675
p126414
g126924
d134307
c137832
l150285
t162149
n164901
r169025
s183342
vv20
zh25
xs214
cn215
mh216
cz226
kk227
bh231
pb249
lh263
cq264
pn264
kh279
dh2117
dj2118
tz2129
bt2133
gm2178
xc2200
mn2223
ln2225
zz2249
dn2253
rh2254
lf2260
kn2288
wr2292
lm2302
nh2304
lk2327
tw2351
dg2418
xe2429
cs2431
sw2516
wh2551
ve2578
cc2653
dd2674
bb2694
gn2728
ld2746
zi2759
gg2846
ks2962
gu21006
pt21013
gh21057
mm21081
nn21092
ps21127
ff21180
mb21293
we21328
pp21398
rr21529
fe21570
qu21690
ph21945
ze21962
tt21963
cu21974
gi22164
mp22288
ke22372
rt22437
be22497
sc22497
ck22534
ct22800
ci22915
th23225
ge23504
nc23643
sh23694
ts23791
ce23859
nd24178
si24447
ch24457
pe24458
me24913
ll25071
di25150
ss25721
se25807
de26913
ne27635
le29900
st210327
ed211866
ng212011
te212027
re212695
ti213497
xsc31
xsw32
pph38
tth38
cht336
ngh338
rrh340
tsh341
chs346
lks352
cqu364
chm389
sth398
lve3163
kes3176
gne3181
gue3181
sci3216
sch3236
mbe3261
dge3291
gge3291
phe3311
sce3315
ffe3318
mme3357
rre3420
nne3421
sne3427
cks3459
que3501
ppe3503
tch3517
ght3671
shi3796
she3848
sse3871
chi3888
ssi3909
tte3910
lle3933
the31090
che31234
chsi42
tsch49
chth411
cque413
ngue425
phth430
ques4114
tche4225
cques50

 

 

 

This entry was posted in NousRandom, Orthography. Bookmark the permalink.

Leave a Reply

Your email address will not be published.

11 + five =