Most Frequent Words - Corpus of Research Articles 2007


There are 5,609,417 words and 126,569 unique words in the CRA.


RANK

WORD

INSTANCES

PERCENTAGE

1.     

the

390393

7.2398 %

2.     

of

225146

4.1753 %

3.     

and

163511

3.0323 %

4.     

in

132492

2.4571 %

5.     

to

132060

2.4490 %

6.     

a

108201

2.0066 %

7.     

is

73715

1.3670 %

8.     

that

69240

1.2840 %

9.     

for

56687

1.0513 %

10. 

as

45985

0.8528 %

11. 

with

37685

0.6989 %

12. 

are

35361

0.6558 %

13. 

on

34533

0.6404 %

14. 

by

34350

0.6370 %

15. 

this

33380

0.6190 %

16. 

be

32800

0.6083 %

17. 

it

25317

0.4695 %

18. 

from

24857

0.4610 %

19. 

was

23763

0.4407 %

20. 

not

23134

0.4290 %

21. 

or

22583

0.4188 %

22. 

an

21886

0.4059 %

23. 

at

20228

0.3751 %

24. 

we

18570

0.3444 %

25. 

were

18236

0.3382 %

26. 

have

18122

0.3361 %

27. 

which

17646

0.3272 %

28. 

their

15307

0.2839 %

29. 

can

13720

0.2544 %

30. 

more

13711

0.2543 %

31. 

these

13683

0.2537 %

32. 

but

12368

0.2294 %

33. 

they

12196

0.2262 %

34. 

one

12128

0.2249 %

35. 

has

11527

0.2138 %

36. 

between

11375

0.2109 %

37. 

also

11335

0.2102 %

38. 

than

10947

0.2030 %

39. 

such

10649

0.1975 %

40. 

other

10304

0.1911 %

41. 

all

9904

0.1837 %

42. 

two

8936

0.1657 %

43. 

may

8884

0.1648 %

44. 

there

8715

0.1616 %

45. 

if

8692

0.1612 %

46. 

been

8667

0.1607 %

47. 

only

8309

0.1541 %

48. 

I

8285

0.1536 %

49. 

its

8183

0.1518 %

50. 

when

8029

0.1489 %

51. 

time

7623

0.1414 %

52. 

would

7509

0.1393 %

53. 

some

7169

0.1329 %

54. 

data

7073

0.1312 %

55. 

used

6994

0.1297 %

56. 

will

6955

0.1290 %

57. 

about

6849

0.1270 %

58. 

both

6771

0.1256 %

59. 

each

6753

0.1252 %

60. 

et

6606

0.1225 %

61. 

different

6510

0.1207 %

62. 

however

6452

0.1197 %

63. 

al

6426

0.1192 %

64. 

model

6396

0.1186 %

65. 

first

6338

0.1175 %

66. 

into

6334

0.1175 %

67. 

where

6278

0.1164 %

68. 

no

6260

0.1161 %

69. 

use

6131

0.1137 %

70. 

had

6088

0.1129 %

71. 

his

6069

0.1125 %

72. 

new

6060

0.1124 %

73. 

who

6029

0.1118 %

74. 

because

6005

0.1114 %

75. 

most

5910

0.1096 %

76. 

our

5833

0.1082 %

77. 

results

5823

0.1080 %

78. 

study

5567

0.1032 %

79. 

then

5524

0.1024 %

80. 

using

5364

0.0995 %

81. 

social

5275

0.0978 %

82. 

number

5208

0.0966 %

83. 

so

5138

0.0953 %

84. 

research

5075

0.0941 %

85. 

those

5025

0.0932 %

86. 

what

4974

0.0922 %

87. 

over

4889

0.0907 %

88. 

through

4827

0.0895 %

89. 

system

4825

0.0895 %

90. 

information

4802

0.0891 %

91. 

case

4770

0.0885 %

92. 

he

4768

0.0884 %

93. 

analysis

4722

0.0876 %

94. 

how

4704

0.0872 %

95. 

any

4634

0.0859 %

96. 

could

4612

0.0855 %

97. 

same

4526

0.0839 %

98. 

see

4522

0.0839 %

99. 

do

4438

0.0823 %

100.                     

within

4392

0.0814 %

101.                     

thus

4376

0.0812 %

102.                     

example

4361

0.0809 %

103.                     

should

4359

0.0808 %

104.                     

state

4322

0.0802 %

105.                     

work

4297

0.0797 %

106.                     

while

4256

0.0789 %

107.                     

given

4151

0.0770 %

108.                     

value

4142

0.0768 %

109.                     

even

4098

0.0760 %

110.                     

process

4090

0.0758 %

111.                     

well

4072

0.0755 %

112.                     

level

4063

0.0753 %

113.                     

control

4009

0.0743 %

114.                     

many

3981

0.0738 %

115.                     

after

3912

0.0725 %

116.                     

based

3909

0.0725 %

117.                     

fig

3858

0.0715 %

118.                     

table

3805

0.0706 %

119.                     

out

3760

0.0697 %

120.                     

important

3692

0.0685 %

121.                     

during

3669

0.0680 %

122.                     

found

3668

0.0680 %

123.                     

them

3668

0.0680 %

124.                     

set

3581

0.0664 %

125.                     

three

3566

0.0661 %

126.                     

very

3543

0.0657 %

127.                     

values

3496

0.0648 %

128.                     

effect

3450

0.0640 %

129.                     

people

3397

0.0630 %

130.                     

significant

3391

0.0629 %

131.                     

high

3388

0.0628 %

132.                     

being

3347

0.0621 %

133.                     

does

3327

0.0617 %

134.                     

order

3323

0.0616 %

135.                     

less

3300

0.0612 %

136.                     

studies

3292

0.0610 %

137.                     

group

3274

0.0607 %

138.                     

second

3232

0.0599 %

139.                     

Since

3227

0.0598 %

140.                     

change

3183

0.0590 %

141.                     

might

3132

0.0581 %

142.                     

development

3123

0.0579 %

143.                     

although

3107

0.0576 %

144.                     

point

3075

0.0570 %

145.                     

function

3074

0.0570 %

146.                     

way

3041

0.0564 %

147.                     

effects

3013

0.0559 %

148.                     

higher

2996

0.0556 %

149.                     

rather

2971

0.0551 %

150.                     

part

2951

0.0547 %

151.                     

following

2945

0.0546 %

152.                     

particular

2937

0.0545 %

153.                     

states

2935

0.0544 %

154.                     

much

2923

0.0542 %

155.                     

therefore

2915

0.0541 %

156.                     

under

2890

0.0536 %

157.                     

did

2862

0.0531 %

158.                     

whether

2854

0.0529 %

159.                     

performance

2849

0.0528 %

160.                     

approach

2824

0.0524 %

161.                     

terms

2816

0.0522 %

162.                     

among

2815

0.0522 %

163.                     

up

2794

0.0518 %

164.                     

made

2783

0.0516 %

165.                     

shown

2768

0.0513 %

166.                     

her

2750

0.0510 %

167.                     

problem

2749

0.0510 %

168.                     

form

2731

0.0506 %

169.                     

figure

2709

0.0502 %

170.                     

period

2709

0.0502 %

171.                     

models

2707

0.0502 %

172.                     

possible

2706

0.0502 %

173.                     

often

2692

0.0499 %

174.                     

individual

2688

0.0498 %

175.                     

large

2668

0.0495 %

176.                     

evidence

2662

0.0494 %

177.                     

power

2660

0.0493 %

178.                     

similar

2620

0.0486 %

179.                     

general

2610

0.0484 %

180.                     

theory

2607

0.0483 %

181.                     

years

2607

0.0483 %

182.                     

present

2605

0.0483 %

183.                     

knowledge

2600

0.0482 %

184.                     

systems

2584

0.0479 %

185.                     

section

2582

0.0479 %

186.                     

Further

2577

0.0478 %

187.                     

must

2573

0.0477 %

188.                     

groups

2560

0.0475 %

189.                     

result

2546

0.0472 %

190.                     

market

2513

0.0466 %

191.                     

rate

2492

0.0462 %

192.                     

method

2474

0.0459 %

193.                     

likely

2449

0.0454 %

194.                     

here

2436

0.0452 %

195.                     

like

2435

0.0452 %

196.                     

make

2425

0.0450 %

197.                     

test

2422

0.0449 %

198.                     

conditions

2413

0.0447 %

199.                     

variables

2409

0.0447 %

200.                     

participants

2407

0.0446 %

[ CRA Start ]