Wednesday, 23 October 2019

Take a sample of 10 phishing e-mails and find the most common words.

# Write a program to find the most common words in a file.


import collections
fin = open('D:\\email.txt','r')
a= fin.read()
d={ }
L=a.lower().split()

for word in L:
     word = word.replace(".","")
     word = word.replace(",","")
     word = word.replace(":","")
     word = word.replace("\"","")
     word = word.replace("!","")
     word = word.replace("&","")
     word = word.replace("*","")

for k in L:
     key=k
     if key not in d:
           count=L.count(key)
           d[key]=count


n_print = int(input("How many most common words to print: "))
print("\nOK. The {} most common words are as follows\n".format(n_print))

word_counter = collections.Counter(d)
for word, count in word_counter.most_common(n_print):
      print(word, ": ", count)

fin.close()


OUTPUT:

How many most common words to print: 5
OK. The 5 most common words are as follows

the : 505
a : 297
is : 247
in : 231
to : 214


No comments:

Post a Comment