I am trying to create a simple implementation of the FlajoletMartin algorithm using Python. The stream will be the contents of a text file and you will produce an approximation of the number of unique words in the file as given by the algorithm. You will need to process the file one line at a time and may not store any part of the file. You can obtain words by splitting the lines on whitespace. Your code will be run from a terminal according to the following command
The text file is:
HERE IS THE TEXT FILE:
this is a fun file
this is the second line of the file
this is the third line of the file
this is the fourth and final line of the file
import sys
for line in sys.stdin:
words = line.split()
for word in words:
bin_string = bin(hash(word))
print(bin_string)