The volume of data that are accessible on the internet has increased dramatically. This growth will only increase exponentially in future as more exhaust devices connected to network. A part these consists documents from various sources. As digital sources increase, it becomes tough perform process identification relevant information which is most essentially needed for their further usage. goa...