Many downstream NLP tasks have shown significant improvement through continual pre-training, transfer learning and multi-task learning. State-of-the-art approaches in Word Sense Disambiguation today benefit from some of these conjunction with information sources such as semantic relationships gloss definitions contained within WordNet. Our work builds upon systems uses data augmentation along e...