This work describes a self-supervised data augmentation approach used to improve learning models' performances when only a moderate amount of labeled data is available.
We present ETC-NLG, an approach leveraging topic modeling annotations to enable fully-unsupervised End-to-end Topic-Conditioned Natural Language Generation over emergent topics in unlabeled document collections.