Semantic Transformation Network: Improving Dialogue Summarization Through Contrastive Learning and Attention

Authors

  • Zheng Ren College of Computing, Georgia Institute of Technology, North Avenue, Atlanta, GA 30332

Keywords:

Dialogue Summarization, Semantic Transformation Network, seq2seq models

Abstract

Dialogue summarization plays a crucial role in natural language processing by generating concise summaries from multi-turn conversations. Traditional extractive methods often fail to capture the full context, leading to a rise in abstractive approaches. This paper proposes a Semantic Transformation Network (STN) integrated with the seq2seq framework to enhance the semantic understanding of dialogues. Leveraging models such as PGN, BERT, and BART, our approach transforms the latent space to generate accurate, semantically enriched summaries. We evaluate the model using SAMSum and DIALOGSUM datasets, where it outperforms state-of-the-art methods in terms of lexical accuracy and semantic coherence, as measured by ROUGE, BLEU, BERTScore, and MoverScore. The results demonstrate the effectiveness of STN in capturing key dialogue content while maintaining summary fluency and relevance.

References

Nallapati, R., Zhai, F., & Zhou, B. (2017). Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 3075-3081.

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS), 3104-3112.

Rush, A. M., Chopra, S., & Weston, J. (2015). A neural attention model for abstractive sentence summarization. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), 379-389.

Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. Proceedings of the International Conference on Learning Representations (ICLR).

See, A., Liu, P. J., & Manning, C. D. (2017). Get to the point: Summarization with pointer-generator networks. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), 1073-1083.

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 4171-4186.

Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., … & Zettlemoyer, L. (2020). BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 7871-7880.

Chen, Y., & Bansal, M. (2021). “Keep it simple”: Unsupervised simplification of multi-paragraph text. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics, 155-168.

Gliwa, B., Mochol, I., Biesek, M., & Wawer, A. (2019). SAMSum corpus: A human-annotated dialogue dataset for abstractive summarization. Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), 10-20.

Chen, S., Zhou, W., & He, Y. (2021). DialogSum: A large-scale dialogue summarization dataset. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 4494-4504.

Zhao, Y., Li, J., Zhang, K., & Zhang, J. (2020). Hierarchical attention mechanism for dialogue summarization. Proceedings of the 28th International Conference on Computational Linguistics (COLING), 4488-4498.

Liu, Y., & Chen, F. (2019). Abstractive dialogue summarization with hierarchical reinforcement learning. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2079-2089.

Chen, Q., & Yang, S. (2020). Multi-view summarization networks for semantic-aware dialogue summarization. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 4236-4246.

Li, Z., Cao, Y., He, R., & Li, W. (2020). A dual-encoder model with attention for improved dialogue summarization. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 4339-4348.

Gunel, B., Du, J., Conneau, A., & Stoyanov, V. (2020). Supervised contrastive learning for pre-trained language model fine-tuning. arXiv preprint arXiv:2011.01403.

Zhang, T., Kishore, V., Wu, F., Weinberger, K., & Artzi, Y. (2020). BERTScore: Evaluating text generation with BERT. Proceedings of the 8th International Conference on Learning Representations (ICLR).

Zhao, W., Peyrard, M., Liu, F., Gao, Y., Meyer, C. M., & Eger, S. (2019). MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 563-578.

Lin, C. Y. (2004). ROUGE: A package for automatic evaluation of summaries. Proceedings of the Workshop on Text Summarization Branches Out (WAS), 74-81.

Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002). BLEU: A method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), 311-318.

Xu, Y., Liu, L., & Lu, H. (2020). Graph-based dialogue summarization with multi-view attention mechanisms. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 3762-3772.

Chen, X., Chen, X., & Liu, W. (2020). Reinforcement learning for dialogue summarization: A framework and case study. Proceedings of the 2020 Conference on Neural Information Processing Systems (NeurIPS).

Downloads

Published

2024-10-10

How to Cite

Ren, Z. (2024). Semantic Transformation Network: Improving Dialogue Summarization Through Contrastive Learning and Attention. Journal of Theory and Practice in Engineering and Technology, 1(3), 1–8. Retrieved from https://woodyinternational.com/index.php/jtpet/article/view/59