Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data

Apurba Nath; Aayush Kubba

doi:doi:10.11648/j.eas.20210605.11

| Peer-Reviewed

Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data

Apurba Nath, Aayush Kubba

Published in Engineering and Applied Sciences (Volume 6, Issue 5)

Received: 1 August 2021 Accepted: 18 August 2021 Published: 7 September 2021

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Can we learn dialog structure from existing dialogs without ontology or domain assumptions. Understanding dialog structures from existing task oriented human human dialogs can help us automate these dialogues in a better way. Traditionally dialog structures have been created using ontologies that are created by domain experts. However, in our experience getting the ontology right is difficult and time consuming. Like other such tasks an unsupervised approach may do better than hand crafted rules. We propose an unsupervised dialog structure discovery approach that is based on SCAN (Semantic Clustering using Nearest Neighbors). Our approach comprises of two steps, the first being creating clusters of utterances and the second being creation of a structure using inter-cluster transition probabilities. Our main contribution in this paper is the adaptation of SCAN on text data. Unlike the SCAN approach for images, for text we did not train a separate pretext model and were able to use BERT for the same. Similarly for neigbor discovery, instead of augmentation we were able to leverage data variety. Evaluation metrics on dialog structures are a bit subjective, so we have used statistical measures as proxies for structure quality. We have also included our results on an internal human human task oriented 100k dialog dataset. We think SCAN like approaches are very promising for problems that use embedding similarities and should be further explored.

Published in	Engineering and Applied Sciences (Volume 6, Issue 5)
DOI	10.11648/j.eas.20210605.11
Page(s)	82-85
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Unsupervised Learning, Dialog Structure Discovery, Text Classification, Clustering

References

[1]	Van Gansbeke, Wouter and Vandenhende, Simon and Georgoulis, Stamatios and Proesmans, Marc and Van Gool, Luc. Scan: Learning to classify images without labels. In: Proceedings of the European Conference on Computer Vision (2020).
[2]	Qiu, Liang and Zhao, Yizhou and Shi, Weiyan and Liang, Yuan and Shi, Feng and Yuan, Tao and Yu, Zhou and Zhu, Song-Chun. Structured Attention for Unsupervised Dialogue Structure Induction. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1889–1899. (2020).
[3]	Jun Xu, Zeyang Lei, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che, Ting Liu. Discovering Dialog Structure Graph for Open-Domain Dialog Generation. arXiv preprint arXiv: 2012.15543, (2020).
[4]	Harry Bunt. 2009. The dit++ taxonomy for functional dialogue markup. In AAMAS 2009 Workshop, Towards a Standard Markup Language for Embodied Dialogue Acts, pages 13–24.
[5]	Wu, Z., Xiong, Y., Yu, S. X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: CVPR (2018).
[6]	He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: arXiv preprint arXiv: 1911.05722 (2020).
[7]	Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. arXiv preprint arXiv: 2002.05709 (2020).
[8]	Misra, I., van der Maaten, L.: Self-supervised learning of pretext-invariant representations. In: CVPR (2020).
[9]	Sohn, K., Berthelot, D., Li, C. L., Zhang, Z., Carlini, N., Cubuk, E. D., Kurakin, A., Zhang, H., Raffel, C.: Fixmatch: Simplifying semi-supervised learning with consistency and confidence. arXiv preprint arXiv: 2001.07685 (2020).
[10]	Asano, Y. M., Rupprecht, C., Vedaldi, A.: Self-labelling via simultaneous clustering and representation learning. In: ICLR (2020).
[11]	Su Zhu, Jieyu Li, Lu Chen, and Kai Yu. 2020. Efficient context and schema fusion networks for multi-domain dialogue state tracking. arXiv: 2004.03386.
[12]	Lu Chen, Boer Lv, Chi Wang, Su Zhu, Bowen Tan, and Kai Yu. 2020. Schema-guided multi-domain dialogue state tracking with graph attention neural networks. In AAAI, pages 7521–7528.
[13]	Nikola Mrksˇic ́, Diarmuid O ́ Se ́aghdha, Tsung-Hsien Wen, Blaise Thomson, and Steve Young. 2017. Neural belief tracker: Data-driven dialogue state tracking. In Proceedings of the 55th Annual Meeting of the ACL (Volume 1: Long Papers), pages 1777–1788.
[14]	Liliang Ren, Jianmo Ni, and Julian McAuley. 2019. Scalable and accurate dialogue state tracking via hierarchical sequence generation. arXiv preprint arXiv: 1909.00754.
[15]	Yan Zeng, Jian-Yun Ne, Multi-Domain Dialogue State Tracking – A Purely Transformer-Based Generative Approach, In: arXiv preprint arXiv: 2010.14061 (2020).

Cite This Article

Plain Text BibTeX RIS

APA Style

Apurba Nath, Aayush Kubba. (2021). Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data. Engineering and Applied Sciences, 6(5), 82-85. https://doi.org/10.11648/j.eas.20210605.11

Copy | Download

ACS Style

Apurba Nath; Aayush Kubba. Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data. Eng. Appl. Sci. 2021, 6(5), 82-85. doi: 10.11648/j.eas.20210605.11

Copy | Download

AMA Style

Apurba Nath, Aayush Kubba. Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data. Eng Appl Sci. 2021;6(5):82-85. doi: 10.11648/j.eas.20210605.11

Copy | Download

@article{10.11648/j.eas.20210605.11,
  author = {Apurba Nath and Aayush Kubba},
  title = {Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data},
  journal = {Engineering and Applied Sciences},
  volume = {6},
  number = {5},
  pages = {82-85},
  doi = {10.11648/j.eas.20210605.11},
  url = {https://doi.org/10.11648/j.eas.20210605.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.eas.20210605.11},
  abstract = {Can we learn dialog structure from existing dialogs without ontology or domain assumptions. Understanding dialog structures from existing task oriented human human dialogs can help us automate these dialogues in a better way. Traditionally dialog structures have been created using ontologies that are created by domain experts. However, in our experience getting the ontology right is difficult and time consuming. Like other such tasks an unsupervised approach may do better than hand crafted rules. We propose an unsupervised dialog structure discovery approach that is based on SCAN (Semantic Clustering using Nearest Neighbors). Our approach comprises of two steps, the first being creating clusters of utterances and the second being creation of a structure using inter-cluster transition probabilities. Our main contribution in this paper is the adaptation of SCAN on text data. Unlike the SCAN approach for images, for text we did not train a separate pretext model and were able to use BERT for the same. Similarly for neigbor discovery, instead of augmentation we were able to leverage data variety. Evaluation metrics on dialog structures are a bit subjective, so we have used statistical measures as proxies for structure quality. We have also included our results on an internal human human task oriented 100k dialog dataset. We think SCAN like approaches are very promising for problems that use embedding similarities and should be further explored.},
 year = {2021}
}

Copy | Download

TY - JOUR
T1 - Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data
AU - Apurba Nath
AU - Aayush Kubba
Y1 - 2021/09/07
PY - 2021
N1 - https://doi.org/10.11648/j.eas.20210605.11
DO - 10.11648/j.eas.20210605.11
T2 - Engineering and Applied Sciences
JF - Engineering and Applied Sciences
JO - Engineering and Applied Sciences
SP - 82
EP - 85
PB - Science Publishing Group
SN - 2575-1468
UR - https://doi.org/10.11648/j.eas.20210605.11
AB - Can we learn dialog structure from existing dialogs without ontology or domain assumptions. Understanding dialog structures from existing task oriented human human dialogs can help us automate these dialogues in a better way. Traditionally dialog structures have been created using ontologies that are created by domain experts. However, in our experience getting the ontology right is difficult and time consuming. Like other such tasks an unsupervised approach may do better than hand crafted rules. We propose an unsupervised dialog structure discovery approach that is based on SCAN (Semantic Clustering using Nearest Neighbors). Our approach comprises of two steps, the first being creating clusters of utterances and the second being creation of a structure using inter-cluster transition probabilities. Our main contribution in this paper is the adaptation of SCAN on text data. Unlike the SCAN approach for images, for text we did not train a separate pretext model and were able to use BERT for the same. Similarly for neigbor discovery, instead of augmentation we were able to leverage data variety. Evaluation metrics on dialog structures are a bit subjective, so we have used statistical measures as proxies for structure quality. We have also included our results on an internal human human task oriented 100k dialog dataset. We think SCAN like approaches are very promising for problems that use embedding similarities and should be further explored.
VL - 6
IS - 5
ER -

Copy | Download

Author Information

Apurba Nath

Voicezen India Pvt Ltd, Gurugram, India
Aayush Kubba

Voicezen India Pvt Ltd, Gurugram, India

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Apurba Nath, Aayush Kubba. (2021). Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data. Engineering and Applied Sciences, 6(5), 82-85. https://doi.org/10.11648/j.eas.20210605.11

Copy | Download

ACS Style

Apurba Nath; Aayush Kubba. Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data. Eng. Appl. Sci. 2021, 6(5), 82-85. doi: 10.11648/j.eas.20210605.11

Copy | Download

AMA Style

Apurba Nath, Aayush Kubba. Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data. Eng Appl Sci. 2021;6(5):82-85. doi: 10.11648/j.eas.20210605.11

Copy | Download

@article{10.11648/j.eas.20210605.11,
  author = {Apurba Nath and Aayush Kubba},
  title = {Tscan: Dialog Structure Discovery Using Scan, Adaptation of Scan to Text Data},
  journal = {Engineering and Applied Sciences},
  volume = {6},
  number = {5},
  pages = {82-85},
  doi = {10.11648/j.eas.20210605.11},
  url = {https://doi.org/10.11648/j.eas.20210605.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.eas.20210605.11},
  abstract = {Can we learn dialog structure from existing dialogs without ontology or domain assumptions. Understanding dialog structures from existing task oriented human human dialogs can help us automate these dialogues in a better way. Traditionally dialog structures have been created using ontologies that are created by domain experts. However, in our experience getting the ontology right is difficult and time consuming. Like other such tasks an unsupervised approach may do better than hand crafted rules. We propose an unsupervised dialog structure discovery approach that is based on SCAN (Semantic Clustering using Nearest Neighbors). Our approach comprises of two steps, the first being creating clusters of utterances and the second being creation of a structure using inter-cluster transition probabilities. Our main contribution in this paper is the adaptation of SCAN on text data. Unlike the SCAN approach for images, for text we did not train a separate pretext model and were able to use BERT for the same. Similarly for neigbor discovery, instead of augmentation we were able to leverage data variety. Evaluation metrics on dialog structures are a bit subjective, so we have used statistical measures as proxies for structure quality. We have also included our results on an internal human human task oriented 100k dialog dataset. We think SCAN like approaches are very promising for problems that use embedding similarities and should be further explored.},
 year = {2021}
}

Copy | Download