Context
The dataset was created based on my experience working with various Customer Service teams of popular Telecom companies. It's almost impossible companies share the conversational data between their Service Agents & customers because it involves a lot of areas of improvement & criticism, which if exposed will ruin the companies reputation.
Content
This dataset contains Telecom Customer Center conversation data. The main column "CustomerIntercationRawText" is the notes scribbled down by the agent while speaking with the Customer regarding some issue. Since it is manually written it is bound to use casual 'shorthands', 'spelling mistakes' etc. The column 'AgentAssignedTopic' deals with the topic the agent assigns from a list of Topics, for each call based on his experience in handling similar calls before. This assignment also could be wrong. Location and Call duration are also recorded on a call level.
Acknowledgements
Contributors to the conversation corpus.
Inspiration
It's very difficult to obtain a large "Conversational Corpus" which can be leveraged for Interactive Voice Response(IVR) containment, Conversational AI purpose. I hope a dataset like this can help us to a certain limit.