An exploratory study into automated real-time categorisation of engineering e-mail

Research output: Contribution to conferencePaper

2 Citations (Scopus)
61 Downloads (Pure)

Abstract

For large, spatially and temporally distributed engineering projects, e-mail is a central means for the discussion of engineering work and sharing of digital assets that define the product and its production process. The importance of communication and the value of its content for resolving issues post facto are universally accepted. More recently, the potential value of its content to predict events, issues and states a priori has been explored with some success. However, while in the former context (post facto) trends and patterns can be established through iteration and refinement over time; for prediction, heuristics need to be established in advance and closer to real-time analysis becomes necessary due to the critical and very often short timescales. It is this challenge of making predictions from the content of e-mail that is considered in this paper. In particular, the paper deals with engineering e-mail and the ability to automatically predict its purpose from its content rather than relying solely on the subject line. The work builds upon previous studies by the authors concerning the characterisation of the content of e-mail: what they are about, why they were sent and how the content is expressed. The paper summarises the previous work and looks at the potential of identifying the purpose of e-mail through the use of Naive Bayes and an adapted Latent Semantic Analysis approach. While the techniques have only been applied to an initial exploratory study of 98 e-mails, the results suggest the potential for automated real-time categorisation of engineering e-mails through achieving an accuracy of 66%. Such a capability would both support prioritisation of e-mail for engineers and macro level characterisation of project e-mail dynamics. The latter provides the opportunity for real-time analysis of an engineering projects status and correspondingly, modes of management intervention.
Original languageEnglish
Pages4806-4811
Number of pages6
DOIs
Publication statusPublished - 2013
Event2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2013) - Manchester, UK United Kingdom
Duration: 13 Oct 201316 Oct 2013

Conference

Conference2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2013)
CountryUK United Kingdom
CityManchester
Period13/10/1316/10/13

Fingerprint

Macros
Semantics
Engineers
Communication

Keywords

  • e-mail
  • engineering communication
  • latent semantic analysis
  • naive bayes

Cite this

Gopsill, J. A., Payne, S. J., & Hicks, B. J. (2013). An exploratory study into automated real-time categorisation of engineering e-mail. 4806-4811. Paper presented at 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2013), Manchester, UK United Kingdom. https://doi.org/10.1109/SMC.2013.818

An exploratory study into automated real-time categorisation of engineering e-mail. / Gopsill, J A; Payne, S J; Hicks, B J.

2013. 4806-4811 Paper presented at 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2013), Manchester, UK United Kingdom.

Research output: Contribution to conferencePaper

Gopsill, JA, Payne, SJ & Hicks, BJ 2013, 'An exploratory study into automated real-time categorisation of engineering e-mail' Paper presented at 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2013), Manchester, UK United Kingdom, 13/10/13 - 16/10/13, pp. 4806-4811. https://doi.org/10.1109/SMC.2013.818
Gopsill JA, Payne SJ, Hicks BJ. An exploratory study into automated real-time categorisation of engineering e-mail. 2013. Paper presented at 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2013), Manchester, UK United Kingdom. https://doi.org/10.1109/SMC.2013.818
Gopsill, J A ; Payne, S J ; Hicks, B J. / An exploratory study into automated real-time categorisation of engineering e-mail. Paper presented at 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2013), Manchester, UK United Kingdom.6 p.
@conference{526ef778a49848d0b4fd38297f6a5922,
title = "An exploratory study into automated real-time categorisation of engineering e-mail",
abstract = "For large, spatially and temporally distributed engineering projects, e-mail is a central means for the discussion of engineering work and sharing of digital assets that define the product and its production process. The importance of communication and the value of its content for resolving issues post facto are universally accepted. More recently, the potential value of its content to predict events, issues and states a priori has been explored with some success. However, while in the former context (post facto) trends and patterns can be established through iteration and refinement over time; for prediction, heuristics need to be established in advance and closer to real-time analysis becomes necessary due to the critical and very often short timescales. It is this challenge of making predictions from the content of e-mail that is considered in this paper. In particular, the paper deals with engineering e-mail and the ability to automatically predict its purpose from its content rather than relying solely on the subject line. The work builds upon previous studies by the authors concerning the characterisation of the content of e-mail: what they are about, why they were sent and how the content is expressed. The paper summarises the previous work and looks at the potential of identifying the purpose of e-mail through the use of Naive Bayes and an adapted Latent Semantic Analysis approach. While the techniques have only been applied to an initial exploratory study of 98 e-mails, the results suggest the potential for automated real-time categorisation of engineering e-mails through achieving an accuracy of 66{\%}. Such a capability would both support prioritisation of e-mail for engineers and macro level characterisation of project e-mail dynamics. The latter provides the opportunity for real-time analysis of an engineering projects status and correspondingly, modes of management intervention.",
keywords = "e-mail, engineering communication, latent semantic analysis, naive bayes",
author = "Gopsill, {J A} and Payne, {S J} and Hicks, {B J}",
year = "2013",
doi = "10.1109/SMC.2013.818",
language = "English",
pages = "4806--4811",
note = "2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2013) ; Conference date: 13-10-2013 Through 16-10-2013",

}

TY - CONF

T1 - An exploratory study into automated real-time categorisation of engineering e-mail

AU - Gopsill, J A

AU - Payne, S J

AU - Hicks, B J

PY - 2013

Y1 - 2013

N2 - For large, spatially and temporally distributed engineering projects, e-mail is a central means for the discussion of engineering work and sharing of digital assets that define the product and its production process. The importance of communication and the value of its content for resolving issues post facto are universally accepted. More recently, the potential value of its content to predict events, issues and states a priori has been explored with some success. However, while in the former context (post facto) trends and patterns can be established through iteration and refinement over time; for prediction, heuristics need to be established in advance and closer to real-time analysis becomes necessary due to the critical and very often short timescales. It is this challenge of making predictions from the content of e-mail that is considered in this paper. In particular, the paper deals with engineering e-mail and the ability to automatically predict its purpose from its content rather than relying solely on the subject line. The work builds upon previous studies by the authors concerning the characterisation of the content of e-mail: what they are about, why they were sent and how the content is expressed. The paper summarises the previous work and looks at the potential of identifying the purpose of e-mail through the use of Naive Bayes and an adapted Latent Semantic Analysis approach. While the techniques have only been applied to an initial exploratory study of 98 e-mails, the results suggest the potential for automated real-time categorisation of engineering e-mails through achieving an accuracy of 66%. Such a capability would both support prioritisation of e-mail for engineers and macro level characterisation of project e-mail dynamics. The latter provides the opportunity for real-time analysis of an engineering projects status and correspondingly, modes of management intervention.

AB - For large, spatially and temporally distributed engineering projects, e-mail is a central means for the discussion of engineering work and sharing of digital assets that define the product and its production process. The importance of communication and the value of its content for resolving issues post facto are universally accepted. More recently, the potential value of its content to predict events, issues and states a priori has been explored with some success. However, while in the former context (post facto) trends and patterns can be established through iteration and refinement over time; for prediction, heuristics need to be established in advance and closer to real-time analysis becomes necessary due to the critical and very often short timescales. It is this challenge of making predictions from the content of e-mail that is considered in this paper. In particular, the paper deals with engineering e-mail and the ability to automatically predict its purpose from its content rather than relying solely on the subject line. The work builds upon previous studies by the authors concerning the characterisation of the content of e-mail: what they are about, why they were sent and how the content is expressed. The paper summarises the previous work and looks at the potential of identifying the purpose of e-mail through the use of Naive Bayes and an adapted Latent Semantic Analysis approach. While the techniques have only been applied to an initial exploratory study of 98 e-mails, the results suggest the potential for automated real-time categorisation of engineering e-mails through achieving an accuracy of 66%. Such a capability would both support prioritisation of e-mail for engineers and macro level characterisation of project e-mail dynamics. The latter provides the opportunity for real-time analysis of an engineering projects status and correspondingly, modes of management intervention.

KW - e-mail

KW - engineering communication

KW - latent semantic analysis

KW - naive bayes

UR - http://dx.doi.org/10.1109/SMC.2013.818

UR - http://www.smc2013.org/

U2 - 10.1109/SMC.2013.818

DO - 10.1109/SMC.2013.818

M3 - Paper

SP - 4806

EP - 4811

ER -