TY - JOUR
T1 - A computational framework for retrieval of document fragments based on decomposition schemes in engineering information management
AU - Liu, S
AU - McMahon, C A
AU - Darlington, M J
AU - Culley, S J
AU - Wild, P J
N1 - ID number: ISI:000242322900005
PY - 2006
Y1 - 2006
N2 - Retrieval of document fragments has a great potential for application in engineering information management. Frequently engineers have neither the time nor inclination to sift through long documents for small pieces of useful information. Yet it is frequently in the form of one or more long documents that the information that they seek is presented. Supporting the delivery of the right information, in the right for-mat and in the right quantity motivates the search for better ways of handling document sub-components or fragments. Document fragment retrieval can be facilitated using modern computational technologies. This paper proposes a novel framework for information access utilising state-of-the-art computational technologies and introducing the use of multiple document structure views through decomposition schemes. The framework integrates document structure study, mark-up technologies, automated fragment extraction, faceted classification and a document navigation mechanism to achieve the target of retrieval of specific document fragments using precise. complex queries. These disparate elements have been brought together in an exploratory Engineering Document Content Management System (EDCMS). Using this, investigations using representative engineering documents have shown that information users can access and retrieve document content - at fragment level rather than at document level - both through data in a document and document metadata, through different perspectives and at different granularities, and simultaneously across multiple documents as well as within a single document. (c) 2006 Elsevier Ltd. All rights reserved.
AB - Retrieval of document fragments has a great potential for application in engineering information management. Frequently engineers have neither the time nor inclination to sift through long documents for small pieces of useful information. Yet it is frequently in the form of one or more long documents that the information that they seek is presented. Supporting the delivery of the right information, in the right for-mat and in the right quantity motivates the search for better ways of handling document sub-components or fragments. Document fragment retrieval can be facilitated using modern computational technologies. This paper proposes a novel framework for information access utilising state-of-the-art computational technologies and introducing the use of multiple document structure views through decomposition schemes. The framework integrates document structure study, mark-up technologies, automated fragment extraction, faceted classification and a document navigation mechanism to achieve the target of retrieval of specific document fragments using precise. complex queries. These disparate elements have been brought together in an exploratory Engineering Document Content Management System (EDCMS). Using this, investigations using representative engineering documents have shown that information users can access and retrieve document content - at fragment level rather than at document level - both through data in a document and document metadata, through different perspectives and at different granularities, and simultaneously across multiple documents as well as within a single document. (c) 2006 Elsevier Ltd. All rights reserved.
U2 - 10.1016/j.aei.2006.05.008
DO - 10.1016/j.aei.2006.05.008
M3 - Article
SN - 1474-0346
VL - 20
SP - 401
EP - 413
JO - Advanced Engineering Informatics
JF - Advanced Engineering Informatics
IS - 4
ER -