Even though data integration has been there for long,
integrating e-health data remains an open research challenge.
Many studies have discussed data integration approaches in
isolation, causing a disjoint of information flows and narrowing
the understanding of alternative solutions in the cause of
choosing an appropriate approach for integrating e-Health data.
This problem can be solved by undertaking a comprehensive
review on different approaches and techniques in data integration
as well as paradigm in e-Health frameworks. The main
contribution of this paper is a narrative description of most
widely used integration architectures as well as comparative
analysis of each of them. The review reveals that despite the
business needs of integration approach, each has limiting factors
based on the data size, unstructured or structured nature of data
set as well as business agility of the organization. Moreover, it is
found that some technologies were not designed for data
integration but for document management or information
integration. It is concluded that data integration approaches and
the field in which it applies mostly, is widely spread that always
results in integration problem. Also, it is noted that designing a
framework for integrating e-Health data using web service
approach in a distributed network characterized with short lived
connection is still a challenge.
Vitalis Ndume : School of Computational, Communication Science and Engineering,
Nelson Mandela-African Institute of Science and Technology, Arusha, Tanzania
Yaw Nkansah-Gyekye : School of Computational, Communication Science and Engineering,
Nelson Mandela-African Institute of Science and Technology, Arusha, Tanzania
Jesuk Ko : Department of Healthcare Management, Gwangju University
Gwangju, Korea
Heterogeneous data
distributed architecture
integration techniques
e-Health paradigm
It is concluded that data integration is a primary concern
with combining data residing at different sources, and
providing users with a unified view of the data. The
integration may also be concerned with managing the
relationship of data from different independent systems.
The degree to which the data are coupled may vary
depending on the organization business needs. On one
side, a need for tightly integrated solution is required while
on the other side loosely integrated data model is preferred
depending on the organization agility. Therefore, a
practitioner should consider the need for better
conceptualization and method for implementing partial
integration of their data between and within the
organization. This can be achieved by using an
architecture which is loosely integrated while enforcing
standardization. Additionally, choosing the right approach
for data integration requires a practitioner to consider the
IT infrastructure, better conceptualization and method for
implementing partial integration of their data.
[1] P. Ziegler and K. R. Dittrich, "Data Integration—
Problems, Approaches, and Perspectives," Conceptual
Modelling in Information Systems Engineering, pp. 39-
58, 2007.
[2] P. Cudré-Mauroux, et al., "Gridvine: An infrastructure
for peer information management," IEEE Internet
Computing, vol. 11, pp. 36-44, 2007.
[3] V. Y. Bichutskiy, et al., "Heterogeneous Biomedical
Database Integration Using a Hybrid Strategy: A p53
Cantcer Research Database," Cancer Informatics, vol.
2, p. 277, 2006.
[4] N. Arch-int and A.-i. Somjit, "Semantic information
integration for electronic patient records using
ontology and web services model," in Information
Science and Applications (ICISA), 2011 International
Conference on, 2011, pp. 1-7.
[5] K. Atalag, et al., "Putting health record interoperability
standards to work," electronic Journal of Health
Informatics, vol. 6, p. e1, 2010.
[6] R. J. Glushko and T. McGrath, Document engineering
Analysing and Designing Document for Business
information & Web service: MIT Press, 2005.
[7] K. A. Stroetmann and V. N. Stroetmann, "Towards an
Interoperability Framework for a European e-Health
Research Area–Locating the Semantic Interoperability
Domain," 2005, pp. 14-15.
[8] D. Loshin, The praactitionners's Guide to Data Quality
Improvement: Morgan Kaufmann 2011.
[9] M. Lenzerini, "Data integration: A theoretical
perspective," in Symposium on Principle of Database
system, 2002, pp. 233-246.
[10] A. Calì, "Reasoning in data integration systems: why
lav and gav are siblings," in Foundations of Intelligent
Systems, ed: Springer, 2003, pp. 562-571.
[11] A. Buccella, et al., "An Ontology Approach to Data
Integration," International jounal of simulation system
,sceince and technology, vol. 11, Ocotber 203 2010.
[12] L. Han, et al., "Visual model of heterogeneous data
sources based on service-ontology," 2010, pp. 2945-
2949. [13] S. J. Cockell, et al., "An integrated dataset for in silico
drug discovery," J Integr Bioinform, vol. 7, p. 116,
2010.
[14] N. J. Salkind, Ed., Tests & Measures for people who
Hate Tests & Measurement. London ECITY 1SP: Sage
publications Ltd, 2006, p.^pp. Pages.
[15] D. Calvanese, et al., "A framework for ontology
integration," 2002, pp. 201-214.
[16] J. A. R. Castillo, et al., "Information extraction and
integration from heterogeneous, distributed,
autonomous information sources-a federated ontologydriven
query-centric approach," in Information Reuse
and Integration, 2003. IRI 2003. IEEE International
Conference on, 2003, pp. 183-191.
[17] I. Cruz and H. Xiao, "Ontology driven data integration
in heterogeneous networks," Complex Systems in
Knowledge-based Environments: Theory, Models and
Applications, pp. 75-98, 2009.
[18] I. F. Cruz and H. Xiao, "The role of ontologies in data
integration," Engineering intelligent systems for
electrical engineering and communications, vol. 13, p.
245, 2005.
[19] G. Atemezing and J. Pavón, "An Ontology for African
Traditional Medicine," in International Symposium on
Distributed Computing and Artificial Intelligence 2008
(DCAI 2008), 2009, pp. 329-337.
[20] B. Louie, et al., "Data integration and genomic
medicine," Journal of biomedical informatics, vol. 40,
pp. 5-16, 2007.
[21] Z. Xu and Y. Lee, "Semantic heterogeneity of
geodata," INTERNATIONAL ARCHIVES OF
PHOTOGRAMMETRY REMOTE SENSING AND
SPATIAL INFORMATION SCIENCES, vol. 34, pp.
216-224, 2002.
[22] A. Doan, et al., "Introduction to the special issue on
semantic integration," ACM SIGMOD Record, vol. 33,
pp. 11-13, 2004.
[23] F. Hakimpour and A. Geppert, "Resolving semantic
heterogeneity in schema integration," in Proceedings of
the international conference on Formal Ontology in
Information Systems-Volume 2001, 2001, pp. 297-308.
[24] A. P. Sheth and J. A. Larson, "Federated database
systems for managing distributed, heterogeneous, and
autonomous databases," ACM Computing Surveys
(CSUR), vol. 22, pp. 183-236, 1990.
[25] A. Doan and A. Y. Halevy, "Semantic integration
research in the database community: A brief survey,"
AI magazine, vol. 26, p. 83, 2005.
[26] R. Chaudhri, et al., "Open data kit sensors: mobile data
collection with wired and wireless sensors," in
Proceedings of the 2nd ACM Symposium on
Computing for Development, 2012, p. 9.
[27] A. Giemza, et al., "A mobile application for collecting
numerical and multimedia data during experiments and
field trips in inquiry learning," in International
Conference on Computers in Education, Putrajaya,
Malaysia, 2010.
[28] X. Dong, et al., "Data integration with uncertainty,"
2007, pp. 687-698.
[29] X. L. Dong, et al., "Data integration with uncertainty,"
The VLDB Journal, vol. 18, pp. 469-500, 2009.
[30] A. Satheesh and R. Patel, "Dynamic Nearest
Neighbours Classifier For Integrated Data Using
Object Oriented Concept Generalization," vol. 11, pp.
35-40, 2010.
[31] H. Kozankiewicz, et al., "Intelligent data integration
middleware based on updateable views," Intelligent
Media Technology for Communicative Intelligence, pp.
29-39, 2005.
[32] D. L. Cal`i A, Riccardo R,, "Query rewriting and
answering under constraints in data integration
systems" 2003.
[33] A. Maedche and S. Staab, "Ontology learning for the
semantic web," Intelligent Systems, IEEE, vol. 16, pp.
72-79, 2001.
[34] Wikipedia. (2011, 13/03/2014). Extract Transform
,Load. Available:
http://en.wikipedia.org/wiki/Extract,_transform,_load
[35] K. P. Kornelson, et al., "Method and system for
developing extract transform load systems for data
warehouses," ed: Google Patents, 2006.
[36] P. Vassiliadis, "A Survey of Extract-Transform-Load
Technology," ed, 2011.
[37] S. Chawathe, et al., "The TSIMMIS project:
Integration of heterogenous information sources,"
1994.
[38] M. Van Cappellen, et al., "Data Aggregation,
Heterogeneous Data Sources and Streaming
Processing: How Can XQuery Help?," IEEE Data Eng.
Bull, vol. 31, pp. 57-64, 2008.
[39] T. Risch, et al., "Functional data integration in a
distributed mediator system," ed: Springer, 2003.
[40] D. George, "Understanding structural and semantic
heterogeneity in the context of database schema
integration," Journal of the Department of Computing,
UCLAN, vol. 4, pp. 29-44, 2005.
[41] D. Heimbigner and D. McLeod, "A federated
architecture for information management," ACM
Transactions on Information Systems (TOIS), vol. 3,
pp. 253-278, 1985.
[42] D. McLeod and D. Heimbigner, "A federated
architecture for database systems," 1980, pp. 283-289.
[43] Webopedia. (2013, 15 March 2013). Data Model.
Available:
http://www.webopedia.com/TERM/D/data_modeling.h
tml
[44] P. Gupta, Ed., Businness Innovation in the 21st century:
A Comprehensive aproach to Institutional Business
Innovation. S.Chand & Company, 2009, p.^pp. Pages.
[45] R. L. Richesson and J. Krischer, "Data standards in
clinical research: gaps, overlaps, challenges and future
directions," Journal of the American Medical
Informatics Association, vol. 14, pp. 687-696, 2007.
[46] R. Ramakrishnan and J. Gehrke, Database
management systems: Osborne/McGraw-Hill, 2000.
[47] R. Atun, et al., "Integration of targeted health
interventions into health systems: a conceptual
framework for analysis," Health Policy and Planning,
vol. 25, pp. 104-111, 2010. [48] M. M. Huynen, et al., "The health impacts of
globalisation: a conceptual framework," Globalization
and Health, vol. 1, p. 14, 2005.
[49] R. Kimball and M. Ross, The data warehouse toolkit:
the complete guide to dimensional modeling: Wiley,
2011.
[50] Y.-C. Lu, et al., "A review and a framework of
handheld computer adoption in healthcare,"
International Journal of Medical Informatics, vol. 74,
p. 409, 2005.
[51] M. Haithcox-Dennis, et al., "Rethinking the Factors of
Success: Social Support and Community Coalitions,"
American Journal of Health Education, vol. 44, pp.
110-118, 2013.
[52] J. Yu and R. Buyya, "A taxonomy of workflow
management systems for grid computing," Journal of
Grid Computing, vol. 3, pp. 171-200, 2005.
[53] W. Van Der Aalst and K. M. Van Hee, Workflow
management: models, methods, and systems: MIT
press, 2004.
[54] C. Hagen and G. Alonso, "Exception handling in
workflow management systems," Software
Engineering, IEEE Transactions on, vol. 26, pp. 943-
958, 2000.
[55] G. Brzykcy, et al., "Schema Mappings and Agents'
Actions in P2P Data Integration System," J. UCS, vol.
14, pp. 1048-1060, 2008.
[56] S. Staab and H. Stuckenschmidt, Semantic web and
peer-to-peer: Springer, 2006.
[57] D. Calvanese, et al., "Inconsistency tolerance in P2P
data integration: An epistemic logic approach,"
Information Systems, vol. 33, pp. 360-384, 2008.
[58] Y. Liu and L. Yuefan, "Research on Data Integration of
Bioinformatics Database Based on Web Services," in
The 1st International Conference on Networked Digital
Technologies (NDT2009), 2009.
[59] Webopedia. (2013, Web
service.http://www.webopedia.com/TERM/W/Web_Se
rvices.html. Available:
http://www.webopedia.com/TERM/W/Web_Services.h
tml
[60] C. Walker and D. Walker, "Integration and Data
Sharing between WS-Based Workflows," in Web
Services, 2008. ICWS'08. IEEE International
Conference on, 2008, pp. 667-674.
[61] MSDN. (2009, 27 Oct.2013). Microsoft Developer
Network. Available:
http://social.msdn.microsoft.com/Forums/en-
US/435f43a9-ee17-4700-8c9dd9c3ba57b5ef/
advantages-disadvantages-ofwebservices?
forum=asmxandxml
[62] M. Elammari, "Health Architecture based on SOA and
Mobile Agents," in The 2nd International Conference
on Software Engineering and Computer Systems
(ICSECS2011), June 27-29, 2011, Kuantan, Malaysia,
2011.
[63] F. Zhu, et al., "Dynamic data integration using web
services," 2004, pp. 262-269.
[64] Y. Zhu, et al., "Research on Web service-oriented data
integration in the distributed system," 2011, pp. 568-
571.
[65] M. P. Papazoglou, et al., "Service-oriented computing:
State of the art and research challenges," Computer,
vol. 40, pp. 38-45, 2007.
[66] M. P. Papazoglou and W. J. Van Den Heuvel, "Service
oriented architectures: approaches, technologies and
research issues," The VLDB journal, vol. 16, pp. 389-
415, 2007.
[67] K. Atalag, et al., "Assessment of Software
Maintainability of openEHR Based Health Information
Systems–A Case Study In Endoscopy," Electronic
Journal of Health Informatics, vol. 7, p. e3, 2012.
[68] R. E. Scott, "e-Records in health—Preserving our
future," International journal of medical informatics,
vol. 76, pp. 427-431, 2007.
[69] Nadhan and J.-L. Weldon. (2014, 20 /05/2014). A
Strategic Approach to Data Transfer Methods.
Available: http://msdn.microsoft.com/enus/
library/aa480064.aspx
[70] T. Katayama, et al., "The DBCLS BioHackathon:
standardization and interoperability for bioinformatics
web services and workflows," Journal of biomedical
semantics, vol. 1, pp. 1-19, 2010.
[71] C. A. Brandt, et al., "Metadata-driven creation of data
marts from an EAV-modeled clinical research
database," International journal of medical
informatics, vol. 65, pp. 225-241, 2002.
[72] L. Stein, "Creating a bioinformatics nation," Nature,
vol. 417, pp. 119-120, 2002.
[73] C. Bizer, et al., "Linked data-the story so far,"
International Journal on Semantic Web and
Information Systems (IJSWIS), vol. 5, pp. 1-22, 2009.
[74] E. Pacitti, et al., "Grid data management: Open
problems and new issues," Journal of Grid Computing,
vol. 5, pp. 273-281, 2007.
[75] A. Nanda, Hands-on Intergration Services. Microsoft
SQL server 2008 2ed. New York: Mc Graw Hill, 2011.
[76] R. W. Comer, et al., "Automatic Spreadsheet forms,"
ed: Google Patents, 1998.
[77] L. Han, et al., "RDF123: from Spreadsheets to RDF,"
in The Semantic Web-ISWC 2008, ed: Springer, 2008,
pp. 451-466.
[78] S. Holzner, Ed., Visual Basic .NET Programing Black
Book.Comprehensive Problme Solver. Dreamtech,
2005, p.^pp. Pages.
[79] T. Milo and S. Zohar, "Using schema matching to
simplify heterogeneous data translation," in VLDB,
1998, pp. 24-27.
[80] P. Helman, The science of database management:
Richard D. Irwin, Inc., 1994.
[81] D. C. Kaelber, et al., "A research agenda for personal
health records (PHRs)," Journal of the American
Medical Informatics Association, vol. 15, pp. 729-736,
2008.
[82] B. Blobel, Analysis, Design, and Implementation of
Secure and Interoperable Distributed Health
Information Systems vol. 89: IOS Press, 2002. [83] J. Grimson, et al., "A CORBA-based integration of
distributed electronic healthcare records using the
synapses approach," Information Technology in
Biomedicine, IEEE Transactions on, vol. 2, pp. 124-
138, 1998.
[84] T. J. Eggebraaten, et al., "A health-care data model
based on the HL7 reference information model," IBM
Systems Journal, vol. 46, pp. 5-18, 2007.
[85] T. Benson, Principles of health interoperability HL7
and SNOMED: Springer, 2010.