International Journal of Artificial Intelligence - Savana
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIAAND ARTIFICIAL INTELLIGENCE ISSN: 1989-1660 IMAI RESEARCH GROUP COUNCIL Director - Dr. Rubén González Crespo, Universidad Internacional de La Rioja (UNIR), Spain Office of Publications - Lic. Ainhoa Puente, Universidad Internacional de La Rioja (UNIR), Spain Latin-America Regional Manager - Dr. Carlos Enrique Montenegro Marín, Francisco José de Caldas District University, Colombia EDITORIAL TEAM Editor-in-Chief Dr. Rubén González Crespo, Universidad Internacional de La Rioja – UNIR, Spain Associate Editors Dr. Óscar Sanjuán Martínez, CenturyLink, USA Dr.
Jordán Pascual Espada, ElasticBox, USA Dr. Juan Pavón Mestras, Complutense University of Madrid, Spain Dr. Alvaro Rocha, University of Coimbra, Portugal Dr. Jörg Thomaschewski, Hochschule Emden/Leer, Emden, Germany Dr. Carlos Enrique Montenegro Marín, Francisco José de Caldas District University, Colombia Dr. Vijay Bhaskar Semwal, National Institute of Technology, Jamshedpur, India Dr. Elena Verdú, Universidad Internacional de La Rioja (UNIR), Spain Editorial Board Members Dr. Rory McGreal, Athabasca University, Canada Dr. Jesús Soto, SEPES, Spain Dr. Nilanjan Dey, Techo India College of Technology, India Dr.
Abelardo Pardo, University of Sidney, Australia Dr. Hernán Sasastegui Chigne, UPAO, Perú Dr. Lei Shu, Osaka University, Japan Dr. Roberto Recio, Cooperative University of Colombia, Colombia Dr. León Welicki, Microsoft, USA Dr. Enrique Herrera, University of Granada, Spain Dr. Francisco Chiclana, De Montfort University, United Kingdom Dr. Luis Joyanes Aguilar, Pontifical University of Salamanca, Spain Dr. Ioannis Konstantinos Argyros, Cameron University, USA Dr. Juan Manuel Cueva Lovelle, University of Oviedo, Spain Dr. Pekka Siirtola, University of Oulu, Finland Dr. Francisco Mochón Morcillo, National Distance Education University, Spain Dr.
Peter A. Henning, Karlsruhe University of Applied Sciences, Germany Dr. Manuel Pérez Cota, University of Vigo, Spain Dr. Walter Colombo, Hochschule Emden/Leer, Emden, Germany Dr. Javier Bajo Pérez, Polytechnic University of Madrid, Spain Dr. Jinlei Jiang, Dept. of Computer Science & Technology, Tsinghua University, China5360彩票官网 Dr. B. Cristina Pelayo G. Bustelo, University of Oviedo, Spain Dr. Cristian Iván Pinzón, Technological University of Panama, Panama Dr. José Manuel Sáiz Álvarez, Nebrija University, Spain Dr. Masao Mori, Tokyo Institue of Technology, Japan Dr. Daniel Burgos,Universidad Internacional de La Rioja - UNIR, Spain Dr.
JianQiang Li, NEC Labs, China5360彩票官网 Dr. David Quintana, Carlos III University, Spain Dr. Ke Ning, CIMRU, NUIG, Ireland Dr. Alberto Magreñán, Real Spanish Mathematical Society, Spain Dr. Monique Janneck, Lübeck University of Applied Sciences, Germany Dr. Carina González, La Laguna University, Spain Dr. David L. La Red Martínez, National University of North East, Argentina Dr. Juan Francisco de Paz Santana, University of Salamanca, Spain - II -
III - Dr. Héctor Fernández, INRIA, Rennes, France Dr. Yago Saez, Carlos III University of Madrid, Spain Dr. Andrés G. Castillo Sanz, Pontifical University of Salamanca, Spain Dr. Pablo Molina, Autonoma University of Madrid, Spain Dr. José Miguel Castillo, SOFTCAST Consulting, Spain Dr. Sukumar Senthilkumar, University Sains Malaysia, Malaysia Dr. Juan Antonio Morente, University of Granada, Spain Dr. Holman Diego Bolivar Barón, Catholic University of Colombia, Colombia Dr. Sara Rodríguez González, University of Salamanca, Spain Dr. José Javier Rainer Granados, Universidad Internacional de La Rioja - UNIR, Spain Dr.
Elpiniki I. Papageorgiou, Technological Educational Institute of Central Greece, Greece Dr. Edward Rolando Nuñez Valdez, Open Software Foundation, Spain Dr. Luis de la Fuente Valentín, Universidad Internacional de La Rioja - UNIR, Spain Dr. Paulo Novais, University of Minho, Portugal Dr. Giovanny Tarazona, Francisco José de Caldas District University, Colombia Dr. Javier Alfonso Cedón, University of León, Spain Dr. Sergio Ríos Aguilar, Corporate University of Orange, Spain Dr. Mohamed Bahaj, Settat, Faculty of Sciences & Technologies, Morocco Dr. Madalena Riberio, Polytechnic Institute of Castelo Branco, Portugal Dr.
Edgar Henry Caballero Rúa, Inforfactory SRL, Bolivia
International Journal of Interactive Multimedia and Artificial Intelligence - IV - Data are becoming increasingly important in health management. Just think of the advantages that could derive from monitoring the vital signs of any person and their symptoms turning them into an online platform that, upon proper authorization, doctors could access at any time and from any place. This is just an example of the kind of transformation processes that health management is undergoing. In this improvement of health services triggered by new technologies, Big Data is playing a prominent role.The benefits derived from Big Data  are becoming a reality in health fields   as diverse as: medical services, synthesis of data from medical histories and clinical analysis, management of health centers, hospital administration, distribution of material (especially relevant to specific epidemic needs), detection and prevention of possible side effects of drugs and treatments, scientific documentation (generation, storage and exploitation), medical research, fight against cancer or Pandemic prevention.
Big Data allows integrating structured and unstructured data effectively. Where Big Data can bring more value is in the analysis of unstructured data, in which there is more knowledge to be discovered and exploited. In addition to all this, there is data coming from social networks and those generated by the Internet of things; devices, sensors, medical instruments, fitness equipment, ... But the important thing is not to have a lot of data, but the fact that Big Data tools contribute to the design and implementation of efficient processes that help us carry out health care policies based not only on the available data, but also on their interpretation and understanding.
This is how it can effectively contribute to improving health care, saving lives, expanding access to health systems and optimizing costs. In this regard, the important role played by Big Data in genomic research and genome sequencing should be mentioned. Looking to the future the challenge is how to efficiently manage the growing amount of data that is being generated. Medicine and health are undergoing profound changes. Technological innovation combined with automation and miniaturization has triggered an explosion in data production, which represents an important potential for improvement in health.
At the same time, we face a wide range of challenges . Exploitation of available data through progress in genomic medicine, imaging, and a wide range of mobile health applications or connected devices is hampered by numerous historical, technical, legal and political barriers. The lack of harmonization of data formats, processing, analysis and data transfer is a source of incompatibilities and loss of opportunities that society should not afford. This special issue is designed with the primary objective of showing what we have just pointed out: the diversity of fields where big data is used and consequently, how it is increasingly gaining importance as a tool for analysis and research in the field of healing.
In this sense there are papers related to the following topics: re-using electronic health records with artificial intelligence, big data analytics solution for intelligent healthcare management, development of a predictive model for successful induction of labour, big data and the efficient management of outpatient visits, development of injury prevention policies following a big data approach, generating big data sets from knowledge-based decision support systems to pursue value-based healthcare, the use of administrative records of health information both for diagnoses and patients, and an analysis of the European public health system model and the corresponding healthcare and management-related information systems, the challenges that these health systems are currently facing, and the possible contributions of big data solutions to this field.
The paper issued by Ignacio Hernández Medrano, Jorge Tello Guijarro, Cristóbal Belda, Alberto Ureña, Ignacio Salcedo, Horacio Saggion and Luis Espinosa Anke, “Savana: re-using Electronic Health Records with Artificial Intelligence” focused on the fact that health information grows exponentially , thus generating more knowledge than we can apply . Unlike what happened in the past, today doctors no longer have time to keep updated. This fact explains well the reason why only one in five medical decisions are strictly based on evidence, a fact that leads to variability. A possible solution can be found on clinical decision support systems , based on big data analysis.
As the processing of large amounts of information gains relevance, big data analytics can see and correlate further than the human mind can. This is where healthcare professionals count on a new tool to deal with growing information. Savana uses natural language processing and neural networks to expand medical terminologies, allowing the reuse of natural language directly from clinical reports. This automated and precise digital extraction allows the generation of a real time information engine, to be applied to care, research and management. “DataCare: Big Data Analytics Solution for Intelligent Healthcare Management” is the research carried out by Alejandro Baldominos, Fernando De Rada, and Yago Saez.
This paper presents DataCare, a solution for intelligent healthcare management. This tool is able not only to retrieve and aggregate data from different key performance indicators in healthcare centers , but also to estimate future values for these key performance indicators and, as a result, fire early alerts when undesirable values are about to occur or provide recommendations to improve the quality of service. The architecture built up in this research ensures high scalability which enables processing very high data volumes coming at fast speed from a large set of sources.
This article describes the architecture designed for this project and the results obtained after conducting a pilot in a healthcare center.
Useful conclusions have been drawn regarding to how key performance indicators change based on different situations, and how they affect patients’ satisfaction . The paper of Cristina Pruenza, María Teulón, Luis Lechuga, Julia Díaz and Ana González “Development of a predictive model for induction success of labour” focused on a relevant issue for obstetricians; that is the induction procedure. Obstetricians face the need to end a pregnancy, usually for medical reasons or less frequently, for social reasons. The success of the induction procedure is conditioned by a multitude of maternal and fetal variables that appear before or during pregnancy or birth process, with a low predictive value.
The failure of the induction process involves performing a caesarean section. This project arises from the clinical need to resolve a situation of uncertainty that frequently occurs in our clinical practice. Since the weight of clinical variables is not adequately evaluated. We find it very interesting to know a priori the possibility of success of induction in order to dismiss those inductions with high probability of failure, avoiding unnecessary procedures or postponing end if possible. We developed a predictive model of induced labour success  as a support tool in clinical decision making.
Improving the predictability of a successful induction is one of the current challenges of obstetrics because of its negative impact. Identifying those patients with high chances of failure will allow us to offer them a better care, thus improving their health outcomes and patient perceived quality. Therefore a Clinical Decision Support System  was developed to give support to Obstetricians.
Editor’s Note DOI: 10.9781/ijimai.2017.03.000
In Press Special Issue on Big Data and e-Health - V - In this article, we proposed a robust method to explore and model a source of clinical information with the purpose of obtaining all possible knowledge. Generally, in classification models it is difficult to find out the contribution that each attribute provides the model with. We worked in this direction to offer transparency to models that may be considered as black boxes. The positive results obtained from both the information recovery system and the predictions and explanations of the classification show the effectiveness and strength of this tool.
Machine-Learning-Based no show prediction in outpatient visits” is the title of the paper written by C.Elvira, J.C.Gonzálvez,A. Martinez and F. Mochón. A problem in the area of health demand is the high percentage of patients who do not attend their appointments, whether it is a consultation or a test at hospital. In this sense, the present study aims at trying to identify if there is a pattern of behaviour that allows predicting when patients will not keep an appointment  for consultation or test. This article involves a study consisting in using big data analysis techniques to try to take measures to improve the consequences of patients not attending to appointments.
A predictive model is constructed which uses the information related to medical appointments of patients and the information referring to the patient’s history of appointments. In view of the results, it can be stated that the information collected in the data set does not seem sufficient, neither in terms of patient description nor in terms of appointment characteristics, so as to construct a solid predictive model. The improvement of the classifier capacities presented in this work seems to require expanding and debugging the available information, both for patients and appointments.
The paper by Rosa María Cantón Croda and Damián Emilio Gibaja Romero, “Development of Injuries Prevention Policies in Mexico: A Big Data Approach” analyses the agents that can cause injuries in Mexico. Mexican injuries prevention strategies have been focused on injuries caused by car accidents and gender violence. This paper presents a whole analysis of the injuries registered in Mexico in order to have a wider overview of those agents that can cause injuries around the country5360彩票官网. Taking into account the amount of information from both public and private sources, obtained from dynamic cubes reported by the Minister of Health, big data strategies are used with the objective of finding an appropriate extraction such as identifying the real correlations between the different variables registered by the Health Sector .
The results of the analysis show areas of opportunity to improve the public policies on the subject, particularly in diminishing wounds at living place, public road (pedestrians) and work. “Generating big data sets from knowledge-based decision support systems to pursue value-based healthcare” is the research carried out by Arturo González-Ferrer, Germán Seara, Joan Cháfer and Julio Mayol. When talking about big data in healthcare we usually refer to how to use data collected from current electronic medical records, either structured or unstructured, so as to answer clinically relevant questions.
This operation is typically carried out by means of analytic tools or by extracting relevant data from patient summaries through natural language processing techniques. From another perspective of research in medical computing, powerful initiatives have emerged to help physicians make decisions, in both diagnosis and therapeutics, built upon the existing medical evidence  (i.e. knowledgebased decision support systems). Much of the problems these tools have shown, when used in real clinical settings, are related to their implementation and deployment, more than failing to support, but technology is slowly overcoming interoperability and integration issues.
Beyond the point-of-care decision support these tools can provide, the data generated when using them, even in controlled trials, could be used to further analyze facts that are traditionally ignored in the current clinical practice. In this paper, the authors reflect on the technologies available to make the leap and how they could help driving healthcare organizations to a shift into a value-based healthcare philosophy .
The paper by Diego J. Bodas-Sagi and José M. Labeaga, “ Big Data and Health Economics: Opportunities, Challenges and Risks” summarize the possibilities of big data to offer useful information to policy makers . In a world with tight public budgets and ageing populations we find it necessary to save costs  in any production process. The use of outcomes from big data could be in the future a way to improve decisions  at a lower cost than today. In addition, to list the advantages of properly using data and big data technologies, we also show some challenges and risks that analysts could face.
In addition we present a hypothetical example of the use of administrative records with health information both for diagnoses and patients. The last paper of this special issue is “Big Data and public health systems: issues and opportunities”, written by David Rojas de la Escalera and Javier Carnicero Giménez de Azcárate. Over the last years, the need for changing the current model of European public health systems has been repeatedly addressed, in order to ensure their sustainability. Following this line, IT has always been referred to as one of the key instruments for enhancing the information management processes of healthcare organizations, thus contributing to the improvement and evolution of health systems.
More specifically, big data solutions are expected to play a main role, since they are designed for handling huge amounts of information in a fast and efficient way, allowing users to make important decisions quickly. This article reviews the main features of the European public health system model and the corresponding healthcare and management-related information systems, the challenges that these health systems are currently facing and the possible contributions of big data solutions to this field . To that end, the authors share their professional experience on the Spanish public health system and review the existing literature related to this topic.
F. Mochón and C. Elvira RefeRences  Groves, P., Kayyali, B., Knott, D., & Kuiken, S. V. (2016). The’big data’revolution in healthcare: Accelerating value and innovation.  Roesems-Kerremans G. (2016). Big Data in Healthcare. J Healthc Commun., 1:4.
 Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt.  Street, R. L., Gold, W. R., & Manning, T. R. (2013). Health promotion and interactive technology: Theoretical applications and future directions. Routledge.  Poon, E. G., Jha, A. K., Christino, M., Honour, M. M., Fernandopulle, R., Middleton, B & Kaushal, R. (2006). Assessing the level of healthcare information technology adoption in the United States: a snapshot. BMC Medical Informatics and Decision Making, 6(1), 1.  Kontos, E., Blake, K.
D., Chou, W. Y. S., & Prestin, A. (2014). Predictors of eHealth usage: insights on the digital divide from the Health Information National Trends Survey 2012. Journal of medical Internet research, 16(7), e172.
 Whitney, S. N. (2003). A new model of medical decisions: exploring the limits of shared decision making. Medical Decision Making, 23(4), 275- 280.  Curtright, J. W., Stolp-Smith, S. C., & Edell, E. S. (2000). Strategic performance management: development of a performance measurement system at the Mayo Clinic. Journal of Healthcare Management, 45(1), 58-68.  Lacy, J. S., Fielding, D. R., Sinclair III, E. L., Schremser, C. L., & Cress, J. A. (2014). U.S. Patent Application No. 14/263,940.
 Stock, G. N., & McFadden, K. L. (2017). Improving service operations: linking safety culture to hospital performance.
Journal of Service Management, 28(1), 57-84.
International Journal of Interactive Multimedia and Artificial Intelligence - VI -  Bajpai, N., Bhakta, R., Kumar, P., Rai, L., & Hebbar, S. (2015). Manipal cervical scoring system by transvaginal ultrasound in predicting successful labour induction. Journal of clinical and diagnostic research: JCDR, 9(5), QC04.  Berner, E. S., & La Lande, T. J. (2016). Overview of clinical decision support systems. In Clinical decision support systems (pp. 1-17). Springer International Publishing.  Norris, J. B., Kumar, C., Chand, S., Moskowitz, H., Shade, S. A., & Willis, D. R. (2014). An empirical investigation into factors affecting patient cancellations and no-shows at outpatient clinics.
Decision Support Systems, 57, 428-443.
 Wilkinson, R. G., & Marmot, M. (2003). Social determinants of health: the solid facts. World Health Organization.  Eisenberg, J. M. (1986). Doctors’ decisions and the cost of medical care: the reasons for doctors’ practice patterns and ways to change them.  Edwards, A., & Elwyn, G. (2009). Shared decision-making in health care: Achieving evidence-based patient choice. Oxford University Press.  Roski, J., Bo-Linn, G. W., & Andrews, T. A. (2014). Creating value in health care through big data: opportunities and policy implications. Health Affairs, 33(7), 1115-1122.
 Bates, D.
W., Saria, S., Ohno-Machado, L., Shah, A., & Escobar, G. (2014). Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Affairs, 33(7), 1123-1131.  Raghupathi, W., & Raghupathi, V. (2014). Big data analytics in healthcare: promise and potential. Health Information Science and Systems, 2(1), 3.  Salathé, M. (2016). Digital Pharmacovigilance and Disease Surveillance: Combining Traditional and Big-Data Systems for Better Public Health. Journal of Infectious Diseases, 214 (suppl 4), S399-S403.
OPEN ACCESS JOURNAL ISSN: 1989-1660 COPYRIGHT NOTICE 5360彩票官网 © 2017 UNIR. This work is licensed under a Creative Commons Attribution 3.0 unported License. Permissions to make digital or hard copies of part or all of this work, share, link, distribute, remix, tweak, and build upon ImaI research works, as long as users or entities credit ImaI authors for the original creation. Request permission for any other issue from email@example.com. All code published by ImaI Journal, ImaI-OpenLab and ImaI-Moodle platform is licensed according to the General Public License (GPL).
http://creativecommons.org/licenses/by/3.0/ TABLE OF CONTENTS EDITOR’S NOTE IV SAVANA: RE-USING ELECTRONIC HEALTH RECORDS WITH ARTIFICIAL INTELLIGENCE 8 DATACARE: BIG DATA ANALYTICS SOLUTION FOR INTELLIGENT HEALTHCARE MANAGEMENT 13 DEVELOPMENT OF A PREDICTIVE MODEL FOR INDUCTION SUCCESS OF LABOUR 21 MACHINE-LEARNING-BASED NO SHOW PREDICTION IN OUTPATIENT VISITS 29 DEVELOPMENT OF INJURIES PREVENTION POLICIES IN MEXICO: A BIG DATA APPROACH 35 GENERATING BIG DATA SETS FROM KNOWLEDGE-BASED DECISION SUPPORT SYSTEMS TO PURSUE VALUE-BASED HEALTHCARE 42 BIG DATA AND HEALTH ECONOMICS: OPPORTUNITIES, CHALLENGES AND RISKS 47 BIG DATA AND PUBLIC HEALTH SYSTEMS: ISSUES AND OPPORTUNITIES 53
International Journal of Interactive Multimedia and Artificial Intelligence DOI: 10.9781/ijimai.2017.03.001 Abstract — Health information grows exponentially (doubling every 5 years), thus generating a sort of inflation of science, i.e. the generation of more knowledge than we can leverage. In an unprecedented data-driven shift, today doctors have no longer time to keep updated. This fact explains why only one in every five medical decisions is based strictly on evidence, which inevitably leads to variability. A good solution lies on clinical decision support systems, based on big data analysis.
As the processing of large amounts of information gains relevance, automatic approaches become increasingly capable to see and correlate information further and better than the human mind can. In this context, healthcare professionals are increasingly counting on a new set of tools in order to deal with the growing information that becomes available to them on a daily basis. By allowing the grouping of collective knowledge and prioritizing “mindlines” against “guidelines”, these support systems are among the most promising applications of big data in health. In this demo paper we introduce Savana, an AI-enabled system based on Natural Language Processing (NLP) and Neural Networks, capable of, for instance, the automatic expansion of medical terminologies, thus enabling the re-use of information expressed in natural language in clinical reports.
This automatized and precise digital extraction allows the generation of a real time information engine, which is currently being deployed in healthcare institutions, as well as clinical research and management.
Keywords — Natural Language Processing,Artificial Intelligence, E-Health, Machine Learning, Electronic Health Records. I. IntRoductIon the information that physicians write in Electronic Health Records (EHRs) during their daily practice generates vast amounts of valuable information. Doctors’ notes illustrate the real and practical approach in which they address casuistry at ground level, where factors associated to their work environment and to uncertainty conditions come into play . However, only a minor portion of all this information is leveraged today, namely that which “sees the light” in the form of scientific literature or other venues where experts share information (articles, reviews, meta-analyses, opinion pieces, conference submissions, and specialized webs in the medical domain) .
A fundamental bottleneck preventing large-scale automatic reuse of this information is that it is mostly encoded in natural language, i.e. free text written by medical practitioners in EHRs . The traditional approach for knowledge extraction was, until very recently, to prestructure certain EHR systems so that only certain type of information is allowed in certain fields. However, today there is an increasing line of thought discouraging this practice, as the complexity of clinical reality cannot be modeled simply by means of splitting information in EHRs via drop-down menus.
As such, it is widely agreed that comprehensive reuse of information generated daily in every point of care of the Health System is of utmost importance. While individual actions do not generate added value due to lack of statistical significance, all the accumulated information provided by specialists in a medical area is an unequivocal and highly valuable reference for any practitioner. Especially considering that part of their actions is supported by the usage of Evidence Based Medicine . Thus, in the daily reality of a medical professional, it is regular practice that physicians ask others, according to their subarea of expertise, confident that their decisions are generally supported by existing scientific knowledge .
Moreover, Spain is one of the world’s leading countries in terms of impact of EHRs, which results in a very high availability of informattion. Every 10 minutes, tens of thousands of EHRs are written in Spanish medical institutions, which results in a total of billions, if we consider how long have medical practitioners been writing down their notes in electronic form. An additional factor is the need for real-time accurate information, which is explained by the fact that knowledge (and particularly, medical knowledge) grows exponentially. IBM currently estimates that in 5360彩票官网 there will be 200 times more medical information than what a single individual would be able to absorb in all his or her life .
Additionally, we do know that, today, doctors have on average one doubt every two patients they see . Past attempts to applyArtificial Intelligence (AI) to medical decision support systems have traditionally encountered a strong limitation in the complexity of human language . Today, the state of the art of Natural Language Processing, along with the availability of the computational power needed to perform large scale text understanding, results in a mature field for performing cutting-edge exploitation of text data in domain-specific scenarios. A viable system, however, must simplify its routines as much as possible, and leverage the statistical exploitation of semantic concepts (and not simply words) by combining NLP  and data aggregation techniques.
Savana’s starting point, in 2013, was motivated by the goal to maximize the huge amount of information contained in EHRs, which up to today had only been used to follow individual patients’ progress. Likewise, other associated issues such as defining a correct medical usage for such information, surmounting legal requirements (data protection, for example), or technical considerations, had to be accounted for. In this context, Savana is born as a platform for clinical decision support, based on real-time dynamic exploitation of all the information contained in EHRs corpora. Savana performs immediate statistical Savana: Re-using Electronic Health Records with Artificial Intelligence Ignacio Hernández Medrano1 , Jorge Tello Guijarro1 , Cristóbal Belda2 , Alberto Ureña1 , Ignacio Salcedo1 , Luis Espinosa-Anke1,3 , Horacio Saggion3 1 Savana 2 HM Hospitales 3 TALN DTIC, Universitat Pompeu Fabra, Barcelona (Spain)
- There is currently very high sensitivity towards how EHRs are used. While the Organic Law of Protection of Personal Data1 states that an anonymized clinical record loses its condition of personal data, several stakeholders are of the opinion that despite not possessing them, it should potentially be possible to maliciously locate specific individuals by performing an inverse association from records to patients.
- A system of such characteristics must by definition exist in the cloud, as it requires constant and on-line training.
- Different EHR systems are incompatible, and hence interoperability is seriously hindered, and data sparsity becomes an additional issue to deal with. For the above reasons, in Savana we decided to address the technical design with the following priorities.
- The source should not matter, as long as there is access to written text. Savana had to detach itself from formatting issues, and be capable to encode any input in text format as its own ‘language’.
- It was essential to ensure that individual (single patient) information was irrelevant. In fact, we purposely randomly tamper each record, so that if a third party with malign purposes would breach into this information, it would never know which of it was accurate, and which was not (not even the team in Savana should know).
- However, information should be correct at aggregation time. Statistical approaches would be expected to automatically and reliably clean any false information the very moment in which a doctor, a manager or a researcher asked a question or performed a query.
Records would not leave the hospital or the institution’s data center. They would be processed there in situ, and the cloud would only contain clinical concepts codified according to a predefined custom terminology. In addition to the above concerns, we faced the challenge posed by current medical terminologies, which are not designed for the reuse of EHRs, and thus constitute a starting point, but not a long term solution. Thus, in Savana we created our own terminology, a process which, for obvious reasons, had to be done automatically. The techniques followed for automatic terminological expansion were designed inhouse, and are the content of a recently published paper authored by the authors signing this article .
In sum, by combining Big Data with AI approaches, we designed a robot that “didn’t read well, but excelled at summarization’’, which surmounted existing shortcomings and allowed us to advance with real use cases, where the goal was to reuse information linked to clinical 1 https://www.boe.es/buscar/act.php?id=BOE-A-1999-23750 experience, which had been traditionally limited. The usual approach had always been to implement systems that encoded information on the physician’s side (structured systems for inputting information, by means of e.g. dropdown menus). These approaches did not have much success due to, among others, the fact that clinical experience is very complex, and the time available to practitioners to document it, very limited.In order to tackle these and other technological challenges, we take advantage of current technologies such as, but not limited to:
- Supervised Machine Learning. We have designed and registered algorithms for the different stages of processing, so that, for instance, our system is able to determine that a given paragraph belongs to the ‘Background’ section, and not ‘Diagnosis’, due to certain morphologic cues (appearance of adverbs, for instance). Note that, while a traditional approach to such problem could be the development of an expert or rule-based system, in this case the output of the system is based on a statistical model which optimizes a function defined at training time.
Unsupervised Machine Learning: These techniques are aimed at designing statistical models sensitive to data distribution without a priori knowledge about the class or label associated to each data point. We took advantage of neural models for NLP (which imitate the way human brain works) for building a computational model (known in the NLP community as word embeddings models) for determining the semantic content of words . For instance, the algorithm learns autonomously, i.e. without predefined semantic relations to be looked up, that Alzheimer’s and Parkinson have similar meanings, very different to e.g.
Naproxeno and Ibuprofeno, which in addition are themselves semantically similar (see Figure 1 for the output of the algorithm for a given query). Savana’s model, which is being used in several modules of our infrastructure, has been trained with over 500M Spanish words coming from EHRs, and enables the robot to decide, for instance, when ‘no’ refers to the negation adverb, and when it is an abbreviation of the medical concept ‘neuritis óptica’, depending on the contextual content. To the best of our knowledge, this is the largest embeddings model trained exclusively with EHRs.
Fig. 1. Example of Savana’s unsupervised learning model. It shows the result when asked for words semantically related to dieta sana. III. Results In this section, we cover the main functionalities and products Savana offers for healthcare professionals.
International Journal of Interactive Multimedia and Artificial Intelligence - 10 - A. Functionalities Savana’s technology can be leveraged in different use cases. Today, there are three available applications already implemented and with real-world users, as well as three additional systems in development.
Once the service is deployed in an institution, usage tracking is incorporated, so that additional functionalities can be adapted, which allows Savana to develop improvements and new related services, depending on the actual use of the tool. This makes it possible to adapt the product to the users’ requirements (for instance, if its usage is more interesting in certain areas or clinical situations).
In what follows, we describe currently available applications, and their usage. 1) Savana Manager This application is designed to learn about clinical practice and resource consumption, by computing data in a single institution, and comparing its data and trends with the average of Savana users (Figure 2). The user can also design intuitively custom tables depending on the type of information desired. In addition, a control panel is available where classic management indicators can be found, which again, can be adapted depending on the needs of each individual institution (Figure 3).
3. 5360彩票官网 screen of Savana Manager, all the information and configuration options appears in a simple way in only one screen. This application can be used to measure quantitatively, among others: How much variability there is in an institution’s practice; which are the average costs per intervention, which patients are more likely to take part in a clinical trial; the quality of clinical records; when is it likely that clinical tests have been duplicated; what is an institution’s position with respect to others of its kind; and in sum, any managerial question solvable with standard metrics.
2) Savana Consulta This is the world’s first application for real-time clinical decision support in Spanish, and is designed to be used at the time of the patient’s visit, in front of him/her (Figure 4). This application was developed from its inception considering first general practitioners, as well as emergency physicians (which have high patient load and very limited time), and then, specialists. Fig. 4. 5360彩票官网 page of Savana Consulta. It improves the corroboration potential, as in practice using Savana Consulta means to query in real-time all the specialists, and hence incorrect data (statistical anomalies) is factored out from the aggregated response.
These common features constitute the content of the answer (which may have not been considered a priori by the practitioner), and can be relevant for decision-making. The vision behind Savana Consulta is that of a helper or second opinion when a medical question is asked (an example can be found in Figure 5).
From a social standpoint, it means that patients are provided with a new type of clinical resource, accessible from any medical institution, and with a very low cost as compared with regular clinical technology. It improves the accuracy in diagnoses and treatments given to patients by any practitioner, thus having a direct impact in their overall health. Fig. 2. Example of the control panel of Savana Manager.
In Press Special Issue on Big Data and e-Health - 11 - Fig. 5. Example of a question to Savana Consulta about the most frequent evolution of a patient with migraine, and their most probable timespan.
This information can be obtained with just one click. Savana Consulta can be implemented either in a national (interoperable) EHR system, or in more delimited system (e.g. an autonomous community, a set of hospitals or one single medical institution). However, let us highlight that the higher the amount of data, the more significant the results become. Information is shared among all users of the network, without being possible to trace back which hospital provided which bit of information. Moreover, each user can decide whether they are interested in sharing their own information or not. In the latter case, information only becomes available to users in the same institution.
The main contributions of this tool are: Suggestions for each specific clinical case, with non existent precision in current scientific literature; evidence coming from the system itself, with its own resources and population; as well as suggestions for better practices in which there is no Evidence-Based Medicine data available. 3) Savana Research Our third working product has its usefulness in clinical research, by performing time-sensitive analyses of the behavior of certain patient typologies. It analyzes the e volution of each individual case, and is capable of performing predictions based on existing data.
For a given patient typology, the system can determine how many cases there are (prevalence in an institution), estimate the next cases of a certain set of events in the institution (for instance, a patient with a certain illness comes back for further assistance), as well as defining evolutions according to a set of input tests and treatments, by detecting typical lines of treatment for prototypical patients. The system analyzes a patient’s timeline (illustrated in Figure 6), and hence it is possible to compute the most likely timespan of an occurring event, or if evolutions span a short period, it enables detection of incorrect actions.
The main goal of this application is to quickly guide research hypotheses. In addition, Savana Research provides an exponential speed up of a physician’s capacity to provide answers to research questions, or guide work hypotheses, without requiring data extraction from EHRs via the traditional, slow methods based on (semi)manual processing.
As an overall conclusion, in Table 1 we provide a listing of interventions carried out in real-world cases thanks to specifically taking advantage of the information encoded by Savana. B. Current implementation state Savana is so far the result of 20,000 hours of computational development. Savana is currently providing service in 24 Spanish hospitals, distributed across three autonomous communities and two private groups. Today, more than 3000 queries have been delivered to the different applications, by a total of 216 users. Fig. 6. Example output of Savana Research: It shows the most likely admittance of patients with diabetes mellitus (again, this information can easily be obtained with just one click).
table I: exaMples of InteRventIons taken thanks to the InfoRMatIon geneRated by savana Avoid usage of unnecessary elastic packs, after analyzing parts of the operating room. Discovering that the most frequent point of care after the diagnosis of the Alzheimer’s disease is Traumatology. Ascertaining that new oral anticoagulants are safer than acenocoumarol in atrial fibriliation. Detecting candidates for undergoing Parkinson surgery, which had been wrongly discarded. Correct a 2x error in the foresight of beds and salbutamol for bronchiolitis. Identify patients with refractory essential tremor which were treated with ultrasound.
Call in patients with family aortic myocardiopathy (CIE code unavailable) for a clinical trial. Knowing how many women who give birth come back to the same hospital in the future. Listing how many debulking procedures a specific surgeon performed. Counting how many cases of bronchiolitis were incorrectly derived to pediatric ICU Anticipating how many spinal surgeries can actually be prevented thanks to the back school Quantifying the number of cases of suspected apendicits in which computerized tomography + abdominal ultrasound were carried out Detecting nosocomial infections Finding out how many breast cancers were treated with lapatinib Iv.
conclusIons A large scale query, submitted to a vast number of practitioners, and supported by a computational tool, facilitates and speeds up the clinician’s task. This is a disruptively new concept, which we call Evidence Generating Medicine, and which constitutes a novel layer of knowledge. On the other hand, in addition to the assistance activity, having all the information contained in EHRs readily available is highly useful for obtaining epidemiological information. This technique is framed within the data mining paradigm, aimed at efficiently exploiting big data. An area destined to revolutionize many areas, including healthcare.
The main avenues where our platform could undergo improvements are: (1) number of referrals to specialists; (2) fitness of diagnostic tests and treatments to recommendations issued in clinical practice guides; (3) number of subsequent visits; (4) reduction of hospitalizations; and (5) improvement of diagnosis.
International Journal of Interactive Multimedia and Artificial Intelligence - 12 - In the case of Savana Consulta, this application allows patients without access to the best specialists to benefit from their collective knowledge. With the data we have today, the picture at 10 years sight is that we would be leveraging input from hundreds of millions of specialists, always depending on the number of patients under consideration.
With Savana Research, we make the research process grow up to 15 times, enabling doctors to focus on interpreting information, rather than extracting it.
The Savana project has an almost universal potential impact, as it can be used in any healthcare point. It is known that technologies related to Internet access and EHR are exponential, and therefore they will become globally available in a few years to the majority of the population. RefeRences  Dawes M and Sampson U. Knowledge management in clinical practice: a systematic review of information seeking behavior in physicians International journal of medical informatics. 2003; 71(1), 9-15.  Bravo R. La gestión del conocimiento en medicina: a la búsqueda de la información perdida. Anales del Sistema Sanitario de Navarra (Vol.
25, No. 3, pp. 255-272).
 Gonzalez-Gonzalez AI, Escortell Mayor E, Hernandez Fernandez T, Sanchez Mateos JF, Sanz Cuesta T and Riesgo Fuertes R. Necesidades de información de los médicos de atención primaria: análisis de preguntas y su resolución. Atención Primaria. 2005;35(8): 419-22.  Lopez-Torres Hidalgo J. Hábitos de lectura de revistas científicas en los médicos de Atención Primaria. Atención Primaria. 2011;43(12): 636-37.  Brassey J, Elwyn G, Price C and Kinnersley P. Just in time information for clinicians: a questionnaire evaluation of the ATTRACT project. Bmj. 2001;322: 529–30.
 Ferrucci D, Levas A, Bagchi S, Gondek D and Mueller ET.
Watson: Beyond Jeopardy! Artificial Intelligence. 2013;93(105): 199–200.  Louro Gonzalez A, Fernandez Obanza E, Fernandez López E, Vazquez Millan P, Villegas González L and Casariego Vales E.Análisis de las dudas de los médicos de atención primaria. Atención Primaria. 41(11), 592-597.  Weiskopf NG, Hripcsak G, Swaminathan S and Weng C. Defining and measuring completeness of electronic health records for secondary use. Journal of Biomedical Informatics 2013;46(5): 830–6.  Geissbuhler A, Haux R and Kulikowski C. Electronic patient records: some answers to the data representation and reuse challenges findings from the section on patient records editors.
IMIA Yearbook of Medical Informatics 2007. Inf Med Methods. 2007; 46(1): 47-9.  Espinosa-Anke L, Tello J, Pardo A, Medrano I, Ureña A, Salcedo I, Saggion H. Savana: un entorno integral de extracción de información y expansión de terminologías en el dominio de la Medicina. Procesamiento del Lenguaje Natural. 2016; 57: 23-30.
 Mikolov T, Sutskever I, Chen K, Corrado G, and Dean J. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013. Jorge Tello received his Bachelor of Science and Master of Science in Industrial Egineering from the Universidad Pontificia de Comillas (ICAI) in 2006, where he also obtained postgraduate studies in Project Management in 2011. Since 2014 he is Founder and CTO of Savana. His research and work topics include Biomedical data mining, Natural Language Processing and Machine Learning. Ignacio Hernández Medrano is a nuerologist in the Ramon y Cajal hospital.
He has a long career in healthcare management, where he has coordinated teaching and the research strategy. He holds a Master’s degree in Healthcare Management, and a Master’s degree in R&D management in health sciences (Spanish National School for HealthcareISCIII). He teaches in areas related to innovation and digital health at postgraduate level, e.g. clinical research master’s courses, health management or MBAs. Ignacio received a degree from the Singularity University (NASA-Silicon Valley) in 2014 in entrepreneurship with exponential technologies, is TED speaker and the CEO-founder of Savana, a startup focused on the application of AI to Electronic Health Records.
Horacio Saggion holds a PhD in Computer Science from Université de Montréal, Canada. He obtained his Bsc in Computer Science from Universidad de Buenos Aires in Argentina, and his MSc in Computer Science from UNICAMP in Brazil. Horacio is an Associate Professor at the Department of Information and Communication Technologies, Universitat Pompeu Fabra (UPF), Barcelona. He is head of the Large Scale Text Understanding Systems Lab and a member of the Natural Language Processing group where he works on automatic text summarization, text simplification, information extraction, sentiment analysis and related topics.His research is empirical combining symbolic, pattern-based approaches and statistical and machine learning techniques.
He is currently principal investigator for UPF in several EU and national projects. Horacio has published over 100 works in leading scientific journals, conferences, and books in the field of human language technology. Luis Espinosa-Anke (Elche, Spain, 1983) received his BA in English Philology from the University of Alicante in 2006. He obtained an MA in English for Speecific Purposes in the same institution, and a second MA in Natural Language Processing and Human Language Technologies in a joint Erasmus Mundus program provided by Universitat Autònoma de Barcelona (Spain) and the University of Wolverhampton (UK).
His research interests lie on knowledge-based approaches for semantics and knowledge acquisition and modeling.
Ignacio Salcedo Ramos (1989, Cuenca, Spain) received his Msc inComputer Science from the Complutense University of Madrid in 2012. He is currently working as R&D engineer in Savana. His research interests include NLP and Machine Learning. Alberto Ureña was born in Madrid, Spain in 1989. He obtained his Msc (2012) in Computer Science from the Complutense University of Madrid. He is currently working at Savana, developing algorithms to extract information from medical records with the goal of improving health system efficiency and future medical breakthroughs. His current interests include NLP and Machine Learning methods, as well as logic programming.
Cristóbal Belda is a medical oncologist and current CEO of HM Hospitales Foundation for Research, an organization involved in the assistance of more than 2 millions of patients every year all over Spain. PhD in Medicine from UAM and former CEO of the Spanish National School of Public Health at NIH “Carlos III”. He has developed his career in biomarkers of cancer and, recently, how health economics may help new biomedical advances to be implemented in real life, publishing more than 80 peer-reviewed, JCRindexed, international papers and international patents for new approaches on biomarker analysis and leading more than 100 clinical trials mainly in lung and brain cancer.