A CASE STUDY OF PROCESS ENGINEERING OF OPERATIONS IN WORKING SITES THROUGH DATA MINING AND AUGMENTED REALITY
Alessandro Massaro1, Angelo Galiano1, Antonio Mustich1, Daniele Convertini1
,VincenzoMaritati1, Antonia Colonna1, Nicola Savino1, Angela Pace2, LeoIaquinta2
1Dyrecta Lab , Via Vescovo Simplicio, Italy
2SO.CO.IN. SYSTEM srl, Italy
In this paper is analyzed the design of a software platform concerning a case study of process engineering involving the simultaneous adoption of data digitation, Data Mining –DM- processing, and Augmented Reality -AR-. Specifically is discussed the platform design able to upgrade the Knowledge Base –KBenabling production process optimizations in working sites. The KB is gained by following ‘Frascati’research guidelines addressing the possible ways to achieve the Knowledge Gain –KG-. The technologies such as AR and data entry mobile app are tailored in order to apply innovative data mining algorithms. In the first part of the paper is commented the preliminary project specifications, besides, in the second part, are shown the use cases, the unified modeling language –UML- models, and the mobile app mockupsenabling KG. The proposed work discusses preliminary results of an industry project
Frascati Guideline, Knowledge Base Gain, Data Mining, Augmented Reality
For More Details :
http://aircconline.com/ijdkp/V9N5/9519ijdkp01.pdf
APPLICATION OF DATA MINING TECHNIQUE TO PREDICT LANDSLIDES IN SRI LANKA
Karunanayake K.B.A.A.M and Wijayanayake W.M.J.I , University of Kelaniya, Sri Lanka
Landslides are the major natural disaster in hill country of Sri Lanka, which create terrible economical and ecological damages. Therefore, the fast detection is important. Currently in Sri Lanka,predict landslides based on a map reading approach. But a map is limited to specific point in time, and do not take current conditions into account. Therefore, develop a model/tool which has ability to efficiently deal with current situation is important. Hence within this study, prediction models were developed using Decision Tree and Neural Network data mining techniques,based on the data of Badulla and NuwaraEliya districts. Selected Decision Tree model for Badulla district has 96.2963% accuracy level and Nuwara Eliya district has 100% accuracy level. Though Decision tree models were outperformed, Neural Network models also have above 90% accuracy. Therefore, it can be concluded that both data mining techniques are suitableto develop andslide prediction models for Sri Lanka
Landslide, Data mining, Predictive analysis, Plan-Do-Check-Act, Decision tree
For More Details :
http://aircconline.com/ijdkp/V9N4/9419ijdkp04.pdf
ACCESS AND CONNECTION VIA TECH DATA AS AN ENABLER AF A THIN OR NONEXISTENT MARKET
Bathabile S C Amirchand Founder Gropeedy App, South Africa
This study provides a prototype of Real estate listing mobile application that has the capability to organize, store, maintain and search data from a Mobile Device such as android or iOS. This application helps the household owners to list their properties without any cost. This system comprises of a mobile app, Central Database, Satellite database and offline database system. It consists of software that combines one or more servers to the computer as well as to the Mobile user, making it a Mobile app and A Web Browser. Apple Mac OSX working framework can also be utilized to make this framework. This Suite of apparatuses involves graphical UI (GUI) based applications, command- line instruments, and documentation to help in the product advancement process. This will pave a way to formalize the much-neglected Former Homelands (Village Sector) and facilitate the development of an inclusive real estate Evaluation Data, thus, enabling access to areas where there has been no base price until now.
Data, Application, Real-estate, Technology, Connection and System
For More Details :
http://aircconline.com/ijdkp/V9N5/9519ijdkp02.pdf
COST-SENSITIVE TOPICAL DATA ACQUISITION FROM THE WEB
Mahdi Naghibi1, Reza Anvari1, Ali Forghani1 and Behrouz 2 , 1Malek-Ashtar University of Technology, Iran , 2 Iran University of Science and Technology, Iran
The cost of acquiring training data instances for induction of data mining models is one of the main concerns in real-world problems. The web is a comprehensive source for many types of data which can be used for data mining tasks. But the distributed and dynamic nature of web dictates the use of solutions which can handle these characteristics. In this paper, we introduce an automatic method for topical data acquisition from the web. We propose a new type of topical crawlers that use a hybrid link context extraction method for topical crawling to acquire on-topic web pages with minimum bandwidth usage and with the lowest cost. The new link context extraction method which is called Block Text Window (BTW), combines a text window method with a block-based method and overcomes challenges of each of these methods using the advantages of the other one. Experimental results show the predominance of BTW in comparison with state of the art automatic topical web data acquisition methods based on standard metrics
Cost-Sensitive Learning, Data acquisition, Topical Crawler, Link Context, Web Data Mining
For More Details :
http://aircconline.com/ijdkp/V9N3/9319ijdkp04.pdf
CATEGORIZATION OF FACTORS AFFECTING CLASSIFICATION ALGORITHMS SELECTION
Mariam Moustafa Reda, Mohammad Nassef and Akram Salah ,Cairo University, Egypt
A lot of classification algorithms are available in the area of data mining for solving the same kind ofproblem with a little guidance for recommending the most appropriate algorithm to use which gives best results for the dataset at hand. As a way of optimizing the chances of recommending the most appropriate classification algorithm for a dataset, this paper focuses on the different factors considered by data miners and researchers in different studies when selecting the classification algorithms that will yield desired knowledge for the dataset at hand. The paper divided the factors affecting classification algorithms recommendation into business and technical factors. The technical factors proposed are measurable and can be exploited by recommendation software tools.
Classification, Algorithm selection, Factors, Meta-learning, Landmarking
For More Details :
http://aircconline.com/ijdkp/V9N4/9419ijdkp01.pdf
IMPLEMENTATION OF RISK ANALYZER MODEL FOR UNDERTAKING THE RISK ANALYSIS OF PROPOSED BUILDING PROJECTS FOR A SELECTED CLIENT
Ibrahim Yakubu , Balewa University, Nigeria
The model of RISK ANALYZER was implemented as Knowledge-based System for the purpose of undertaking risk analysis for proposed construction projects in a selected domain. The Fuzzy Decision Variables (FDVs) that cause differences between initial and final contract sums of building projects were identified, the likelihood of the occurrence of the risks were determined and a Knowledge-Based System that would rank the risks was constructed using JAVA programming language and Graphic User Interface. The Knowledge-Based System is composed a Knowledge Base for storing data, an Inference Engine for controlling and directing the use of knowledge for problem-solution, and a User Interface that assists the user retrieve, use and alter data in the Knowledge Base. The developed Knowledge-Based System was compiled, implemented and validated with data of previously completed projects. The client could utilize the Knowledge-Based System to undertake proposed building projects.
RISK ANALYZER, Risk analysis, Knowledge-Based Systems, JAVA, Graphic User Interface
For More Details :
http://aircconline.com/ijdkp/V9N4/9419ijdkp03.pdf
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELS
Nehal Mohamed Ali, Marwa Mostafa Abd El Hamid and Aliaa Youssif , Arab Academy for Science Technology and Maritime, Egypt
Due to the enormous amount of data and opinions being produced, shared and transferred everyday across the internet and other media, Sentiment analysis has become vital for developing opinion mining systems. This paper introduces a developed classification sentiment analysis using deep learning networks and introduces comparative results of different deep learning networks. Multilayer Perceptron (MLP) was developed as a baseline for other networks results. Long short-term memory (LSTM) recurrent neural network, Convolutional Neural Network (CNN) in addition to a hybrid model of LSTM and CNN were developed and applied on IMDB dataset consists of 50K movies reviews files. Dataset was divided to 50% positive reviews and 50% negative reviews. The data was initially pre-processed using Word2Vec and word embedding was applied accordingly. The results have shown that, the hybrid CNN_LSTM model have outperformed the MLP and singular CNN and LSTM networks. CNN_LSTM have reported the accuracy of 89.2% while CNN has given accuracy of 87.7%, while MLP and LSTM have reported accuracy of 86.74% and 86.64 respectively. Moreover, the results have elaborated that the proposed deep learning models have also outperformed SVM, Naïve Bayes and RNTN that were published in other works using English datasets.
Deep learning, LSTM, CNN, Sentiment Analysis, Movies Reviews, Binary Classification
For More Details :
http://aircconline.com/ijdkp/V9N3/9319ijdkp02.pdf
DEEP LEARNING BASED MULTIPLE REGRESSION TO PREDICT TOTAL COLUMN WATER VAPOR (TCWV) FROM PHYSICAL PARAMETERS IN WEST AFRICA BY USING KERAS LIBRARY
Daouda DIOUF1 , Awa Niang1and Sylvie Thiria2 , 1 Université Cheikh Anta Diop, Sénégal , 2Université Pierre et Marie Curie, France
Total column water vapor is an important factor for the weather and climate. This study applydeep learning based multiple regression to map the TCWV with elements that can improve spatiotemporal prediction. In this study, we predict the TCWV with the use of ERA5 that is the fifth generation ECMWF atmospheric reanalysis of the global climate. We use an appropriate deep learning based multiple regression algorithm using Keras library to improve nonlinear prediction between Total Column water vapor and predictors as Mean sea level pressure, Surface pressure, Sea surface temperature, 100 metre U wind component, 100 metre V wind component, 10 metre U wind component, 10 metre V wind component, 2 metre dew point temperature, 2 metre temperature.The results obtained permit to build a predictor which modelling TCWV with a mean abs error(MAE) equal to 3.60 kg/m2 and a coefficient of determination R 2 equal to 0.90.
For More Details :
http://aircconline.com/ijdkp/V9N6/9619ijdkp02.pdf
INSOLVENCY PREDICTION ANALYSIS OF ITALIAN SMALL FIRMS BY DEEP LEARNING
Agostino Di Ciaccio1 and Giovanni Cialone2 , 1 university of Rome, Italy , 2Senior partner of Kairos Advisory srl., Italy
To improve credit risk management, there is a lot of interest in bankruptcy predictive models. Academic research has mainly used traditional statistical techniques, but interest in the capability of machine learning methods is growing. This Italian case study pursues the goal of developing a commercial firms in solvency prediction model. In compliance with the Basel II Accords, the major objective of the model is an estimation of the probability of default over a given time horizon, typically one year. The collected dataset consists of absolute values as well as financial ratios collected from the balance sheets of 14.966 Italian micro-small firms, 13,846 ongoing and 1,120 bankrupted, with 82 observed variables. The volume of data processed places the research on a scale like that used by Moody’s in the development of its rating model for public and private companies, RiskcalcTM. The study has been conducted using Gradient Boosting, Random Forests, Logistic Regression and some deep learning techniques: Convolutional Neural Networks and Recurrent Neural Networks. The results were compared with respect to the predictive performance on a test set, considering accuracy, sensitivity and AUC. The results obtained show that the choice of the variables was very effective, since all the models show good performances, better than those obtained in previous works. Gradient Boosting was the preferred model, although an increase in observation times would probably favour Recurrent Neural Networks.
Credit risk, Bankruptcy prediction, Deep learning
For More Details :
http://aircconline.com/ijdkp/V9N6/9619ijdkp01.pdf
A BUSINESS INTELLIGENCE PLATFORM IMPLEMENTED IN A BIG DATA SYSTEM EMBEDDING DATA MINING: A CASE OF STUDY
Alessandro Massaro, Valeria Vitti, Palo Lisco, Angelo Galiano and Nicola Savino , Dyrecta Lab, IT Research Laboratory, Italy
In this work is discussed a case study of a business intelligence –BI- platform developed within the framework of an industry project by following research and development –R&D- guidelines of ‘Frascati’. The proposed results are a part of the output of different jointed projects enabling the BI of the industry ACI Global working mainly in roadside assistance services. The main project goal is to upgrade the information system, the knowledge base –KB- and industry processes activating data mining algorithms and big data systems able to provide gain of knowledge. The proposed work concerns the development of the highly performing Cassandra big data system collecting data of two industry location. Data are processed by data mining algorithms in order to formulate a decision making system oriented on call center human resources optimization and on customer service improvement. Correlation Matrix, Decision Tree and Random Forest Decision Tree algorithms have been applied for the testing of the prototype system by finding a good accuracy of the output solutions. The Rapid Miner tool has been adopted for the data processing. The work describes all the system architectures adopted for the design and for the testing phases, providing information about Cassandra performance and showing some results of data mining processes matching with industry BI strategies.
Big Data Systems, Cassandra Big Data, Data Mining, Correlation Matrix, Decision Tree, Frascati Guideline
For More Details :
http://aircconline.com/ijdkp/V9N1/9119ijdkp01.pdf