A Study into Math Document Classification using Deep Learning
Fatimah Alshamari and Abdou Youssef, Department of Computer Science, The George Washington University, Washington, D.C, USA
Document classification is a fundamental task for many applications, including document annotation, document understanding, and knowledge discovery. This is especially true in STEM fields where the growth rate of scientific publications is exponential, and where the need for document processing and understanding is essential to technological advancement. Classifying a new publication into a specific domain based on the content of the document is an expensive process in terms of cost and time. Therefore, there is a high demand for a reliable document classification system. In this paper, we focus on classification of mathematics documents, which consist of English text and mathematics formulas and symbols. The paper addresses two key questions. The first question is whether math-document classification performance is impacted by math expressions and symbols, either alone or in conjunction with the text contents of documents. Our investigations show that Text-Only embedding produces better classification results. The second question we address is the optimization of a deep learning (DL) model, the LSTM combined with one dimension CNN, for math document classification. We examine the model with several input representations, key design parameters and decision choices, and choices of the best input representation for math documents classification.
Math, document, classification, deep learning, LSTM
Proposed Model for Enhancing Retrieving Process in Big Data Management
Ayman E. Khedr1, Mohamed Attia Mohamed2, Abdulwahab Ali Almazroi3, 1University of Jeddah, College of Computing and Information Technology at Khulais, Department of Information Systems, Jeddah, Saudi Arabia, 2Future University in Egypt, Egypt, 3University of Jeddah, College of Computing and Information Technology at Khulais, Department of Information Technology, Jeddah, Saudi Arabia
Nowadays, operations of the Internet have a significant growth and size of data is increasing every second. Most of organizations and individuals were unaware of such data explosion because quantity of data is continuously increasing. Consequently, managing and controlling tools and methodologies of big data become critical aspect. One of the big issues that needed to be tackled when working with big data is how to manage data effectively. To address this issue, there are two main research directions exist. The first one is using big data frameworks like Hive and pig Latin while the other one is employing NoSQL data models like key-value, graph, column and document stores. In addition, unprecedented data volume and the complexity of managing data across complex multi- infrastructure only further exacerbate the problems. This paper reviews different representative techniques that treat with big data management challenges and finally, proposed a model for handling such issues.
Big data, NoSQL, Machine learning, JackHare, Hive
Experiments on Nl2Sql Using Sqlova, Tabert and Lookahead Optimizer
Shubham V Chaudhari and Kameshwar Rao JV, HCL Technologies LTD, India
With the advancement of deep learning in NLP there has been keen interest to convert natural language to SQL across academics and industry. Various models have been developed to address this problem which employ techniques like reinforcement learning, seq-to-seq, seq-to-set etc. We present an approach, where TaBERT and SQLOVA  are combined. TaBERT trained on structured text improves over traditional BERT thus better enhancing the features of the input query and headers. NL2SQL layer of the SQLOVA connected at the top of TaBERT which further encodes the query and headers further enhancing the features. The choice of optimizer plays a key role in improving the model’s results. This proposed architecture with lookahead optimizer surpasses the accuracy of where-num, where-col and where-cond by 0.2%,0.5%,0.4% respectively.
nl2sql, deep neural networks, NLP
Blockchain-based Ticketing Solution for Collegiate Athletics
Zaki Zahed1, Matt Fitzgerald2, Ronald Sayles3, 1IT Engineering Department, Saudi Aramco, Dhahran, Saudi Arabia, 2TCP program, University of Colorado at Boulder, Boulder, Colorado, USA, 3TCP program, University of Colorado at Boulder, Boulder, Colorado, USA
This paper proposes an ecosystem for Blockchain-Based Ticketing Solution for Collegiate Athletics. Utilizing technologies such as digital ledgers paired with cryptography, this paper constructs a theoretical implementation of secure digital ticketing. Four components essential to operation are identified as: issuer, user, verifier and DID (Decentralized Identifiers). The proposed solution begins with an authenticated University user. Said user must grant the ticketing website access to the user's assigned University identifier through QR code/login. This initial handshake is signed with private keys of both the University and user which is confirmed by the ticketing website. A digital ticket to the event, signed with the website's private key, is then released to the user via smart contract. The smart contract is then stored by the ticketing website into the blockchain. Upon arrival at the event the user presents the digital ticket (QR code) signed by the website and user's private keys. By doing so, proof of identity through authenticated University identifier is confirmed while simultaneously executes the aforementioned smart contract. Once the ticket and respective signatures are verified through the University's QR code scanner, the user is granted access into the event and the ticket can no longer be reused/resold.
Blockchain, Digital Identity, Digital Ticketing, Collegiate Athletics
Genetic Algorithm for Exam Timetabling Problem-a Specific Case for Japanese University Final Presentation Timetabling
Jiawei LI and Tad Gonsalves, Department of Information & Communication Sciences. Faculty of Science and Technology, Sophia University, Tokyo, Japan
This paper presents a Genetic Algorithm approach to solve a specific examination timetabling problem which is common in Japanese Universities. The model is programmed in Excel VBA programming language, which could be run on the Microsoft Office Excel worksheets directly. The model uses direct chromosome representation. To satisfy hard and soft constraints, constraint-based initialization operation, constraint-based crossover operation and penalty points system are implemented. To further improve the result quality of the algorithm, this paper designed an improvement called initial population pre-training. The proposed model was tested by the real data from Sophia University, Tokyo, Japan. The model shows acceptable results and the comparison results prove that the initial population pre-training approach can improve the result quality.
Examination timetabling problem, Excel VBA, Direct chromosome representation, Genetic Algorithm Improvement
Real-time Emotion based Virtual Assistant
M.M.A Safnaj, E.A.S.Ahamed, B.K.S Geethmi, M.G.A.U Jayasooriya and Samantha Rajapaksha, Department of Information Technology, Sri Lanka Institute of Information Technology, New Kandy Road, Malabe, Sri Lanka
The objective of this project is to automate the interaction with the users and a virtual assistant application on smartphones by reading the user’s emotion in real-time. Human emotions play a major role in people’s day-to-day life. Therefore, understanding the human emotional state of the user enables efficient human-computer interaction and leads to build emotion aware applications. The existing virtual assistant applications such as Apple’s Siri, Google Assistant & Amazons’ Alexa are capable of performing some tasks based on users’ verbal inputs. Today’s advancement in technology has allowed various technologies such as Machine Learning & Artificial Intelligence to make ordinary applications to smarter. The application results from testing the live captured images and detect emotions using the Machine Learning Model which has been built with the Convolutional Neural Network to help to achieve a high accuracy rate to provide suggestions to the user based on the user’s current emotional state. Also, the solution has focused on content-based recommendation and user behavior analysis to provide more appropriate suggestions and tasks to the user to enhance more user experience. While this approach increases efficiency and user experience in the field of Virtual Assistants, our solution differs from other platforms being with having a privacy-based emotion recognition API which doesn’t require store user’s emotional pictures for the prediction purpose.
Real-Time Emotion Detection, Content-Based Recommendations, User Behaviour Analysis, Virtual Assistant.
Blockchain-based Distributed Data Integrity Auditing Scheme
Hui Li, Baofu Han and Chuansi Wei, School of Information Science and Engineering, Shenyang University of Technology, Liao Ning, China
Cloud storage technology enables users to outsource local data to cloud service provider (CSP). In spite of its copious advantages, how to ensure the integrity of data has always been a significant issue.A variety of provable data possession (PDP) scheme have been proposed for cloud storage scenarios. However, the participation of centralized trusted third-party auditor (TPA) in most of the previous work has brought new security risks, because the TPA is prone to the single point of failure. Furthermore, the existing schemes do not consider the fair arbitration and lack an effective method to punish the malicious behaviour. To address the above challenges, we propose a novel blockchain-based decentralized data integrity auditing scheme without the need for a centralized TPA. By using smart contract technique, our scheme supports automatic compensation mechanism. DO and CSP must first pay a certain amount of ether for the smart contract as deposit. The CSP gets the corresponding storage fee if the integrity auditing is passed. Otherwise, the CSP not only gets no fee but has to compensate DO whose data integrity is destroyed. Security analysis shows that the proposed scheme can resist a variety of attacks. Also, we implement our scheme on the platform of Ethereum to demonstrate the efficiency and effectiveness of our scheme.
Data integrity, Fair arbitration, Blockchain, Decentralization, Public auditing.
Geothermal Energy for Refrigeration and Air Conditioning, Sustainable Development, and the Environment
A.M. Omer* , Energy Research Institute (ERI), Nottingham NG7 4EU, United Kingdom
Geothermal heat pumps (GSHPs), or direct expansion (DX) ground source heat pumps, are a highly efficient renewable energy technology, which uses the earth, groundwater or surface water as a heat source when operating in heating mode or as a heat sink when operating in a cooling mode. It is receiving increasing interest because of its potential to decrease primary energy consumption and thus reduce emissions of the greenhouse gases (GHGs). The main concept of this technology is that it uses the lower temperature of the ground (approximately lessthan 32°C), which remains relatively stable throughout the year, to provide space heating, cooling and domestic hot water inside the building area. The main goal of this study was to stimulate the uptake of the GSHPs. Recent attempts to stimulate alternative energy sources for heating and cooling of buildings have emphasised the utilisation of the ambient energy from ground source and other renewable energy sources. The purpose of this study, however, was to examine the means of reducing of energy consumption in buildings, identifying GSHPs as an environmental friendly technology able to provide efficient utilisation of energy in the buildings sector, promoting the use of GSHPs applications as an optimum means of heating and cooling, and presenting typical applications and recent advances of the DX GSHPs. The study highlighted the potential energy saving that could be achieved through the use of ground energy sources. It also focused on the optimisation and improvement of the operation conditions of the heat cycle and performance of the DX GSHP. It is concluded that the direct expansion of the GSHP, combined with the ground heat exchanger in foundation piles and the seasonal thermal energy storage from solar thermal collectors, is extendable to more comprehensive applications.
Geothermal heat pumps, direct expansion, ground heat exchanger, heating and cooling
Digital Wireless Mini Plant Temperature Sensor
Grishin Alexander, Grishin Andrey, Semenova Natalia, Grishin Vladimir and Dorokhov Alexey, Federal Scientific Agroengineering Centre VIM, Ryazan Avenue, Moscow, Russia
Sensors occupy a special place among the elemental base of digital agricultural technologies in order to obtain primary information about the physiological status of plants. The authors developed and practically tested a wireless mini-sheet sensor for digital assessment of thermoregulatory processes characterizing plant productivity factors. The developed sensor has scientific and practical value, which is confirmed by the experimental verification. The existence of a cooperative (synergetic) relationship, which usually characterizes the self-organization process, between the order parameter — the heating of the leaves under the influence of thermal exergy and the control parameter — the cooling thermoregulation, was proved. It has been established that the thermoregulation process, while reducing the limiting factors influence, favourably effects on plant productivity, and the time series fractal dimension of thermoregulation can be used as a plant production processes indicator. The efficiency of thermoregulation sensor using to analyse plant moisture losses has been theoretically and practically proved, based on the data obtained, the ability to regulate the nutrient solution supply. It allows us to recommend the use of the sensor in closed artificial agroecosystems. The data obtained from the sensor can be used both for regulating irrigation and drainage regimes and for scientific researches.
Digital Technologies, Leaf Sensor, Plant Productivity, Thermoregulation, Closed Artificial Agroecosystems.
How to Engage Followers: Classifying Fashion Brands According to their Instagram Profiles, Posts and Comments
Stefanie Scholz1 and Christian Winkler2, 1Department ofSocial Economy, WilhemLoehe University of Applied Sciences, Fuerth, Germany, 2Christian Winkler, datanizing GmbH, Schwarzenbruck, Germany
In this article we show how fashion brands communicate with their follower on Instagram. We use a continuously update dataset of 68 brands, more than 300,000 posts and more than 40,000,000 comments. Starting with descriptive statistics, we uncover different behavior and success of the various brands. It turns out that there are patterns specific to luxury, mass-market and sportswear brands. Posting volume is extremely brand dependent as is the number of comments and the engagement of the community.Having understood the statistics, we turn to machine learning techniques to measure the response of the community via comments. Topic models help us understand the structure of their respective community and uncover insights regarding the response to campaigns.Having up-to-date content is essential for this kind of analysis, as the market is highly volatile. Furthermore, automatic data analysis is crucial to measure the success of campaigns and adjust them accordingly for maximum effect.
Instagram, Fashion Brands, Data Extraction, Marketing, Analysis, Artificial Intelligence, Netnography, Descriptive Statistics, Visualization, Community Engagement, Artificial Intelligence, Unsupervised Learning, Topic Modelling.
A Research on Client-side-based Web Attack Response using Ensemble Model
Hyeongmin Kim1, Suhyeon Oh1, Yerin Im1, Hyeonseong Jeong1, Jiwon Hong1, Jaehyeon Cho1, Hyeonmin Kim2, Kyounggon Kim3, 1Best of the Best, Korea Information Technology Research Institute 2Financial Security Institute, 3Naif Arab University for Security Sciences
AI, Machine Learning, Deep Learning, Client-Side, Ensemble technique.
Regularization Method for Rule Reduction in Belief Rule-based System
Yu Guan and Yanggeng Fu, College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
Belief rule-based inference system introduces a belief distribution structure into the conventional rule-based system, which can effectively synthesize incomplete and fuzzy information. In order to optimize reasoning efficiency and reduce redundant rules, this paper proposes a rule reduction method based on regularization. This method controls the distribution of rules by setting corresponding regularization penalties in different learning steps and reduces redundant rules. This paper first proposes the use of the Gaussian membership function to optimize the structure and activation process of the belief rule base, and the corresponding regularization penalty construction method. Then, a step-by-step training method is used to set a different objective function for each step to control the distribution of belief rules, and a reduction threshold is set according to the distribution information of the belief rule base to perform rule reduction. Two experiments will be conducted based on the synthetic classification data set and the benchmark classification data set to verify the performance of the reduced belief rule base.
Knowledge-based system, Belief rule base, Regularization method, Rule reduction.
Machine Learning Algorithm for Nlos Millimeter Wave in 5G V2X Communication
Deepika Mohan1, Peter Han Joo Chong1 and G.G. Md. Nawaz2, 1Department of Electrical and Electronics Engineering, Auckland University of Technology, Auckland, New Zealand, 2Department of Applied Computer Science, University of Charleston, WV 25304, USA
The 5G vehicle to everything (V2X) communication for autonomous and semi-autonomous driving utilizes the wireless technology for communication and the Millimeter Wave bands are widely implemented in this kind of vehicular network application. The main purpose of this paper is to broadcast the messages from the mmWave Base Station to vehicles at LOS (Line-of-sight) and NLOS (Non-LOS). Relay using Machine Learning (RML) algorithm is formulated to train the mmBS for identifying the blockages within its coverage area and broadcast the messages to the vehicles at NLOS using a LOS nodes as a relay. The transmission of information is faster with higher throughput and it covers a wider bandwidth which is reused, therefore when performing machine learning within the coverage area of mmBS most of the vehicles in NLOS can be benefited. A unique method of relay mechanism combined with machine learning is proposed to communicate with mobile nodes at NLOS.
5G, Millimeter Wave, Machine Learning, Relay, V2X communication.
Secure Handover Protocol for 5G and Beyond Networks
Vincent Omollo Nyangaresi1 and Anthony Joachim Rodriguez2, 1Faculty of Biological & Physical Sciences, Tom Mboya University College, Homabay, Kenya, 2School of Informatics & Innovative Systems, JOOUST, Kisumu, Kenya
Technical network challenges in 5G relates to handover authentication, user privacy protection and resource management. Due to interoperability requirements among the heterogenous networks (Hetnets), the security requirements for 5G are high compared to 2G, 3G and 4G. The current 5G handover protocols are based on either fuzzy logic (FL), artificial neural networks (ANN), blockchain, software defined network (SDN), or Multi-layer Feed Forward Network (MFNN). These protocols have either long latencies or focus on either security or quality of services parameters such as user satisfaction. The usage of these inefficient authentication schemes during 5G handovers lead to performance degradation in heterogeneous cells and increases the delay. In addition, 5G networks experience frequent handover failures and increased handover delays. Consequently, the provision of strong security, privacy and low latency handovers is required for the successful deployment of 5G networks such as wireless local area networks (5G-WLAN) heterogeneous networks. These new requirements, coupled with demands for higher scalability, reliability, security, data rates, quality of service (QoS), and support for internet of everything (IoE) have seen the shift from 5G to beyond 5G(B5G). However, 5G and B5G are incapable of providing the complete requirements of IoE such as enhanced security and QoS. This paper sought to develop an ANN-FL protocol that addressed both security and QoS in 5G and B5G networks. The simulation results showed that the developed protocol was robust against attacks such de-synchronization and tracing attacks and yielded a 27.1% increase in handover success rate, a 27.3% reduction in handover failure rate, and a 24.1% reduction in ping pong handovers.
5G, Hetnets, authentication, ping pong rate, handover success rate, handover failure rates.
Medical Image Zero-Watermarking Based on Quasi-Sparse Matrix to Localize and Recover Tampered Zones
Lamri LAOUAMER1 and Adel ALTI2, 1Department of Management Information Systems & Production Management College of Business & Economics, Qassim University P.O. Box 6633, Buraidah, 51452, KSA, 2LRSD Lab, Computer Science Department, University of SETIF-1, Sétif 19000, Algeria
Medical images exchanged through various communication networks require a very particular interest especially to secure their contents against any kind of illegitimate manipulation. These images are becoming increasingly vulnerable to attacks and presenting a serious issue in telemedicine and e-healthcare real-time applications. Data security encompasses several aspects such as tamper detection and recovery, integrity, authenticity and confidentiality. In this paper, we present a zero atermarking approach based on quasi-sparse matrix to extract the region of interest allowing detecting the tampered zones and recovering the attacked zones to their original states. The approach is based on the extraction of the quasi-sparse matrix from the host image to select only the Regions of Interests (ROI) and to define also the zero watermark. Only Least and Most Significant bits LSB and MSB attacks are applied to evaluate the efficiency of the proposed approach. This approach has less complexity since it only supports non-zero pixels, which minimizes the computation time. We detail the obtained results by using the most widespread metrics in the literature to evaluate the performance of the proposed approach.
medical image, Quasi-Sparse matrix, zero-watermarking, tamper detection and recovery..