About the Journal
Journal on Big Data is launched in a new area when the engineering features of big data are setting off upsurges of explorations in algorithms, raising challenges on big data, and industrial development integration; and novel paradigms in this cross–disciplinary field need to be constructed by translating complex innovative ideas from various fields.
Indexing and Abstracting
Starting from July 2023, Journal on Big Data will transition to a continuous publication model, accepted articles will be promptly published online upon completion of the peer review and production processes.
-
Open Access
REVIEW
An Overview of ETL Techniques, Tools, Processes and Evaluations in Data Warehousing
Journal on Big Data, Vol.6, pp. 1-20, 2024, DOI:10.32604/jbd.2023.046223 - 26 January 2024
Abstract The extraction, transformation, and loading (ETL) process is a crucial and intricate area of study that lies deep within the broad field of data warehousing. This specific, yet crucial, aspect of data management fills the knowledge gap between unprocessed data and useful insights. Starting with basic information unique to this complex field, this study thoroughly examines the many issues that practitioners encounter. These issues include the complexities of ETL procedures, the rigorous pursuit of data quality, and the increasing amounts and variety of data sources present in the modern data environment. The study examines ETL… More >
-
Open Access
ARTICLE
A Survey on Methods and Applications of Intelligent Market Basket Analysis Based on Association Rule
Journal on Big Data, Vol.4, No.1, pp. 1-25, 2022, DOI:10.32604/jbd.2022.021744
Abstract The market trends rapidly changed over the last two decades. The primary reason is the newly created opportunities and the increased number of competitors competing to grasp market share using business analysis techniques. Market Basket Analysis has a tangible effect in facilitating current change in the market. Market Basket Analysis is one of the famous fields that deal with Big Data and Data Mining applications. MBA initially uses Association Rule Learning (ARL) as a mean for realization. ARL has a beneficial effect in providing a plenty benefit in analyzing the market data and understanding customers’ More >
-
Open Access
ARTICLE
Chinese News Text Classification Based on Convolutional Neural Network
Journal on Big Data, Vol.4, No.1, pp. 41-60, 2022, DOI:10.32604/jbd.2022.027717
Abstract With the explosive growth of Internet text information, the task of text classification is more important. As a part of text classification, Chinese news text classification also plays an important role. In public security work, public opinion news classification is an important topic. Effective and accurate classification of public opinion news is a necessary prerequisite for relevant departments to grasp the situation of public opinion and control the trend of public opinion in time. This paper introduces a combined-convolutional neural network text classification model based on word2vec and improved TF-IDF: firstly, the word vector is… More >
-
Open Access
ARTICLE
A Noise Extraction Method for Cryo-EM Single-Particle Denoising
Journal on Big Data, Vol.4, No.1, pp. 61-76, 2022, DOI:10.32604/jbd.2022.028078
Abstract Cryo-Electron Microscopy (cryo-EM) has become a powerful method to study the structure and function of biological macromolecules. However, in clustering tasks based on the projection angle of particles in cryo-EM, the noise considerably affects the clustering results. Existing denoising algorithms are ineffective due to the extremely low signal-to-noise ratio (SNR) of cryo-EM images and the complexity of noise types. The noise of a single particle greatly influences the orientation estimation of the subsequent clustering task, and the result of the clustering task directly affects the accuracy of the 3D reconstruction. In this paper, we propose… More >
-
Open Access
ARTICLE
Research and Practice of Telecommunication User Rating Method Based on Machine Learning
Journal on Big Data, Vol.4, No.1, pp. 27-39, 2022, DOI:10.32604/jbd.2022.026850
Abstract The machine learning model has advantages in multi-category credit rating classification. It can replace discriminant analysis based on statistical methods, greatly helping credit rating reduce human interference and improve rating efficiency. Therefore, we use a variety of machine learning algorithms to study the credit rating of telecom users. This paper conducts data understanding and preprocessing on Operator Telecom user data, and matches the user’s characteristics and tags based on the time sliding window method. In order to deal with the deviation caused by the imbalance of multi-category data, the SMOTE oversampling method is used to… More >
-
Open Access
ARTICLE
Restoration of Wind Speed in Qinzhou, Guangxi during Typhoon Rammasun
Journal on Big Data, Vol.4, No.1, pp. 77-86, 2022, DOI:10.32604/jbd.2022.027477
Abstract In 2014, Typhoon Rammasun invaded Qinzhou, Guangxi, causing damage to the wind tower sensor at 80 m in Qinzhou. In order to restore the wind speed at 80 m at that time, this paper was based on the hourly average wind speed data of the wind tower and meteorological station from 2017–2019, and constructed the wind speed related model of Meteorological Station and the wind measuring tower in Qinzhou, Moreover, this paper Based on the hourly average wind speed data of Qinzhou Meteorological Station in 2014, Restored the hourly average wind speed of the anemometer tower during More >
-
Open Access
ARTICLE
A Survey of Machine Learning for Big Data Processing
Journal on Big Data, Vol.4, No.2, pp. 97-111, 2022, DOI:10.32604/jbd.2022.028363
Abstract Today’s world is a data-driven one, with data being produced in vast amounts as a result of the rapid growth of technology that permeates every aspect of our lives. New data processing techniques must be developed and refined over time to gain meaningful insights from this vast continuous volume of produced data in various forms. Machine learning technologies provide promising solutions and potential methods for processing large quantities of data and gaining value from it. This study conducts a literature review on the application of machine learning techniques in big data processing. It provides a More >
-
Open Access
ARTICLE
Design of a Web Crawler for Water Quality Monitoring Data and Data Visualization
Journal on Big Data, Vol.4, No.2, pp. 135-143, 2022, DOI:10.32604/jbd.2022.031024
Abstract Many countries are paying more and more attention to the protection of water resources at present, and how to protect water resources has received extensive attention from society. Water quality monitoring is the key work to water resources protection. How to efficiently collect and analyze water quality monitoring data is an important aspect of water resources protection. In this paper, python programming tools and regular expressions were used to design a web crawler for the acquisition of water quality monitoring data from Global Freshwater Quality Database (GEMStat) sites, and the multi-thread parallelism was added to More >
-
Open Access
ARTICLE
Application of Big Data Information Platform in Medical Equipment
Journal on Big Data, Vol.4, No.2, pp. 113-123, 2022, DOI:10.32604/jbd.2022.028791
Abstract The application of big data in the medical device industry mainly refers to the analysis and processing of various medical devices, so as to provide patients with better treatment and rehabilitation services. At present, our country already has a relatively mature and reliable large database system. This article studies the application of medical equipment in the big data information platform. The main methods used in this article are survey method, case analysis method, and interview method. The big data information platform and medical devices are studied from different aspects. The survey results show that 41% More >
-
Open Access
ARTICLE
Social Opinion Network Analytics in Community Based Customer Churn Prediction
Journal on Big Data, Vol.4, No.2, pp. 87-95, 2022, DOI:10.32604/jbd.2022.024533
Abstract Community based churn prediction, or the assignment of recognising the influence of a customer’s community in churn prediction has become an important concern for firms in many different industries. While churn prediction until recent times have focused only on transactional dataset (targeted approach), the untargeted approach through product advisement, digital marketing and expressions in customer’s opinion on the social media like Twitter, have not been fully harnessed. Although this data source has become an important influencing factor with lasting impact on churn management. Since Social Network Analysis (SNA) has become a blended approach for churn… More >
-
Open Access
ARTICLE
Study on the Present Situation and Optimization Path of Gamification Design in Chinese University Libraries
Journal on Big Data, Vol.4, No.2, pp. 125-133, 2022, DOI:10.32604/jbd.2022.030660
Abstract In this paper, 137 “First-class universities” and “First-class discipline” construction universities in China are selected as the objects of investigation to analyzes the present situation and characteristics of the game design of University Library in China. Taking the university library in other countries as the reference object, this paper compares the differences of the game design of University Library in China and other countries, sums up the deficiency of the gamification service practice in Chinese university libraries. At last, this paper proposes an optimization path of the gamification design of Chinese University Library from six More >
-
Open Access
ARTICLE
A New Population Initialization of Particle Swarm Optimization Method Based on PCA for Feature Selection
Journal on Big Data, Vol.3, No.1, pp. 1-9, 2021, DOI:10.32604/jbd.2021.010364
Abstract In many fields such as signal processing, machine learning, pattern
recognition and data mining, it is common practice to process datasets containing
huge numbers of features. In such cases, Feature Selection (FS) is often involved.
Meanwhile, owing to their excellent global search ability, evolutionary
computation techniques have been widely employed to the FS. So, as a powerful
global search method and calculation fast than other EC algorithms, PSO can solve
features selection problems well. However, when facing a large number of feature
selection, the efficiency of PSO drops significantly. Therefore, plenty of works
have been… More >
-
Open Access
ARTICLE
OPPR: An Outsourcing Privacy-Preserving JPEG Image Retrieval Scheme with Local Histograms in Cloud Environment
Journal on Big Data, Vol.3, No.1, pp. 21-33, 2021, DOI:10.32604/jbd.2021.015892
Abstract As the wide application of imaging technology, the number of big
image data which may containing private information is growing fast. Due to
insufficient computing power and storage space for local server device, many
people hand over these images to cloud servers for management. But actually, it
is unsafe to store the images to the cloud, so encryption becomes a necessary step
before uploading to reduce the risk of privacy leakage. However, it is not
conducive to the efficient application of image, especially in the Content-Based
Image Retrieval (CBIR) scheme. This paper proposes an outsourcing… More >
-
Open Access
ARTICLE
Grain Yield Predict Based on GRA-AdaBoost-SVR Model
Journal on Big Data, Vol.3, No.2, pp. 65-76, 2021, DOI:10.32604/jbd.2021.016317
Abstract Grain yield security is a basic national policy of China, and changes in
grain yield are influenced by a variety of factors, which often have a complex,
non-linear relationship with each other. Therefore, this paper proposes a Grey
Relational Analysis–Adaptive Boosting–Support Vector Regression (GRAAdaBoost-SVR) model, which can ensure the prediction accuracy of the model
under small sample, improve the generalization ability, and enhance the prediction
accuracy. SVR allows mapping to high-dimensional spaces using kernel functions,
good for solving nonlinear problems. Grain yield datasets generally have small
sample sizes and many features, making SVR a promising… More >
Copyright © 2024 The Author(s). Published by Tech Science Press.