TY - EJOU
AU - Khan, Mubariz
AU - Siddiqui, Hafeez Ur Rehman
AU - Saleem, Adil Ali
AU - Raza, Muhammad Amjad
AU - Rodríguez, Lázaro Javier Hernández
AU - García, Pablo Herrero
AU - Díez, Isabel de la Torre
TI - Cryptocurrency Market Trends: A Machine Learning-Driven Time Series Forecasting with Twitter Sentiment Integration
T2 - Computers, Materials \& Continua
PY -
VL -
IS -
SN - 1546-2226
AB - Accurate forecasting of cryptocurrency prices remains an open challenge because classical statistical models cannot capture the non-linear, sentiment-driven dynamics of these markets. This study compares three hybrid deep learning architectures—VAR-LSTM, XGBoost-LSTM, and CNN-LSTM—to determine which best forecasts Bitcoin (BTC), Ethereum (ETH), and Dogecoin (DOGE) closing prices, and to quantify the marginal predictive value of Twitter sentiment integration. Six years of hourly OHLCV data (2017–2023) are augmented with VADER-scored Twitter sentiment polarity. Each model is formulated mathematically, implemented with documented hyperparameters (epochs, dropout, units;), and trained for one-step-ahead next-hour price prediction. Performance is measured by RMSE, MAE, MAPE, R2, and Directional Accuracy (DA) across five random seeds, with paired Wilcoxon significance tests. XGBoost-LSTM achieves the best performance (RMSE = 81.547, R2 = 0.9254, DA = 80.0%), outperforming all nine literature baselines. Removing Twitter sentiment degrades DA by 14.3 percentage points (p<0.01), confirming that social media signals carry independent predictive information. Hybrid architectures consistently outperform single-model baselines; XGBoost-LSTM offers the best accuracy-to-compute ratio. VADER-enriched Twitter sentiment is a significant predictor beyond price history. Limitations include reliance on a single sentiment platform and a training window that predates several structural market events.
KW - Cryptocurrency forecasting; long short-term memory; XGBoost; convolutional neural network; vector autoregression; sentiment analysis; Bitcoin; time series; hybrid models; deep learning
DO - 10.32604/cmc.2026.084269