HyperAIHyperAI

Command Palette

Search for a command to run...

3 年前

基于循环神经网络的长短期记忆(LSTM)商业情感分析模型

Md. Jahidul Islam Razin M. F. Mridha Md. Abdul Karim S M Rafiuddin Tahira Alam

长短期记忆网络(LSTM)

20 小时 RTX 5090 算力资源,仅 $1 (原价 $7)
跳转至 Notebook

摘要

商业情感分析(BSA)是自然语言处理领域中一个重要且热门的主题,属于面向商业用途的情感分析技术。不同类别的情感分析技术(如基于词典的方法)以及不同类型的机器学习算法已被应用于英语、印地语、西班牙语等不同语言的情感分析。本文采用长短期记忆(LSTM)进行商业情感分析,其中使用了循环神经网络(RNN)。为了解决传统循环神经网络存在的梯度消失问题,本文采用一种改进方法构建LSTM模型。实验使用产品评论数据集来应用该改进的RNN模型。在实验中,70%的数据用于LSTM模型的训练,剩余30%的数据用于测试。将该改进RNN模型的结果与其他传统RNN模型进行比较,并对结果进行分析。结果表明,所提出的模型性能优于其他传统RNN模型。在此,所提出的模型(即改进的RNN模型方法)达到了约91.33%的准确率。

一句话总结

The authors propose a modified long short-term memory (LSTM) recurrent neural network that mitigates the vanishing gradient problem of conventional architectures to improve business sentiment analysis, achieving approximately 91.33% accuracy on a product review dataset split 70% for training and 30% for testing while outperforming standard RNN baselines.

核心贡献

  • This work proposes a modified recurrent neural network architecture that integrates long short-term memory (LSTM) units to classify product reviews into positive, neutral, and negative categories.
  • The modified architecture mitigates the vanishing gradient problem inherent in conventional recurrent networks, enabling the model to capture both short- and long-term dependencies in sequential text.
  • Evaluations on a product review dataset demonstrate that the proposed model achieves 91.33% accuracy, outperforming standard recurrent neural networks and feed-forward network baselines.

引言

Business sentiment analysis automates the extraction of customer opinions from massive volumes of unstructured text, enabling companies to forecast market trends, refine marketing strategies, and track consumer behavior. Traditional machine learning approaches rely on manual feature engineering and struggle with scale, while conventional recurrent neural networks frequently fail to capture long-range textual dependencies due to the vanishing gradient problem. The authors leverage a modified recurrent neural network architecture built on Long Short-Term Memory units to effectively model sequential data while resolving gradient degradation. Applied to a product review dataset, their approach achieves 91.33 percent accuracy, outperforming standard RNN and feed-forward baselines and providing a reliable tool for automated three-way sentiment classification.

数据集

  • Dataset Composition & Sources: The authors source their data from the Amazon Review Information Dataset (ARD), originally compiled via web scraping and APIs. While the full ARD contains 142.8 million ratings with extensive metadata, the authors extract a focused subset of 25,000 product reviews.
  • Subset Breakdown & Split: The selected reviews are categorized into positive, neutral, and negative sentiment classes. The authors partition this subset into a 70% training set and a 30% testing set.
  • Text Cleaning & Preprocessing: Raw review text undergoes strict cleaning to strip HTML tags and punctuation, which are replaced with spaces. Single-character tokens and multiple consecutive spaces are then removed. The authors note that the cleaned text is naturally divided into emotion-based business categories. No cropping strategy is applied, and the original ARD metadata is acknowledged but not integrated into the processing pipeline.
  • Vectorization, Model Integration & Training: Each token is converted into a fixed-dimensional vector (vR1×dv \in \mathbb{R}^{1 \times d}vR1×d) using Word2Vec embeddings. The authors feed these sequences sequentially into an LSTM network, which processes the data left-to-right. The LSTM output passes through a Dense layer with sigmoid activation to generate a final probability score between 0.0 and 1.0. The model is trained for up to 50 epochs to prevent overfitting, ultimately achieving approximately 96.23% training accuracy and 91.33% testing accuracy.

方法

The authors leverage long short-term memory (LSTM) networks to address the limitations of standard recurrent neural networks (RNNs) in capturing long-term dependencies within sequential text data for business sentiment analysis. The core of the approach lies in replacing the standard RNN cell with an LSTM cell, which incorporates gated mechanisms to control information flow and mitigate the vanishing gradient problem. Refer to the framework diagram for an overview of the RNN architecture, where the recurrent weight matrix A propagates information through time. The LSTM cell, as shown in the diagram , replaces the simple hidden layer with a memory block containing specialized gates. This block processes the current input X(t)X(t)X(t), the previous hidden state h(t1)h(t-1)h(t1), and the previous cell state c(t1)c(t-1)c(t1) to produce the updated cell state c(t)c(t)c(t) and the current hidden state h(t)h(t)h(t).

The LSTM computation proceeds in four main steps. First, the forget gate ftf_tft and input gate iti_tit are computed using sigmoid activation functions, which determine what information to discard from the previous cell state and what new information to store, respectively. The equations for these gates are ft=σ(xtUf+ht1Wf)f_t = \sigma(x_t U^f + h_{t-1} W^f)ft=σ(xtUf+ht1Wf) and it=σ(xtUi+ht1Wi)i_t = \sigma(x_t U^i + h_{t-1} W^i)it=σ(xtUi+ht1Wi). Second, the cell state is updated by combining the previous cell state Ct1C_{t-1}Ct1 with the candidate cell state C~t\tilde{C}_tC~t, which is generated using a hyperbolic tangent activation function on the combined input and hidden state: C~t=tanh(xtUg+ht1Wg)\tilde{C}_t = \tanh(x_t U^g + h_{t-1} W^g)C~t=tanh(xtUg+ht1Wg). The updated cell state is then Ct=ft×Ct1+it×C~tC_t = f_t \times C_{t-1} + i_t \times \tilde{C}_tCt=ft×Ct1+it×C~t. Third, the output gate oto_tot is computed using a sigmoid activation function: ot=σ(xtUo+ht1Wo)o_t = \sigma(x_t U^o + h_{t-1} W^o)ot=σ(xtUo+ht1Wo). Finally, the new hidden state hth_tht is produced by applying a hyperbolic tangent to the updated cell state and multiplying it by the output gate activation: ht=tanh(Ct)×oth_t = \tanh(C_t) \times o_tht=tanh(Ct)×ot. This process is illustrated in the detailed LSTM cell diagram .

The overall model architecture for sentiment analysis, as depicted in the diagram , consists of multiple LSTM layers. The input sequence is fed into the first LSTM layer, which processes it to generate a sequence of hidden states. These hidden states are then passed to subsequent LSTM layers, allowing the model to capture increasingly complex features. After the final LSTM layer, a dense layer with a sigmoid activation function is applied to produce the final output. The model is trained using a multi-model approach, where separate LSTM models are trained on data categorized as positive, negative, and neutral. For a new input review, each trained model evaluates the review, and the model with the smallest error value is selected to assign the sentiment label. This architecture is designed to overcome the vanishing gradient problem and effectively handle the sequential nature of text data, enabling robust performance in business sentiment analysis.

实验

The evaluation setup involves testing the trained RNN-LSTM architecture on previously unseen product reviews to validate its core capability for automated business sentiment classification. The initial experiments confirm that the model reliably maps novel text to distinct sentiment categories by applying probability thresholds, demonstrating strong generalization. Subsequent comparative analysis further validates its practical superiority, as the architecture consistently outperforms established baseline methods across standard accuracy benchmarks.

The authors evaluate the performance of their LSTM model through training and testing phases, demonstrating consistent improvements in both accuracy metrics over epochs. Results show that the model achieves high testing accuracy, outperforming other models mentioned in the comparison section. The model's testing accuracy improves as training progresses over epochs. The model achieves higher accuracy compared to other sentiment classification models. Training accuracy consistently exceeds testing accuracy across all epochs.

The authors describe a model that classifies product reviews into sentiment categories using an LSTM-based approach. The model is evaluated on unseen data, with classification thresholds defined based on prediction probabilities, and it achieves higher accuracy compared to other models mentioned in the literature. The model uses an LSTM architecture with a dense output layer for sentiment classification. Classification decisions are based on prediction probability thresholds for different sentiment levels. The proposed model outperforms other models in terms of accuracy compared to existing approaches.

The authors evaluate a model for business sentiment analysis that classifies product reviews into categories such as excellent, good, bad, and very bad based on prediction probability thresholds. The model achieves high accuracy, outperforming other existing models in sentiment classification tasks. The model classifies reviews into sentiment categories using probability thresholds, with higher values indicating more positive sentiment. The model achieves higher accuracy compared to other models, including KNN and SVM-based approaches. The classification system uses a multi-valued encoding scheme for sentiment categories, including positive, neutral, and negative.

The authors evaluate an LSTM-based sentiment classification model through iterative training and testing phases on unseen product review data. The experiments demonstrate that the architecture consistently improves predictive accuracy over successive epochs while maintaining stable generalization between training and testing performance. By leveraging probability thresholds for multi-category sentiment encoding, the approach effectively captures nuanced review classifications. Overall, the model validates its superiority by consistently outperforming traditional baselines such as KNN and SVM across all tested scenarios.


用 AI 构建 AI

从创意到上线——通过免费 AI 协同编码、开箱即用的环境和最优惠的 GPU 价格,加速您的 AI 开发。

AI 协同编码
开箱即用的 GPU
最优定价

HyperAI Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供