Woxsen University · MBA Business Analytics · 2024–2026

AI-Augmented
Predictive Risk Analytics

Using Financial News Events — A NLP Approach with BERT

A BERT-driven framework that classifies financial news headlines into Low, Medium, and High risk levels — achieving 65.95% accuracy over a 33.3% random baseline, with a novel Risk Exposure Index covering 465 industry sectors.

3,024Headlines analysed
Feb–Aug 2025Observation window
+32.65 ppOver random baseline
465Sectors indexed

P Chaitanya · Vanaparthy Nitisha · Tejasree K · Twinkle Shah
Supervised by Dr. K V Rajesh Kumar, Associate Professor, School of Business

1 / 16

Why do we need this?

The Problem
  • 80%
    80% of enterprise data is unstructured — unexploited by traditional ERM frameworks (Taha & Yoo, 2024)
  • Lagging signals — balance sheets & credit ratings describe what already happened, not what's coming
  • Manual review of news & social media is slow, inconsistent, and prone to human error
  • Reactive posture — organizations respond to crises after they materialize, not before
Research Objectives
  • 1
    Develop & evaluate a BERT-based NLP pipeline classifying headlines as Low / Medium / High risk
  • 2
    Construct a quantitative Risk Exposure Index (REI) for companies and sectors
  • 3
    Generate proactive decision-support insights for risk practitioners
  • 4
    Transform raw unstructured news into structured strategic intelligence
2 / 16

Dataset Description

3,024
Financial headlines
Feb–Aug 2025
Observation window
10
Major companies
465
Industry sectors
~5%
Null value rate
12 Dataset Features
  • Date — publication date
  • Headline — primary NLP input
  • Source — Reuters / Bloomberg / FT
  • Market Event Type
  • Market Index
  • Index Change (%)
  • Trading Volume
  • Sentiment (Pre-labeled)
  • Sector
  • Impact Level — target variable
  • Related Company
  • News URL
Risk Level Distribution
High Risk965 (31.9%)
Medium Risk1,039 (34.4%)
Low Risk1,020 (33.7%)

Note: Near-balanced distribution across classes is unusual for real-world risk datasets — suggesting the dataset may have been curated or stratified in its construction.

3 / 16

Key Research Foundations

📚

NLP & Transformers

Evolution from bag-of-words & TF-IDF → deep contextualised embeddings. BERT consistently outperforms classical methods across classification benchmarks (Yang et al., 2023; Tucudean et al., 2024). Text classification accuracy has seen dramatic improvement.

📈

Financial Sentiment Analysis

Transformer models substantially outperform earlier approaches on financial corpora. Real-time sentiment from news & social media is critical for risk detection (Ghosh et al., 2024; Leechewyuwasorn et al., 2024). Domain-adapted models like FinBERT outperform general BERT.

🔍

Information Extraction & NER

Structured extraction from financial text using LLMs provides a methodology template for entity-level risk signal association (Dagdelen et al., 2024; Han et al., 2023). NER links risk signals to specific companies and sectors.

Research Gaps Addressed

Sentiment outputs are rarely converted into entity-level quantitative risk scores — this is what the REI does. Most financial NLP uses binary classification; balanced 3-class headline risk is understudied. Proactive scenario-based forecasting from NLP is emerging but immature.

4 / 16

Six-Phase Pipeline

PHASE 01
Data Collection & Preprocessing
Remove nulls, median imputation, lowercase/tokenise (WordPiece, max 128 tokens), label encode, 80/20 stratified split.
PHASE 02
BERT Sentiment Analysis
bert-base-uncased → [CLS] pooled 768-dim embeddings. Sentiment polarity inferred via lightweight classification head trained on BERT representations.
PHASE 03
REI Computation
Weighted aggregation per entity: 0.5×F_high + 0.3×S_neg + 0.2×|ΔI| then min-max normalise to 0–100 scale for 10 companies & 465 sectors.
PHASE 04
Risk Level Classification
3-class BERT classifier: dropout 0.3 → Dense → Softmax. AdamW lr=2e-5, 3 epochs, class weights, cross-entropy loss. Evaluated on held-out test set.
PHASE 05
Scenario Forecasting
Trained classifier applied to hypothetical headlines. Probabilistic risk estimates with confidence intervals — Low, Medium, High probabilities for novel inputs.
PHASE 06
Visualisation & Reporting
20+ professional charts, 4 structured CSV/TXT output files, automated strategic insight reports designed for practitioner consumption.
5 / 16

Technical Formulation

INPUT TOKENS
[CLS]
Federal
Reserve
hikes
rates
[SEP]
BERT Encoder — 12 Transformer Layers
Bidirectional self-attention · Feed-forward · Layer normalisation · 768-dim · 12 attention heads
[CLS] pooled → 768-dim vector z
Classification Head
Dropout 0.3 → Dense W∈ℝ³ˣ⁷⁶⁸ → Softmax
Low Risk
29.6%
Medium Risk
36.2%
High Risk
34.2%
Formal Formulation
Input Sequence
[CLS], t₁, t₂, …, tₙ, [SEP] where n ≤ 128
CLS Embedding
z = BERT_pooled([CLS], …) where z ∈ ℝ⁷⁶⁸
Classification Head
ŷ = softmax(W · dropout(z, p=0.3) + b)
Prediction
class = argmax(ŷ)
Optimiser
AdamW lr=2e-5, decay=0.01, 10% warmup, 3 epochs
Loss
Cross-entropy with class weights to handle residual imbalance
6 / 16

Risk Exposure Index — Construction

REI(e) = normalize [ 0.5 × Fhigh(e) + 0.3 × Sneg(e) + 0.2 × |ΔI(e)| ] × 100
Fhigh(e)Weight: 0.50
High-Risk Event Frequency

Frequency of High-Risk classified headlines mentioning entity e. Primary driver — event classification is the most direct risk signal.
Sneg(e)Weight: 0.30
Negative Sentiment Score

Mean negative sentiment score (0–1) from BERT. Captures qualitative linguistic signals that may precede formal event classification.
|ΔI(e)|Weight: 0.20
Index Change Magnitude

Mean absolute % change in the associated market index on days entity e appeared in headlines. Anchors score to market reality.
Early Warning Threshold: Entities with REI(e) ≥ 75 trigger automated alerts to risk practitioners — representing the top quartile of observed risk exposure. Approximately 18% of 465 sectors crossed this threshold during the Feb–Aug 2025 observation window.
7 / 16

Technology Stack & Output Files

Transformers (HuggingFace)
BERT model loading, tokenization, feature extraction via bert-base-uncased
scikit-learn
Train-test split, label encoding, classification metrics & evaluation
Pandas & NumPy
Data ingestion, preprocessing, numerical computation pipeline
Matplotlib & Seaborn
20+ professional charts and visualisations of model outputs
Google Colab (CPU)
Python 3.10 execution environment — CPU-only computation
Gemini API (Planned)
Narrative insight generation — API version compatibility issues flagged
Pipeline Output Files
company_risk_exposure_index.csv— REI scores for the 10 monitored companies
sector_risk_exposure_index.csv— REI scores across 465 industry sectors
processed_financial_news_data.csv— Cleaned & feature-engineered dataset with BERT sentiment scores
strategic_insights_report.txt— Narrative summary of key risk findings for practitioner consumption
8 / 16

Classification Performance

65.95%
Overall accuracy
33.3%
Random baseline
+32.65pp
Improvement over chance
0.66
Macro avg F1-score
Classification Report (n=605)
ClassPrecisionRecallF1-ScoreSupport
Low Risk0.790.690.73212
Medium Risk0.490.560.53190
High Risk0.730.720.73203
Macro Avg0.670.660.66605
Confusion Matrix Insight
  • Low & High Risk classes achieve F1=0.73 each — strong discriminative ability with polarised linguistic signals
  • !
    Medium Risk is hardest (F1=0.53) — boundary ambiguity highest for middle class; same headline can be Low or Medium depending on framing
  • Conservative bias — model tends to escalate rather than downgrade uncertain classifications. Most consequential error (High→Low) is the least frequent
9 / 16

Risk Exposure Index Findings

Top Companies by REI Score
1Apple Inc.100.0 ▲ HIGH
2Boeing100.0 ▲ HIGH
3ExxonMobil100.0 ▲ HIGH
4Goldman Sachs100.0 ▲ HIGH
5JP Morgan Chase100.0 ▲ HIGH
6Microsoft100.0 ▲ HIGH
7Reliance Industries100.0 ▲ HIGH
8Samsung Electronics100.0 ▲ HIGH
9Tata Motors100.0 ▲ HIGH
10Tesla100.0 ▲ HIGH
Sector-Level Findings (n=465)
  • 75+
    High REI sectors: Technology, Financial Services, Energy — highest news volume & volatility correlation
  • 30−
    Low REI sectors: Utilities, Basic Materials, Regional Consumer Goods — low-frequency-news industries
  • 18%
    ~18% of 465 sectors crossed the ≥75 early warning threshold — manageable alert volume
Source-Level Risk Distribution
  • R
    Reuters — highest High-Risk proportion; breaking news focus, crisis-signalling vocabulary
  • B
    Bloomberg — balanced distribution; analytical commentary + data-driven reporting
  • FT
    Financial Times — higher Medium-Risk proportion; contextual analysis, nuanced language
10 / 16

Forecasting & Risk Patterns

Scenario-Based Forecasting

Sample synthetic headline passed to classifier:

"Federal Reserve signals rate pause amid persistent inflation"
Predicted: Medium Risk
Low Risk
29.6%
Medium Risk
36.2%
High Risk
34.2%

Close distribution reflects inherent uncertainty in financial risk prediction from text alone.

3 Decision-Support Output Types
  • 1
    Early Warning Alerts — triggered when REI ≥ 75
  • 2
    Sector Risk Dashboards — aggregated REI scores
  • 3
    Scenario Risk Forecasts — probabilistic estimates for novel headlines
Temporal Analysis
  • Q2
    Q2 2025 High-Risk spike — markedly elevated Apr–Jun aligned with Fed decisions, earnings season & escalating geopolitical tensions
  • Medium Risk = baseline noise — distributed evenly through which High-Risk spikes emerge
  • Post-Q2 stabilisation — Low-Risk events increased Jul–Aug 2025 as markets absorbed earlier high-impact events
  • Mo
    Day-of-week pattern — High-Risk headlines disproportionately published Monday–Tuesday (weekend events, early-week earnings)
These temporal findings support deployment as a dynamic monitoring tool, not a static classifier — capable of distinguishing episodic risk spikes from chronic risk accumulation.
11 / 16

Business Implications

1
First-Pass Headline Triage — 500 headlines/day → system correctly classifies ~330 vs 166 by random selection. Near-doubling of triage efficiency, freeing analyst capacity for deeper investigation.
2
Sector Rotation & Portfolio Risk — Sector REI ≥ 75 triggers portfolio exposure review. Converts qualitative news sentiment into standardised risk scores embeddable in existing governance frameworks.
3
Regulatory Compliance Escalation — High-Risk headlines on regulatory/legal actions auto-routed to compliance teams. Reduces risk of missed events during high-volume news periods.
4
Competitive Intelligence — REI trends across competitors provide early sector-wide distress signals before they appear in structured market data. Two-tier architecture surfaces cumulative signals.
5
Proactive Risk Reporting — Scenario module produces probabilistic risk statements for anticipated upcoming events, enabling forward-looking board-level decisions rather than backward-looking reports.
12 / 16

Accuracy vs Benchmarks

Random Baseline
33.3%
33.3%
This Study (BERT)
65.95%
65.95%
Group 7 (BERT opt.)
59%
59%
FinBERT (Literature)
72%
72%
Multimodal (Literature)
81%
81%
+32.65pp
vs random baseline — strong learning demonstrated
~72%
FinBERT ceiling — achievable with domain fine-tuning
~81%
Multimodal frontier — NLP + structured market data
13 / 16

Current Limitations

×
Dataset Scope
Only 10 companies over ~6 months. Broader temporal and entity coverage needed for REI's discriminative power to generalise.
×
Model Architecture
General-purpose bert-base-uncased constrains achievable accuracy. FinBERT (pre-trained on financial text) offers the highest-impact single improvement.
×
Compute Resources
CPU-only training limited hyperparameter search and training epochs. GPU acceleration would enable larger batches and more epochs.
×
Generative AI Integration
Planned Gemini API integration for narrative synthesis encountered version compatibility errors — high-value direction for future work.
×
Label Quality
Pre-labeled sentiment & impact fields introduce dependency on external annotation quality that was not independently validated.

Model Performance Context

Financial news headlines are inherently concise — often 10–15 words — limiting contextual information available to the BERT encoder relative to longer financial documents.

Risk classification is inherently ambiguous: the same headline describing a corporate earnings miss may be High Risk for a concentrated investor and Low Risk for a diversified fund. This subjectivity is embedded in the labeling and constitutes irreducible noise.

Domain adaptation (FinBERT) is identified as the single most impactful path forward — consistent with Garrido-Merchan & Hernandez-Lobato (2023) and Yang et al. (2023).

14 / 16

Future Directions

FinBERT Fine-Tuning
Pre-train on financial corpora (SEC filings, earnings transcripts). Expected to yield the most significant single accuracy improvement — targeting the ~72% literature benchmark.
Multimodal Risk Signals
Integrate price time series, trading volumes & options implied volatility with NLP-derived signals — consistent with Ghosh et al. (2024). Potential ceiling ~81%.
Real-Time Streaming Pipeline
Deploy via Apache Kafka + FastAPI for true real-time risk monitoring from live news feeds — transforming the system from batch to streaming architecture.
Multilingual Extension
Extend to non-English financial news using Costa-jussà et al.'s (2024) multilingual translation capabilities — broadening applicability to emerging market risk analytics.
Explainability & Interpretability
SHAP values & attention visualisation for interpretable individual risk classifications — critical for regulatory compliance in financial AI systems.

As NLP capabilities continue to advance — with increasingly powerful foundation models, multimodal architectures, and real-time processing — AI-augmented risk analytics will become not a specialised capability but fundamental financial infrastructure.

15 / 16

Conclusion

BERT-based NLP framework achieves 65.95% accuracy on balanced 3-class financial risk classification — a +32.65 pp improvement over the 33.3% random baseline, surpassing the 59% companion benchmark.

The Risk Exposure Index (REI) provides a complementary aggregate-level risk signal for 10 companies & 465 industry sectors, smoothing classification noise and enabling entity-level monitoring.

Conservative bias toward correctly flagging high-risk events makes the system operationally safe — the most consequential error type (High→Low misclassification) is the least frequent.

Framework demonstrates practical value: triage of 500 daily headlines correctly classifies ~330 vs 166 random — near-doubling of efficiency with measurable analyst hour savings.

Clear roadmap: FinBERT fine-tuning → multimodal fusion → real-time streaming pipeline. AI-augmented risk analytics will become fundamental financial infrastructure.

Woxsen University · MBA Business Analytics · 2024–2026 · Supervised by Dr. K V Rajesh Kumar
16 / 16