STBMP-Net - Block-Matching Tensor Network

Published:

Spatial–Temporal Block-Matching Patch-Tensor Model for Infrared Small Moving Target Detection
A Aliha, Y Liu, Y Ma, Y Hu, Z Pan, G Zhou
Published in Remote Sensing, August 2023
DOI


Project Overview

Challenge: Best of Both Worlds

Traditional methods (tensor-based):

  • Good interpretability
  • Mathematical foundation
  • Limited feature learning

Deep learning methods:

  • Powerful feature learning
  • End-to-end optimization
  • Require large training data
  • Black box nature

Can we combine strengths?

Solution: Hybrid Architecture

STBMP-Net integrates:

  1. Block-matching for motion-aware tensor construction
  2. Tensor decomposition for background modeling
  3. Deep network for feature enhancement

Key Innovations

1. Spatial-Temporal Block-Matching

Motion-aware patch grouping:

  • Match blocks across frames using correlation
  • Group similar patches for tensor construction
  • Preserve temporal consistency

Advantage: Better tensor structure than naive stacking

2. Tensor Decomposition Backbone

Low-rank + sparse decomposition:

  • Background → Low-rank tensor
  • Target → Sparse component
  • Noise → Residual

Optimization: ADMM solver with learned priors

3. Deep Enhancement Network

CNN-based refinement:

  • Process residual tensor
  • Learn discriminative features
  • End-to-end training with decomposition

Hybrid loss:

  • Reconstruction loss (tensor)
  • Detection loss (network)
  • Regularization (smoothness)

Architecture

Input: T frames
    ↓
[Block Matching]
├── Motion estimation
├── Patch grouping
└── Tensor construction
    ↓
[Tensor Decomposition]
├── Low-rank background
├── Sparse target (initial)
└── Residual
    ↓
[Deep Enhancement]
├── CNN feature extraction
├── Attention refinement
└── Detection map
    ↓
Output: Final detection

Training Strategy

Two-stage training:

  1. Pre-train tensor decomposition (unsupervised)
  2. End-to-end fine-tuning (supervised)

Results

Advantages demonstrated:

  • Better than pure tensor methods: +15% detection rate
  • Better than pure CNN: More data efficient (50% less data needed)
  • Interpretable: Clear background/target separation

Robustness:

  • Complex backgrounds
  • Various target sizes
  • Different motion patterns

Applications

  • Research platforms: Interpretable detection
  • Educational tools: Understanding tensor methods
  • Practical systems: Data-efficient deployment

Citation

@article{aliha2023spatial,
  title={A Spatial--Temporal Block-Matching Patch-Tensor Model for Infrared Small Moving Target Detection in Complex Scenes},
  author={Aliha, A and Liu, Y and Ma, Y and Hu, Y and Pan, Z and Zhou, G},
  journal={Remote Sensing},
  volume={15},
  number={17},
  pages={4316},
  year={2023},
  publisher={MDPI}
}