Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.073657
Special Issues
Table of Content

Open Access

ARTICLE

IG-3D: Integrated-Gradients 3D Optimization for Private Transformer Inference

Lei Sun1,2, Jingwen Wang2,*, Peng Hu2, Xiuqing Mao1,2, Cuiyun Hu1,2, Zhihong Wang2
1 Henan Key Laboratory of Information Security, Zhengzhou, 450004, China
2 Cryptographic Engineering School, Information Engineering University, Zhengzhou, 450004, China
* Corresponding Author: Jingwen Wang. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.073657

Received 23 September 2025; Accepted 22 December 2025; Published online 21 January 2026

Abstract

Transformer models face significant computational challenges in private inference (PI). Existing optimization methods often rely on isolated techniques, neglecting joint structural and operational improvements. We propose IG-3D, a unified framework that integrates structured compression and operator approximation through accurate importance assessment. Our approach first evaluates attention head importance using Integrated Gradients (IG), offering greater stability and theoretical soundness than gradient-based methods. We then apply a three-dimensional optimization: (1) structurally pruning redundant attention heads; (2) replacing Softmax with adaptive polynomial approximation to avoid exponential computations; (3) implementing layer-wise GELU substitution to accommodate different layer characteristics. A joint threshold mechanism coordinates compression across dimensions under accuracy constraints. Experimental results on the GLUE benchmark show that our method achieves an average 2.9× speedup in inference latency and a 50% reduction in communication cost, while controlling the accuracy loss within 2.3%, demonstrating significant synergistic effects and a superior accuracy-efficiency trade-off compared to single-technique optimization strategies.

Keywords

Private inference; transformer; attention-head pruning; integrated gradients; transformer model optimization
  • 90

    View

  • 10

    Download

  • 0

    Like

Share Link