Open Access
ARTICLE
IG-3D: Integrated-Gradients 3D Optimization for Private Transformer Inference
1 Henan Key Laboratory of Information Security, Zhengzhou, 450004, China
2 Cryptographic Engineering School, Information Engineering University, Zhengzhou, 450004, China
* Corresponding Author: Jingwen Wang. Email:
Computers, Materials & Continua 2026, 87(2), 49 https://doi.org/10.32604/cmc.2026.073657
Received 23 September 2025; Accepted 22 December 2025; Issue published 12 March 2026
Abstract
Transformer models face significant computational challenges in private inference (PI). Existing optimization methods often rely on isolated techniques, neglecting joint structural and operational improvements. We propose IG-3D, a unified framework that integrates structured compression and operator approximation through accurate importance assessment. Our approach first evaluates attention head importance using Integrated Gradients (IG), offering greater stability and theoretical soundness than gradient-based methods. We then apply a three-dimensional optimization: (1) structurally pruning redundant attention heads; (2) replacing Softmax with adaptive polynomial approximation to avoid exponential computations; (3) implementing layer-wise GELU substitution to accommodate different layer characteristics. A joint threshold mechanism coordinates compression across dimensions under accuracy constraints. Experimental results on the GLUE benchmark show that our method achieves an average 2.9Keywords
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools