XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Fazaeli-Javan M, Monsefi R, Ghiasi-Shirazi K. Analyzing of Positive and Negative Prototypes based on the ±ED-WTA Method. Journal of Iranian Association of Electrical and Electronics Engineers 2025; 22 (2) :93-107
URL: http://jiaeee.com/article-1-1662-en.html
Ferdowsi university of Mashhad
Abstract:   (783 Views)
The recently introduced method, ±ED-WTA, for each output neuron of a class in the last layer of neural networks, obtain a pair of positive and negative prototypes. The functionality of a neuron is explained based on the difference in the Euclidean distance of a sample from these two prototypes. An astonishing point in this method is the great similarity of positive and negative prototypes for each neuron at the end of training. The authors of [2] have claimed that the reason for this extreme similarity is the formation of the negative prototype of a class with samples of other classes, which are very similar to the positive prototype of that class. In this paper, we show that it is not only the negative prototype that gets close to the positive prototype of a class but also a positive prototype is formed by samples that are far from the center of that class. The new finding about this great similarity shows that each neuron in the softmax layer, like SVMs, makes decisions based on the near-boundary samples. The theoretical analysis is examined in detail and experimental results on MNIST, FERET, and Fashion-MNIST show the correctness of the claims made.
 
Full-Text [PDF 2282 kb]   (105 Downloads)    
Type of Article: Research | Subject: Control
Received: 2023/11/2 | Accepted: 2024/09/19 | Published: 2025/08/15

References
1. [1] Esmaeli H, Ghiasi-Shirazi K, Harati A. Online learning of positive and negative prototypes with explanations based on kernel expansion. Journal of Iranian Association of Electrical and Electronics Engineers 2023; 20 (1) :67-77 [DOI:10.52547/jiaeee.20.1.67]
2. [2] R. Zarei-Sabzevar, K. Ghiasi-Shirazi, and A. Harati, "Prototype-based interpretation of the functionality of neurons in winner-take-all neural networks", IEEE Transactions on Neural Networks and Learning Systems, 2022. [DOI:10.1109/TNNLS.2022.3155174]
3. [3] J. Snell, K. Swersky, and R. Zemel, "Prototypical networks for few-shot learning", Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017.
4. [4] O. Li, H. Liu, C. Chen, and C. Rudin, "Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions", In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1. 2018. [DOI:10.1609/aaai.v32i1.11771]
5. [5] C. Chen, O. Li, D. Tao, A. Barnett, C. Rudin, and J.K. Su, "This looks like that: deep learning for interpretable image recognition", Advances in neural information processing systems 32, 2019.
6. [6] G. Chen, T. Zhang, J. Lu, and J. Zhou, "Deep meta metric learning", In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9547-9556. 2019. [DOI:10.1109/ICCV.2019.00964]
7. [7] V. Feldman, "Does learning require memorization? a short tale about a long tail", Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing. 2020. [DOI:10.1145/3357713.3384290]
8. [8] J. Bien, and R. Tibshirani, "Prototype selection for interpretable classification", The Annals of Applied Statistics: 2403-2424, 2011. [DOI:10.1214/11-AOAS495]
9. [9] S.Ö. Arik, and T. Pfister, "Protoattend: Attention-based prototypical learning", The Journal of Machine Learning Research, 21(1), pp.8691-8725, 2020.
10. [10] K. Chen, and C.G. Lee, "Incremental few-shot learning via vector quantization in deep embedded space", International Conference on Learning Representations. 2021.
11. [11] Q. Qian, L. Shang, B. Sun, J. Hu, H. Li, and R. Jin, "SoftTriple loss: Deep metric learning without triplet sampling", In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6450-6458. 2019. [DOI:10.1109/ICCV.2019.00655]
12. [12] F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering", Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. [DOI:10.1109/CVPR.2015.7298682]
13. [13] C. Cortes, and V. Vapnik, "Support-vector networks", Machine learning 20.3, pp. 273-297, 1995. [DOI:10.1023/A:1022627411411]
14. [14] G. Brown, M. Bun, V. Feldman, A. Smith, and K. Talwar, "When is memorization of irrelevant training data necessary for high-accuracy learning?", Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing. 2021. [DOI:10.1145/3406325.3451131]
15. [15] V. Feldman, and C. Zhang, "What neural networks memorize and why: Discovering the long tail via influence estimation", Advances in Neural Information Processing Systems 33, pp. 2881-2891, 2020.
16. [16] M.E. Mavroforakis, and S. Theodoridis, "A geometric approach to support vector machine (SVM) classification", IEEE transactions on neural networks 17.3 pp. 671-682, 2006. [DOI:10.1109/TNN.2006.873281]
17. [17] C.K. Yeh, J. Kim, I.E.H. Yen, and P.K. Ravikumar, "Representer point selection for explaining deep neural networks", ,Advances in neural information processing systems 31, 2018.
18. [18] K. Allen, E. Shelhamer, H. Shin, and J. Tenenbaum, "Infinite mixture prototypes for few-shot learning", In International Conference on Machine Learning, pp. 232-241. PMLR, 2019.
19. [19] J. Chen, L.M. Zhan, X.M. Wu, and F.L. Chung, "Variational metric scaling for metric-based meta-learning", AAAI Conference on Artificial Intelligence, vol. 34, no. 04, pp. 3478-3485, 2020 [DOI:10.1609/aaai.v34i04.5752]
20. [20] K. Crammer, and Y. Singer, "On the algorithmic implementation of multiclass kernel-based vector machines", Journal of machine learning research 2. Dec 265-292, 2001.
21. [21] K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer, "Online passive aggressive algorithms", Journal of Machine Learning Research, 2006.
22. [22] F. Aiolli, A. Sperduti, and Y. Singer, "Multiclass Classification with Multi-Prototype Support Vector Machines", Journal of Machine Learning Research 6.5 2005.
23. [23] B. Schölkopf, A. J. Smola, R. C. Williamson, and P. L. Bartlett, "New support vector algorithms". Neural computation, 12(5), 1207-1245, 2000. [DOI:10.1162/089976600300015565]
24. [24] M.E. Tipping, "Sparse Bayesian learning and the relevance vector machine", Journal of machine learning research 1. 211-244. 2001.
25. [25] http://ocw.um.ac.ir/streams/course/view/163.html
26. [26] S. Arora, H. Khandeparkar, M. Khodak, O. Plevrakis, and N. Saunshi, "A theoretical analysis of contrastive unsupervised representation learning", In International Conference on Machine Learning, pp. 5628-5637. PMLR, 2019.
27. [27] P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D.Krishnan, "Supervised contrastive learning", Advances in Neural Information Processing Systems 33 ,18661-18673, 2020.
28. [28] K. Ghiasi-Shirazi, "Competitive cross-entropy loss: A study on training single-layer neural networks for solving nonlinearly separable classification problems", Neural Processing Letters, vol. 50, no. 2, pp. 1115-1122, 2019. [DOI:10.1007/s11063-018-9906-5]

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This Journal is an open access Journal Licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. (CC BY NC 4.0)