Two-Pass Softmax Algorithm

The softmax (also called softargmax) function is widely used in machine learning models to normalize real-valued scores into a probability distribution. To avoid floating-point overflow, the softmax function is conventionally implemented in three passes: the first pass to compute the normalization c...

Full description

Saved in:
Bibliographic Details
Published in2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) pp. 386 - 395
Main Authors Dukhan, Marat, Ablavatski, Artsiom
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2020
Subjects
Online AccessGet full text
DOI10.1109/IPDPSW50202.2020.00074

Cover

Abstract The softmax (also called softargmax) function is widely used in machine learning models to normalize real-valued scores into a probability distribution. To avoid floating-point overflow, the softmax function is conventionally implemented in three passes: the first pass to compute the normalization constant, and two other passes to compute outputs from normalized inputs. We analyze two variants of the Three-Pass algorithm and demonstrate that in a well-optimized implementation on HPC-class processors performance of all three passes is limited by memory bandwidth.We then present a novel algorithm for softmax computation in just two passes. The proposed Two-Pass algorithm avoids both numerical overflow and the extra normalization pass by employing an exotic representation for intermediate values, where each value is represented as a pair of floating-point numbers: one representing the "mantissa" and another representing the "exponent".Performance evaluation demonstrates that on out-of-cache inputs on an Intel Skylake-X processor the new Two-Pass algorithm outperforms the traditional Three-Pass algorithm by up to 28% in AVX512 implementation, and by up to 18% in AVX2 implementation. The proposed Two-Pass algorithm also outperforms the traditional Three-Pass algorithm on Intel Broadwell and AMD Zen 2 processors.
AbstractList The softmax (also called softargmax) function is widely used in machine learning models to normalize real-valued scores into a probability distribution. To avoid floating-point overflow, the softmax function is conventionally implemented in three passes: the first pass to compute the normalization constant, and two other passes to compute outputs from normalized inputs. We analyze two variants of the Three-Pass algorithm and demonstrate that in a well-optimized implementation on HPC-class processors performance of all three passes is limited by memory bandwidth.We then present a novel algorithm for softmax computation in just two passes. The proposed Two-Pass algorithm avoids both numerical overflow and the extra normalization pass by employing an exotic representation for intermediate values, where each value is represented as a pair of floating-point numbers: one representing the "mantissa" and another representing the "exponent".Performance evaluation demonstrates that on out-of-cache inputs on an Intel Skylake-X processor the new Two-Pass algorithm outperforms the traditional Three-Pass algorithm by up to 28% in AVX512 implementation, and by up to 18% in AVX2 implementation. The proposed Two-Pass algorithm also outperforms the traditional Three-Pass algorithm on Intel Broadwell and AMD Zen 2 processors.
Author Dukhan, Marat
Ablavatski, Artsiom
Author_xml – sequence: 1
  givenname: Marat
  surname: Dukhan
  fullname: Dukhan, Marat
  organization: Google Research,Mountain View,CA,USA
– sequence: 2
  givenname: Artsiom
  surname: Ablavatski
  fullname: Ablavatski, Artsiom
  organization: Google Research,Mountain View,CA,USA
BookMark eNotjcFqwkAQQFeoB7X-QAviDySdmZ3dzR7FtlYQGtDSo-wmkzZgjCSBtn9foT083u29qbo5t2dRaoGQIoJ_2OaP-f7dAAGlVyAFAMcjNUVHGTpm4ybq7vDVJnno--W-rYYmfC9Xp4-2q4fP5laNq3DqZf7vmXp7fjqsX5Ld62a7Xu2SmkAPiRbgUrMvSopo2ekQLRYGwUiE8vopMmO9j6LZOgIvWSBhCz5WTJSVeqbu_7q1iBwvXd2E7ufo0YD2rH8BnHE3-A
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/IPDPSW50202.2020.00074
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore (NTUSG)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1728174457
9781728174457
EndPage 395
ExternalDocumentID 9150394
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i203t-3e04d349cd2b16473ab61c5105eb0d174c85699be3467209e8a2e4609bf4228d3
IEDL.DBID RIE
IngestDate Mon Jul 08 05:38:32 EDT 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-3e04d349cd2b16473ab61c5105eb0d174c85699be3467209e8a2e4609bf4228d3
PageCount 10
ParticipantIDs ieee_primary_9150394
PublicationCentury 2000
PublicationDate 2020-May
PublicationDateYYYYMMDD 2020-05-01
PublicationDate_xml – month: 05
  year: 2020
  text: 2020-May
PublicationDecade 2020
PublicationTitle 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
PublicationTitleAbbrev IPDPSW
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.7955717
Snippet The softmax (also called softargmax) function is widely used in machine learning models to normalize real-valued scores into a probability distribution. To...
SourceID ieee
SourceType Publisher
StartPage 386
SubjectTerms Approximation algorithms
Bandwidth
Computational modeling
Machine learning
Machine learning algorithms
Probability distribution
Program processors
SIMD
softargmax
softmax
Title Two-Pass Softmax Algorithm
URI https://ieeexplore.ieee.org/document/9150394
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH5sO3lS2UTnD3rwaLc0adPmKOqYwqSwDXcbTfKqoltFOhT_el_aOUU8eAshkF-8vPeS7_sCcGq5CWRstS-tE9XWMSObi41PuYTME0qcbeTYyKNbOZyGN7No1oCzDRcGESvwGfZcsXrLt4VZuauyvqLoRaiwCc04kTVXa036DZjqX6eX6fguovjHEay4g2wxh-P78WtK5TQG2zD66q7Gijz1VqXumY9fSoz_Hc8OdL7peV66cTy70MBlG7qTt8JPKRT2xnSyLrJ37_z5vqDU_2HRgenganIx9NcfH_iPnInSF8hCK0JlLNdO70tkWgbGhUKomaUcwiSRVEqjoGOOM4VJxjGUTOncKXpZsQetZbHEffB4nGSC2mfUOhTMJIpMzuRMG64VRuIA2m5e85da22K-nlL37-pD2HIrWwP-jqBVvq7wmJxyqU-q3fgEapWLRQ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEJ4gHvSkBoziaw8eXShtt7s9GpWAAtkEiNzI9oEYhTVmicZf73QX0RgP3pqmSV-Zzkz7fV8Bzg3VTREa5QvjRLVVSNDmQu1jLiGmESbOJnBs5F5ftEf8dhyMS3Cx5sJYa3Pwma27Yv6Wb1K9dFdlDYnRC5N8AzYDznlQsLVWtN8mkY1OfB0P7gOMgBzFijrQFnFIvh__puRuo7UDva8OC7TIU32Zqbr--KXF-N8R7UL1m6DnxWvXswclu6hAbfiW-jEGw94Az9Z58u5dPj-kmPzP5lUYtW6GV21_9fWB_0gJy3xmCTeMS22ocopfLFGiqV0wZBUxmEXoKBBSKsvwoKNE2iihlgsi1dRpehm2D-VFurAH4NEwShi2T7A1Z0RHEo1OT4nSVEkbsEOouHlNXgp1i8lqSrW_q89gqz3sdSfdTv_uCLbdKhfwv2MoZ69Le4IuOlOn-c58Au1XjpI
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+IEEE+International+Parallel+and+Distributed+Processing+Symposium+Workshops+%28IPDPSW%29&rft.atitle=Two-Pass+Softmax+Algorithm&rft.au=Dukhan%2C+Marat&rft.au=Ablavatski%2C+Artsiom&rft.date=2020-05-01&rft.pub=IEEE&rft.spage=386&rft.epage=395&rft_id=info:doi/10.1109%2FIPDPSW50202.2020.00074&rft.externalDocID=9150394