Handling the problems and opportunities posed by multiple on-chip memory controllers

Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, at memory address space. This trend to utilize multiple MCs will likely continue and a core or socket will consequently need to rout...

Full description

Saved in:
Bibliographic Details
Published inPACT '10 : proceedings of the Nineteenth International Conference on Parallel Architectures and Compilation Techniques : September 11-15, 2010, Vienna, Austria pp. 319 - 330
Main Authors Awasthi, Manu, Nellans, David, Sudan, Kshitij, Balasubramonian, Rajeev, Davis, Al
Format Conference Proceeding
LanguageEnglish
Published ACM 11.09.2010
Subjects
Online AccessGet full text
DOI10.1145/1854273.1854314

Cover

Abstract Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, at memory address space. This trend to utilize multiple MCs will likely continue and a core or socket will consequently need to route memory requests to the appropriate MC via an inter- or intra-socket interconnect fabric similar to AMD's HyperTransport™, or Intel's Quick-Path Interconnect™. Such systems are therefore subject to non-uniform memory access (NUMA) latencies because of the time spent traveling to remote MCs. Each MC will act as the gateway to a particular piece of the physical memory. Data placement will therefore become increasingly critical in minimizing memory access latencies. To date, no prior work has examined the effects of data placement among multiple MCs in such systems. Future chip-multiprocessors are likely to comprise multiple MCs and an even larger number of cores. This trend will increase the memory access latency variation in these systems. Proper allocation of workload data to the appropriate MC will be important in reducing the latency of memory service requests. The allocation strategy will need to be aware of queuing delays, on-chip latencies, and row-buffer hit-rates for each MC. In this paper, we propose dynamic mechanisms that take these factors into account when placing data in appropriate slices of the physical memory. We introduce adaptive first-touch page placement, and dynamic page-migration mechanisms to reduce DRAM access delays for multi-MC systems. These policies yield average performance improvements of 17% for adaptive first-touch page-placement, and 35% for a dynamic page-migration policy.
AbstractList Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, at memory address space. This trend to utilize multiple MCs will likely continue and a core or socket will consequently need to route memory requests to the appropriate MC via an inter- or intra-socket interconnect fabric similar to AMD's HyperTransport™, or Intel's Quick-Path Interconnect™. Such systems are therefore subject to non-uniform memory access (NUMA) latencies because of the time spent traveling to remote MCs. Each MC will act as the gateway to a particular piece of the physical memory. Data placement will therefore become increasingly critical in minimizing memory access latencies. To date, no prior work has examined the effects of data placement among multiple MCs in such systems. Future chip-multiprocessors are likely to comprise multiple MCs and an even larger number of cores. This trend will increase the memory access latency variation in these systems. Proper allocation of workload data to the appropriate MC will be important in reducing the latency of memory service requests. The allocation strategy will need to be aware of queuing delays, on-chip latencies, and row-buffer hit-rates for each MC. In this paper, we propose dynamic mechanisms that take these factors into account when placing data in appropriate slices of the physical memory. We introduce adaptive first-touch page placement, and dynamic page-migration mechanisms to reduce DRAM access delays for multi-MC systems. These policies yield average performance improvements of 17% for adaptive first-touch page-placement, and 35% for a dynamic page-migration policy.
Author Sudan, Kshitij
Awasthi, Manu
Balasubramonian, Rajeev
Davis, Al
Nellans, David
Author_xml – sequence: 1
  givenname: Manu
  surname: Awasthi
  fullname: Awasthi, Manu
  email: manua@cs.utah.edu
  organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA
– sequence: 2
  givenname: David
  surname: Nellans
  fullname: Nellans, David
  email: dnellans@cs.utah.edu
  organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA
– sequence: 3
  givenname: Kshitij
  surname: Sudan
  fullname: Sudan, Kshitij
  email: kshitij@cs.utah.edu
  organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA
– sequence: 4
  givenname: Rajeev
  surname: Balasubramonian
  fullname: Balasubramonian, Rajeev
  email: rajeev@cs.utah.edu
  organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA
– sequence: 5
  givenname: Al
  surname: Davis
  fullname: Davis, Al
  email: ald@cs.utah.edu
  organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA
BookMark eNotjUFPAyEYRDHRRK179uCFP7AVFljYo2nUNmnipZ4bYD8shgUC20P_vWt0Li-Zl8zco-uYIiD0SMmaUi6eqRK8k2z9S0b5FWoGqRZBGKFSyVvU1PpNlkjBCBd36LDVcQw-fuH5BDiXZAJMFS8lTjmnMp-jnz1UnFOFEZsLns5h9jkATrG1J5_xBFMqF2xTnEsKAUp9QDdOhwrNP1fo8-31sNm2-4_33eZl32pG2dxSyZiBHozqOmEJFY4RY7QRyinjjJaEGEt7gIFZYgfeKemUMEI43o_KUrZCT3-7HgCOufhJl8tRKkHFcvADQUhR1Q
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/1854273.1854314
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781450301787
1450301789
EndPage 330
ExternalDocumentID 7851531
Genre orig-research
GroupedDBID 6IE
6IL
ACM
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
GUFHI
LHSKQ
RIE
RIL
ID FETCH-LOGICAL-a313t-1733be6eb8225c015f30bbab58f8bfba700bc16ee93c0c94287f85b55f46d8c13
IEDL.DBID RIE
IngestDate Wed Aug 27 02:39:51 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a313t-1733be6eb8225c015f30bbab58f8bfba700bc16ee93c0c94287f85b55f46d8c13
PageCount 12
ParticipantIDs ieee_primary_7851531
PublicationCentury 2000
PublicationDate 2010-09-11
PublicationDateYYYYMMDD 2010-09-11
PublicationDate_xml – month: 09
  year: 2010
  text: 2010-09-11
  day: 11
PublicationDecade 2010
PublicationTitle PACT '10 : proceedings of the Nineteenth International Conference on Parallel Architectures and Compilation Techniques : September 11-15, 2010, Vienna, Austria
PublicationTitleAbbrev PACT
PublicationYear 2010
Publisher ACM
Publisher_xml – name: ACM
SSID ssj0000753045
Score 2.127527
Snippet Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, at...
SourceID ieee
SourceType Publisher
StartPage 319
SubjectTerms Data Placement
Delays
DRAM chips
DRAM Management
Memory Controller Design
Memory management
Multicore processing
Pins
System-on-chip
Title Handling the problems and opportunities posed by multiple on-chip memory controllers
URI https://ieeexplore.ieee.org/document/7851531
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGA1zJ09TN_E3OXg0XbM0SXsWxxAmHjbYbfRLUhS1Ldoe5l9vvrabIh48tYRCS0p4L_neex8h11zFVnIjGVc6ZJFWksVYKZTJBMPGZaYNGpznD2q2jO5XctUjNzsvjHOuEZ-5AG-bWr4tTI1HZWNsJC_RNL2nddJ6tXbnKR76sOjXpffwSI49EkUenAO8CnTp_Gif0qDHdEDm2_e2opGXoK4gMJ-_Ihn_-2EHZPTt06OPOwQ6JD2XH5HBtlED7dbtkCxmmKXgH6Ge7tGuh8wH9YO0KJGA13kTrErL4sNZChu61RnSImfm6bmkb6jI3dBO2f7qSeOILKd3i9sZ69opsFRwUTGuhQCnHHhOII2nAZkIAVKQcRZDBqkOQzBcOZcIE5oE91JZLEHKLFI2Nlwck35e5O6EUDmxoQW_nDXYKDSQZBi9BtalKvS7c3dKhjhJ67JNzFh383P29_A52W9r8gnj_IL0q_faXXqor-Cq-cdfeeSqUw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELWqMsBUoEV844ERp3YdO8mMqAq0FUMrdavijwgEJBFNhvLr8SVpQYiBKZYVyZEj69353nuH0DWToRFMC8JkQIkfSEFCqBSKaABm4yIJNAicJ1M5mvsPC7FooZutFsZaW5HPrAfDqpZvMl3CVVkfGskLEE3vCJdVBLVaa3uj4sAPyn6Nfw_zRd9hke_g2YMnB53OjwYqFX4MO2iyWbmmjbx6ZaE8_fnLlPG_n7aPet9KPfy0xaAD1LLpIepsWjXg5uR20WwEbgruFewCPtx0kVlhN4mzHELwMq2sVXGerazBao03TEOcpUQ_v-T4HTi5a9xw299c2NhD8-Hd7HZEmoYKJOaMF4QFnCsrrXJRgdAuEEg4VSpWIkxClag4oFRpJq2NuKY6gmwqCYUSIvGlCTXjR6idZqk9RlgMDDXKHehAGZ9qFSVgvqaMjSV1-bk9QV3YpGVee2Ysm_05_Xv6Cu2OZpPxcnw_fTxDe3WFPiKMnaN28VHaCwf8hbqs_vcXmJGtpA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=PACT+%2710+%3A+proceedings+of+the+Nineteenth+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques+%3A+September+11-15%2C+2010%2C+Vienna%2C+Austria&rft.atitle=Handling+the+problems+and+opportunities+posed+by+multiple+on-chip+memory+controllers&rft.au=Awasthi%2C+Manu&rft.au=Nellans%2C+David&rft.au=Sudan%2C+Kshitij&rft.au=Balasubramonian%2C+Rajeev&rft.date=2010-09-11&rft.pub=ACM&rft.spage=319&rft.epage=330&rft_id=info:doi/10.1145%2F1854273.1854314&rft.externalDocID=7851531