Handling the problems and opportunities posed by multiple on-chip memory controllers

Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, at memory address space. This trend to utilize multiple MCs will likely continue and a core or socket will consequently need to rout...

Full description

Saved in:

Bibliographic Details
Published in	PACT '10 : proceedings of the Nineteenth International Conference on Parallel Architectures and Compilation Techniques : September 11-15, 2010, Vienna, Austria pp. 319 - 330
Main Authors	Awasthi, Manu, Nellans, David, Sudan, Kshitij, Balasubramonian, Rajeev, Davis, Al
Format	Conference Proceeding
Language	English
Published	ACM 11.09.2010
Subjects	Data Placement Delays DRAM chips DRAM Management Memory Controller Design Memory management Multicore processing Pins System-on-chip
Online Access	Get full text
DOI	10.1145/1854273.1854314

Cover

Abstract	Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, at memory address space. This trend to utilize multiple MCs will likely continue and a core or socket will consequently need to route memory requests to the appropriate MC via an inter- or intra-socket interconnect fabric similar to AMD's HyperTransport™, or Intel's Quick-Path Interconnect™. Such systems are therefore subject to non-uniform memory access (NUMA) latencies because of the time spent traveling to remote MCs. Each MC will act as the gateway to a particular piece of the physical memory. Data placement will therefore become increasingly critical in minimizing memory access latencies. To date, no prior work has examined the effects of data placement among multiple MCs in such systems. Future chip-multiprocessors are likely to comprise multiple MCs and an even larger number of cores. This trend will increase the memory access latency variation in these systems. Proper allocation of workload data to the appropriate MC will be important in reducing the latency of memory service requests. The allocation strategy will need to be aware of queuing delays, on-chip latencies, and row-buffer hit-rates for each MC. In this paper, we propose dynamic mechanisms that take these factors into account when placing data in appropriate slices of the physical memory. We introduce adaptive first-touch page placement, and dynamic page-migration mechanisms to reduce DRAM access delays for multi-MC systems. These policies yield average performance improvements of 17% for adaptive first-touch page-placement, and 35% for a dynamic page-migration policy.
AbstractList	Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, at memory address space. This trend to utilize multiple MCs will likely continue and a core or socket will consequently need to route memory requests to the appropriate MC via an inter- or intra-socket interconnect fabric similar to AMD's HyperTransport™, or Intel's Quick-Path Interconnect™. Such systems are therefore subject to non-uniform memory access (NUMA) latencies because of the time spent traveling to remote MCs. Each MC will act as the gateway to a particular piece of the physical memory. Data placement will therefore become increasingly critical in minimizing memory access latencies. To date, no prior work has examined the effects of data placement among multiple MCs in such systems. Future chip-multiprocessors are likely to comprise multiple MCs and an even larger number of cores. This trend will increase the memory access latency variation in these systems. Proper allocation of workload data to the appropriate MC will be important in reducing the latency of memory service requests. The allocation strategy will need to be aware of queuing delays, on-chip latencies, and row-buffer hit-rates for each MC. In this paper, we propose dynamic mechanisms that take these factors into account when placing data in appropriate slices of the physical memory. We introduce adaptive first-touch page placement, and dynamic page-migration mechanisms to reduce DRAM access delays for multi-MC systems. These policies yield average performance improvements of 17% for adaptive first-touch page-placement, and 35% for a dynamic page-migration policy.
Author	Sudan, Kshitij Awasthi, Manu Balasubramonian, Rajeev Davis, Al Nellans, David
Author_xml	– sequence: 1 givenname: Manu surname: Awasthi fullname: Awasthi, Manu email: manua@cs.utah.edu organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA – sequence: 2 givenname: David surname: Nellans fullname: Nellans, David email: dnellans@cs.utah.edu organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA – sequence: 3 givenname: Kshitij surname: Sudan fullname: Sudan, Kshitij email: kshitij@cs.utah.edu organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA – sequence: 4 givenname: Rajeev surname: Balasubramonian fullname: Balasubramonian, Rajeev email: rajeev@cs.utah.edu organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA – sequence: 5 givenname: Al surname: Davis fullname: Davis, Al email: ald@cs.utah.edu organization: Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA
BookMark	eNotjUFPAyEYRDHRRK179uCFP7AVFljYo2nUNmnipZ4bYD8shgUC20P_vWt0Li-Zl8zco-uYIiD0SMmaUi6eqRK8k2z9S0b5FWoGqRZBGKFSyVvU1PpNlkjBCBd36LDVcQw-fuH5BDiXZAJMFS8lTjmnMp-jnz1UnFOFEZsLns5h9jkATrG1J5_xBFMqF2xTnEsKAUp9QDdOhwrNP1fo8-31sNm2-4_33eZl32pG2dxSyZiBHozqOmEJFY4RY7QRyinjjJaEGEt7gIFZYgfeKemUMEI43o_KUrZCT3-7HgCOufhJl8tRKkHFcvADQUhR1Q
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1145/1854273.1854314
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9781450301787 1450301789
EndPage	330
ExternalDocumentID	7851531
Genre	orig-research
GroupedDBID	6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK GUFHI LHSKQ RIE RIL
ID	FETCH-LOGICAL-a313t-1733be6eb8225c015f30bbab58f8bfba700bc16ee93c0c94287f85b55f46d8c13
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:39:51 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a313t-1733be6eb8225c015f30bbab58f8bfba700bc16ee93c0c94287f85b55f46d8c13
PageCount	12
ParticipantIDs	ieee_primary_7851531
PublicationCentury	2000
PublicationDate	2010-09-11
PublicationDateYYYYMMDD	2010-09-11
PublicationDate_xml	– month: 09 year: 2010 text: 2010-09-11 day: 11
PublicationDecade	2010
PublicationTitle	PACT '10 : proceedings of the Nineteenth International Conference on Parallel Architectures and Compilation Techniques : September 11-15, 2010, Vienna, Austria
PublicationTitleAbbrev	PACT
PublicationYear	2010
Publisher	ACM
Publisher_xml	– name: ACM
SSID	ssj0000753045
Score	2.127527
Snippet	Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, at...
SourceID	ieee
SourceType	Publisher
StartPage	319
SubjectTerms	Data Placement Delays DRAM chips DRAM Management Memory Controller Design Memory management Multicore processing Pins System-on-chip
Title	Handling the problems and opportunities posed by multiple on-chip memory controllers
URI	https://ieeexplore.ieee.org/document/7851531
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwGA1zJ09TN_E3OXg0XbM0SXsWxxAmHjbYbfRLUhS1Ldoe5l9vvrabIh48tYRCS0p4L_neex8h11zFVnIjGVc6ZJFWksVYKZTJBMPGZaYNGpznD2q2jO5XctUjNzsvjHOuEZ-5AG-bWr4tTI1HZWNsJC_RNL2nddJ6tXbnKR76sOjXpffwSI49EkUenAO8CnTp_Gif0qDHdEDm2_e2opGXoK4gMJ-_Ihn_-2EHZPTt06OPOwQ6JD2XH5HBtlED7dbtkCxmmKXgH6Ge7tGuh8wH9YO0KJGA13kTrErL4sNZChu61RnSImfm6bmkb6jI3dBO2f7qSeOILKd3i9sZ69opsFRwUTGuhQCnHHhOII2nAZkIAVKQcRZDBqkOQzBcOZcIE5oE91JZLEHKLFI2Nlwck35e5O6EUDmxoQW_nDXYKDSQZBi9BtalKvS7c3dKhjhJ67JNzFh383P29_A52W9r8gnj_IL0q_faXXqor-Cq-cdfeeSqUw
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELWqMsBUoEV844ERp3YdO8mMqAq0FUMrdavijwgEJBFNhvLr8SVpQYiBKZYVyZEj69353nuH0DWToRFMC8JkQIkfSEFCqBSKaABm4yIJNAicJ1M5mvsPC7FooZutFsZaW5HPrAfDqpZvMl3CVVkfGskLEE3vCJdVBLVaa3uj4sAPyn6Nfw_zRd9hke_g2YMnB53OjwYqFX4MO2iyWbmmjbx6ZaE8_fnLlPG_n7aPet9KPfy0xaAD1LLpIepsWjXg5uR20WwEbgruFewCPtx0kVlhN4mzHELwMq2sVXGerazBao03TEOcpUQ_v-T4HTi5a9xw299c2NhD8-Hd7HZEmoYKJOaMF4QFnCsrrXJRgdAuEEg4VSpWIkxClag4oFRpJq2NuKY6gmwqCYUSIvGlCTXjR6idZqk9RlgMDDXKHehAGZ9qFSVgvqaMjSV1-bk9QV3YpGVee2Ysm_05_Xv6Cu2OZpPxcnw_fTxDe3WFPiKMnaN28VHaCwf8hbqs_vcXmJGtpA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=PACT+%2710+%3A+proceedings+of+the+Nineteenth+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques+%3A+September+11-15%2C+2010%2C+Vienna%2C+Austria&rft.atitle=Handling+the+problems+and+opportunities+posed+by+multiple+on-chip+memory+controllers&rft.au=Awasthi%2C+Manu&rft.au=Nellans%2C+David&rft.au=Sudan%2C+Kshitij&rft.au=Balasubramonian%2C+Rajeev&rft.date=2010-09-11&rft.pub=ACM&rft.spage=319&rft.epage=330&rft_id=info:doi/10.1145%2F1854273.1854314&rft.externalDocID=7851531