The changing relevance of the TLB

Research output: Chapter in Book/Report/Conference proceedingConference contribution

59 Downloads (Pure)

Abstract

A little over a decade ago, Goto and van de Geijn
wrote about the importance of the treatment of the translation
lookaside buffer (TLB) on the performance of matrix multiplication
[1]. Crucially, they did not say how important, nor did
they provide results that would allow the reader to make his
own judgement. In this paper, we revisit their work and look
at the effect on the performance of their algorithm when built
with different assumed data TLB sizes. Results on three different
processors, one relatively modern, two contemporary with Goto
and van de Geijn’s writings ([1] and [2]), are examined and
compared within a real-world context. Our findings show that,
although important when aiming for a place in the TOP500 [3]
list, these features have little practical effect on the architectures
we have chosen. We conclude, then, that the importance of
the various factors, which must be taken into account when
tuning matrix multiplication (GEMM, the heart of the High
Performance LINPACK benchmark, and hence of the TOP500
table), differ dramatically relative to one another on different
processors.
Original languageEnglish
Title of host publication2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES)
Place of PublicationPiscataway, NJ
PublisherIEEE
Pages110-114
Number of pages6
DOIs
Publication statusPublished - Sep 2013
EventDCABES 2013 - Kingston-on-Thames, UK United Kingdom
Duration: 2 Sep 20134 Sep 2013

Conference

ConferenceDCABES 2013
CountryUK United Kingdom
CityKingston-on-Thames
Period2/09/134/09/13

Fingerprint

Tuning

Keywords

  • Linpack performance
  • TLB

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Jones, J., Davenport, J., & Bradford, R. (2013). The changing relevance of the TLB. In 2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES) (pp. 110-114). Piscataway, NJ: IEEE. https://doi.org/10.1109/DCABES.2013.27

The changing relevance of the TLB. / Jones, Jessica; Davenport, James; Bradford, Russell.

2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES). Piscataway, NJ : IEEE, 2013. p. 110-114.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Jones, J, Davenport, J & Bradford, R 2013, The changing relevance of the TLB. in 2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES). IEEE, Piscataway, NJ, pp. 110-114, DCABES 2013, Kingston-on-Thames, UK United Kingdom, 2/09/13. https://doi.org/10.1109/DCABES.2013.27
Jones J, Davenport J, Bradford R. The changing relevance of the TLB. In 2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES). Piscataway, NJ: IEEE. 2013. p. 110-114 https://doi.org/10.1109/DCABES.2013.27
Jones, Jessica ; Davenport, James ; Bradford, Russell. / The changing relevance of the TLB. 2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES). Piscataway, NJ : IEEE, 2013. pp. 110-114
@inproceedings{d143c2b8771244888e2b29fe5b43e9f9,
title = "The changing relevance of the TLB",
abstract = "A little over a decade ago, Goto and van de Geijnwrote about the importance of the treatment of the translationlookaside buffer (TLB) on the performance of matrix multiplication[1]. Crucially, they did not say how important, nor didthey provide results that would allow the reader to make hisown judgement. In this paper, we revisit their work and lookat the effect on the performance of their algorithm when builtwith different assumed data TLB sizes. Results on three differentprocessors, one relatively modern, two contemporary with Gotoand van de Geijn’s writings ([1] and [2]), are examined andcompared within a real-world context. Our findings show that,although important when aiming for a place in the TOP500 [3]list, these features have little practical effect on the architectureswe have chosen. We conclude, then, that the importance ofthe various factors, which must be taken into account whentuning matrix multiplication (GEMM, the heart of the HighPerformance LINPACK benchmark, and hence of the TOP500table), differ dramatically relative to one another on differentprocessors.",
keywords = "Linpack performance, TLB",
author = "Jessica Jones and James Davenport and Russell Bradford",
year = "2013",
month = "9",
doi = "10.1109/DCABES.2013.27",
language = "English",
pages = "110--114",
booktitle = "2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES)",
publisher = "IEEE",
address = "USA United States",

}

TY - GEN

T1 - The changing relevance of the TLB

AU - Jones, Jessica

AU - Davenport, James

AU - Bradford, Russell

PY - 2013/9

Y1 - 2013/9

N2 - A little over a decade ago, Goto and van de Geijnwrote about the importance of the treatment of the translationlookaside buffer (TLB) on the performance of matrix multiplication[1]. Crucially, they did not say how important, nor didthey provide results that would allow the reader to make hisown judgement. In this paper, we revisit their work and lookat the effect on the performance of their algorithm when builtwith different assumed data TLB sizes. Results on three differentprocessors, one relatively modern, two contemporary with Gotoand van de Geijn’s writings ([1] and [2]), are examined andcompared within a real-world context. Our findings show that,although important when aiming for a place in the TOP500 [3]list, these features have little practical effect on the architectureswe have chosen. We conclude, then, that the importance ofthe various factors, which must be taken into account whentuning matrix multiplication (GEMM, the heart of the HighPerformance LINPACK benchmark, and hence of the TOP500table), differ dramatically relative to one another on differentprocessors.

AB - A little over a decade ago, Goto and van de Geijnwrote about the importance of the treatment of the translationlookaside buffer (TLB) on the performance of matrix multiplication[1]. Crucially, they did not say how important, nor didthey provide results that would allow the reader to make hisown judgement. In this paper, we revisit their work and lookat the effect on the performance of their algorithm when builtwith different assumed data TLB sizes. Results on three differentprocessors, one relatively modern, two contemporary with Gotoand van de Geijn’s writings ([1] and [2]), are examined andcompared within a real-world context. Our findings show that,although important when aiming for a place in the TOP500 [3]list, these features have little practical effect on the architectureswe have chosen. We conclude, then, that the importance ofthe various factors, which must be taken into account whentuning matrix multiplication (GEMM, the heart of the HighPerformance LINPACK benchmark, and hence of the TOP500table), differ dramatically relative to one another on differentprocessors.

KW - Linpack performance

KW - TLB

UR - http://sec.kingston.ac.uk/2013dcabes/

UR - http://dx.doi.org/10.1109/DCABES.2013.27

U2 - 10.1109/DCABES.2013.27

DO - 10.1109/DCABES.2013.27

M3 - Conference contribution

SP - 110

EP - 114

BT - 2013 12th International Symposium on Distributed Computing and Applications to Business, Engineering & Science (DCABES)

PB - IEEE

CY - Piscataway, NJ

ER -