25-10-2016, 02:09 PM
1461090005-1310.58422.docx (Size: 481.38 KB / Downloads: 4)
I. INTRODUCTION
Intel Xeon Phiisthelatesthigh-throughputarchitecturetargetedathighperformancecomputing,and,withoutadoubt,will bepartofthe verynextgenerationofsupercomputersthatwillchallengeTOP5001.Toachieveitshighlevelperformance(1000GFlops),IntelXeonPhi[1]usesover50coresand
25MBofon-chipcaches.Despitethe features itshareswithmulti-coreCPUsandmany-coreGPUs(vectorization,SIMD/SIMT,highthroughput,andhigh bandwidth)[2],XeonPhihasadifferentarchitecturefromallofthem[3].Forexample,overallcachecoherencyisnotavailableonGPUs,whiletheringinterconnectisnotusedonCPUsandGPUs.
Foradvancedusers-likemosthighperformancecomputing(HPC)programmersandcompilerdevelopers are-itisessen-tialtounderstandthisarchitectureindetail,astheachievedperformancedependsoneachofthesedetails.Forexample,knowingtherequirementsfordensityandplacementofthreadspercores,theoptimalutilizationofthecoreinterconnections,