Presentation

· Presenter IndexPresenters · Organization IndexOrganizations · Search Program · Flagged · Happening Now · QRCode Reader

Paper

: LIBXSMM: Accelerating Small Matrix Multiplications by Runtime Code Generation

ask a question

give feedback

SessionAccelerating Science

Session ChairMichael Bader

Authors

Event Type

Paper

Event Tags

Applications

Intermediate

Performance

Scientific Computing

TimeThursday, November 17th4:30pm - 5pm

Location255-EF

DescriptionMany modern highly scalable scientific simulations packages rely on small matrix multiplications as their main computational engine. Math libraries or compilers are unlikely to provide the best possible kernel performance. To address this issue, we present a library which provides high performance small matrix multiplications targeting all recent x86 vector instruction set extensions up to Intel AVX-512. Our evaluation proves that speed-ups of more than 10X are possible depending on the CPU and application. These speed-ups are achieved by a combination of several novel technologies. We use a code generator which has a built-in architectural model to create code which runs best without requiring an auto-tuning phase. Since such code is very specialized, we leverage just-in-time compilation to only build the required kernel variant at runtime. To keep ease-of-use, overhead, and kernel management under control we accompany our library with a BLAS-compliant frontend which features a multi-level code-cache hierarchy.

Download PDF

Paper provided by the IEEE Computer Society
Paper also available from the ACM Digital Library

Authors

Alexander Heinecke (presenting)

Intel Corporation

Greg Henry (presenting)

Intel Corporation

Maxwell Hutchinson

University of Chicago

Hans Pabst

Intel Corporation

Navigation