Running jobs on Intel Xeon Phi co-processors
Intel Xeon Phi™ coprocessor is able to work with Intel Xeon processors to enable dramatic performance gains for highly parallel codes such as LAMMPS. The Xeon Phi coprocessor uses a PCIe interface to communicate with the Xeon processors (the host). Peregrine has 288 Intel Phi coprocessors (on nodes n1297-n1584) with an aggregate performance of about 582 TeraFlops. Each node contains two 8-core SandyBridge processors and two Xeon Phi cards. On each Xeon Phi co-processor there are 60 processor cores. The architecture provides 4 threads per core, resulting in up to 240 threads per card. Each Xeon Phi co-processor has 8GB of local, high bandwidth memory.
How to Request Phi nodes
The phi queue has 32 nodes with Xeon Phi co-processors, the short queue has 4 nodes with Xeon Phi co-processors and the debug queue has 2 nodes with Xeon Phi co-processors.
To request one node from the phi queue for an interactive job:
qsub -I -l nodes=1 -l walltime=0:20:00 -q phi -A <project-account>
To submit a batch job, add the following line to your PBS script:
#PBS -q phi
To request nodes from the short or debug queues, use "-l feature=phi". For example, the following command request one node from the short queue for an interactive job:
qsub -I -l nodes=1 -l walltime=0:20:00 -q short -l feature=phi -A <project-account>
Applications supporting Xeon Phi on Peregrine
LAMMPS: LAMMPS is able to offload some of calculations to the Phi cards. For more information about how to run LAMMPS using Phi, please go to the LAMMPS web page.
Amber: To be added soon.
Other Applications
How to decide if using Xeon Phi is right for your application
Tutorials on how to port and optimize code for Intel Xeon Phi Many-core processors
Programming and Compiling for Intel Many Integrate Core Architecture