QueueAdministration

From MDWiki
Jump to navigationJump to search

Setup

  • Maui is the fair-share software, Torque-PBS is the queue server
  • Configuration in /usr/spool/PBS
  • PBS User Guide
  • PBS Administrators Guide
  • Nodes run pbs-mom service (/etc/init.d/...)
  • Grape runs pbs and maui services (/etc/init.d/...)

Look for ghost process on GRAPE

on grape:

$>sudo cexec uptime

If one of the nodes has a working load below 1 or above 3 it should be checked. If there are processes which running more then 48h while other stuff is running on the node, kill it.

Also, if you see process running for more then 48h when using showq, talk with the owner, ask him to kill it.

Checking HD

Find the foulty HD (main one mostly)

$>df -h

The (assume it is /dev/sda) fix it

$>e2fsck /dev/sda1

Reboot

$>shutdown -r now