QueueAdministration: Difference between revisions
From MDWiki
Jump to navigationJump to search
No edit summary |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 7: | Line 7: | ||
* Grape runs pbs and maui services (/etc/init.d/...) | * Grape runs pbs and maui services (/etc/init.d/...) | ||
== Look for ghost process on GRAPE== | == Look for ghost process on GRAPE == | ||
$> | on grape: | ||
$>sudo cexec uptime | |||
If one of the nodes has a working load below 1 or above 3 it should be checked. If there are processes which running more then 48h while other stuff is running on the node, kill it. | |||
Also, if you see process running for more then 48h when using showq, talk with the owner, ask him to kill it. | |||
== Checking HD == | |||
Find the foulty HD (main one mostly) | |||
$>df -h | |||
The (assume it is /dev/sda) fix it | |||
$>e2fsck /dev/sda1 | |||
Reboot | |||
$>shutdown -r now |
Latest revision as of 01:07, 12 June 2008
Setup
- Maui is the fair-share software, Torque-PBS is the queue server
- Configuration in /usr/spool/PBS
- PBS User Guide
- PBS Administrators Guide
- Nodes run pbs-mom service (/etc/init.d/...)
- Grape runs pbs and maui services (/etc/init.d/...)
Look for ghost process on GRAPE
on grape:
$>sudo cexec uptime
If one of the nodes has a working load below 1 or above 3 it should be checked. If there are processes which running more then 48h while other stuff is running on the node, kill it.
Also, if you see process running for more then 48h when using showq, talk with the owner, ask him to kill it.
Checking HD
Find the foulty HD (main one mostly)
$>df -h
The (assume it is /dev/sda) fix it
$>e2fsck /dev/sda1
Reboot
$>shutdown -r now