Formatted Output for PBS qstat
We use PBS as our HPC scheduler and a common thing I want to do is check on the status of my jobs. Vanilla qstat
is pretty bare-bones though:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
1117644.nodeX* kwaneu gaussia* tiny_solu* 14940 1 16 32gb -- R 00:02
1117645.nodeX* kwaneu gaussia* tiny_solu* 14948 1 16 32gb -- R 00:02
1117646.nodeX* kwaneu gaussia* tiny_solu* 14958 1 16 32gb -- R 00:02
What’s after the *?? (Unlike SLURM, there doesn’t seem to be a way to specify a format string! Why not? Who knows?) What are these other random fields? What if I want other information?
A better way is to parse the json:
> qstat -f -Fjson
...
"1117658.somecomputer.merck.com":{
"Job_Name":"tiny_solutes-02149",
"Job_Owner":"kwaneu@somecomputer",
"resources_used":{
"cpupercent":6,
"cput":"00:16:59",
"mem":"8618044kb",
"ncpus":16,
"vmem":"8618044kb",
"walltime":"00:05:08"
},
"job_state":"R",
"queue":"gaussian_medium",
...
This has all the information but is overwhelming for a bunch of jobs. You can parse this with Python, but the problem is the modules take a while to load. A better way is to use an awk script. Here it is!
# script for parsing qstat output
# Eugene Kwan, June 2024
#
# example usage:
# qstat -f -Fjson | awk -f qstat.awk
BEGIN { FS="\"" }
/merck.com":{$/ {
job_count++
ids[job_count] = $2
in_resource = 0
}
/"Job_Name"/ {
names[job_count] = $4
}
/"Job_Owner"/ {
split($4, fields, "@")
owners[job_count] = fields[1]
}
/"ncpus"/ {
split($3, fields, "[:,]")
cpus[job_count] = fields[2]
}
/resources_used/ {
in_resource = 1
}
/"mem"/ && in_resource == 1 {
gsub("[^0-9]","",$4)
memory[job_count] = $4 / 1000000
#print $0, memory[job_count]
in_resource = 0
}
/"walltime"/ {
walltime[job_count] = $4
}
/"job_state"/ {
state[job_count] = $4
}
/"queue"/ {
queue[job_count] = $4
}
END {
print "job_id job_name user ncpus memory walltime state queue"
print "-------------------------------------------------------------------------------------------------------------------------------------------------"
for (i=1; i <= job_count; i++) {
if (owners[i] != "kwaneu")
continue
printf "%30s %50s %12s %3d %6.1f GB %10s %2s %15s\n", ids[i], names[i], owners[i], cpus[i], memory[i], walltime[i], state[i], queue[i]
n_jobs++
if (state[i] == "R")
n_running++
}
printf "\nTotal: %d jobs (%d running)\n", n_jobs, n_running
}
Here’s an example of the output:
> qstat -f -Fjson | awk -f qstat.awk
job_id job_name user ncpus memory walltime state queue
-------------------------------------------------------------------------------------------------------------------------------------------------
1117843.somecomputer.merck.com tiny_solutes-02334 kwaneu 16 0.7 GB 00:03:18 R gaussian_medium
1117844.somecomputer.merck.com tiny_solutes-02335 kwaneu 16 0.6 GB 00:03:17 R gaussian_medium
1117845.somecomputer.merck.com tiny_solutes-02336 kwaneu 16 0.7 GB 00:03:17 R gaussian_medium
Total: 3 jobs (3 running)
The results comes pretty much instantly!
Note that I didn’t actually write a full json parser. I’m just using the very simplest search patterns to extract the metadata. You might need to alter for your system.
Another thing is that qstat
doesn’t seem to take username or other search filters in json mode. This script just searches for my jobs, but you might need to alter this too.