Kill a Banner Job

We may occasionally receive or initiate requests to kill Banner jobs or processes if:

  • A job is performing poorly (loop, inefficient query, &c)

  • A job is preventing other jobs from running or performing efficiently (UC4 queue doesn't allow enough tasks, Banner record locks, &c)

  • A job did not terminate normally (unexpected prompt, started before database bounce, &c)

  • A job is submitted unintentionally

You need to use your professional judgement to determine whether a job should really be killed. For example, if the problem is that a UC4 queue has reached its task limit, it may be more appropriate to increase that limit instead of to kill jobs currently running in that queue. Or, if a job is known to run a long time under certain circumstances, it may be more appropriate to let that job run to completion.

We may not always be able to respond to a kill request. The job may have already terminated by itself, or the requestor may not be authorized to request termination of a job. A person requesting termination of a job must be one of the following:

  • the person that submitted the job
  • the supervisor of the person that submitted the job
  • the DTS staff in the office of the person that submitted the job
  • the data steward responsible for the data processed by the job
  • the DBA or authorized agent of the DBA

There are several ways to kill a Banner job:

  • Using UC4 (if the process is truly a Banner job)
  • Using a database command prompt or GUI (if the process is any session connected to the Banner database)
  • Using the operating system kill command (if the process is any process on any Banner server)

NOTE: It is not desirable to use the operating system kill command to kill the job; you should always try one of the other procedures first. The operating system kill command should be used only as a last resort.

UC4

Information and Tools Needed

To perform this task:

  1. You need to know which job to kill. When a Banner user submits a job, the job submission form (GJAPCTL) reports a job sequence number to them. UC4 assigns the job to an online queue and runs it. The UC4 job ID does not correspond with the Banner job sequence number. You will have to try to determine which job to kill based on the userid of the person who submitted it, the name of the job, the time the job was submitted, whether the job is hogging system resources, &c.

  2. You need Operator or DBA access to UC4.

Identifying the process

Log in to the appropriate UC4 instance using an account with DBA or Operator privileges. If your initial view is not of the backlog, click "Backlog" in the left panel. The running/waiting jobs will appear in the upper panel of the display. Finished/aborted jobs will appear in the lower panel.

Jobs submitted from Banner

Jobs submitted from Banner job submission will have a UC4 job name of the form <database>_<userid>_<bannerjobname>. You should already know the database, userid, and Banner job name; determine the UC4 job name from that.

Look first in the upper panel of the display for all jobs matching the UC4 job name. If there is only one running job with that UC4 job name, this may or may not be the one you are looking for. Check the lower panel of the display to see if any other jobs with that name have recently been run, in case the job you want to kill has already stopped.

If you know the Banner job number, right-click on the job's entry in the backlog and select "Output Files" from the menu. If you see any files in the file list with the Banner job number, this confirms that this entry is the correct process. If you don't see any files in the file list with the Banner job number, you should see a file named o<uc4jobnumber>.<nn>, where "nn" is an index of a UC4 job number. View this file. If there are multiple files with names of this pattern, view the one with the highest index number. In this file, search for the text "ONE_UP=<bannerjobnumber>; if you find it, this confirms that the entry is the correct process.

Once you've found the correct process, you can proceed to kill it. If you have any doubt that you've found the correct process, exit UC4 and use the Database Session procedure below.

Jobs submitted from UC4

Jobs submitted from UC4 may have a name in any format, unlike jobs submitted from Banner. You will need to know the job name, the userid of the submitter, and the UC4 job number. A user who submitted a job from UC4 should have sufficient privileges to kill the job themselves; however, if they don't have this privilege, they should still be able to tell you the required information. This information is sufficient to uniquely identify the process to kill.

Killing the process

Right-click on the job's entry in the backlog and select "Kill 1". You will be asked for confirmation; select "Yes".

If the job was submitted from Banner (the name is in the format <database>_<userid>_<bannerjobname>), right-click again on the job's entry in the backlog and select "Delete 1". If you are asked for confirmation, select "Yes".

If the job was submitted from UC4, determine from the user whether the job should be reset. The user may have to fix a data condition or parameter, or ask UTS to fix it, before you can reset the job.

If they request a reset, right-click on the job's entry in the backlog and select "Reset 1". If you are asked for confirmation, select "Yes". If they do not want the job to be reset, right-click on the job's entry and select "Delete 1".

Jobs submitted from Banner should never be reset.

Database session

Information and Tools Needed

To perform this task:

  1. You need to know which process to kill. When a Banner user submits a job, the job submission form (GJAPCTL) reports a job sequence number to them. This number is sufficient to uniquely identify the job to be killed. If you do not have the job number, you will have to try to determine it based on the userid of the person who submitted it, the name of the job, the time the job was submitted, whether the job is hogging system resources, &c.

  2. You need access to the UNIX command prompt on the job submission server.
  3. If the job submission server and the database server are separate, you also need access to the UNIX command prompt on the database server.
  4. You need DBA access to the Banner database.
  5. While it is not absolutely necessary, it may be helpful to have a GUI tool (such as Oracle Enterprise Manager Console), but only if job submission and the database are on the same server.

If the process you need to kill is a Banner form rather than a job, treat it as a job on a where job submission and the database are on different servers.

Job submission and database on different servers

Identifying the process

If you have the job sequence number (for example, NNN), go to the command prompt of the job submission server, and type the following command:

ps -ef | grep NNN

This will display all occurrences on the server of processes associated with that number. The display shows the name of the operating system user that started the process, the process identifier (PID) number, the parent process identifier (PPID) number, the percentage of CPU, the time the process started, the terminal associated with the process (if any), the CPU time used by the process, and finally, the command string. The job sequence number NNN will occur in the string of the command being executed (for example, "sh /banjobs/PROD/glbdata_NNN.shl"). The number NNN might also be a PID or a PPID either in whole or in part, but we are not interested in those processes; we only care about processes where NNN is part of a file name.

If you do not have the job sequence number, you can search on other information. For example, if you know the name of the job and the person that submitted it, you can enter:

ps -ef | grep jobname | grep username

Usually, Banner jobs will be run by the user appmgr, or by the Banner job submission operating system user (banjobs, bjobtest, &c). In rare cases, the username will be the userid of the person who submitted it.

However, if the user submitted the job more than once, the above command will return multiple processes. You will need further information to uniquely identify the job to be killed. Following the trace procedure below may help you to get that information.

We want to trace the process forward through its child processes until we determine the PID of the database process. Once we have found that PID, we will locate it in the database and kill it there.

As a simple example, suppose we are asked to kill the GLBDATA job with the job sequence number 914607. We go to the server, and type:

ps -ef | grep 914607

This may return the following:

banner  2818112 3682714   0 14:21:14  pts/7  0:00 grep 914607
banjobs 2920892 2957636   0 12:08:35      -  0:00 sh /u01/SCT/banjobs/glbdata_914607.shl
banjobs 2957636       1   0 12:08:35      -  0:00 sh /u01/SCT/banjobs/glbdata_914607.shl

One of these processes appears to have been started by PPID 1, which is the root init process. We select the other one. The PID of that process is 2920892, so we then enter the command:

ps -ef | grep 2920892

This returns:

banner  2793754 3682714   0 14:22:49  pts/7  0:00 grep 2920892
banjobs 2875738 2920892   0 12:08:35      -  0:00 GLBDATA
banjobs 2920892 2957636   0 12:08:35      -  0:00 sh /u01/SCT/banjobs/glbdata_914607.shl

In this example, process 2920892 is the parent of process 2875738. As above, you use the "ps -ef" command with the child PID number to trace through the process table. On a database server, you could continue this trace; however, because the database is on a different server in this example, your trace will end just before you get to a database process. Using the above example, the trace would have ended when you found PID 2875738. Querying on this PID would have returned:

banjobs 2875738 2920892   0 12:08:35      -  0:00 GLBDATA
banner  3060016 3682714   0 14:23:11  pts/7  0:00 grep 2875738

When you see only the process you'd found earlier and the grep that searched for it, you've searched as far as you can go. At this point, you must query the v$session table in the database to find the database process. Log in to the database as a DBA, and enter the following query:

select username, machine, status, substr(sid||','||serial#,1,11) sid_serial,
       process , substr(to_char(logon_time,'dd-mon hh24:mi'),1,12) logon,
       round(LAST_CALL_ET/60,0) i_min, substr(module,1,10) Module
from   v$session
where  process = NNN;

where NNN is the last PID you found in your trace. You will see a display like

USERNAME   MACHINE    S SID_SERIAL       PROCESS      LOGON        I_MIN MODULE
---------- ---------- - ---------------- ------------ ------------ ----- ----------
USERNAME   jobserver  I 1866,57          2875738      20-apr 08:03     0 JOBNAME

The numbers in the "SID_SERIAL" column (separated by a comma) identify the database process.

Killing the process

In the same database session you used to query the v$session table, enter the following command:

alter system kill session 'SID_SERIAL' immediate;

where SID_SERIAL is a single-quoted string containing the two numbers from the SID_SERIAL column in your query, including the comma that separates them. To kill the JOBNAME process we identified above, we would enter:

alter system kill session '1866,57' immediate;

Job submission and database on same server

Our Banner environment is no longer in this configuration. The instructions are kept for historical purposes.

Identifying the process

If you have the job sequence number (for example, NNN), go to the command prompt of the job submission server, and type the following command:

ps -ef | grep NNN

This will display all occurrences on the server of processes associated with that number. The display shows the name of the operating system user that started the process, the process identifier (PID) number, the parent process identifier (PPID) number, the percentage of CPU, the time the process started, the terminal associated with the process (if any), the CPU time used by the process, and finally, the command string. The job sequence number NNN will occur in the string of the command being executed (for example, "sh /u01/SCT/banjobs/glbdata_NNN.shl"). The number NNN might also be a PID or a PPID either in whole or in part, but we are not interested in those processes.

If you do not have the job sequence number, you can search on other information. For example, if you know the name of the job and the person that submitted it, you can enter:

ps -ef | grep jobname | grep username

However, if the user submitted the job more than once, this will return multiple processes. You will need further information to uniquely identify the job to be killed. Following the trace procedure below may help you to get that information.

We want to trace the process back through its child processes until we determine the PID of the database process. Once we have found that PID, we will locate it in the database and kill it there.

As a simple example, suppose we are asked to kill the GLBDATA job with the job sequence number 914607. We go to the server, and type:

ps -ef | grep 914607

This returns the following:

banner  2818112 3682714   0 14:21:14  pts/7  0:00 grep 914607
banjobs 2920892 2957636   0 12:08:35      -  0:00 sh /u01/SCT/banjobs/glbdata_914607.shl
banjobs 2957636       1   0 12:08:35      -  0:00 sh /u01/SCT/banjobs/glbdata_914607.shl

One of these processes appears to have been started by PPID 1, which is the root init process. We select the other one. The PID of that process is 2920892, so we then enter the command:

ps -ef | grep 2920892

This returns:

banner  2793754 3682714   0 14:22:49  pts/7  0:00 grep 2920892
banjobs 2875738 2920892   0 12:08:35      -  0:00 GLBDATA
banjobs 2920892 2957636   0 12:08:35      -  0:00 sh /u01/SCT/banjobs/glbdata_914607.shl

The process 2920892 is the parent of process 2875738, so we now enter:

ps -ef | grep 2875738

This returns:

banjobs 2851172 2875738  91 12:08:35      - 73:00 oraclePROD (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
banjobs 2875738 2920892   0 12:08:35      -  0:00 GLBDATA
banner  3060016 3682714   0 14:23:11  pts/7  0:00 grep 2875738

The database process is identified by the command "oraclePROD" (this name will vary based on the database being used and the properties of the network connection to the database). Its PID is 2851172.

Killing the process

NOTE: Do not use this procedure if job submission and the database run on different servers! If you do, you may kill the wrong process!

If you have Oracle Enterprise Manager Console, you can simply login to the database with a DBA account, and in the navigation pane on the left, expand the "Instances" tree, and click on the "Sessions" folder. A list of processes will be displayed. In the column marked "OS Process ID", find the PID of the database process you wish to kill (in the above example, we would look for process 2851172), and click on it to highlight it. Then, from the "Object" menu, select "Kill Process", and in the submenu, select "Immediate".

If you do not have Oracle Enterprise Manager Console, you must query the v$session table in the database to find the database process. Log in to the database as a DBA, and enter the following query:

select username, machine, status, substr(sid||','||serial#,1,11) sid_serial,
       process , substr(to_char(logon_time,'dd-mon hh24:mi'),1,12) logon,
       round(LAST_CALL_ET/60,0) i_min, substr(module,1,10) Module
from   v$session
where  process = NNN;

where NNN is the last PID you found in your trace. You will see a display like

USERNAME   MACHINE    S SID_SERIAL       PROCESS      LOGON        I_MIN MODULE
---------- ---------- - ---------------- ------------ ------------ ----- ----------
USERNAME   jobserver  I 1866,57          2851172      20-apr 08:03     0 JOBNAME

The numbers in the "SID_SERIAL" column (separated by a comma) identify the database process. Enter the following command:

alter system kill session 'SID_SERIAL' immediate;

where SID_SERIAL is a single-quoted string containing the two numbers from the SID_SERIAL column in your query, including the comma that separates them. To kill the JOBNAME process we identified above, we would enter:

alter system kill session '1866,57' immediate;

---

BTSHowTo

DataAdminHowTo

DB_Administration