NAME
     sched_conf - Grid  Engine  default  scheduler  configuration
     file

DESCRIPTION
     sched_conf defines the configuration file  format  for  Grid
     Engine's  default  scheduler  provided by sge_schedd(8).  In
     order to modify the configuration, use the graphical  user's
     interface qmon(1) or the -msconf option of the qconf(1) com-
     mand. A default configuration is provided together with  the
     Grid Engine distribution package.

FORMAT
     The following parameters are recognized by the  Grid  Engine
     scheduler if present in sched_conf:

  algorithm
     Allows for the selection  of  alternative  scheduling  algo-
     rithms.

     Currently default is the only allowed setting.

  load_formula
     A simple  algebraic  expression  used  to  derive  a  single
     weighted  load value from all or part of the load parameters
     reported by sge_execd(8) for each host and from all or  part
     of  the  consumable  resources  (see complex(5)) being main-
     tained for each host.  The load formula expression syntax is
     that of a summation weighted load values, that is:

          load_val1[*w1][{+|-}load_val2[*w2][{+|-}...]]

     Note, no blanks are allowed in the load formula.
     The load values and consumable  resources  (load_val1,  ...)
     are  specified  by the name defined in the complex (see com-
     plex(5)).
     Note: Administrator defined load values (see the load_sensor
     parameter   in   sge_conf(5)  for  details)  and  consumable
     resources available for all hosts (see  complex(5))  may  be
     used as well as Grid Engine default load parameters.
     The weighting factors (w1, ...) are positive integers. After
     the  expression  is  evaluated for each host the results are
     assigned to the  hosts  and  are  used  to  sort  the  hosts
     corresponding  to the weighted load. The sorted host list is
     used to sort queues subsequently.
     The default load formula is "load_avg".

  job_load_adjustments
     The load, which is imposed by the Grid Engine  jobs  running
     on  a  system  varies  in  time, and often, e.g. for the CPU
     load, requires some amount of time to  be  reported  in  the
     appropriate  quantity by the operating system. Consequently,
     if a job was started very recently, the  reported  load  may
     not provide a sufficient representation of the load which is
     already imposed on that host by the job. The  reported  load
     will  adapt  to  the  real load over time, but the period of
     time, in which the reported load is  too  low,  may  already
     lead to an oversubscription of that host. Grid Engine allows
     the administrator to specify job_load_adjustments which  are
     used  in  the  Grid  Engine scheduler to compensate for this
     problem.
     The job_load_adjustments are specified as a comma  separated
     list  of  arbitrary  load parameters or consumable resources
     and (separated by an equal sign) an associated load  correc-
     tion  value.  Whenever  a  job  is  dispatched  to a host by
     sge_schedd(8), the load parameter and consumable  value  set
     of  that  host  is  increased  by the values provided in the
     job_load_adjustments  list.  These  correction  values   are
     decayed      linearly      over     time     until     after
     load_adjustment_decay_time from the  start  the  corrections
     reach  the  value  0.   If  the job_load_adjustments list is
     assigned the special denominator NONE, no  load  corrections
     are performed.
     The adjusted load and consumable values are used to  compute
     the  combined  and  weighted  load  of  the  hosts  with the
     load_formula (see above) and to compare the load and consum-
     able  values against the load threshold lists defined in the
     queue   configurations   (see   queue_conf(5)).    If   your
     load_formula simply consists of the CPU load average parame-
     ter load_avg and if your jobs are  very  compute  intensive,
     you  might  want  to  set  the  job_load_adjustments list to
     load_avg=100, which means that every new job dispatched to a
     host will require 100 % CPU time and thus the machine's load
     is instantly raised by 100.

  load_adjustment_decay_time
     The load  corrections  in  the  "job_load_adjustments"  list
     above  are  decayed linearly over time from the point of the
     job start, where the corresponding load or consumable param-
     eter  is  raised by the full correction value, until after a
     time  period  of  "load_adjustment_decay_time",  where   the
     correction      becomes     0.     Proper     values     for
     "load_adjustment_decay_time" greatly depend upon the load or
     consumable   parameters  used  and  the  specific  operating
     system(s). Therefore, they can only  be  determined  on-site
     and experimentally.  For the default load_avg load parameter
     a "load_adjustment_decay_time" of 7 minutes  has  proven  to
     yield reasonable results.

  maxujobs
     The maximum number of jobs any user may have  running  in  a
     Grid  Engine cluster at the same time. If set to 0 (default)
     the users may run an arbitrary number of jobs.

  schedule_interval
     At   the   time   sge_schedd(8)   initially   registers   to
     sge_qmaster(8)  schedule_interval  is  used  to set the time
     interval in  which  sge_qmaster(8)  sends  scheduling  event
     updates  to  sge_schedd(8).   A scheduling event is a status
     change that has occurred  within  sge_qmaster(8)  which  may
     trigger  or  affect scheduler decisions (e.g. a job has fin-
     ished and thus the allocated resources are available again).
     In the Grid  Engine  default  scheduler  the  arrival  of  a
     scheduling  event  report  triggers  a  scheduler  run.  The
     scheduler waits for event reports otherwise.
     Schedule_interval is a time value (see queue_conf(5)  for  a
     definition of the syntax of time values).

  queue_sort_method
     This parameter determines in which  order  several  criteria
     are  taken  into  account  to  product  a sorted queue list.
     Currently, two settings are valid:  seqno and load.  However
     in  both  cases, Grid Engine attempts to maximize the number
     of soft requests (see qsub(1) -s option) being fulfilled  by
     the queues for a particular as the primary criterion.
     Then, if the queue_sort_method parameter is  set  to  seqno,
     Grid  Engine  will use the seq_no parameter as configured in
     the current queue configurations (see queue_conf(5)) as  the
     next criterion to sort the queue list. The load_formula (see
     above) has only a meaning if two queues have equal  sequence
     numbers.   If  queue_sort_method  is  set  to  load the load
     according the load_formula is the criterion after maximizing
     a  job's  soft requests and the sequence number is only used
     if two hosts have the same load.  The sequence number  sort-
     ing  is  most  useful if you want to define a fixed order in
     which queues are to be filled (e.g.   the cheapest  resource
     first).

     The default for this parameter is load.

  halftime
     When executing under a share  based  policy,  the  scheduler
     "ages"  (i.e. decreases) usage to implement a sliding window
     for achieving the share entitlements as defined by the share
     tree.  The halftime defines the time interval in which accu-
     mulated usage will have been decayed to  half  its  original
     value.  Valid values are specified of type time as specified
     in queue_conf(5).

  usage_weight_list
     Grid Engine accounts for the consumption  of  the  resources
     CPU-time,  memory  and  IO  to  determine the usage which is
     imposed on a system by a job. A single usage value  is  com-
     puted  from  these three input parameters by multiplying the
     individual values by weights and adding them up. The weights
     are defined in the usage_weight_list. The format of the list
     is

          cpu=wcpu,mem=wmem,io=wio

     where wcpu, wmem and wio are the configurable  weights.  The
     weights  are real number. The sum of all tree weights should
     be 1.

  compensation_factor
     Determines how fast Grid Engine should compensate  for  past
     usage  below  of  above the share entitlement defined in the
     share tree. Recommended values are between 2 and  10,  where
     10 means faster compensation.

  weight_user
     The relative importance of the user shares in the functional
     policy.  Values are of type real.

  weight_project
     The relative importance of the project shares in  the  func-
     tional policy.  Values are of type real.

  weight_department
     The relative importance of  the  department  shares  in  the
     functional policy. Values are of type real.

  weight_job
     The relative importance of the job shares in the  functional
     policy. Values are of type real.

  weight_tickets_functional
     The maximum number of functional tickets available for  dis-
     tribution by Grid Engine. Determines the relative importance
     of the functional policy. See under sge_priority(5)  for  an
     overview on job priorities.

  weight_tickets_share
     The maximum number of share based tickets available for dis-
     tribution by Grid Engine. Determines the relative importance
     of the share tree policy. See under sge_priority(5)  for  an
     overview on job priorities.

  weight_deadline
     The weight applied on the remaining time until a jobs latest
     start  time. Determines the relative importance of the dead-
     line. See under  sge_priority(5)  for  an  overview  on  job
     priorities.

  weight_waiting_time
     The weight applied on the jobs waiting  time  since  submis-
     sion.  Determines  the  relative  importance  of the waiting
     time.  See under sge_priority(5)  for  an  overview  on  job
     priorities.

  weight_urgency
     The weight applied on jobs normalized urgency when determin-
     ing  priority  finally used.  Determines the relative impor-
     tance of urgency.  See under sge_priority(5) for an overview
     on job priorities.

  weight_ticket
     The weight applied on normalized ticket amount  when  deter-
     mining  priority  finally  used.   Determines  the  relative
     importance of the ticket policies. See under sge_priority(5)
     for an overview on job priorities.

  flush_finish_sec
     The parameters are provided for tuning the system's schedul-
     ing  behavior.   By default, a scheduler run is triggered in
     the scheduler interval. When this parameter is set to  1  or
     larger,  the  scheduler  will be triggered x seconds after a
     job has finished. Setting this parameter to 0  disables  the
     flush after a job has finished.

  flush_submit_sec
     The parameters are provided for tuning the system's schedul-
     ing  behavior.   By default, a scheduler run is triggered in
     the scheduler interval.  When this parameter is set to 1  or
     larger,  the  scheduler will be triggered  x seconds after a
     job was submitted to the system. Setting this parameter to 0
     disables the flush after a job was submitted.

  schedd_job_info
     The default scheduler can keep track why jobs could  not  be
     scheduled  during  the  last  scheduler  run. This parameter
     enables or disables the observation.  The value true enables
     the monitoring false turns it off.

     It is also possible to activate  the  observation  only  for
     certain  jobs.  This will be done if the parameter is set to
     job_list followed by a comma separated list of job ids.

     The user can obtain the collected information with the  com-
     mand qstat -j.

  params
     This is foreseen for passing additional paramters to the  N1
     Grid Engine scheduler. The following values are recognized:

     JC_FILTER
          If set to true, the scheduler limits the number of jobs
          it  looks  at during a scheduling run. At the beginning
          of the scheduling run it assigns each  job  a  specific
          category,   which  is  based  on  the  job's  requests,
          priority settings, and the job  owner.  All  scheduling
          policies will assign the same importance to each job in
          one category. Therefore the number of jobs per category
          have  a  FIFO order and can be limited to the number of
          free slots in the system.

          A exception are jobs, which request a resource reserva-
          tion.  They  are  included regardsless of the number of
          jobs in a category.

          This setting is turned off per default, because in very
          rare cases, the scheduler can make a wrong decision. It
          is also advised to turn report_pjob_tickets off. Other-
          wise qstat -ext can report outdated ticket amounts. The
          information shown with a qstat -j for a job,  that  was
          excluded in a scheduling run, is very limited.

     PROFILE
          If set equal to 1, the scheduler logs profiling  infor-
          mation summarizing each scheduling run.

     MONITOR
          If set equal to 1, the  scheduler  records  information
          for  each  scheduling  run  allowing  to  reproduce job
          resources      utilization      in       the       file
          <sge_root>/<cell>/common/schedule.

     Changing params will take immediate effect.  The default for
     params is none.

  reprioritize_interval"
     Interval (HH:MM:SS) to reprioritize jobs on the  exec  hosts
     based  on the current ticket amount for the running jobs. If
     the interval is set  to  00:00:00  the  reprioritization  is
     turned off. The default value is 00:00:00.

  report_pjob_tickets
     This parameter allows to tune the  system's  scheduling  run
     time.  It is used to enable / disable the reporting of pend-
     ing job tickets to the qmaster.  It does not  influence  the
     tickets  calculation.  The  sort  order of jobs in qstat and
     qmon is only based on the submit time, when the reporting is
     turned off.
     The reporting should be turned of  in  a  system  with  very
     large amount of jobs by setting this param to "false".

  halflife_decay_list
     The halflive_decay_list allows to configure different  decay
     rates  for  the "finished_jobs usage types, which is used in
     the pending job ticket calculation to account for jobs which
     have just ended. This allows the user the pending jobs algo-
     rithm to count finished jobs against a user or project for a
     configurable  decayed  time  period.  The default is set so,
     that finished jobs are not considered. It is  also  used  to
     determine  if  the  job  usage has been added to the user or
     project usage_list.
     The halflife_decay_list also allows to  configure  different
     decay  rates for each usage type being tracked (cpu, io, and
     mem).

  policy_hierarchy
     This parameter sets up a dependency chain  of  ticket  based
     policies.  Each  ticket based policy in the dependency chain
     is influenced by the previous policies  and  influences  the
     following  policies.  A  typical  scenario is to assign pre-
     cedence for the override policy over the share-based policy.
     The  override  policy  determines  in such a case how share-
     based tickets are assigned among jobs of the  same  user  or
     project.   Note  that  all policies contribute to the ticket
     amount assigned to a particular job regardless of the policy
     hierarchy  definition. Yet the tickets calculated in each of
     the    policies    can    be    different    depending    on
     "POLICY_HIERARCHY".

     The "POLICY_HIERARCHY" parameter can be a  up  to  3  letter
     combination of the first letters of the 3 ticket based poli-
     cies S(hare-based), F(unctional) and O(verride). So a  value
     "OFS"  means  that the override policy takes precedence over
     the functional policy, which finally influences  the  share-
     based  policy.   Less  than  3 letters mean that some of the
     policies do not influence other policies and  also  are  not
     influenced  by other policies. So a value of "FS" means that
     the functional policy influences the share-based policy  and
     that there is no interference with the other policies.

     The special value "NONE" switches off policy hierarchies.

  share_override_tickets
     If set to "true" or "1", override tickets  of  any  override
     object  instance  are  shared  equaly among all running jobs
     associated with the object. The pending  jobs  will  get  as
     many  override  tickets,  as they would have, when they were
     running. If set to "false" or "0", each job  gets  the  full
     value  of  the  override tickets associated with the object.
     The default value is "true".

  share_functional_shares
     If set to "true" or "1", functional shares of any functional
     object  instance  are  shared  among all the jobs associated
     with the object. If set to "false" or "0", each job  associ-
     ated  with  a  functional  object,  gets the full functional
     shares of that object. The default value is "true".


  max_functional_jobs_to_schedule
     The maximum number of pending jobs to schedule in the  func-
     tional policy. The default value is 200.

  max_pending_tasks_per_job
     The maximum number of subtasks  per  pending  array  job  to
     schedule.  This parameter exists in order to reduce schedul-
     ing overhead. The default value is 50.

  max_reservation
     The  maximum  number  of  reservations  scheduled  within  a
     schedule  interval.  When  a runnable job can not be started
     due  to  a  shortage  of  resources  a  reservation  can  be
     scheduled   instead.  A  reservation  can  cover  consumable
     resources with the global host, any execution host  and  any
     queue.  For  parallel  jobs  reservations  are done also for
     slots resource as specified in sge_pe(5).   As  job  runtime
     the  maximum  of  the  time specified with -l h_rt=... or -l
     s_rt=... is assumed. For jobs that have neither of them  the
     default_duration  is  assumed.  Reservations prevent jobs of
     lower priority as specified in sge_priority(5) from  utiliz-
     ing the reserved resource quota during the while of reserva-
     tion. Jobs of lower priority are allowed  to  utilize  those
     reserved  resources  only  if  their  prospective job end is
     before the start of the reservation (backfilling).  Reserva-
     tion  is  done  only  for  non-immediate jobs (-now no) that
     request reservation (-R y). If max_reservation is set to "0"
     no job reservation is done.

     Note, that reservation scheduling can be performance consum-
     ing  and  hence  reservation  scheduling  is switched off by
     default. Since reservation scheduling  performance  consump-
     tion is known to grow with the number of pending jobs use of
     -R y option is recommended  only  for  those  jobs  actually
     queuing   for   bottleneck   resources.  Together  with  the
     max_reservation parameter this technique can be used to nar-
     row down performance impacts.

  default_duration
     When job  reservation  is  enabled  through  max_reservation
     sched_conf(5)  parameter  the default duration is assumed as
     runtime for  jobs  that  have  neiter  -l  h_rt=...  nor  -l
     s_rt=...  specified.  In  contrast to a h_rt/s_rt time limit
     the default_duration is not enforced.

FILES
     <sge_root>/<cell>/common/sched_configuration
                sge_schedd configuration

SEE ALSO
     sge_intro(1), qalter(1), qconf(1), qstat(1),  qsub(1),  com-
     plex(5),    queue_conf(5),   sge_execd(8),   sge_qmaster(8),
     sge_schedd(8).  Grid Engine Installation and  Administration
     Guide

COPYRIGHT
     See sge_intro(1) for a full statement of rights and  permis-
     sions.















































Man(1) output converted with man2html