Scyld Cloud Accountant Documentation

The Scyld Cloud Accountant provides a web API with one main purpose: aggregate resource usage into daily usage summaries that can be queried by a caller.

The following resource families are supported:

  • Storage (bytes allocated)
  • Jobs (core hours) - Summary and Itemized
  • Server-Instances (time)

Usage API

Consumers of this API can fetch information via HTTP GET requests. These requests must have the appropriate URL, HTTP headers for authentication, and GET parameters. This document describes each of these components, the expected output, and resource-family specific details.

HTTP Headers

Consumers of the accountant must use the OAuth protocol and fetch a token using their key and secret using a GET request to Scyld Cloud Auth’s /auth/request_token. These two items will then be passed as part of each GET request header to the accountant for the purpose of authentication using the following header fields:

  • X-Auth-Cloudauth-Id
  • X-Auth-Token

Storage

The Scyld Cloud Accountant tracks the maximum storage allocated for each date / user / cloud controller / volume_uuid combination.

URI

/storage

GET Parameters

The following parameters are required:

  • start_date – Select jobs after this date (YYYY-MM-DD). This is inclusive.
  • end_date – Select jobs before this date (YYYY-MM-DD). This is inclusive.

The following parameters are optional:

  • cloud_controller_id – If specified, filters results to show only the specified cloud_controller_id.
  • cloud_auth_userid – If specified, filters results to show only the specified cloud_auth_userid.
  • owner_userid – If specified, assume the identity of another user (NOTE: this is only available to superusers).

Returned values are sorted by date, cloud_controller_id, cloud_auth_userid, resource_type, and resource-family specific attributes. A maximum of 1200 values are returned per request. To retrieve the next page, a caller must provide values from the last row of the previous response (i.e. ‘clues’). All ‘clues’ must be provided for paging to work correctly.

  • clue_date – The date of the last row.
  • clue_cloud_controller_id – The cloud_controller_id value of the last row.
  • clue_cloud_auth_userid – The cloud_auth_userid value of the last row.
  • clue_resource_type – The resource_type value of the last row.
  • clue_volume_uuid – The volume_uuid value of the last row.

Output

If the query fails, the JSON response will have the form:

{'success': False,
 'version': <api version>,
 'message': <a message - not used by accountant>,
 'error': <an error message>}

If the query succeeds, the JSON response will have the form:

{'success': True,
 'version': <api version>,
 'message': <a message - not used by accountant>,
 'data': {'result': [{'date': <string>,
                      'cloud_controller_id': <int>,
                      'cloud_auth_userid': <string>,
                      'cloud_resource_type': 'storage',
                      'volume_uuid': <string>,
                      'allocated_MB': <float>
                    }],
          'page_size': <number of results>}}

Jobs

The Scyld Cloud Accountant tracks the total number of jobs, the walltime, and the core hours for each date / user / cloud controller / scheduler / queue-name combination.

URL

/jobs

GET Parameters

The following parameters are required:

  • start_date – Select jobs after this date (YYYY-MM-DD). This is inclusive.
  • end_date – Select jobs before this date (YYYY-MM-DD). This is inclusive.

The following parameters are optional:

  • cloud_controller_id – If specified, filters results to show only the specified cloud_controller_id.
  • cloud_auth_userid – If specified, filters results to show only the specified cloud_auth_userid.
  • owner_userid – If specified, assume the identity of another user (NOTE: this is only available to superusers).

Returned values are sorted by date, cloud_controller_id, cloud_auth_userid, resource_type, and resource-family specific attributes. A maximum of 1200 values are returned per request. To retrieve the next page, a caller must provide values from the last row of the previous response (i.e. ‘clues’). All ‘clues’ must be provided for paging to work correctly.

  • clue_date – The date of the last row.
  • clue_cloud_controller_id – The cloud_controller_id value of the last row.
  • clue_cloud_auth_userid – The cloud_auth_userid value of the last row.
  • clue_resource_type – The resource_type value of the last row.
  • clue_queue – The queue value of the last row.

Output

If the query fails, the JSON response will have the form:

{'success': False,
 'version': <api version>,
 'message': <a message - not used by accountant>,
 'error': <an error message>}

If the query succeeds, the JSON response will have the form:

{'success': True,
 'version': <api version>,
 'message': <a message - not used by accountant>,
 'data': {'result': [{'date': <string>,
                      'cloud_controller_id': <int>,
                      'cloud_auth_userid': <string>,
                      'cloud_resource_type': <'pbs'|'slurm'|'sge'|'lstc'>,
                      'queue': <string>,
                      'total_jobs': <int>,
                      'walltime': <int.  Sum of the runtimes of every job in seconds.>,
                      'core_hours': <float.  The number of cores * hours of walltime.>,
                    }],
          'page_size': <number of results>}}

‘resource_type’ may be any one of the following:

  • ‘pbs’ - PBS Torque jobs
  • ‘slurm’ - Slurm jobs
  • ‘sge’ - SGE (a.k.a. Sun Grid Engine, Oracle Grid Engine, Grid Engine) jobs
  • ‘lstc’ - LSTC jobs

Itemized Jobs

Return per-job results. The returned results will be the union of queries to each jobs database. You will receive at most 1200 rows.

URL

/jobs/itemized

GET Parameters

The following parameters are required:

  • start_date – Select jobs after this date (YYYY-MM-DD). This is inclusive.
  • end_date – Select jobs before this date (YYYY-MM-DD). This is inclusive.

The following parameters are optional:

  • cloud_controller_id – If specified, filters results to show only the specified cloud_controller_id.
  • cloud_auth_userid – If specified, filters results to show only the specified cloud_auth_userid.
  • queue – If specified, filters results to show only jobs with the given queue value.
  • resource_type – If specified, filters results to show only jobs with the given resource type, typically 'pbs' or 'slurm'. Can be specified more than once.
  • account – If specified, filters results to show only jobs with the given project account value.
  • owner_userid – If specified, assume the identity of another user (NOTE: this is only available to superusers).

Returned values are sorted by date, cloud_controller_id, cloud_auth_userid, resource_type, and resource-family specific attributes. A maximum of 1200 values are returned per request. To retrieve the next page, a caller must provide values from the last row of the previous response (i.e. ‘clues’). All ‘clues’ must be provided for paging to work correctly.

  • clue_cloud_controller_id – The cloud_controller_id value of the
  • clue_resource_type – The resource_type value of the last row.
  • last row.
  • clue_user – The user value of the last row.
  • clue_queue – The queue value of the last row.
  • clue_account – The account value of the last row.
  • clue_submit – The submit value of the last row.
  • clue_start – The start value of the last row.
  • clue_end – The end value of the last row.
  • clue_job_id – The job_id value of the last row.
  • clue_job_name – The job_name value of the last row.

Output

If the query fails, the JSON response will have the form:

{'success': False,
 'version': <api version>,
 'message': <a message - not used by accountant>,
 'error': <an error message>}

If the query succeeds, the JSON response will have the form:

{'success': True,
 'version': <api version>,
 'data': {'result':
            [{'cloud_controller_id': <int>,
              'resource_type': <'pbs'|'slurm'|'sge'|'lstc'>,
              'user': <string.  system account name>,
              'queue': <string>,
              'account': <string.  The project account for the job>,
              'submit': <string.  Submit datetime string, UTC>,
              'start': <string.  Start datetime string, UTC>,
              'end': <string.  End datetime string, UTC>,
              'job_name': <string>,
              'job_id': <string>,
              'cloud_auth_userid': <string>,
              'num_cores': <int>,
              'walltime': <int.  Sum of the runtimes of every job in seconds.>,
              'core_hours': <float.  The number of cores * hours of walltime.>,
             }],
            }
}

‘resource_type’ may be any one of the following:

  • ‘pbs’ - PBS Torque jobs
  • ‘slurm’ - Slurm jobs
  • ‘sge’ - SGE (a.k.a. Sun Grid Engine, Oracle Grid Engine, Grid Engine) jobs
  • ‘lstc’ - LSTC jobs

Server Instances

Server Instances are the VMs owned by a user. These VMs must be “powered on” to give a user access to cluster resources. The Scyld Cloud Accountant tracks the walltime (aka the uptime) of the Server Instances for each date / user / cloud-controller combination.

URL

/server-instances

GET Parameters

The following parameters are required:

  • start_date – Select jobs after this date (YYYY-MM-DD). This is inclusive.
  • end_date – Select jobs before this date (YYYY-MM-DD). This is inclusive.

The following parameters are optional:

  • cloud_controller_id – If specified, filters results to show only the specified cloud_controller_id.
  • cloud_auth_userid – If specified, filters results to show only the specified cloud_auth_userid.
  • owner_userid – If specified, assume the identity of another user (NOTE: this is only available to superusers).

Returned values are sorted by date, cloud_controller_id, cloud_auth_userid, resource_type, and resource-family specific attributes. A maximum of 1200 values are returned per request. To retrieve the next page, a caller must provide values from the last row of the previous response (i.e. ‘clues’). All ‘clues’ must be provided for paging to work correctly.

  • clue_date – The date of the last row.
  • clue_cloud_controller_id – The cloud_controller_id value of the last row.
  • clue_cloud_auth_userid – The cloud_auth_userid value of the last row.
  • clue_resource_type – The resource_type value of the last row.
  • clue_instance_uuid – The instance_uuid value of the last row.

Output

If the query fails, the JSON response will have the form:

{'success': False,
 'version': <api version>,
 'message': <a message - not used by accountant>,
 'error': <an error message>}

If the query succeeds, the JSON response will have the form:

{'success': True,
 'version': <api version>,
 'message': <a message - not used by accountant>,
 'data': {'result': [{'date': <string>,
                      'cloud_controller_id': <int>,
                      'cloud_auth_userid': <string>,
                      'cloud_resource_type': 'instances',
                      'instance_uuid': <string>,
                      'walltime': <int.  Number of seconds this instance was active.>,
                    }],
          'page_size': <number of results>}}