API & Command Line Utilities¶

User Identity & API Key¶

Registering for an account on the POD portal establishes your Scyld Cloud Auth user identity and provides you with an API key and secret. All Scyld Cloud Manager APIs require you to provide an authentication token and to get that token, you will use that API key and secret.

You can find your User ID, API key, and API secret on the POD Portal settings page under the API section. If you do not already have an API key or secret you must request one from support by clicking the Request an API key and secret link. Once you have an API key you can click the Revoke or Reset links to recreate or revoke your secret immediately displaying the results on the page. Revoking your API secret will remove your ability to use the API. Please be sure to protect your API secret just as you would a password.

POD API Endpoints (Authentication Required)¶

POD Scyld Cloud Auth Service: https://auth.pod.penguincomputing.com

POD Scyld Cloud Accountant Service: https://accountant.pod.penguincomputing.com

A Guide to using Scyld Cloud Manager APIs on POD¶

The Scyld Cloud Manager (SCM) Platform is a distributed set of software components that provides HPC cloud account and resource management. These software components work tightly together to provide authentication, authorization, and cluster resource provisioning services through their APIs.

At Penguin Computing, we have deployed SCM for our POD clusters and supporting infrastructure. Users can register at and use the POD Portal to manage their account. In addition, each component within SCM exposes an API that users can utilize for programmatic management of their account and POD cluster resources.

The purpose of this guide is to demonstrate the functionality of the APIs with example. We will show some code examples. Furthermore we will be using Python, but the basic interaction pattern should look similar in other languages. We will go through these steps:

Authentication Token

Get a list of available Scyld Cloud Controller instances and API endpoints

Create a new user account on the POD MT1 cluster

Provision a new storage volume on the POD MT1 cluster

Create a new login server instance on the POD MT1 cluster

Add SSH keys to access the POD MT1 cluster

Authentication Token¶

Now that you have your API credentials, you can use them to get an authentication token from the Scyld Cloud Auth service. To get a token you submit a request to the service using OAuth. OAuth is an open web protocol for token-based authentication. This is best done using a Python client library since OAuth requests involve manipulating HTTP headers and concatenating multiple strings including timestamps. This is not something easily done on the command line.

import oauth2 as oauth
import json

CLOUDAUTH_ID = '880da6d718d54acc745ab9fea7b2c090'
API_KEY = 'my_key'
API_SECRET = 'my_secret'
AUTH_API_ENDPOINT = 'https://auth.pod.penguincomputing.com'

auth_url = AUTH_API_ENDPOINT + '/v1/auth/request_token'

# setup the oauth client
consumer = oauth.Consumer(key=API_KEY, secret=API_SECRET)
client = oauth.Client(consumer)

# make the HTTP request using GET method
resp, content = client.request(auth_url, "GET")

# we are looking for HTTP status code of 200 for success
if resp['status'] == '200':
    result = json.loads(content)
    auth_token = result['authentication_token']
    auth_token_expires_at = result['expires_at_utc']
else:
# an error occured.
    pass

Once you have acquired your auth_token, you will use it every subsequent SCM API call by including in the HTTP headers as the X-Auth-Token header. Please Note: Once the authentication token expires it is no longer valid and any API calls using it will fail.

Get a List of Available PODs¶

Now that we have an authentication token stored in auth_token, make a request to retrieve the list of Scyld Cloud Controller instances available within the POD domain. The response is a list of Scyld Cloud Controllers in the JSON results 'data' field. A Cloud Controller object contains the API endpoint information so now we know where to make cluster resource requests.

import requests

# construct the url for the request
url = AUTH_API_ENDPOINT + '/v1/cloud_controller/list'

# all SCM APIs require this header
headers = {'X-Auth-Token': auth_token}

# make the HTTP GET request
r = requests.get(url, headers=headers)

# check for success
if r.status_code == requests.codes.ok:
    response = r.json()
    if response['success']:
        data = response['data']

'data': {
    'cloud_controllers': [{
        'id': 16,
        'name': 'POD MT1',
        'active': True,
        'api_endpoint': 'mt1-api.pod.penguincomputing.com:443',
        'api_ssl': True,
        'api_version': '1',
        'beoweb_endpoint': 'mt1-beoweb.pod.penguincomputing.com:443',
        'beoweb_ssl': True,
        'active': True,
        'created_at_utc': '2013-02-13 17:03:31',
        'timezone': 'US/Mountain'}],
    'total': 1
}

Create a New User Account¶

The first thing we will need to do to get set up on the POD MT1 cluster is to create a user account, which is a UNIX-style system account. We will make the API request to the Cloud Controller endpoint identified from the previous code snippet. A successful response to this call to the Cloud Controller returns a Cloud Controller User object.

import requests
import json

# result from our previous query
CLOUDCON_ENDPOINT = 'https://mt1-api.pod.penguincomputing.com:443'
CLOUDCON_VERSION = 'v1'

# construct URL
url = CLOUDCON_ENDPOINT + '/' + CLOUDCON_VERSION + '/user'

# construct headers: the cloud controller requires an additional header
headers = {
    'X-Auth-Token': auth_token,
    'X-Auth-Cloudauth-Id': CLOUDAUTH_ID}

# construct the required parameters
# note that the cloud controller requires JSON encoded parameters
params = json.dumps({
    'cloudauth-id': CLOUDAUTH_ID,
    'system-name': 'penguin'})

# make the HTTP POST request
r = requests.post(url, headers=headers, data=params)

# check for success
if r.status_code == requests.codes.ok:
    response = r.json()
    if response['success']:
        data = response['data']
        # the response is a list of user objects, in this case a list of 1
        cloudcon_user = data[0]

Provision a New Storage Volume¶

Next, we will provision a new storage volume on the POD MT1 cluster for our new user. A successful response from this call to the Cloud Controller returns a Storage Volume object.

import requests
import json

# construct URL
url = CLOUDCON_ENDPOINT + '/' + CLOUDCON_VERSION + '/storage-volume'

# construct headers
headers = {
    'X-Auth-Token': auth_token}
    'X-Auth-Cloudauth-Id': CLOUDAUTH_ID}

# construct the required parameters: volume-size is GB.
params = json.dumps({
    'cloudauth-id': CLOUDAUTH_ID,
    'volume-size': 50,
    'volume-type': {'visibility': 'cluster'}})

# make the HTTP POST request
r = requests.post(url, headers=headers, data=params)

# check for success
if r.status_code == requests.codes.ok:
    response = r.json()
    if response['success']
        data = response['data']
        # the response is a list of storage volumes
        volume = data[0]

Create a New Login Server Instance¶

We have now created a new user account and storage volume on the POD MT1 cluster. These steps are required before you can create a new login server instance that serves as your gateway to the POD MT1 cluster. You can login here to submit jobs and manage the input and output files.

When we create a login server instance, we will be required to specify both a server-image, which indicates the boot image and a server-flavor, which defines the resource specifications: including cores, memory, and disk space. In order to determine what server images and flavors are available from the cloud controller, we will first make two separate requests:

# GET /server-instance payload
'data': [{
    'id': 'd36256c9-9fcb-4178-83b0-ead2243fc0e1',
    'active': True,
    'min-disk': 5,
    'min-ram': 256,
    'name': 'MT1 Login Node (CentOS 6)',
    'operating-system': {
        'major_version': '',
        'minor_version': '',
        'name': ''},
    'creation-date': '2013-04-17T18:50:19Z',
    'public': True}]

# GET /server-flavor payload
'data': [{
    'id': '1',
    'disk': 5,
    'memory': 256,
    'name': 'pod.free',
    'vcpu': 1}]

Now we can make the request to create the new login server instance. Note that we use the users parameter; which is your Cloud Controller userid. A successful call will return a Server Instance object. You will have the id of the instance so you can query it for status and the public IP address.

import requests
import json

# construct URL
url = CLOUDCON_ENDPOINT + '/' + CLOUDCON_VERSION + '/server-instance/'

# construct headers
headers = {
    'X-Auth-Token': auth_token}
    'X-Auth-Cloudauth-Id': CLOUDAUTH_ID}

# construct the required parameters: volume-size is GB.
params = json.dumps({
    'cloudauth-id': CLOUDAUTH_ID,
    'server-image': server_image_id,
    'server-flavor': server_flavor_id,
    'users': [cloudcon_user['id']]})

# make the HTTP POST request
r = requests.post(url, headers=headers, data=params)

# check for success
if r.status_code == requests.codes.ok:
    response = r.json()
    if response['success']
        data = response['data']
        # the response is a list of server instance objects
        vm = data[0]

# GET /server-instance/vm['id']
'data': [{
    'cloudauth-id': '880da6d718d54acc745ab9fea7b2c090',
    'creation-date': '2013-04-12T00:01:21Z',
    'id': 'cbaadf63529df080885497cbaad1f635',
    'private-ip': '10.10.100.37',
    'public-ip': '192.41.74.108',
    'server-flavor': '1',
    'server-image': '66d63657-570b-4a49-9560-5427c861c5f6',
    'status': 'ACTIVE',
    'users': ['880da6d718d54acc745ab9fea7b2c090']}]

Add SSH Keys for Authentication¶

This guide assumes you are familiar with SSH keys and their use. As stated above, in order to log in to your server instance, you’ll need to upload a public ssh key. SSH keys are a property of a Cloud Controller User object. Now you are ready to access your new login server instance and prepare to run jobs on the cluster.

import requests
import json

# construct URL
url = CLOUDCON_ENDPOINT + '/' + CLOUDCON_VERSION + '/user/' + cloudcon['id']

# the public key string, shortened here
mykey = 'ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA1u'
# construct headers
headers = {
    'X-Auth-Token': auth_token}
    'X-Auth-Cloudauth-Id': CLOUDAUTH_ID}

# construct the required parameters
params = json.dumps({
    'cloudauth-id': CLOUDAUTH_ID,
    'action': 'add',
    'ssh-public-keys': [{'key': mykey}]

# make the HTTP PUT request
r = requests.put(url, headers=headers, data=params)

# check for success
if r.status_code == requests.codes.ok:
    response = r.json()
    if response['success']
        # success
    else:
        # did not succeed. the msg may indicate the specific error.
        print response['msg']

PODTools: Command-line Utilities¶

PODTools is a collection of applications that interface with POD directly from a user’s local machine. This enables job submission and queries without first logging into a POD login node. The CLI application enables remote job submission and management, data staging, and file management.

Download & Install¶

To use PODTools, first you must download and install the necessary development libraries and packages. PODTools prefers Python 2.7 and has not yet been tested with Python 3. Download the latest podtools and python-scyld-utils tarballs from here.

In addition to having Python 2.7 and python2-virtualenv installed, ensure that the following two development libraries are also installed. Library package names typically differ between Operating Systems:

Debian/Ubuntu:

Secure Sockets Layer toolkit: libssl-dev

Foreign Function Interface library: libffi-dev

RedHat/CentOS:

Secure Sockets Layer toolkit: openssl-dev

Foreign Function Interface library: libffi-devel

We recommend installing PODTools into a dedicated Python virtual environment. In this example we unpacked the podtools and python-scyld-utils tarballs then created and activated a dedicated Python virtual environment named env. Once activated the prompt changed to show we are using the Python installed in the virtual environment. Finally we ran the install scripts for python-scyld-utils and podtools. Please Note: always activate your Python virtual environment before using PODTools with the command: source env/bin/activate

$ tar xzf podtools-2.4.1.tar.gz
$ tar xzf python-scyld-utils-1.4.0.tgz
$ virtualenv env
...
$ source env/bin/activate
(env) $
(env) $ cd python-scyld-utils
(env) $ python setup.py install
(env) $ cd ../podtools-2.4.1
(env) $ python setup.py install

Configuration¶

Each time PODTools is invoked, it will read the configuration files /etc/podtools.conf and ~/.podtools/podtools.conf in that order. Settings in a user’s ~/.podtools/podtools.conf configuration file will take precedence over the ones set at the system level in /etc/podtools.conf. Additionally, any options set as command line arguments will override settings from both configuration files.

A recommended approach is to put a minimal configuration that contains global settings for all users in /etc/podtools.conf. Those settings should include the POD IP address and port parameters. If all users in your group will be using the same scheduler, you could also set the sched parameter here as well. A user’s configuration file should, at a minimum, contain their POD user name parameter. Make sure the /etc/podtools.conf is readable to all, but only editable by root.

Sample /etc/podtools.conf:

# Global Options for POD Tools
[podtools]
address=mt1-beoweb.pod.penguincomputing.com
port=443
ssl=True
auth_type=cloudauth
cloudauth_url=https://auth.pod.penguincomputing.com

Sample ~/.podtools/podtools.conf:

# User Options for POD Tools
[podtools]
user=<your system username>
cloudauth_api_key=<your API key>
cloudauth_api_secret=<your API secret>

Configuration Templates¶

A template for podtools.conf is installed in /opt/scyld/podtools/podtools.conf.template and included below for reference. The template file leaves all options commented out, and set to their default values. If the option is commented, the value shown will be the one used.

# Global Options for PODTools
[podtools]
#user=
#address=127.0.0.1
#port=443
#logout=false
#ssl=True
#auth_type=cloudauth
#cloudauth_api_key=
#cloudauth_api_secret=
#ssh_auth_address=
#ssh_auth_port=2200
#password=
#loglevel=ERROR

# Options for PODShell
[podsh]
#overwrite=true
#sched=TRQ
#stageinmaxsize=0
#stageoutmaxsize=0
#hash_algo=md5

PODTools Configuration Options¶

PODTools Option	Value	Meaning
user	<name>	The user name to use when communicating with POD. If not set, PODTools well use your current username.
address	<ip address>	The IP Address of the POD web server.
port	<port number>	The Port number of the POD web server.
logout	`true`	Automatically logout after reach request.
logout	`false`	Do not automatically logout after each request.
ssl	`true`	Encrypt communication with POD using SSL/TLS.
ssl	`false`	Do not use SSL/TLS. Not supported on POD MT1 & MT2.
authtype	`cloudauth`	Use API key and secret for token-based authentication.
	`password`	Prompt for a password when authenticating on POD.
	`publickey`	Use SSH public/private Keys for authentication on POD.
cloudauth_api_key	<api key>	API Key for your POD account
cloudauth_api_secret	<api secret>	API Secret for your POD account
ssh_auth_address	<ip address>	IP Address of server to use during SSH authentication. If not set, address value is used. Only valid if authtype is `publickey`.
ssh_auth_port	<port number>	The port number to use during SSH authentication. Only valid if authtype is `publickey`.
password	<pod password>	Use your POD password to no longer be prompted for it. Your `podtools.conf` should be unreadable by others. Only valid if authtype is `password`.
loglevel	`ERROR`	Only show ERROR messages and higher in `podtools.log`.
	`WARNING`	Show WARNING messages and higher in `podtools.log`.
	`INFO`	Show INFO messages and higher in `podtools.log`.
	`DEBUG`	Show DEBUG messages and higher in `podtools.log`. Using this settings will generate a lot of log messages.

PODShell Configuration Options¶

PODShell Option	Value	Meaning
overwrite	`true`	Overwrite job script if it already exists.
overwrite	`false`	Do not overwrite existing job scripts.
sched	`TRQ`	Submit job using the TORQUE scheduler. Job script must be written using TORQUE syntax.
sched	`SGE`	Submit job using the SGE scheduler. Job script must be written using SGE syntax.
stageinmaxsize	<size>	Abort submission if total size of files to stagein is greater than <size>. Use `0` for unlimited. Suffixes are `K`, `M`, `G`, and `T`.
stageoutmaxsize	<size>	Abort stageout if total size of files to download is greater than <size>. Use `0` for unlimited. Suffixes are `K`, `M`, `G`, and `T`.
hash_algo	<algo>	Hash algorithm to use for verifying file integrity. Valid values are: `md5`, `sha1`, `sha224`, `sha256`, `sha384`, and `sha512`.

POD MT1 Sample¶

If using PODTools on MT1 to status group jobs, the following is all that is needed in a file located at ~/.podtools/podtools.conf:

[podtools]
cloudauth_api_key=<your API key>
cloudauth_api_secret=<your API secret>

PODShell: Job Submission and Management Utility¶

PODShell is a CLI application distributed with PODTools that enables remote job submission and management, data staging, and file management. With PODShell, you can do the following. Some examples are provided below.

Submit job scripts directly to POD without having to SSH to a login node.

Stagein data (upload) to POD as a prerequisite of executing your job.

Stageout data (download) from POD automatically when a job is finished.

Submit either TORQUE or SGE formatted job scripts.

Stagein or stageout data to/from POD independent of job submission.

Query the status of your jobs.

Submitting Jobs¶

PODShell can essentially act as a “remote qsub” that supports job scripts written using TORQUE or SGE job submission formats. The examples below submit jobs using both formats with the podsh submit command:

$ podsh submit test.sub
Job Submitted successfully.  Job ID: 1586
$ podsh submit --sched=SGE sge-test.sub
Job Submitted successfully.  Job ID: 1587

Data Staging¶

PODShell can copy your input and output data for you as part of a job, or independently. The examples below use the --stagein and --stageout options. Jobs will be held until stagein completes and stageout will not occur until a job finishes.

$ podsh submit --stagein=~/input-data test.sub
Job Submitted successfully.  Job ID: 1588
$ podsh submit --stageout=~/job-results.log test.sub
Job Submitted successfully.  Job ID: 1589

Combine the two options to automate an entire workflow, normally accomplished by the following manual steps.

SCP input data and job script to login node.

SSH to login node and submit job.

Watch job status from SSH session or wait for job completion email.

SCP results data back to your local host.

$ podsh submit --stagein=input-data:~/data --stageout=~/job-results.log workflow-job.sub
Job Submitted successfully.  Job ID: 1590

Checking Job Status¶

PODShell can display job status similar to the output from the qstat command. In addition the podsh status command will also display status related to the staging in and out of data as part of your job submission.

$ podsh status
+-----------+---------+---------+----------+----------+---+----------+-----------+
| ID        | User    | Type    | State    | Job Name | S | Stage-in | Stage-out |
+-----------+---------+---------+----------+----------+---+----------+-----------+
| 22482.pod | travis  | COMPUTE | COMPLETE | N/A      |   | NONE     | FAILED    |
| 22708.pod | travis  | COMPUTE | COMPLETE | N/A      |   | NONE     | NONE      |
| 22709.pod | travis  | COMPUTE | COMPLETE | N/A      |   | NONE     | NONE      |
| 22710.pod | travis  | COMPUTE | COMPLETE | N/A      |   | NONE     | NONE      |
+-----------+---------+---------+----------+----------+---+----------+-----------+

$ podsh status -j 22937.pod
+---------------------+----------------------------------------------------+
| Property            | Value                                              |
+---------------------+----------------------------------------------------+
| ID                  | 22937.pod                                          |
| User                |                                                    |
| Type                | COMPUTE                                            |
| State               | QUEUED                                             |
| Resource_List.nodes | 1:ppn=1                                            |
| job_radix           | 0                                                  |
| job_id              | 22937.pod                                          |
...

Checking System Group Jobs¶

For more information about using PODShell to monitor jobs for system groups please reference the Monitoring System Group Jobs Job Scheduler documentation.

Help & Support¶

For more information on how to run the PODShell commands use the --help command line option: podsh --help and podsh <cmd> --help. For any additional questions please reach out to POD Support: pod@penguincomputing.com.

References & Useful Links¶

Python Beoweb Client

Scyld Cloud Manager Docs

Scyld Cloud Manager API Docs

PODTools Client

POD Portal

General OAuth information

List of OAuth libraries and examples