API & Command Line Utilities¶
User Identity & API Key¶
Registering for an account on the POD portal establishes your Scyld Cloud Auth user identity and provides you with an API key and secret. All Scyld Cloud Manager APIs require you to provide an authentication token and to get that token, you will use that API key and secret.
You can find your User ID, API key, and API secret on the POD Portal settings page under the API section. If you do not already have an API key or secret you must request one from support by clicking the Request an API key and secret link. Once you have an API key you can click the Revoke or Reset links to recreate or revoke your secret immediately displaying the results on the page. Revoking your API secret will remove your ability to use the API. Please be sure to protect your API secret just as you would a password.
POD API Endpoints (Authentication Required)¶
POD Scyld Cloud Auth Service: https://auth.pod.penguincomputing.com
POD Scyld Cloud Accountant Service: https://accountant.pod.penguincomputing.com
A Guide to using Scyld Cloud Manager APIs on POD¶
The Scyld Cloud Manager (SCM) Platform is a distributed set of software components that provides HPC cloud account and resource management. These software components work tightly together to provide authentication, authorization, and cluster resource provisioning services through their APIs.
At Penguin Computing, we have deployed SCM for our POD clusters and supporting infrastructure. Users can register at and use the POD Portal to manage their account. In addition, each component within SCM exposes an API that users can utilize for programmatic management of their account and POD cluster resources.
The purpose of this guide is to demonstrate the functionality of the APIs with example. We will show some code examples. Furthermore we will be using Python, but the basic interaction pattern should look similar in other languages. We will go through these steps:
Authentication Token¶
Now that you have your API credentials, you can use them to get an authentication token from the Scyld Cloud Auth service. To get a token you submit a request to the service using OAuth. OAuth is an open web protocol for token-based authentication. This is best done using a Python client library since OAuth requests involve manipulating HTTP headers and concatenating multiple strings including timestamps. This is not something easily done on the command line.
import oauth2 as oauth
import json
CLOUDAUTH_ID = '880da6d718d54acc745ab9fea7b2c090'
API_KEY = 'my_key'
API_SECRET = 'my_secret'
AUTH_API_ENDPOINT = 'https://auth.pod.penguincomputing.com'
auth_url = AUTH_API_ENDPOINT + '/v1/auth/request_token'
# setup the oauth client
consumer = oauth.Consumer(key=API_KEY, secret=API_SECRET)
client = oauth.Client(consumer)
# make the HTTP request using GET method
resp, content = client.request(auth_url, "GET")
# we are looking for HTTP status code of 200 for success
if resp['status'] == '200':
result = json.loads(content)
auth_token = result['authentication_token']
auth_token_expires_at = result['expires_at_utc']
else:
# an error occured.
pass
Once you have acquired your auth_token
, you will use it every subsequent SCM API call by including in the HTTP headers as the X-Auth-Token
header. Please Note: Once the authentication token expires it is no longer valid and any API calls using it will fail.
Get a List of Available PODs¶
Now that we have an authentication token stored in auth_token
, make a request to retrieve the list of Scyld Cloud Controller instances available within the POD domain. The response is a list of Scyld Cloud Controllers in the JSON results 'data'
field. A Cloud Controller object contains the API endpoint information so now we know where to make cluster resource requests.
import requests
# construct the url for the request
url = AUTH_API_ENDPOINT + '/v1/cloud_controller/list'
# all SCM APIs require this header
headers = {'X-Auth-Token': auth_token}
# make the HTTP GET request
r = requests.get(url, headers=headers)
# check for success
if r.status_code == requests.codes.ok:
response = r.json()
if response['success']:
data = response['data']
'data': {
'cloud_controllers': [{
'id': 16,
'name': 'POD MT1',
'active': True,
'api_endpoint': 'mt1-api.pod.penguincomputing.com:443',
'api_ssl': True,
'api_version': '1',
'beoweb_endpoint': 'mt1-beoweb.pod.penguincomputing.com:443',
'beoweb_ssl': True,
'active': True,
'created_at_utc': '2013-02-13 17:03:31',
'timezone': 'US/Mountain'}],
'total': 1
}
Create a New User Account¶
The first thing we will need to do to get set up on the POD MT1 cluster is to create a user account, which is a UNIX-style system account. We will make the API request to the Cloud Controller endpoint identified from the previous code snippet. A successful response to this call to the Cloud Controller returns a Cloud Controller User object.
import requests
import json
# result from our previous query
CLOUDCON_ENDPOINT = 'https://mt1-api.pod.penguincomputing.com:443'
CLOUDCON_VERSION = 'v1'
# construct URL
url = CLOUDCON_ENDPOINT + '/' + CLOUDCON_VERSION + '/user'
# construct headers: the cloud controller requires an additional header
headers = {
'X-Auth-Token': auth_token,
'X-Auth-Cloudauth-Id': CLOUDAUTH_ID}
# construct the required parameters
# note that the cloud controller requires JSON encoded parameters
params = json.dumps({
'cloudauth-id': CLOUDAUTH_ID,
'system-name': 'penguin'})
# make the HTTP POST request
r = requests.post(url, headers=headers, data=params)
# check for success
if r.status_code == requests.codes.ok:
response = r.json()
if response['success']:
data = response['data']
# the response is a list of user objects, in this case a list of 1
cloudcon_user = data[0]
Provision a New Storage Volume¶
Next, we will provision a new storage volume on the POD MT1 cluster for our new user. A successful response from this call to the Cloud Controller returns a Storage Volume object.
import requests
import json
# construct URL
url = CLOUDCON_ENDPOINT + '/' + CLOUDCON_VERSION + '/storage-volume'
# construct headers
headers = {
'X-Auth-Token': auth_token}
'X-Auth-Cloudauth-Id': CLOUDAUTH_ID}
# construct the required parameters: volume-size is GB.
params = json.dumps({
'cloudauth-id': CLOUDAUTH_ID,
'volume-size': 50,
'volume-type': {'visibility': 'cluster'}})
# make the HTTP POST request
r = requests.post(url, headers=headers, data=params)
# check for success
if r.status_code == requests.codes.ok:
response = r.json()
if response['success']
data = response['data']
# the response is a list of storage volumes
volume = data[0]
Create a New Login Server Instance¶
We have now created a new user account and storage volume on the POD MT1 cluster. These steps are required before you can create a new login server instance that serves as your gateway to the POD MT1 cluster. You can login here to submit jobs and manage the input and output files.
When we create a login server instance, we will be required to specify both a server-image, which indicates the boot image and a server-flavor, which defines the resource specifications: including cores, memory, and disk space. In order to determine what server images and flavors are available from the cloud controller, we will first make two separate requests:
# GET /server-instance payload
'data': [{
'id': 'd36256c9-9fcb-4178-83b0-ead2243fc0e1',
'active': True,
'min-disk': 5,
'min-ram': 256,
'name': 'MT1 Login Node (CentOS 6)',
'operating-system': {
'major_version': '',
'minor_version': '',
'name': ''},
'creation-date': '2013-04-17T18:50:19Z',
'public': True}]
# GET /server-flavor payload
'data': [{
'id': '1',
'disk': 5,
'memory': 256,
'name': 'pod.free',
'vcpu': 1}]
Now we can make the request to create the new login server instance. Note that we use the users
parameter; which is your Cloud Controller userid. A successful call will return a Server Instance object. You will have the id
of the instance so you can query it for status and the public IP address.
import requests
import json
# construct URL
url = CLOUDCON_ENDPOINT + '/' + CLOUDCON_VERSION + '/server-instance/'
# construct headers
headers = {
'X-Auth-Token': auth_token}
'X-Auth-Cloudauth-Id': CLOUDAUTH_ID}
# construct the required parameters: volume-size is GB.
params = json.dumps({
'cloudauth-id': CLOUDAUTH_ID,
'server-image': server_image_id,
'server-flavor': server_flavor_id,
'users': [cloudcon_user['id']]})
# make the HTTP POST request
r = requests.post(url, headers=headers, data=params)
# check for success
if r.status_code == requests.codes.ok:
response = r.json()
if response['success']
data = response['data']
# the response is a list of server instance objects
vm = data[0]
# GET /server-instance/vm['id']
'data': [{
'cloudauth-id': '880da6d718d54acc745ab9fea7b2c090',
'creation-date': '2013-04-12T00:01:21Z',
'id': 'cbaadf63529df080885497cbaad1f635',
'private-ip': '10.10.100.37',
'public-ip': '192.41.74.108',
'server-flavor': '1',
'server-image': '66d63657-570b-4a49-9560-5427c861c5f6',
'status': 'ACTIVE',
'users': ['880da6d718d54acc745ab9fea7b2c090']}]
Add SSH Keys for Authentication¶
This guide assumes you are familiar with SSH keys and their use. As stated above, in order to log in to your server instance, you’ll need to upload a public ssh key. SSH keys are a property of a Cloud Controller User object. Now you are ready to access your new login server instance and prepare to run jobs on the cluster.
import requests
import json
# construct URL
url = CLOUDCON_ENDPOINT + '/' + CLOUDCON_VERSION + '/user/' + cloudcon['id']
# the public key string, shortened here
mykey = 'ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA1u'
# construct headers
headers = {
'X-Auth-Token': auth_token}
'X-Auth-Cloudauth-Id': CLOUDAUTH_ID}
# construct the required parameters
params = json.dumps({
'cloudauth-id': CLOUDAUTH_ID,
'action': 'add',
'ssh-public-keys': [{'key': mykey}]
# make the HTTP PUT request
r = requests.put(url, headers=headers, data=params)
# check for success
if r.status_code == requests.codes.ok:
response = r.json()
if response['success']
# success
else:
# did not succeed. the msg may indicate the specific error.
print response['msg']
PODTools: Command-line Utilities¶
PODTools is a collection of applications that interface with POD directly from a user’s local machine. This enables job submission and queries without first logging into a POD login node. The CLI application enables remote job submission and management, data staging, and file management.
Download & Install¶
To use PODTools, first you must download and install the necessary development libraries and packages. PODTools prefers Python 2.7 and has not yet been tested with Python 3. Download the latest podtools and python-scyld-utils tarballs from here.
In addition to having Python 2.7 and python2-virtualenv
installed, ensure that the following two development libraries are also installed. Library package names typically differ between Operating Systems:
Debian/Ubuntu:
Secure Sockets Layer toolkit:
libssl-dev
Foreign Function Interface library:
libffi-dev
RedHat/CentOS:
Secure Sockets Layer toolkit:
openssl-dev
Foreign Function Interface library:
libffi-devel
We recommend installing PODTools into a dedicated Python virtual environment. In this example we unpacked the podtools and python-scyld-utils tarballs then created and activated a dedicated Python virtual environment named env
. Once activated the prompt changed to show we are using the Python installed in the virtual environment. Finally we ran the install scripts for python-scyld-utils and podtools. Please Note: always activate your Python virtual environment before using PODTools with the command: source env/bin/activate
$ tar xzf podtools-2.4.1.tar.gz
$ tar xzf python-scyld-utils-1.4.0.tgz
$ virtualenv env
...
$ source env/bin/activate
(env) $
(env) $ cd python-scyld-utils
(env) $ python setup.py install
(env) $ cd ../podtools-2.4.1
(env) $ python setup.py install
Configuration¶
Each time PODTools is invoked, it will read the configuration files /etc/podtools.conf
and ~/.podtools/podtools.conf
in that order. Settings in a user’s ~/.podtools/podtools.conf
configuration file will take precedence over the ones set at the system level in /etc/podtools.conf
. Additionally, any options set as command line arguments will override settings from both configuration files.
A recommended approach is to put a minimal configuration that contains global settings for all users in /etc/podtools.conf
. Those settings should include the POD IP address and port parameters. If all users in your group will be using the same scheduler, you could also set the sched parameter here as well. A user’s configuration file should, at a minimum, contain their POD user name parameter. Make sure the /etc/podtools.conf
is readable to all, but only editable by root.
Sample /etc/podtools.conf
:
# Global Options for POD Tools
[podtools]
address=mt1-beoweb.pod.penguincomputing.com
port=443
ssl=True
auth_type=cloudauth
cloudauth_url=https://auth.pod.penguincomputing.com
Sample ~/.podtools/podtools.conf
:
# User Options for POD Tools
[podtools]
user=<your system username>
cloudauth_api_key=<your API key>
cloudauth_api_secret=<your API secret>
Configuration Templates¶
A template for podtools.conf
is installed in /opt/scyld/podtools/podtools.conf.template
and included below for reference. The template file leaves all options commented out, and set to their default values. If the option is commented, the value shown will be the one used.
# Global Options for PODTools
[podtools]
#user=
#address=127.0.0.1
#port=443
#logout=false
#ssl=True
#auth_type=cloudauth
#cloudauth_api_key=
#cloudauth_api_secret=
#ssh_auth_address=
#ssh_auth_port=2200
#password=
#loglevel=ERROR
# Options for PODShell
[podsh]
#overwrite=true
#sched=TRQ
#stageinmaxsize=0
#stageoutmaxsize=0
#hash_algo=md5
PODTools Configuration Options¶
PODTools Option |
Value |
Meaning |
---|---|---|
user |
<name> |
The user name to use when communicating with POD. |
address |
<ip address> |
The IP Address of the POD web server. |
port |
<port number> |
The Port number of the POD web server. |
logout |
|
Automatically logout after reach request. |
|
Do not automatically logout after each request. |
|
ssl |
|
Encrypt communication with POD using SSL/TLS. |
|
Do not use SSL/TLS. Not supported on POD MT1 & MT2. |
|
authtype |
|
Use API key and secret for token-based authentication. |
|
Prompt for a password when authenticating on POD. |
|
|
Use SSH public/private Keys for authentication on POD. |
|
cloudauth_api_key |
<api key> |
API Key for your POD account |
cloudauth_api_secret |
<api secret> |
API Secret for your POD account |
ssh_auth_address |
<ip address> |
IP Address of server to use during SSH authentication. |
ssh_auth_port |
<port number> |
The port number to use during SSH authentication. |
password |
<pod password> |
Use your POD password to no longer be prompted for it. |
loglevel |
|
Only show ERROR messages and higher in |
|
Show WARNING messages and higher in |
|
|
Show INFO messages and higher in |
|
|
Show DEBUG messages and higher in |
PODShell Configuration Options¶
PODShell Option |
Value |
Meaning |
---|---|---|
overwrite |
|
Overwrite job script if it already exists. |
|
Do not overwrite existing job scripts. |
|
sched |
|
Submit job using the TORQUE scheduler. |
|
Submit job using the SGE scheduler. |
|
stageinmaxsize |
<size> |
Abort submission if total size of files to stagein is greater than <size>. |
stageoutmaxsize |
<size> |
Abort stageout if total size of files to download is greater than <size>. |
hash_algo |
<algo> |
Hash algorithm to use for verifying file integrity. |
POD MT1 Sample¶
If using PODTools on MT1 to status group jobs, the following is all that is needed in a file located at ~/.podtools/podtools.conf
:
[podtools]
cloudauth_api_key=<your API key>
cloudauth_api_secret=<your API secret>
PODShell: Job Submission and Management Utility¶
PODShell is a CLI application distributed with PODTools that enables remote job submission and management, data staging, and file management. With PODShell, you can do the following. Some examples are provided below.
Submit job scripts directly to POD without having to SSH to a login node.
Stagein data (upload) to POD as a prerequisite of executing your job.
Stageout data (download) from POD automatically when a job is finished.
Submit either TORQUE or SGE formatted job scripts.
Stagein or stageout data to/from POD independent of job submission.
Query the status of your jobs.
Submitting Jobs¶
PODShell can essentially act as a “remote qsub” that supports job scripts written using TORQUE or SGE job submission formats. The examples below submit jobs using both formats with the podsh submit
command:
$ podsh submit test.sub
Job Submitted successfully. Job ID: 1586
$ podsh submit --sched=SGE sge-test.sub
Job Submitted successfully. Job ID: 1587
Data Staging¶
PODShell can copy your input and output data for you as part of a job, or independently. The examples below use the --stagein
and --stageout
options. Jobs will be held until stagein completes and stageout will not occur until a job finishes.
$ podsh submit --stagein=~/input-data test.sub
Job Submitted successfully. Job ID: 1588
$ podsh submit --stageout=~/job-results.log test.sub
Job Submitted successfully. Job ID: 1589
Combine the two options to automate an entire workflow, normally accomplished by the following manual steps.
SCP input data and job script to login node.
SSH to login node and submit job.
Watch job status from SSH session or wait for job completion email.
SCP results data back to your local host.
$ podsh submit --stagein=input-data:~/data --stageout=~/job-results.log workflow-job.sub
Job Submitted successfully. Job ID: 1590
Checking Job Status¶
PODShell can display job status similar to the output from the qstat
command. In addition the podsh status
command will also display status related to the staging in and out of data as part of your job submission.
$ podsh status
+-----------+---------+---------+----------+----------+---+----------+-----------+
| ID | User | Type | State | Job Name | S | Stage-in | Stage-out |
+-----------+---------+---------+----------+----------+---+----------+-----------+
| 22482.pod | travis | COMPUTE | COMPLETE | N/A | | NONE | FAILED |
| 22708.pod | travis | COMPUTE | COMPLETE | N/A | | NONE | NONE |
| 22709.pod | travis | COMPUTE | COMPLETE | N/A | | NONE | NONE |
| 22710.pod | travis | COMPUTE | COMPLETE | N/A | | NONE | NONE |
+-----------+---------+---------+----------+----------+---+----------+-----------+
$ podsh status -j 22937.pod
+---------------------+----------------------------------------------------+
| Property | Value |
+---------------------+----------------------------------------------------+
| ID | 22937.pod |
| User | |
| Type | COMPUTE |
| State | QUEUED |
| Resource_List.nodes | 1:ppn=1 |
| job_radix | 0 |
| job_id | 22937.pod |
...
Checking System Group Jobs¶
For more information about using PODShell to monitor jobs for system groups please reference the Monitoring System Group Jobs Job Scheduler documentation.
Help & Support¶
For more information on how to run the PODShell commands use the --help
command line option: podsh --help
and podsh <cmd> --help
. For any additional questions please reach out to POD Support: pod@penguincomputing.com.