Running workflows

POST /workflows/{workflow_permanent_id}/run

You can see the generic endpoint definition below. We’ll go into the specifics of the invoice retrieval workflow in the next section.

Body

ParameterTypeRequired?Sample ValueDescription
dataJSONno{
“website_url”: “YOUR_URL”,
“invoice_retrieval_start_date”: “2024-04-15”
},
The data field is used to pass in required and optional parameters that a workflow accepts. For the invoice retrieval workflow, required fields are website_url and invoice_retrieval_start_date
webhook_callback_urlStringnoOur system will send the webhook once it is finished executing the workflow run.
proxy_locationStringnoRESIDENTIALProxy location for the web browser. Please pass RESIDENTIAL.
If we use residential proxies, Skyvern’s requests to the websites will be less suspicious.

Response

ParameterTypeAlways returned?Sample ValueDescription
workflow_permanent_idStringyeswpid_123456The workflow id
workflow_run_idStringyeswr_123456The workflow run id that represents this specific workflow run.
You can use this id to match the webhook response to the initial request.

Sample Request & Response - Invoice retrieval

-- Sample Request
curl --location 'https://api.skyvern.com/api/v1/workflows/wpid_123456/run' \
--header 'x-api-key: <USE_YOUR_API_KEY>' \
--header 'Content-Type: application/json' \
--data '{
    "data": {
        "website_url": "your_website",
        "invoice_retrieval_start_date": "2024-04-15"
    },
    "proxy_location": "RESIDENTIAL",
    "webhook_callback_url": "<your-endpoint>"
}'

-- Sample Response
{
    "workflow_id": "wpid_123456",
    "workflow_run_id": "wr_123456"
}

Retrieving workflow runs

GET /workflows/{workflow_id}/runs/{workflow_run_id}

Response

ParameterTypeSample valueDescription
workflow_idStringwpid_123456
workflow_run_idStringwr_123456
statusStringcompletedStatus of the workflow run. Possible values: created, running, failed, terminated, completed
proxy_locationJSONRESIDENTIAL
webhook_callback_urlString127.0.0.1:8000/api/v1/webhook
created_atTimestamp2024-05-16T08:35:24.920793Timestamp for when the workflow run is created
modified_atTimestamp2024-05-16T08:42:32.568908Last modified timestamp for the workflow run
parametersJSONsee sample response belowThe parameters that the workflow run was triggered with. For the invoice retrieval workflow, this field will have the website_url and invoice_retrieval_start_date values you sent.
screenshot_urlslist[String]see sample response belowFinal screenshots for the last 3 tasks in the workflow.
recording_urlStringsee sample response belowThe full browser recording.
outputsJSONsee sample response belowSee the explaining outputs section

Sample response

{
    "workflow_id": "wpid_123456",
    "workflow_run_id": "wr_123456",
    "status": "completed",
    "proxy_location": "RESIDENTIAL",
    "webhook_callback_url": "127.0.0.1:8000/api/v1/webhook",
    "created_at": "2024-05-16T08:35:24.920793",
    "modified_at": "2024-05-16T08:42:32.568908",
    "parameters": {
        "website_url": "YOUR_WEBSITE_URL",
        "invoice_retrieval_start_date": "2024-04-15"
    },
    "screenshot_urls": [
        "https://skyvern-artifacts.s3.amazonaws.com/...",
        "https://skyvern-artifacts.s3.amazonaws.com/...",
        "https://skyvern-artifacts.s3.amazonaws.com/..."
    ],
    "recording_url": "https://skyvern-artifacts.s3.amazonaws.com/...",
    "outputs": {
        "login_output": {
            "task_id": "tsk_1234",
            "status": "completed",
            "extracted_information": null,
            "failure_reason": null,
            "errors": []
        },
        "get_order_history_page_url_and_qualifying_order_ids_output": {
            "task_id": "tsk_258409009008559418",
            "status": "completed",
            "extracted_information": {
              ...
            },
            "failure_reason": null,
            "errors": []
        },
        "iterate_over_order_ids_output": [
            [
                {
                    ...
                }
            ]
        ],
        "download_invoice_for_order_output": {
            "task_id": "tsk_258409361195877732",
            "status": "completed",
            "extracted_information": null,
            "failure_reason": null,
            "errors": []
        },
        "upload_downloaded_files_to_s3_output": [
            "s3://skyvern-uploads/..."
        ],
        "send_email_output": {
            "success": true
        }
    }
}

Webhooks

Skyvern always sends webhooks when a workflow run is executed. The status for an executed workflow run can be: completed, failed, terminated.

The webhook body is the same as the get workflow run endpoint.

Webhook Headers

ParameterTypeRequired?Sample ValueDescription
x-skyvern-signatureStringyesv0=a2114d57b48eac39b9ad189dd8316235a7b4a8d21a10bd27519666489c69b503Authentication token that allows our service to communicate with your backend service via callback / webhook

We’ll be using the same strategy slack uses, as defined here: https://api.slack.com/authentication/verifying-requests-from-slack#making__validating-a-request
x-skyvern-timestampStringyes1531420618Timestamp used to decode and validate the incoming webhook call

We’ll be using the same strategy slack uses, as defined here: https://api.slack.com/authentication/verifying-requests-from-slack#making__validating-a-request |

Explaining outputs

If you checked out the sample response, you probably thought “What the heck is this field right here?”.

We previously went over that workflows are essentially a list of building blocks. outputs field has the output for every single block that a workflow has. Before we start analyzing the outputs from the sample above, let’s go over the building blocks for the invoice retrieval workflow.

Building blocks of invoice retrieval workflow:

#Block typeBlock labelPurpose
1TaskBlockloginFind login page, login to the website
2TaskBlockget_order_history_page_url_and_qualifying_order_idsFind the order history page, extract order history page url, contact emails, and order details for orders after the start date
3ForLoopBlockiterate_over_order_idsThe contents of the ForLoop is executed for each order id that’s extracted from the previous step.
4TaskBlock [within ForLoopBlock]download_invoice_for_orderFor a given order id, find a way to download the invoice, download it.
5UploadToS3Blockupload_downloaded_files_to_s3Upload all downloaded invoices to S3
6SendEmailBlocksend_emailSend an email attaching all the downloaded invoices

⚠️ Still in development, not a blocker

  1. The blocks within the ForLoop show up twice: within the ForLoop output and as a root block.
  2. UploadToS3Block output is S3 URIs at the moment. They’ll be updated with signed urls instead.
  3. Add block type to each object in outputs, define the output structure for each block for easier integration.
...
"outputs": {
    "login_output": {
        "task_id": "tsk_1234",
        "status": "completed",
        "extracted_information": null,
        "failure_reason": null,
        "errors": []
    },
    "get_order_history_page_url_and_qualifying_order_ids_output": {
        "task_id": "tsk_1234",
        "status": "completed",
        "extracted_information": {
            ...
        },
        "failure_reason": null,
        "errors": []
    },
    "iterate_over_order_ids_output": [
        ...
    ],
    "download_invoice_for_order_output": {
        "task_id": "tsk_1234",
        "status": "completed",
        "extracted_information": null,
        "failure_reason": null,
        "errors": []
    },
    "upload_downloaded_files_to_s3_output": [
        "s3://skyvern-uploads/...",
        "s3://skyvern-uploads/..."
    ],
    "send_email_output": {
        "success": true
    }
}
...