How to use the Raptor Streaming API

Raptor Services provides an API that supports streaming data into the Customer Data Platform or Site Search through the Data Manager. The API is intended for “server to server” streaming.

The endpoint is customer specific and is split into different streams by sending a header that states the streamId. By including multiple headers containing a streamId each, you can stream the same set of data to multiple dataflows. A streamId is defined by setting up a stream-dataflow in the Data Manager. This is also where you set up the schema of the input data and do transformation to match the schema of the CDP.

Once the schema is settled, you can insert new columns and update columns with partial information, but not delete columns (yet).

If you wish to change the schema, besides adding new columns to the input schema, it is done by creating a new stream-dataflow, which points to the same destination.

Note: It is advised to consider the StreamId a secret, like an API key or password.

Endpoint

The endpoint for streaming is: https://in.raptorsmartadvisor.com/stream/{accountId}

AccountId is found in the Raptor Control Panel (four or five digits)

Verb:

POST

Headers:

x-streamid: {streamId} (guid provided by Data Manager)
Content-Type: application/json
x-isdraft: Set this header to true as long as the dataflow is in draft mode. Remove the header when pushing the dataflow into production

Note: For streaming the same data to multiple dataflows, include multiple headers with one streamId each.

Body:

a JSON array of objects:

Streaming API 1

Example:

Supported datatypes

String
Boolean (true/false)
Number
DateTime (ISO 8601)
Array of objects
Array of values
Null (for deleting a value in a property)

Supported Schema types

Person data
Interaction data
Catalog data

Remarks about updating data

The API in general only supports adding and updating data. It is currently only possible to delete a data row when writing to a Search schema.

🔍Note: If you wish to delete person data, please use the GDPR endpoint

Interaction Data is always appended. Person Data is always upserted (updated if the person exists, otherwise inserted)

When Person Data is updated, it will either succeed by appending to or overwriting existing data.

By appending new data rows will update existing data rows or are added to the dataset already in the CDP, but any other existing rows are left untouched. By overwriting the original dataset is completely removed and replaced with the new. Another option, particularly well-suited for linking to Site Search, is patching - an enhanced version of appending that allows individual cells to be updated, rather than entire lines. To enable patching, the schema must simply be set to append, while checking the 'Ignore missing columns?' box during the File Mapping stage of the Dataflow. For more information on this process, see Dataflow Creation Guide.

Status codes (of the response)

200 Ok. Data is acknowledged and will eventually be added to system (if schema is correct)

400 Bad Request (data is not valid json and/or the content-type is not sent)

401 Unauthorized (streamid or accountid is invalid or missing)

408 Timeout (try again)

413 Payload too long (send fewer entries in request - the maximum size of a request is 1 MB)

FAQ

Q: Can I really send anything JSON?
A: Yes! When setting up the stream-dataflow, you settle on the schema of the data you are sending in - this schema must be obeyed going forward.

Q: What happens with data that doesn’t have a correct schema?
A: it is acknowledged by the endpoint (status code 200 OK) but it might fail when transforming the data to match the CDP schema. And then an error will be sent to the Operational Insights, which could send an email with info on the error.

It is also possible to see samples of failed entries in the Data Manager.

Q: What is considered a failing schema?
A: Wrong datatypes, Missing columns that are required. Empty entries.

Q: what if I send fewer columns than the schema dictates?
A: All columns from the schema must be present. Even though they might have empty, null or default values.

Q: What if I send more columns than expected?
A: Extra columns would be ignored until they are mapped in the Data Manager

Q: How reliable is the API? What if it's down?

A: Our host, Azure, is generally highly reliable. Setting a 2-second retry rule should help swiftly bypass any minor hiccups that may occur.

Q: Is there a maximum number of entries for a request? If so, what is it?

A: There is no direct ceiling to the number of entries you can send at once - however, the maximum size of a request is roughly 1 MB, so there is an effective cap that depends on the size of your entries.

Q: What cybersecurity measures are used to prevent foreign data from being inserted into the datastream?

A: The primary safety is the SteamID used by the client. As long as this is kept secure, no outside source can intrude on the datastream. This ID should thus be kept carefully secured, since if it leaks, it could be used to send false data...

Q: Are there measures in place to ensure that the streaming data won't be redirected to another source by malicious action, such as hacking?

A: Of course - since this dataflow includes customer data, it is treated with all due precautions. Standard HTTPS encryption-protocols are fully implemented, and the StreamID essentially acts as a secure API Key, preventing such man-in-the-middle attacks.