Using the REST API v2 dataexport Iterator Endpoints
Using the REST API v2 dataexport Iterator Endpoints
The Netskope dataexport
endpoints, also called iterator endpoints, provide a simplified way of consuming tenant log information. This article describes the best practices for consumption of this data.
Netskope recommends leveraging existing clients for SIEM integration where possible through the use of the following solutions:
- Netskope Cloud Exchange: Log Shipper Module: https://docs.netskope.com/en/netskope-cloud-exchange.html
- Netskope Splunk Technical Add-on: https://apps.splunk.com/app/3808/
- Sumo Logic Netskope WebTx Source: https://help.sumologic.com/docs/send-data/hosted-collectors/cloud-to-cloud-integration-framework/netskope-webtx-source/
If the aforementioned clients are insufficient, Netskope also provides a python SDK.
- Python SDK for dataexport endpoints https://pypi.org/project/netskopesdk/
How Do Iterator Endpoints Function
Through the use of an index, the Netskope platform tracks log consumption though a simplified operational workflow that replicates that often seen in web forums. When a consumer requests a page of data from the endpoint, Netskope delivers the requested data by writing an index as to the data provided. When the consumer has completed processing the requested page of data, the consumer simply requests the next page of data.
Each endpoint stores its own index value, which is provided by the consumer on query. This allows for easy parallelization of API calls across multiple endpoints concurrently.
Note
Multiple consumers leveraging the same endpoint and index concurrently is not supported and could result in the appearance of missing data on the consumer.
Iterator Query Structure
The endpoint query structure is very easy to construct.
https://<tenant-URL>/api/v2/<endpoint>/?operation=<operation>&index=<index>
Supported Iterator Operations
epoch timestamp
: If an epoch timestamp is provided, this informs the Netskope endpoint to begin log consumption in one hour batches from this timestamp. You will need to use the Next operation to fetch more logs.next
: The next operation value requests the next page of data from the Netskope endpoint.resend
: If the consumer is unable to process the page of data provided, resend operation will issue a retry of the last page of data requested.
The epoch timestamp
and next
operations both update the Netskope stored index, where the resend
operation asks for the prior page without updating the index.
Iterator Index
The index value for the iterator is a string value supplied by the consumer that is used by Netskope to store the page values. This index should be unique by the consumer to prevent data consumption challenges. The consumer may use the same index value across multiple endpoints without concern.
The index string is used for when more than one system is pulling logs. For example, you are using demo as the index and pull records 1-1000. The next time demo pulls logs, it will pull logs 1001-2000 (if pulling 1000 at a time).
If you leave it blank and have two systems pulling logs, the first system will pull logs 1-1000, and then when the second system pulled logs, it will get 1001-2000. This is not optimal.
If you have system1 (demo1) and system2 (demo2) pulling logs at the same time, each will get 1-1000 if using unique index strings.
Not using an index has the chance of “being reused”. If you define your own index, you can guarantee that only you have that index value and won’t lose records.
Page Size
Netskope Iterator endpoints deliver 10,000 record pages per API call.
Wait Time
Each iterator query will provide guidance in seconds how long to wait. This value is calculated based on the amount of data returned in your API call.
{ "ok": 1, "result": [ { } ], "wait_time": 5
Rate Limits
Using the response headers to manage rate limits to avoid 429 error messages is recommended.
- RateLimit-Limit: Rate limits are applied by endpoint, and this value provides the number allowed per second
- RateLimit-Remaining: The amount of queries supported before the interval resets before generating a 429 error message.
- RateLimit-Reset: The time before the rate limits are reset, this value is in seconds.
HTTP/1.1 200 OK .... RateLimit-Limit: 4 RateLimit-Remaining: 1 RateLimit-Reset: 1 ...
If a Rate is exceeded, the headers will be extended, and the data payload will mention why the query returned a 429 error response.
- Retry-After: This is the recommended wait time before retrying your query. This value is in seconds.
HTTP/1.1 429 Too Many Requests ... RateLimit-Remaining: 0 RateLimit-Reset: 1 Retry-After: 1 RateLimit-Limit: 4 ... { "message":"API rate limit exceeded" }
Example
Example of workflow using the iterator endpoint starting with the oldest record Netskope has.
- Craft your query using the
operation=next
, and index value.https://<tenant-URL>/api/v2/events/dataexport/events/alert?operation=next&index=demo
- Review the
wait_time
attribute in the JSON response."wait_time": 5
- Request the next page of data from the endpoint.
https://<tenant-URL>/api/v2/events/dataexport/events/alert?operation=next&index=demo
- Repeat steps 2 and 3.
Error Response Codes
Error Code | User Action Required | Notes |
---|---|---|
403 | Yes | Check the API V2 token is associated to the valid endpoint and its not expired. A Retry will solve the problem only after solving the token issue by following the guidelines. |
409 | No | Concurrency conflict and the request cannot be processed at this point of time. DataExport API V2 endpoints does not support downloading the same event type concurrently with same iterator index and the client is expected to validate the logic to pull the events is single threaded. |
429 | No | Too many request for the same tenant accessing the same endpoint. The Client is expected to honor the rate limit to avoid a 429 error, and as part of the response header, it carries the reset time in the header ratelimit-reset. The Client is expected to sleep/wait (ratelimit-reset ) to avoid the 429. The current rate limit is 4 req / second / endpoint. |
5xx | No | Netskope is having a temporary server issue for one of these reasons:
|
Using the Client Status Iterator API
The Netskope Client periodically reports Client status to the Netskope backend to have visibility in the tenant UI for different aspects of Clients. For example, status about user-initiated actions (enable/disable), installation/upgrade status, current tunnel status (Up/Down) etc. Users can check the Client status logs on the device page on the tenant UI.
Through the use of a Client status iterator, the Netskope platform tracks log consumption though a simplified operational workflow. When a consumer requests a page of Client status events from the endpoint, Netskope delivers the requested events, and writes an index with the watermark of events provided. When the consumer has completed processing the requested page of events, the consumer simply requests the next page.
The Client status iterator service provides a streaming API and these management APIs:
- Create Iterator API: Allows you to create a new iterator. Call this API before sending requests for event logs.
- Check Iterator Status API: Allows you to check whether the creation of an iterator is completed.
- Delete Iterator API: Allows you to delete an existing iterator. Generally, this is used when you need to rename an iterator of a certain event type.
- Event Fetch API: Allows you to request for event logs from an iterator. The response will be returned in CSV format.
Workflow
Here are examples of how to use the new APIs to request the Client status events:
- Use Create Iterator API to create an iterator for Client status events.
POST https://my_test_tenant/api/v2/dataexport/iterator/my_test_index?eventtype=clientstatus
- Use Check Iterator status API to check the creation status of the iterator until the status of the iterator is ready.
GET https://my_test_tenant/api/v2/dataexport/iterator/my_test_index?
- Use Event Fetch API to request events from the iterator.
Get https://my_test_tenant/api/v2/dataexport/iterator/my_test_index/events?operation=next
- Review the wait_time attribute in the response header and wait for enough time accordingly, like “wait_time”: 1
- Use Iterator Events Request API to request events from the iterator.
Get https://my_test_tenant/api/v2/dataexport/iterator/my_test_index/events?operation=next
- Repeat steps 4 and 5.
API Limits
- We allow only one Client status iterator per tenant.
- Concurrent Create Iterator or Delete Iterator requests are not supported and could result in a request failure.
- The iterator service is designed to stream the recent event logs with high speed. You can only request for event logs that are not older than a certain time period; older events are dropped automatically if not requested in time. The supported retention for Client status iterators is 7 days.
- Concurrent event fetch requests on the same iterator are not supported and will result in request failure.
- Multiple consumers request event logs from the same iterator concurrently is not supported and could result in the appearance of missing data on the consumer.