Configuring the Cloud TAP Stitcher

Configuring the Cloud TAP Stitcher

You must deploy and configure the Cloud TAP stitcher to use Cloud TAP. Netskope distributes the Cloud TAP stitcher tool as a container image. This tool is responsible for pulling the traffic from cloud storage, stitching back connections, doing TLS decryption, and then exporting the traffic.

You can export to .pcap format or over the wire, using VXLAN and GENEVE protocols, to external NDR tools.

Deploying the Cloud TAP Stitcher

The Cloud TAP stitcher is available as a docker image that you must run on your machine. The following is a link to the docker container for the stitcher: https://hub.docker.com/r/nsteam/cloudtap-stitcher

Use the following commands to download the docker container:

CommandDescription
# download latest version 
docker pull nsteam/cloudtap-stitcher:latest

Download the latest version of the docker container.
# download specific version [version number]
docker pull nsteam/cloudtap-stitcher:[version number]
Specify the version of the docker container to download.
ubuntu@ip-[IP address]:~$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nsteam/cloudtap-stitcher latest d080b2946cca 11 days ago 567MB
Check the images in the docker container.

The following is a list of commonly used arguments for running docker containers. To learn more, see the Docker help documentation.

ArgumentsDescription
--nameAssign a name to the container.
-dRun the container in the background and print the container ID.
--rmAutomatically remove the container upon exiting.
-vBind mount a volume.
--entrypointOverwrite the default ENTRYPOINT of the image.
-cExecute a command.

The following is an example command to run the stitcher in continuous mode with /usr/bin/bash as an entrypoint to execute. The container will run in the background and will be removed automatically upon exiting. The container mounts are:

  • The host (/home/ubuntu/ctap) to the container (/ctap), which has the credential files aws.json.
  • The host (/var/log) to the container (/var/log) for the stitcher.log output.

docker run --rm -d --name ns-ctap-stitcher \
-v /home/ubuntu/ctap:/ctap -v /var/log:/var/log --entrypoint "/usr/bin/bash" \
nsteam/cloudtap-stitcher \
-c 'stitcher -n --log-progress /ctap -c /ctap/aws.json --aws-region="us-west-1" --provider aws -b traffic >> /var/log/stitcher.log 2>&1'

Configuring the Cloud TAP Stitcher

The Cloud TAP stitcher allows you to download traffic from cloud storage (e.g., AWS S3, GCP, Azure), then converts that traffic to export or upload the packets to the local or NDR in their original or decrypted form.

Note

The format of the downloaded traffic is proprietary and any details are outside the scope of this document.

The stitcher executable is installed in the container under /usr/bin/. Run docker ps and locate the CONTAINER ID, then connect to the container by running docker exec -it CONTAINER ID bash. From inside the container run stitcher --help to view the full list of options. The core arguments are separated into two groups: input and output methods. Use the input options to control cloud import (e.g., cloud provider authentication. Use the output options to control output (e.g., protocol, format, etc.). When done working in the container type exit and hit enter to leave the container.

Running Modes

Running ModesDescriptionsExample
Selective retrievalYou can retrieve data from a specific timeframe. Use --start-ts and --end-ts to specify the start and end timestamps. If no timestamps options are specified, then the stitcher fetches all BLOB from the source.
# Fetch 1 hour of data starting from Thu Jun 01 2023 22:00:00 GMT+0000
stitcher --provider gcp -b traffic --start-ts 1685656800 --end-ts 1685660400 -e export.pcap
One shot fetchingBy default, the stitcher fetches all available traffic and terminates. You can use this mode for integration testing.
# fetch and process the BLOBs once
stitcher --provider gcp -b traffic --geneve-host x.x.x.x
Continuous fetching This mode is supported for NDR integration. The stitcher fetches all available data and then continues polling and fetching for updates.
# running continuously to fetch new data
stitcher --provider gcp -b traffic --geneve-host x.x.x.x --continuous
Crash recoveryUse --log-progress to store the progress if a crash occurs. The stitcher saves the progress file (.stitcher-progress) in the provided directory path and is a persistent directory mounted into the container. The stitcher will load the previous progress and continue to run after the crash.
# runing with crash recovery progress file in /tmp direcotry
stitcher --provider gcp -b traffic --geneve-host x.x.x.x --log-progress /tmp
Vertical scalingUse --mt to use vertical scaling. The stitcher separates BLOB fetching and parsing across available CPU cores.
# use additional CPU cores to retrieve and reply the traffic
stitcher --provider gcp -b traffic --geneve-host x.x.x.x --mt

Cloud Storage

Use the --provider option to configure the cloud providers that serve as the source of traffic for the stitcher.

The stitcher uses the same bucket name for both key and data by default (-b or --bucket). The stitcher also uses the same credential files for both key and data by default (-c or --credentials). You can use different buckets for key and data (-k or --keylog-bucket). Optionally, you can also specify a separate credentials file for the keylog bucket (--key-credentials).

Amazon Web Services

For the data hosted in an AWS S3 bucket (--provider aws), you must provide the access credentials (--credentials), the bucket name (--bucket), and the region (--aws-region). The credentials value is a filesystem path to the JSON file in the container. To provide credentials for access to AWS S3, you can provide a JSON file on the stitcher CLI with this method. 

Alternatively, you can assign an IAM role to the AWS EC2 instance instead of specifying a credentials file. Once the stitcher is running on the instance and you grant the IAM role the necessary permissions to access the bucket, use the --default-aws-credentials option. The AWS SDK will automatically manage authentication based on the instance’s IAM role.

docker run --rm -d --name ns-ctap-stitcher \
-v /home/ubuntu/ctap:/ctap -v /var/log:/var/log --entrypoint "/usr/bin/bash" \
nsteam/cloudtap-stitcher \
-c 'stitcher -n --log-progress /ctap -c /ctap/aws.json --aws-region="INSERT_REGION_HERE" --provider aws -b traffic >> /var/log/stitcher.log 2>&1'
Google Cloud Storage

For the data hosted in a Google Cloud Storage bucket (--provider gcp), you must provide the service account credentials (--credentials) and bucket name (--bucket) to the stitcher. The credentials value is a filesystem path to the JSON file in the container.

After you grant the service account the permissions to access the bucket, when the stitcher runs on the GCP VM, you can use --default-gcp-credentials instead of specifying the credential file. Google SDK automatically handles the authentication.

Microsoft Azure

For the data hosted in a Microsoft Azure Blob (--provider azure), you must provide the access credentials (--credentials) and the name of the Azure storage account (--storage-account) where your data is hosted. The credentials value is a filesystem path to the JSON file in the container.When Cloud TAP Stitcher is runs on the Azure VM, and with a Managed Identify attached, it can use --default-credentials azure instead of specifying credential file. Azure SDK will handle the authentication automatically.

docker run --rm -d --name ns-ctap-stitcher \
-v /home/ubuntu/ctap:/ctap -v /var/log:/var/log --entrypoint "/usr/bin/bash" \
nsteam/cloudtap-stitcher \
-c 'stitcher -n --log-progress /ctap -c ctap/azure.json --storage-account INSERT_STORAGE_ACC_NAME_HERE --provider azure >> /var/log/stitcher.log 2>&1'

Note

When the Cloud TAP feature is enabled, a default container named ‘netskope‘ is automatically created within the specified Azure storage account.

Output Modes

The stitcher can export traffic data locally or upload the data to an NDR sensor that is located in the same cloud provider and region. For NDR, the stitcher uses VXLAN or Geneve encapsulation to tunnel the traffic to the sensor:

  • For VXLAN, use -v or --vxlan-host to specify the remote VXLAN endpoint.
  • For Geneve, use -g or --geneve-host to specify the remote Geneve endpoint.

Netskope-Defined Geneve Options

When you configure Geneve encapsulation (-g), the stitcher uploads the following Geneve options to an NDR sensor.

Geneve OptionDescription
USER_IDIf using the NSClient access method, you can use this option for Geneve to provide the Netskope-specific user identifier (email).
SITE_IDIf using the IPSec or GRE access methods, you can use this option for Geneve to provide the Netskope-specific site identifier configured for each tunnel by the tenant.

MTU Adjustment

(Optional) Due to the stitcher using VXLAN or Geneve encapsulation to tunnel traffic, additional bytes will be added to the packet size. If necessary, you can adjust the maximum transition units (MTU) configuration to accommodate the increase in packet size and prevent IP fragmentation.

  • For GCP and Azure, you can enable jumbo frames for optimal performance. This ensures that network interfaces can handle larger packet sizes with fragmentation. For more information on enabling jumbo frames, refer to the GCP or Azure documentation.
  • If the exported packet size exceeds the default MTU, you can adjust the MTU of the docker host. For more information on modifying the MTU of a network interface, refer to the docker host documentation.
  • You can also adjust the MTU setting of the docker container to match the MTU value of the docker host. To learn more on adjusting the MTU of the docker container, refer to the docker host documentation.

Transparent Mode

Some NDR vendors (e.g., ExtraHop) provide data decryption and accept original traffic and session keys. This is supported with the –-transparent option.

If you are using Geneve extensions, use the following options to control communicated time stamps:

  • Use --with-timestamps to replay the original timestamps.
  • Use --with-current-timestamps to produce current timestamps at the point of export. You can use this option for testing purposes.

To separate the ExtraHop management interface and data capture interface for higher throughput, use the following options to push the session keys and data to different interfaces:

  • Use --extrahop-prod to specify the hostname for uploading ExtraHop NDR data.
  • Use --extrahop-management-host to specify the hostname for uploading ExtraHop session keys.

Decryption Mode

By default, the traffic processed by the stitcher is decrypted. You can upload the traffic directly to an NDR that accepts decrypted traffic. To export decrypted traffic with NDR, use the --with-decrypted option.

stitcher -c aws.json --provider aws --aws-region="us-west-1" -b traffic -g x.x.x.x -n \
--with-timestamps --with-decrypted

Filtering

The stitcher can apply traffic filters when processing traffic, allowing for fine-grained slice exports. Each filter value is a pcap expression. For example, you can retrieve a slice of traffic from a given time and source.

# Fetch 1 hour of data starting from Thu Jun 01 2023 22:00:00 GMT+0000 for a single user
stitcher --provider gcp -b traffic \
  --start-ts 1685656800 --end-ts 1685660400 \
  --export-filter "host 1.2.3.4 and tcp port 443" \
  -e export.pcap

Scaling with Multiple Stitcher Instances

You can distribute traffic copied from multiple origins (e.g., Netskope POPs) across multiple instances to achieve the desired scale.

To select the traffic sources to process, use wildcards with the --origin-filter option. The following are examples of wildcards you can use with this option.

Tip

Prior to configuring multiple stitcher instances, Netskope recommends planning your values to avoid overlap and ensure balanced load distribution. You must also ensure that each instance is responsible for a distinct subset of traffic. Distributing copied traffic from one Netskope POP across multiple stitcher instances is not supported.

WildcardsDescriptions
--origin-filter US-NYC*This wildcard allows the stitcher to process all points of presence (POPs) that start with US-NYC.
--origin-filter *NYC*This wildcard allows the stitcher to process all POPs that contain NYC.
--origin-filter UK-*This wildcard allows the stitcher to process all POPs in the UK.
stitcher --provider gcp -c gcp.json -b traffic --origin-filter US-NYC* --extrahop-management-host 10.138.0.40 --extrahop-prod 192.168.10.2
This wildcard allows the stitcher to perform selective retrieval by data center.

The following is an example of using multiple wildcards to distribute the load across multiple instances.

InstancesWildcards
Instance 1
stitcher --provider gcp -c gcp.json -b traffic --origin-filter US-* --extrahop-management-host 10.138.0.40 --extrahop-prod 192.168.10.2
Instance 2
stitcher --provider gcp -c gcp.json -b traffic --origin-filter IN-* --extrahop-management-host 10.138.0.40 --extrahop-prod 192.168.10.2
Instance 3
stitcher --provider gcp -c gcp.json -b traffic --origin-filter UK-* --extrahop-management-host 10.138.0.40 --extrahop-prod 192.168.10.2
Share this Doc

Configuring the Cloud TAP Stitcher

Or copy link

In this topic ...