Configuring the Cloud TAP Stitcher
Configuring the Cloud TAP Stitcher
You must deploy and configure the Cloud TAP stitcher to use Cloud TAP. Netskope distributes the Cloud TAP stitcher tool as a container image. This tool is responsible for pulling the traffic from cloud storage, stitching back connections, doing TLS decryption, and then exporting the traffic.
You can export to .pcap format or over the wire, using VXLAN and GENEVE protocols, to external NDR tools.
Deploying the Cloud TAP Stitcher
The Cloud TAP stitcher is available as a docker image that you must run on your machine. The following is a link to the docker container for the stitcher: https://hub.docker.com/r/nsteam/cloudtap-stitcher
Use the following commands to download the docker container:
Command | Description |
---|---|
# download latest version | Download the latest version of the docker container. |
# download specific version [version number] | Specify the version of the docker container to download. |
ubuntu@ip-[IP address]:~$ docker images | Check the images in the docker container. |
The following is a list of commonly used arguments for running docker containers. To learn more, see the Docker help documentation.
Arguments | Description |
---|---|
--name | Assign a name to the container. |
-d | Run the container in the background and print the container ID. |
--rm | Automatically remove the container upon exiting. |
-v | Bind mount a volume. |
--entrypoint | Overwrite the default ENTRYPOINT of the image. |
-c | Execute a command. |
The following is an example command to run the stitcher in continuous mode with /usr/bin/bash
as an entrypoint to execute. The container will run in the background and will be removed automatically upon exiting. The container mounts are:
- The host (
/home/ubuntu/ctap
) to the container (/ctap
), which has the credential files aws.json. - The host (
/var/log
) to the container (/var/log
) for the stitcher.log output.
docker run --rm -d --name ns-ctap-stitcher \ -v /home/ubuntu/ctap:/ctap -v /var/log:/var/log --entrypoint "/usr/bin/bash" \ nsteam/cloudtap-stitcher \ -c 'stitcher -n --log-progress /ctap -c /ctap/aws.json --aws-region="us-west-1" --provider aws -b traffic >> /var/log/stitcher.log 2>&1'
Configuring the Cloud TAP Stitcher
The Cloud TAP stitcher allows you to download traffic from cloud storage (e.g., AWS S3, GCP, Azure), then converts that traffic to export or upload the packets to the local or NDR in their original or decrypted form.
Note
The format of the downloaded traffic is proprietary and any details are outside the scope of this document.
The stitcher executable is installed in the container under /usr/bin/
. Run docker ps
and locate the CONTAINER ID
, then connect to the container by running docker exec -it CONTAINER ID bash
. From inside the container run stitcher --help
to view the full list of options. The core arguments are separated into two groups: input and output methods. Use the input options to control cloud import (e.g., cloud provider authentication. Use the output options to control output (e.g., protocol, format, etc.). When done working in the container type exit
and hit enter to leave the container.
Running Modes
Running Modes | Descriptions | Example |
---|---|---|
Selective retrieval | You can retrieve data from a specific timeframe. Use --start-ts and --end-ts to specify the start and end timestamps. If no timestamps options are specified, then the stitcher fetches all BLOB from the source. | # Fetch 1 hour of data starting from Thu Jun 01 2023 22:00:00 GMT+0000 |
One shot fetching | By default, the stitcher fetches all available traffic and terminates. You can use this mode for integration testing. | # fetch and process the BLOBs once |
Continuous fetching | This mode is supported for NDR integration. The stitcher fetches all available data and then continues polling and fetching for updates. | # running continuously to fetch new data |
Crash recovery | Use --log-progress to store the progress if a crash occurs. The stitcher saves the progress file (.stitcher-progress ) in the provided directory path and is a persistent directory mounted into the container. The stitcher will load the previous progress and continue to run after the crash. | # runing with crash recovery progress file in /tmp direcotry |
Vertical scaling | Use --mt to use vertical scaling. The stitcher separates BLOB fetching and parsing across available CPU cores. | # use additional CPU cores to retrieve and reply the traffic |
Cloud Storage
Use the --provider
option to configure the cloud providers that serve as the source of traffic for the stitcher.
The stitcher uses the same bucket name for both key and data by default (-b
or --bucket
). The stitcher also uses the same credential files for both key and data by default (-c
or --credentials
). You can use different buckets for key and data (-k
or --keylog-bucket
). Optionally, you can also specify a separate credentials file for the keylog bucket (--key-credentials
).
Amazon Web Services
For the data hosted in an AWS S3 bucket (--provider aws
), you must provide the access credentials (--credentials
), the bucket name (--bucket
), and the region (--aws-region
). The credentials value is a filesystem path to the JSON file in the container. To provide credentials for access to AWS S3, you can provide a JSON file on the stitcher CLI with this method.
Alternatively, you can assign an IAM role to the AWS EC2 instance instead of specifying a credentials file. Once the stitcher is running on the instance and you grant the IAM role the necessary permissions to access the bucket, use the --default-aws-credentials
option. The AWS SDK will automatically manage authentication based on the instance’s IAM role.
docker run --rm -d --name ns-ctap-stitcher \
-v /home/ubuntu/ctap:/ctap -v /var/log:/var/log --entrypoint "/usr/bin/bash" \
nsteam/cloudtap-stitcher \
-c 'stitcher -n --log-progress /ctap -c /ctap/aws.json --aws-region="INSERT_REGION_HERE" --provider aws -b traffic >> /var/log/stitcher.log 2>&1'
Google Cloud Storage
For the data hosted in a Google Cloud Storage bucket (--provider gcp
), you must provide the service account credentials (--credentials
) and bucket name (--bucket
) to the stitcher. The credentials value is a filesystem path to the JSON file in the container.
After you grant the service account the permissions to access the bucket, when the stitcher runs on the GCP VM, you can use --default-gcp-credentials
instead of specifying the credential file. Google SDK automatically handles the authentication.
Microsoft Azure
For the data hosted in a Microsoft Azure Blob (--provider azure
), you must provide the access credentials (--credentials
) and the name of the Azure storage account (--storage-account
) where your data is hosted. The credentials value is a filesystem path to the JSON file in the container.When Cloud TAP Stitcher is runs on the Azure VM, and with a Managed Identify attached, it can use --default-credentials azure
instead of specifying credential file. Azure SDK will handle the authentication automatically.
docker run --rm -d --name ns-ctap-stitcher \
-v /home/ubuntu/ctap:/ctap -v /var/log:/var/log --entrypoint "/usr/bin/bash" \
nsteam/cloudtap-stitcher \
-c 'stitcher -n --log-progress /ctap -c ctap/azure.json --storage-account INSERT_STORAGE_ACC_NAME_HERE --provider azure >> /var/log/stitcher.log 2>&1'
Note
When the Cloud TAP feature is enabled, a default container named ‘netskope
‘ is automatically created within the specified Azure storage account.
Output Modes
The stitcher can export traffic data locally or upload the data to an NDR sensor that is located in the same cloud provider and region. For NDR, the stitcher uses VXLAN or Geneve encapsulation to tunnel the traffic to the sensor:
- For VXLAN, use
-v
or--vxlan-host
to specify the remote VXLAN endpoint. - For Geneve, use
-g
or--geneve-host
to specify the remote Geneve endpoint.
Netskope-Defined Geneve Options
When you configure Geneve encapsulation (-g
), the stitcher uploads the following Geneve options to an NDR sensor.
Geneve Option | Description |
---|---|
USER_ID | If using the NSClient access method, you can use this option for Geneve to provide the Netskope-specific user identifier (email). |
SITE_ID | If using the IPSec or GRE access methods, you can use this option for Geneve to provide the Netskope-specific site identifier configured for each tunnel by the tenant. |
MTU Adjustment
(Optional) Due to the stitcher using VXLAN or Geneve encapsulation to tunnel traffic, additional bytes will be added to the packet size. If necessary, you can adjust the maximum transition units (MTU) configuration to accommodate the increase in packet size and prevent IP fragmentation.
- For GCP and Azure, you can enable jumbo frames for optimal performance. This ensures that network interfaces can handle larger packet sizes with fragmentation. For more information on enabling jumbo frames, refer to the GCP or Azure documentation.
- If the exported packet size exceeds the default MTU, you can adjust the MTU of the docker host. For more information on modifying the MTU of a network interface, refer to the docker host documentation.
- You can also adjust the MTU setting of the docker container to match the MTU value of the docker host. To learn more on adjusting the MTU of the docker container, refer to the docker host documentation.
Transparent Mode
Some NDR vendors (e.g., ExtraHop) provide data decryption and accept original traffic and session keys. This is supported with the –-transparent
option.
If you are using Geneve extensions, use the following options to control communicated time stamps:
- Use
--with-timestamps
to replay the original timestamps. - Use
--with-current-timestamps
to produce current timestamps at the point of export. You can use this option for testing purposes.
To separate the ExtraHop management interface and data capture interface for higher throughput, use the following options to push the session keys and data to different interfaces:
- Use
--extrahop-prod
to specify the hostname for uploading ExtraHop NDR data. - Use
--extrahop-management-host
to specify the hostname for uploading ExtraHop session keys.
Decryption Mode
By default, the traffic processed by the stitcher is decrypted. You can upload the traffic directly to an NDR that accepts decrypted traffic. To export decrypted traffic with NDR, use the --with-decrypted
option.
stitcher -c aws.json --provider aws --aws-region="us-west-1" -b traffic -g x.x.x.x -n \
--with-timestamps --with-decrypted
Filtering
The stitcher can apply traffic filters when processing traffic, allowing for fine-grained slice exports. Each filter value is a pcap expression. For example, you can retrieve a slice of traffic from a given time and source.
# Fetch 1 hour of data starting from Thu Jun 01 2023 22:00:00 GMT+0000 for a single user stitcher --provider gcp -b traffic \ --start-ts 1685656800 --end-ts 1685660400 \ --export-filter "host 1.2.3.4 and tcp port 443" \ -e export.pcap
Scaling with Multiple Stitcher Instances
You can distribute traffic copied from multiple origins (e.g., Netskope POPs) across multiple instances to achieve the desired scale.
To select the traffic sources to process, use wildcards with the --origin-filter
option. The following are examples of wildcards you can use with this option.
Tip
Prior to configuring multiple stitcher instances, Netskope recommends planning your values to avoid overlap and ensure balanced load distribution. You must also ensure that each instance is responsible for a distinct subset of traffic. Distributing copied traffic from one Netskope POP across multiple stitcher instances is not supported.
Wildcards | Descriptions |
---|---|
--origin-filter US-NYC* | This wildcard allows the stitcher to process all points of presence (POPs) that start with US-NYC. |
--origin-filter *NYC* | This wildcard allows the stitcher to process all POPs that contain NYC. |
--origin-filter UK-* | This wildcard allows the stitcher to process all POPs in the UK. |
stitcher --provider gcp -c gcp.json -b traffic --origin-filter US-NYC* --extrahop-management-host 10.138.0.40 --extrahop-prod 192.168.10.2 | This wildcard allows the stitcher to perform selective retrieval by data center. |
The following is an example of using multiple wildcards to distribute the load across multiple instances.
Instances | Wildcards |
---|---|
Instance 1 | stitcher --provider gcp -c gcp.json -b traffic --origin-filter US-* --extrahop-management-host 10.138.0.40 --extrahop-prod 192.168.10.2 |
Instance 2 | stitcher --provider gcp -c gcp.json -b traffic --origin-filter IN-* --extrahop-management-host 10.138.0.40 --extrahop-prod 192.168.10.2 |
Instance 3 | stitcher --provider gcp -c gcp.json -b traffic --origin-filter UK-* --extrahop-management-host 10.138.0.40 --extrahop-prod 192.168.10.2 |