Getting investigations data with Presto
Capsule8’s recommended deployment for non-cloud environments is to use HDFS (Hadoop Distributed File System) for storage and Presto for the querying engine.
Investigations is configured in the
/etc/capsule8/capsule8-sensor.yaml file. By default, the Process Events, Sensor, and Container Events are enabled. Which means that if no
table key is provided, those tables will be turned on automatically with a default row size – unique for each MetaEvent type. To configure additional tables, specify them directly.
This is a complete example with every MetaEvent type configured to write to HDFS:
Investigations: reporting_interval: 5m sinks: - name: "[The address of the name node here]:9000/[directory on hdfs to store data, absolute path]" backend: hdfs automated: true credentials: blob_storage_hdfs_user: "[hadoop username that has write access]" blob_storage_create_buckets_enabled: true flight_recorder: enabled: true Tables: - name: "shell_commands" enabled: true - name: "tty_data" enabled: true - name: "file_events" enabled: false - name: "connections" enabled: true - name: "sensor_metadata" enabled: true - name: "alerts" enabled: true - name: "sensors" enabled: true - name: "process_events" enabled: true - name: "container_events" enabled: true
This section provides guides to aid in the installation/setup of storage solutions for Capsule8’s investigations data.
Currently, only insecure HDFS is supported. Only the username of a user that has write access to the directory – that would store Investigations data – is required. In addition to this, every sensor writing to HDFS will need to be able to access the namenodes on ports 8020/9000 and all of the datanodes on port 50010 and 50020.
To write MetaEvents to HDFS, you can configure it with a address of the name node with port and a user:
Investigations: reporting_interval: 5m sinks: - name: "[The address of the name node here]:9000/[directory on hdfs to store data, absolute path]" backend: hdfs automated: true credentials: blob_storage_hdfs_user: "[hadoop username that has write access]" blob_storage_create_buckets_enabled: true
It is highly recommended to have
blob_storage_create_buckets_enabled: true set for HDFS. This is because of the hierarchical nature of HDFS vs. the flat nature of blob storage. If a table subdirectory or partition folder does not exist, it will fail to write.
The following settings will ensure that folders are created in HDFS if they do not exist. In /etc/capsule8/capsule8-sensor.yaml enable the
blob_storage_create_buckets_enabled field. See the example configuration below.
blob_storage_create_buckets_enabled: true blob_storage_hdfs_user: <hdfs user>
This section provides guides to aid in the installation/setup of query solutions with Capsule8’s investigations.
Create and Configure Bucket
Queries are run using SQL syntax. This section provides a few example queries that might be of use during an investigation. For a complete reference of all the available fields that can be queried, see the MetaEvents section at the end of this guide.
Who Has Run a Command Through Sudo?
SELECT from_unixtime(process_events.unix_nano_timestamp / 1000000000) as timestamp, pid, path, username, login_username FROM process_events where event_type = 0 and username != login_username;
Which Programs and their Users Connected to a Given IP?
SELECT DISTINCT from_unixtime(connections.unix_nano_timestamp / 1000000000) AS timestamp, sensors.hostname, process_events.path, container_events.container_name, container_events.image_name, connections.dst_addr, connections.dst_port FROM connections, sensors, container_events, process_events WHERE connections.process_c8id = process_events.process_c8id AND container_events.process_c8id = ## Storage solutionsprocess_events.proce ss_c8id AND connections.dst_addr = '$DESTINATION_IP';
What Containers or Images Ran on my Cluster and where?
SELECT sensors.hostname container_events.image_name, from_unix_time(container_events.unix_nano_timestamp / 1000000000) as timestamp FROM sensors, container_events;
Get All Alerts that Are Part of an Incident
SELECT * FROM alerts where incident_id = '$INCIDENT_ID';
Get All Shell Commands That are Part of an Incident
SELECT from_unixtime(shell_commands.unix_nano_timestamp / 1000000000) AS timestamp, sensors.hostname, array_join(shell_commands.program_arguments, ' ') as args, shell_commands.username FROM shell_commands JOIN sensors ON sensors.sensor_id = shell_commands.sensor_id WHERE shell_commands.incident_id = '$INCIDENT_ID';