Quick Start
This guide walks you through capturing network flows and querying them.
1. Capture Flows
From a PCAP File
# Basic capture to Parquet
rockfish_probe -i capture.pcap --parquet-dir ./flows
# With nDPI application labeling
rockfish_probe -i capture.pcap --ndpi --parquet-dir ./flows
Live Capture
# Standard libpcap capture (requires root)
sudo rockfish_probe -i eth0 --live pcap --parquet-dir ./flows
# High-performance AF_PACKET capture (Linux)
sudo rockfish_probe -i eth0 --live afpacket --parquet-dir ./flows
With a Configuration File
# Create config.yaml (see Configuration docs)
rockfish_probe -c config.yaml
2. Verify Output
# Check generated files
ls -la flows/
# View file info with DuckDB
duckdb -c "DESCRIBE SELECT * FROM 'flows/*.parquet'"
3. Query with MCP
Set up the MCP server to query your flows:
# mcp-config.yaml
sources:
flow:
path: ./flows/
description: Network flow data
output:
default_format: table
max_rows: 100
# Start MCP server
ROCKFISH_CONFIG=mcp-config.yaml rockfish_mcp
Example Queries
Using the MCP tools:
# Count total flows
count:
source: flow
# Top talkers by bytes
query:
source: flow
sql: |
SELECT saddr, SUM(sbytes + dbytes) as total_bytes
FROM {source}
GROUP BY saddr
ORDER BY total_bytes DESC
LIMIT 10
# Filter by protocol
query:
source: flow
filter: "proto = 'TCP'"
limit: 50
4. Upload to S3 (Optional)
Configure S3 upload in your probe config:
output:
parquet_dir: /var/lib/rockfish/flows
s3:
bucket: my-flow-data
region: us-east-1
hive_partitioning: true
delete_after_upload: true
Files are automatically uploaded and organized by date:
s3://my-flow-data/year=2025/month=01/day=28/rockfish-*.parquet
Next Steps
- Configuration - Full configuration reference
- Capture Modes - High-performance capture options
- MCP Setup - Query server configuration