top of page

ELI5: How to investigate a SOC incident? (5 Steps)

  • Writer: cypienta cypienta
    cypienta cypienta
  • Jun 16
  • 4 min read

SOC Analyst imagined by Chatgpt

At Cypienta, I'm fortunate to work with many of the best SOC analysts, hunters, and detection engineers out there, as we help automatically find the relevant data points right at their SIEM, SOAR, XDR, and Case Management tools without searches, queries or rules. 


As I am preparing for our Blackhat & Defcon presentations, I went through some personal notes. So thought I'd share my ELI5 (explain like Im 5) from what I got from Fortune 100 SOC internal frameworks, guidelines, and documented workflows on how to deal with an incident when it hits your queue:- 


1. Understand the Incident

Answer the following:- 


  • What happened & what threat behavior is this? (abstract to something like MITRE ATTACK TTPs)



  • Which systems, networks, business units, orgs, identities, or assets are involved? (Use something like a CMDB to understand involved entities' classifications, roles, owners & attributes)

  • When did it happen?


Example: WHAT="Suspicious outbound traffic" from a WHICH="service acc external user endpoint" at WHEN="2:00 AM Yesterday".
Splunk Enterprise Security SIEM
Splunk Enterprise Security Analytical Stories (more spl content: https://github.com/splunk/security_content)

2. Gather Related Evidence

Inspect & skim or eyeball the relevant situational context around that time period: 


  • Logs from affected systems & apps among others



  • Events from network traffic capture among others



  • Alerts from endpoint, network, or cloud security among others 

  • Service tickets & CMDB data for the affected entities among others

  • Threat intelligence feeds among others 

  • Verify with the user if activity was legitimate


Search Time period determination: 

- Start and end times of the suspicious activity

- Events leading up to the incident

- Subsequent actions after detection

- No silver bullet but start with 5min around the incident and go up to a week or more as needed 

Example: Query & review relevant firewall logs showing the outbound traffic from this system and EDR events from this endpoint in the past 5mins. 
Palo Alto Networks XIAM Cortex Response Process
Palo Alto Networks Malware Playbook

3. Formulate investigational hypotheses & possible threat scenarios 

Any outliers or anomalies? Any indicators or patterns of attack/compromise? Initial signs?

From your acumen, experience & training, what do you think is happening here? 

Does it look like Malware, Phishing, Insider threat, vulnerability exploit, etc.? 

Example Hypothesis 1: The endpoint is infected with malware communicating with a C2 server.
Example Hypothesis 2: User credentials were compromised and used to exfiltrate data.
Example Hypothesis 3: The outbound traffic is from a legitimate but misconfigured application.
Oak Ridge National Lab ORNL Paper
From our good friends and supporters at ORNL. ref: Bobby Bridges et al. 2023 

4. Confirm or refute Hypotheses & Identify False Positives

For each hypothesis: 


  • Collect and analyze evidence and revise or form new hypotheses if needed.

  • Cross-reference alerts with other data sources & check for known benign patterns

  • Determine what happened and why & Document findings and steps taken


Check network traffic for C2 patterns. Review user activity for anomalies for compromised creds. Examine application logs and settings for mis configs

 5. R^3: Respond, Report, and Review


  •  Contain the incident: Isolate systems, block traffic

  •  Eradicate: Remove malicious artifacts, fix issues

  •  Recover & Restore operations, verify remediation

  •  Document the incident, investigation, findings, and actions

  •  Review and learn from the incident to improve detection content, observability, playbook, and response process 



If you noticed, this similar to NIST & SANS incident handling processes, but it is more tactical & operational based on some of the best SOCs' processes outhere. 


BONUS: AI can help! 

You are a real one for reading this far! So, if you want to make it easier consider using our FOREVER FREE Cyber LLM chatbots like https://hf.co/chat/assistant/6692ea1980d075bf4961ecdf and https://hf.co/chat/assistant/6692eb85ce7a1a25328ab049 to help you in the following usecases:- 


  1. Automated incident report generation: LLMs can be used to create detailed and structured incident reports based on raw data and logs

  2. Pattern recognition: The models can identify patterns and anomalies in security data, potentially uncovering hidden threats or attack vectors

  3. Natural language querying: LLMs enable analysts to interact with security data using natural language queries, making it easier to extract relevant information

  4. Context-aware analysis: These models can provide contextual information about threats, helping analysts understand the broader implications of an incident

  5. Recommendation generation: LLMs can suggest potential remediation steps or mitigation strategies based on the analysis of the incident

  6. Knowledge augmentation: The models can supplement an analyst's knowledge by providing relevant information from vast databases of security information

  7. Automated triage: LLMs can help prioritize incidents based on their severity and potential impact, allowing analysts to focus on the most critical issues



And, if you want a private full data correlation & contextualization fine tuned models pipeline to scale for your big data and hold context, knowledge, and is more configurable, auditable, transparent, and reliable then schedule a free trial of our solution at cypienta.com/trial or get started right away for free following the docs at docs.cypienta.com 


Cypienta Correlation models
Becase we don't have a SOC talent shortage we just have an elite unicorn talent shortage!

That's all folks! Until the next ELI5! 

 
 
bottom of page