Last updated: 
2 months 2 weeks ago
Blog Manager
We are the Computer Security and Incident Response Team (CSIRT) for the Janet network. Part of Jisc's Security Operations Centre, our mission is to safeguard the current and future network security of Janet (steering the security policies for all Janet connections) and of our customers, creating a secure environment to conduct your online activities. Our primary function is monitor and resolve any security incidents that occur on the Janet network, with specialists tracking a range of platforms, including Unix, Linux and Windows.

Incident Response Triage – Final steps

Thursday, December 20, 2018 - 13:46

Incident Response Triage - Eradication, Recovery and Lessons Learned

This is part two of a two-part blog set covering an overview of the Incident Response life-cycle. In response to an incident, the next life-cycle steps that follow the containment stage are the remediation steps; eradication of the threat, recovery of systems and lessons learned. This second article focuses on each of these stages, highlighting the important areas to consider within the remaining life-cycle steps.

NIST incident life-cycle 800-61 phases.

Step 5 Eradication

Effective steps must be taken to ensure thorough removal of malicious software and processes, and other illicit content from the affected systems

Eradication steps should look to achieve longer-term mitigation efforts in order to prevent an attacker from re-entering and gaining persistence within the environment. These differ from the temporary measures deployed during the containment phase [5]. Effective steps must be taken to ensure thorough removal of malicious software and processes, and other illicit content from the affected systems (Incident Response Process, 2008). . Feeding earlier data into the eradication phase including details recorded during the ‘incident detection and analysis’ phase, such as the incident indicators determined from the five W’s and one H is necessary to guide full eradication. Information such as the log of all infected host systems and identified malware detail recorded during that phase will help here. This is because a good record of infected hosts enables full clean-up to be confirmed. .

This step is where you would seek to deploy defences that actively deny and disrupt live threats detected in the previous phases. The improvements and actions sought should be well thought out as they will consume time and resources.

Here is a suggested list of eradication actions that could be considered:

  • Reimaging disk to new ‘gold image’, using evidence-based data to determine removal success.

  • A reset of any potentially affected password/account.

  • Change of account administration privilege access/controls.

  • Host security scanning, including the up-to-date signature of any detected threat.

  • Patching and AV updates in line with the threat.

  • Denial of identified attack traffic with firewall rule.

  • Stopping the use of involved insecure protocols such as SMBv1 in the organization.

  • Improved ACL rules on network devices and segmentation zones.

  • Isolation of network/department network segments/file services prior to recovery.

  • Shut/lock down compromised email account.

  • Creation of entries in DNS RPZ files or URL blacklisting.

  • Amending insecure code on public facing web portals or resources

It should be remembered that documentation of actions taken are necessary in determining impact, people resource costs and any required infrastructure changes.


Step 6 Recovery

Recovery is the process of returning to a non-incident state, or where business operations are operating normally[5]. Proactive monitoring of attacker behavior should be employed to detect further incident activity. Recovery is the process of going back to a non-incident state [5]. Consult with system owners to decide on an effective backup restore point for affected systems. If there has been a widespread ransomware, malware, or compromise event, timeline data from the analysis phase should highlight the first point of activity associated with an incident. This may suggest an appropriate backup restore point used for system recovery. The co-ordination of a phased approach to system recovery would be sensible, bringing one area at a time into operation, while continuously monitoring system behaviors and network traffic. This methodical recovery would quickly capture any security event re-occurring. The ability to return a system to a usable state demonstrates the requirement for a well-planned and documented backup strategy – this should include periodic testing to ensure a system restore is possible.

Account management may have a part to play, if multiple or individual accounts are in lock-down. User consultation or communications for password resets may be part of the recovery process, with user computer security awareness education also occurring as part of the phase.

During system recovery, the co-ordination of phased system recovery would be sensible, bringing one area at a time into operation, while continuously monitoring behaviors and network traffic. The system managers would be deciding which tools to use to test, monitor, and validate system behavior as the recovery phase continues.


Step 7 Lessons Learned

At this stage, the documentation of the incident should be complete, and details recorded of systems impacted and the subsequent security enhancements and controls that should be implemented to prevent further the occurrences of the incident. This information could act as reference material should a similar incident occur. The goal of lessons learned is to improve the organisation’s preparedness, and eventual response and performance for future incidents. Documentation of historical events can also be used as training materials for new team members or as a benchmark to be used in comparison in future crises (Bejtlich, 2005)[3].

An example of performing lessons learned covers the following points:

  • When was the problem was first detected and by whom. Did logging capture enough detail.

  • The scope of the incident.

  • How the incident was contained and eradicated.

  • Worked performed during recovery.

  • Areas where the CIRT/security teams were and weren’t effective, and potential improvement actions

An incident completes the full life-cycle loop when the results of the investigation, containment, eradication and recovery are fed back into the preparation phase.

The incident lifecycle as defined by NIST and SANS feeds the lessons learned from a security incident response engagement at each stage into the organisations preparedness and defences. If you are recording and learning from a security incident, you are preparing yourself for a repeat offence. If the attacker returns, with improved organisation defences, logging and understanding, the threat source can be related to the previous incident to provide intel which can aid yourself and others in the security community.


For further completeness, taking the information learned at each incident lifecycle stage would help to create a more secure environment and faster response time, if the staff involved used the lessons to improve security. Below is a suitable list to cover at each stage [5]:


How could the incident have been avoided altogether? This includes network architecture, system configuration, user training, or even computer usage policy.

Where can defensive security architecture be improved and can patch frequency on involved devices improve?


What telemetry sources (IDS, netflow, authentication logging, etc.) could have made it easier or faster to identify this attack?

What Anti-virus signatures or threat intelligence could have helped?


What containment measures were effective, preventing even further spread?

Could other containment measures be useful if they’d been more easily deployable?


What eradication steps went well in successfully removing the threat?


What (communication, staff knowledge, system ownership, etc) slowed down the recovery?

What did the response to recovery tell us about the adversary and techniques they exploited?

Hopefully understanding the incident life-cycle explained here, standardised by SANS and NIST, shows that the formalised life-cycle process delivers a method of continuous improvement in computer defences. Using it has helped many organisations to handle security incidents with a formalised process and deliver them an improved security posture afterwards.


[1]Incident Response & Computer Forensics, McGraw Hill - J. Luttgens, M. Pepe, K. Mandia.

[2]Cisco CCNA Cyber Ops SECOPS 210-255, Cisco Press - O. Santos, J. Muniz.

[3]SANS Incident Handlers Handbook

[4]NIST Computer Security Incident Handling Guide

[5] Intelligence driven incident Response, O’Reilly – Scott J. Roberts & Rebekah Brown

[6] 12 Proven cyber incident response strategies (Fireeye whitepaper)