WEBVTT

00:07.250 --> 00:12.080
Once an incident has occurred within our organization and we've cleaned up the mess, we've repaired

00:12.080 --> 00:12.590
everything.

00:12.590 --> 00:16.730
We've brought it back to the original working condition, or we've applied compensating controls to

00:16.730 --> 00:21.020
ensure that that incident has been fully remediated and protected against.

00:21.020 --> 00:26.900
We need to go through a post-incident activity, usually post-incident, which involves the idea of

00:26.900 --> 00:30.050
what can we learn from the incident that just occurred.

00:30.080 --> 00:32.060
However, it's more than just that.

00:32.060 --> 00:37.460
We need to identify different aspects of how the incident came into being, what vulnerabilities took

00:37.460 --> 00:42.560
place, and how would overall function across our enterprise environment, whether that was from a flaw,

00:42.590 --> 00:45.110
a vulnerability, or a misconfiguration.

00:45.290 --> 00:50.990
We need to complete a forensic analysis and go back through after the repairs have been completed and

00:50.990 --> 00:54.770
identify everything that was affected within our systems themselves.

00:54.800 --> 00:59.600
If a folder was changed and something was left behind, we need to figure that out and go through the

00:59.600 --> 01:05.810
process of identifying not just what they changed, but deleting anything that was in our systems from

01:05.810 --> 01:12.240
the get go Ago that man constituted or emerged from that incident after the fact.

01:12.270 --> 01:17.520
Usually this involves identifying different network changes, application folders or programs that may

01:17.520 --> 01:23.640
have been installed or modified, and digital storage to make sure that the hashing values match up

01:23.640 --> 01:24.750
to where they were before.

01:24.780 --> 01:29.640
This has all been gone over before, but realistically, what we're really trying to do here is bring

01:29.640 --> 01:35.880
our systems back to the original, um, operating of what they were beforehand.

01:35.880 --> 01:39.930
Uh, but we also need to look at it from the perspective of what was left behind.

01:39.930 --> 01:43.770
We can fix something, and an attacker may have left something behind.

01:43.860 --> 01:45.300
This could be a logic bomb.

01:45.300 --> 01:46.920
It could be malware that's waiting.

01:46.920 --> 01:49.560
It could be any number of things, including backdoors.

01:49.560 --> 01:54.090
And the mere fact is that we have to go back through after the major incident has occurred, and figure

01:54.120 --> 01:55.920
out what happened.

01:56.730 --> 02:00.270
Once that's happened, we can identify root cause analysis.

02:00.300 --> 02:05.730
Root cause analysis is understanding the initial vulnerability or the flaw in the misconfiguration that

02:05.730 --> 02:09.750
went through the process of what started this entire endeavor in the first place.

02:09.750 --> 02:13.830
This could be something as simple as identifying, hey, they came in through telnet and were able to

02:13.830 --> 02:15.570
get access to our username and passwords.

02:15.570 --> 02:20.370
Because it was transmitted in the clear to is something as complex as somebody clicked on a phishing

02:20.370 --> 02:25.860
email, which we thought we caught, and then it inadvertently downloaded malware into this system,

02:25.860 --> 02:28.950
which then expanded, and our antivirus system didn't work properly.

02:28.950 --> 02:35.100
There could be many layers of flaws within the, uh, within the enterprise system, but finding the

02:35.100 --> 02:41.010
root cause and then every step that happened away along the way is vital to understanding what happened

02:41.010 --> 02:42.240
in our network as a whole.

02:42.240 --> 02:49.140
This all constitutes a root cause analysis of understanding what happened to provide us with this initial

02:49.140 --> 02:52.110
attack vector, what we are going to identify log files.

02:52.110 --> 02:53.520
We're going to go through network diagrams.

02:53.520 --> 02:54.900
We're going to go through workflows.

02:54.900 --> 03:01.350
We're going to go through the entire process and really understand not only the root cause, but what

03:01.350 --> 03:02.400
happened along the way.

03:02.400 --> 03:04.740
And that all constitutes that root cause analysis.

03:04.740 --> 03:13.290
You could have a problem that started with, say, telnet and then evolved over time to go from there

03:13.290 --> 03:16.440
and then spread out like a virus over our different systems.

03:16.440 --> 03:19.720
Uh, each of those having their own quote unquote root cause.

03:19.720 --> 03:23.020
But the true root cause would be that initial telnet.

03:23.050 --> 03:26.440
That doesn't mean we're just going to identify telnet as our initial problem and fix that.

03:26.440 --> 03:29.320
We need to fix all the flaws and vulnerabilities along the way.

03:29.950 --> 03:33.040
Finally, at the end of the at the end of the fix.

03:33.040 --> 03:33.370
Right.

03:33.400 --> 03:37.840
So you've gone through you've had this incident, you've got employees that are working 20, 25 hour

03:37.840 --> 03:38.320
shifts.

03:38.320 --> 03:39.400
And I'm not joking.

03:39.430 --> 03:44.710
It's really been that bad sometimes where you've got employees that are virtually exhausted.

03:44.710 --> 03:46.960
We've fixed the a major incident.

03:46.960 --> 03:52.210
We've fortified our systems, we've closed down the vulnerabilities, we've stopped the attack that's

03:52.210 --> 03:55.090
in progress, and there's still a lot left to do.

03:55.120 --> 03:57.850
This is not the time to do the lessons learned capability, right?

03:57.880 --> 04:01.840
I've seen brand new managers come into play and we fix all the problems.

04:01.840 --> 04:05.770
Everybody's running on fumes and it's like, oh, well, what did we learn from this?

04:05.800 --> 04:06.940
Nobody cares.

04:06.940 --> 04:09.760
I hate to be that guy, but nobody cares at that point, right?

04:09.790 --> 04:11.020
We're all exhausted.

04:11.020 --> 04:12.700
Let your employees go to sleep.

04:13.060 --> 04:14.410
Send them home.

04:14.440 --> 04:16.210
Tackle it in 8 to 10 hours.

04:16.210 --> 04:18.730
They're still cleanup processes that have to take place.

04:18.730 --> 04:24.040
There's other technical things that need to be done, but having your employees work on limited sleep

04:24.040 --> 04:27.860
where they're literally falling asleep at the table is not good for you, and it's not good for your

04:27.860 --> 04:28.850
employees either.

04:28.850 --> 04:33.590
But once the mess has been cleaned up, once you're fairly confident you've got all the vulnerabilities

04:33.590 --> 04:34.070
taken care of.

04:34.100 --> 04:36.110
The flaws have been remediated.

04:36.140 --> 04:37.700
The malware is out of your system.

04:37.700 --> 04:40.190
Now is the time to go through and identify.

04:40.190 --> 04:42.080
What did we learn from this process?

04:42.590 --> 04:47.240
Could we have done something easier that would have shaved off some hours of time?

04:47.240 --> 04:48.050
It's not.

04:48.140 --> 04:53.750
It's not unnatural to find somebody who made a mistake at the very get go.

04:53.750 --> 04:58.130
And they were going down one path, only to find out 2 or 3 hours were wasted because they could have

04:58.130 --> 05:03.290
gone this direction had they identified the hint that was provided within the malware scheme.

05:03.290 --> 05:04.700
It happens to everybody.

05:04.700 --> 05:06.620
This is where experience really comes into play.

05:06.620 --> 05:10.670
This is why somebody that's got ten years of experience is paid more than somebody with, you know,

05:10.700 --> 05:13.190
two months of experience, because experience matters.

05:13.190 --> 05:18.200
But in order to get those experiences, we need to do a lessons learned that lessons learned process

05:18.200 --> 05:22.580
allows people to sit at the table, identify what could I have done better?

05:22.580 --> 05:26.150
What tools could we have utilized that would have stopped this in the first place?

05:26.150 --> 05:30.800
What could we have done that would have mitigated the problems that we foresaw?

05:30.830 --> 05:36.120
I remember several times when I was a newer employee or even an experienced employee where we had a

05:36.120 --> 05:42.030
disaster take place and we wasted 2 or 3 hours, even 5 or 6 hours, going down a rabbit hole.

05:42.030 --> 05:43.590
That had nothing to do with the problem.

05:43.590 --> 05:46.380
But all the resources said, go this way.

05:46.380 --> 05:49.830
The problem was the technology was so new that no one really understood the logs.

05:49.830 --> 05:54.330
So we used what the manufacturer told us in the first place should be the root problem.

05:54.720 --> 05:56.730
When we find out that wasn't the root problem at all.

05:56.760 --> 06:02.040
Had we identified the complex nature of the logs, we could have, we could have digressed and gone,

06:02.070 --> 06:03.900
hey, it's actually over here.

06:03.960 --> 06:09.420
And knowing that this one little alarm in combination of this other alarm, the problem is in this direction,

06:09.420 --> 06:10.950
not in this direction.

06:10.950 --> 06:11.970
And that's normal.

06:12.000 --> 06:14.100
You shouldn't get mad as a manager that that happens.

06:14.100 --> 06:18.330
It's part of that learning experience, especially when new equipment or new technology is coming to

06:18.360 --> 06:18.840
bear.

06:18.840 --> 06:23.550
That's why your experienced employees make so much more money than your brand new employees, because

06:23.550 --> 06:25.050
that experience really does matter.

06:25.050 --> 06:29.520
So we need to go through lessons learned policy so that not only can your experienced employees underscore

06:29.520 --> 06:34.020
what they learned, but the newer employees getting the feeling of, oh, this is how it's done and

06:34.020 --> 06:35.520
what did they learn along the way?

06:35.550 --> 06:39.450
A lot of times during this lessons learned process, you'll have brand new employees that don't want

06:39.450 --> 06:45.540
to open their mouth because they feel like what they say will make them sound stupid, or that they

06:45.540 --> 06:49.230
don't deserve to work here and they really face that syndrome that says, I don't belong.

06:49.230 --> 06:50.430
I'm not really qualified.

06:50.460 --> 06:55.860
You need to pull them out of that shell and reassure them that it's a learning process to get them to

06:55.890 --> 07:00.180
talk about it, because I've had brand new employees where they come in and they're and they're like,

07:00.180 --> 07:02.130
oh, I learned this, this and this.

07:02.160 --> 07:06.480
When I was going through the logs, we had no idea, but they had literally solved the problem.

07:06.510 --> 07:11.430
Had an experienced employee saw the same logs they had, we could have shaved off ten hours.

07:11.430 --> 07:17.400
But that new employee was so worried about getting fired or being looked down upon that they didn't

07:17.400 --> 07:18.810
want to share that information.

07:18.810 --> 07:21.900
Now you can blame the employee, but I don't think that's the right way.

07:21.900 --> 07:24.840
That's a cultural issue, not an employee issue.

07:24.840 --> 07:29.070
And so I think with this lesson learned process, it's the culture of the company that really comes

07:29.220 --> 07:32.280
into play here that says it's okay to make mistakes.

07:32.280 --> 07:34.080
Let's learn from those mistakes.

07:34.080 --> 07:39.240
And if you can constitute that good culture within your company, yes, you may have some rocky starts

07:39.240 --> 07:43.350
to begin with, but long term wise, your company, your organization is going to be far better for

07:43.350 --> 07:43.770
it.
