1
00:00:01,840 --> 00:00:09,250
This architecture can be considered as a scaled up version of the larger deployment, which we saw in

2
00:00:09,250 --> 00:00:11,470
the previous tutorial.

3
00:00:12,100 --> 00:00:19,600
This will be one of the craziest of involving like high availability and clustering of Splunk into your

4
00:00:19,630 --> 00:00:20,340
design.

5
00:00:20,350 --> 00:00:27,640
Since we have already gone through these scenarios of having high availability and clustering options

6
00:00:27,640 --> 00:00:29,380
I'm using by now.

7
00:00:29,500 --> 00:00:37,000
You should be aware of the benefits of having high availability and clustering option in your organization.

8
00:00:38,120 --> 00:00:40,160
Let's see the architecture now.

9
00:00:42,450 --> 00:00:43,410
Looking in Chile.

10
00:00:43,410 --> 00:00:49,800
It looks like a total chaos in the architecture and a lot of components, but.

11
00:00:50,600 --> 00:00:58,130
As a Splunk architect, you'll be able to see the beauty of Splunk flexibility and scaling up and its

12
00:00:58,130 --> 00:00:58,790
design.

13
00:01:00,420 --> 00:01:06,420
If you look carefully, there are two sides that is side one and side two.

14
00:01:08,960 --> 00:01:12,360
These are two sites in real scenarios.

15
00:01:12,380 --> 00:01:19,670
It will be like main data center and this could be your data or disaster recovery center.

16
00:01:19,700 --> 00:01:27,140
For our understanding, let us call them site one and site to the site one component.

17
00:01:28,560 --> 00:01:34,790
Looks identical to the large enterprise architecture which we have seen in our previous example.

18
00:01:34,800 --> 00:01:37,080
This is our site one architecture.

19
00:01:37,500 --> 00:01:40,350
If you just see in our previous.

20
00:01:41,140 --> 00:01:42,100
A discussion.

21
00:01:42,100 --> 00:01:48,370
We went through the large enterprise architecture which is identical to our site one.

22
00:01:50,090 --> 00:01:56,950
It is clear that for high availability and cluster we are considering only the large scale enterprise.

23
00:01:56,960 --> 00:02:05,270
So the site one is our main data centre where all the logs are collected using universal forwarders,

24
00:02:05,270 --> 00:02:08,810
sis logs and then pass it by our AV forwarders.

25
00:02:08,900 --> 00:02:13,310
Push push to the indexer for storage and retrieving.

26
00:02:13,310 --> 00:02:21,400
The searchers do their fancy stuff of fetching the data from indexers and visualizing reporting or alerting.

27
00:02:21,410 --> 00:02:28,040
The same applies to the D.R. or our site too, which is identical to our main site.

28
00:02:29,640 --> 00:02:37,560
But from this diagram, we can see that some of the components like the deployment server and the license

29
00:02:37,560 --> 00:02:40,500
manager are communicating to both side.

30
00:02:41,510 --> 00:02:48,410
Having a deployment server talk to all the components will have a huge advantage of managing.

31
00:02:49,440 --> 00:02:51,780
The configuration at one place.

32
00:02:51,780 --> 00:02:59,940
It talks to all the components like circuits, indexer, every forwarder and the data sources.

33
00:03:02,310 --> 00:03:10,350
Similarly, we know from our previous modules that license manager talks to all the indexers.

34
00:03:11,510 --> 00:03:18,890
That at present inside one side too, or in any other side of your architecture to keep track of the

35
00:03:18,890 --> 00:03:28,830
license utilization, since it has very limited functionality, we can make it as a cluster master.

36
00:03:28,850 --> 00:03:36,020
Also, we can use the license server itself to function alongside us Cluster Master, which takes care

37
00:03:36,050 --> 00:03:43,730
of making sure that the data has been copied or replicated to the other sites and vice versa.

38
00:03:46,330 --> 00:03:53,200
The function of cluster muster can be clubbed with deployment, server or license manager, although

39
00:03:53,200 --> 00:03:55,300
it is not recommended by Splunk.

40
00:03:55,330 --> 00:03:59,170
It doesn't have much of an impact on the performance.

41
00:03:59,170 --> 00:04:06,370
Since License Manager, which has very limited functionalities, it can be made as Cluster Master to.

42
00:04:09,760 --> 00:04:13,870
And it is also the duty of cluster manager.

43
00:04:16,290 --> 00:04:24,480
To make sure the replication and the search factors are met in between the cluster or the cluster members

44
00:04:24,480 --> 00:04:27,210
and make sure the cluster is stable.

45
00:04:27,450 --> 00:04:32,250
The health of the cluster can also be monitored from the cluster master.

46
00:04:32,760 --> 00:04:40,800
To conclude, let us go through some scenarios when the multi-site clustering will be adding value.

47
00:04:42,120 --> 00:04:46,630
Let's say one of the indexers in my main suit goes down.

48
00:04:46,650 --> 00:04:47,820
So what happens?

49
00:04:48,060 --> 00:04:51,500
There is still data between two indexes.

50
00:04:51,510 --> 00:04:57,210
It should be more than enough if you have configured the replication factor of two.

51
00:04:57,510 --> 00:05:03,390
We'll come to this replication factor and search factors and how they influence the cluster and the

52
00:05:03,390 --> 00:05:05,340
storage and the high availability part.

53
00:05:05,730 --> 00:05:12,780
Let's say we have two copies of data here, so if one indexer goes down, there is very good chance

54
00:05:12,780 --> 00:05:17,700
that these two indexes can give you the results without any impact.

55
00:05:18,670 --> 00:05:26,410
Let's say one of the searchers goes down as a second scenario where when the search goes down, if it

56
00:05:26,410 --> 00:05:31,180
is a highly critical one and it is clustered into the searcher.

57
00:05:32,630 --> 00:05:38,270
We can access our DVR searchers and continue with their dashboard reports or alerting whatever it was

58
00:05:38,270 --> 00:05:41,150
it should operate without any issues.

59
00:05:42,380 --> 00:05:49,160
Similarly, if it is a dedicated searcher like it handles a premium app which is configured only on

60
00:05:49,160 --> 00:05:51,590
one searcher and it is not been clustered.

61
00:05:53,700 --> 00:05:59,990
The impact will be the alerts or the schedule searches which are configured on this search.

62
00:06:00,000 --> 00:06:09,240
It will not be running anymore if it has been clustered into our DVR or the site to the scheduled searches

63
00:06:09,690 --> 00:06:14,400
and the alerts will be run by our search it and then side to.

64
00:06:15,850 --> 00:06:20,270
The third scenario, let us consider there are two indexes going down.

65
00:06:20,290 --> 00:06:23,090
In that case, our search will be impacted.

66
00:06:23,110 --> 00:06:27,730
We will not be getting 100% of the results from the main indexes.

67
00:06:27,730 --> 00:06:35,170
But if we make the same searches point to these indexes, it will be able to retrieve 100% of data.

68
00:06:35,200 --> 00:06:38,830
Even though these two indexes are down.

69
00:06:38,830 --> 00:06:45,820
So at any given point of time, either these three indexes or these three indexes should be able to

70
00:06:45,820 --> 00:06:48,580
serve you with 100% of the results.

71
00:06:49,400 --> 00:06:55,610
And the fourth scenario the deployment server goes to consider, the deployment server goes down, which

72
00:06:55,610 --> 00:07:02,000
doesn't have a slave in this architecture or like it doesn't have a fellow but the deployment server.

73
00:07:02,030 --> 00:07:06,200
There is a reason why it stands out from the regular architecture.

74
00:07:06,200 --> 00:07:11,060
If you see it stands somewhere in the middle just communicating to all the servers.

75
00:07:11,060 --> 00:07:16,710
But if you see if the deployment server goes down, there is no functional impact for.

76
00:07:18,690 --> 00:07:26,400
Splunk architecture because it just makes sure that all the instances are up and you'll be able to modify

77
00:07:26,400 --> 00:07:31,290
configuration, restart them and make sure the new configurations are deployed.

78
00:07:31,410 --> 00:07:32,910
These kind of scenarios.

79
00:07:32,910 --> 00:07:38,730
Whereas even if it goes down, the search, index, search and heavy folders will have a local copy

80
00:07:38,730 --> 00:07:40,080
of their configuration.

81
00:07:41,010 --> 00:07:46,200
And it will be able to operate without any issues in case of deployment.

82
00:07:46,200 --> 00:07:52,020
So we're going to probably let's say you're not able to bring up the server, make sure to restore the

83
00:07:52,020 --> 00:07:59,220
backup into new VM and you'll be able to assign the same IP and deployment should a server should be

84
00:07:59,220 --> 00:08:01,230
up within a matter of no time.

85
00:08:03,070 --> 00:08:09,370
By understanding all this architecture and the benefits, you should now be able to design one of the

86
00:08:09,370 --> 00:08:12,550
best fit architecture for your organisation.