1 00:00:00,060 --> 00:00:01,080 In this lesson, 2 00:00:01,080 --> 00:00:03,390 we start designing redundant networks. 3 00:00:03,390 --> 00:00:04,680 First, you need to ask yourself, 4 00:00:04,680 --> 00:00:06,689 are you going to use redundancy in the network, 5 00:00:06,689 --> 00:00:09,390 and if so, where and how? 6 00:00:09,390 --> 00:00:12,360 So are you going to do it from a module or a parts perspective? 7 00:00:12,360 --> 00:00:14,460 For instance, are you going to have multiple power supplies, 8 00:00:14,460 --> 00:00:17,940 multiple network interface devices, multiple hard drives, 9 00:00:17,940 --> 00:00:20,310 or are you going to look at more from a chassis redundancy 10 00:00:20,310 --> 00:00:23,520 and have two sets of routers or two sets of switches? 11 00:00:23,520 --> 00:00:25,350 These are things you have to think about. 12 00:00:25,350 --> 00:00:27,120 Which one of these are you going to use, 13 00:00:27,120 --> 00:00:29,760 because each one is going to affect the cost of your network 14 00:00:29,760 --> 00:00:31,440 based on the decisions you make. 15 00:00:31,440 --> 00:00:33,180 You have to be able to make a good business case 16 00:00:33,180 --> 00:00:35,370 for which one you're going to use and why. 17 00:00:35,370 --> 00:00:37,110 For instance, if you could just have a second network 18 00:00:37,110 --> 00:00:39,360 interface card or a second power supply, 19 00:00:39,360 --> 00:00:41,340 that's going to be a lot cheaper than having to have 20 00:00:41,340 --> 00:00:44,160 an entire switch or an entire extra router there. 21 00:00:44,160 --> 00:00:45,900 Now, each of those switches and routers, 22 00:00:45,900 --> 00:00:48,720 some of these can cost 3 or 4 or $5,000, 23 00:00:48,720 --> 00:00:50,130 and so it might be a lot cheaper 24 00:00:50,130 --> 00:00:51,597 to have a redundant power supply, right? 25 00:00:51,597 --> 00:00:53,280 And so these are the things you have to think about 26 00:00:53,280 --> 00:00:55,500 and weigh as you're building your networks. 27 00:00:55,500 --> 00:00:58,200 Another thing you have to think about is software redundancy 28 00:00:58,200 --> 00:01:00,450 and which features of those are going to be appropriate? 29 00:01:00,450 --> 00:01:03,060 Sometimes you can solve a lot of these redundancy problems 30 00:01:03,060 --> 00:01:05,550 by using software as opposed to hardware. 31 00:01:05,550 --> 00:01:08,130 For example, if you have a virtual network setup, 32 00:01:08,130 --> 00:01:09,570 you could just put in a virtual switch 33 00:01:09,570 --> 00:01:11,010 or a virtual router in there, 34 00:01:11,010 --> 00:01:12,240 and that way you don't have to bring another 35 00:01:12,240 --> 00:01:14,070 real router or real switch in. 36 00:01:14,070 --> 00:01:16,260 That can save you a lot of money. 37 00:01:16,260 --> 00:01:18,630 There's also a lot of other software solutions out there, 38 00:01:18,630 --> 00:01:20,340 like a software RAID, that will give you 39 00:01:20,340 --> 00:01:22,440 additional redundancy for your storage devices 40 00:01:22,440 --> 00:01:25,170 as opposed to putting in an extra hard drive chassis 41 00:01:25,170 --> 00:01:27,900 or another RAID array or storage area network. 42 00:01:27,900 --> 00:01:29,130 Also, these are the types of things 43 00:01:29,130 --> 00:01:30,150 you have to be thinking about 44 00:01:30,150 --> 00:01:32,010 as you're building out your network, right? 45 00:01:32,010 --> 00:01:33,450 When you think about your protocols, 46 00:01:33,450 --> 00:01:35,580 what protocol characteristics are going to affect 47 00:01:35,580 --> 00:01:36,990 your design requirements? 48 00:01:36,990 --> 00:01:39,150 This is really important if you're designing things 49 00:01:39,150 --> 00:01:39,990 and you're using something like 50 00:01:39,990 --> 00:01:42,240 TCP versus UDP in your designs, 51 00:01:42,240 --> 00:01:45,180 because TCP has that additional redundancy 52 00:01:45,180 --> 00:01:47,820 by resending packets where UDP doesn't. 53 00:01:47,820 --> 00:01:50,190 This is something you have to consider as well. 54 00:01:50,190 --> 00:01:52,050 As you design all these different things, 55 00:01:52,050 --> 00:01:54,600 all these different factors are going to work together 56 00:01:54,600 --> 00:01:55,890 just like gears. 57 00:01:55,890 --> 00:01:59,310 Each one turns another, each one is going to feed another one, 58 00:01:59,310 --> 00:02:01,237 and the more reliability and availability 59 00:02:01,237 --> 00:02:02,070 you get in your networks 60 00:02:02,070 --> 00:02:04,380 by adding all these components together. 61 00:02:04,380 --> 00:02:05,340 In addition to all this, 62 00:02:05,340 --> 00:02:06,810 there are other design considerations 63 00:02:06,810 --> 00:02:08,550 that we have to think about as well. 64 00:02:08,550 --> 00:02:10,560 Like what redundancy features should we use 65 00:02:10,560 --> 00:02:13,470 in terms of powering the infrastructure devices? 66 00:02:13,470 --> 00:02:15,030 Are we going to have internal power supplies 67 00:02:15,030 --> 00:02:17,370 and have two of those and have them redundant? 68 00:02:17,370 --> 00:02:19,860 Or, are we going to have battery backups or UPSes? 69 00:02:19,860 --> 00:02:21,330 Are we going to have generators? 70 00:02:21,330 --> 00:02:23,430 All of these things are things you have to think about. 71 00:02:23,430 --> 00:02:25,530 And I don't have necessarily the right answers for you, 72 00:02:25,530 --> 00:02:28,500 because it all comes down to a case-by-case basis. 73 00:02:28,500 --> 00:02:30,240 Every network is going to be different 74 00:02:30,240 --> 00:02:31,740 and everyone has its own needs 75 00:02:31,740 --> 00:02:34,200 and its own business case associated with it. 76 00:02:34,200 --> 00:02:36,060 The network that I had at former employers 77 00:02:36,060 --> 00:02:38,100 were serving hundreds of thousands of clients, 78 00:02:38,100 --> 00:02:39,660 and those are vastly different 79 00:02:39,660 --> 00:02:40,770 than the ones that are servicing 80 00:02:40,770 --> 00:02:42,300 my training company right now 81 00:02:42,300 --> 00:02:44,130 with just a handful of employees. 82 00:02:44,130 --> 00:02:45,960 Because when you're dealing with your network design 83 00:02:45,960 --> 00:02:47,070 and your redundancies, 84 00:02:47,070 --> 00:02:49,800 you have to think about the business case first. 85 00:02:49,800 --> 00:02:51,000 Each one is going to be different 86 00:02:51,000 --> 00:02:53,610 based on your needs and your considerations. 87 00:02:53,610 --> 00:02:55,290 What redundancy features should be used 88 00:02:55,290 --> 00:02:57,780 to maintain the environmental conditions of your space? 89 00:02:57,780 --> 00:03:00,030 If you have good power and space and cooling, 90 00:03:00,030 --> 00:03:01,380 you need to make sure that you're thinking about 91 00:03:01,380 --> 00:03:04,110 air conditioning and do you have one unit or two? 92 00:03:04,110 --> 00:03:05,700 Do you have generators on site? 93 00:03:05,700 --> 00:03:08,550 Do you have additional thermal heating or thermal cooling? 94 00:03:08,550 --> 00:03:10,680 All of these things are things you have to think about. 95 00:03:10,680 --> 00:03:12,510 What do you do when power goes down? 96 00:03:12,510 --> 00:03:13,500 What are some of those things 97 00:03:13,500 --> 00:03:14,333 that you're going to have to deal with 98 00:03:14,333 --> 00:03:15,750 if you're running a server farm 99 00:03:15,750 --> 00:03:18,270 that has to have units running all the time, 100 00:03:18,270 --> 00:03:19,620 because it can't afford to go down 101 00:03:19,620 --> 00:03:22,380 because it's going to affect thousands and thousands of people 102 00:03:22,380 --> 00:03:25,020 instead of just your one office with 20 people? 103 00:03:25,020 --> 00:03:26,610 All of these are things you have to consider 104 00:03:26,610 --> 00:03:27,840 as you think about it. 105 00:03:27,840 --> 00:03:29,760 In my office, we made the decision 106 00:03:29,760 --> 00:03:31,860 that one air conditioning unit was enough, 107 00:03:31,860 --> 00:03:34,380 because if it goes down, we might just not work today 108 00:03:34,380 --> 00:03:35,760 and we'll come to work tomorrow. 109 00:03:35,760 --> 00:03:36,990 We can get over that. 110 00:03:36,990 --> 00:03:38,400 But in a server farm, 111 00:03:38,400 --> 00:03:40,770 we need to make sure we have multiple air conditioners, 112 00:03:40,770 --> 00:03:41,670 because if that goes down, 113 00:03:41,670 --> 00:03:43,710 it can actually burn up all the components, right? 114 00:03:43,710 --> 00:03:46,020 So we have to have additional power and space and cooling 115 00:03:46,020 --> 00:03:47,310 that are fully redundant 116 00:03:47,310 --> 00:03:49,050 because of that server infrastructure 117 00:03:49,050 --> 00:03:50,520 that we're supporting there. 118 00:03:50,520 --> 00:03:53,400 These are the things you have to balance in your practices. 119 00:03:53,400 --> 00:03:55,590 And so when you start looking at the best practices, 120 00:03:55,590 --> 00:03:57,600 I want you to examine your technical goals 121 00:03:57,600 --> 00:03:59,250 and your operational goals. 122 00:03:59,250 --> 00:04:00,600 Now, what I mean by that is 123 00:04:00,600 --> 00:04:02,700 what is the function of this network? 124 00:04:02,700 --> 00:04:04,680 What are you actually trying to accomplish? 125 00:04:04,680 --> 00:04:06,390 Are you trying to get to 90% uptime? 126 00:04:06,390 --> 00:04:08,880 Or 95% or 99%? 127 00:04:08,880 --> 00:04:10,440 Or are you going for that gold standard 128 00:04:10,440 --> 00:04:12,570 of five nines of availability? 129 00:04:12,570 --> 00:04:14,790 Every company has a different technical goal, 130 00:04:14,790 --> 00:04:16,440 and that technical goal is going to determine 131 00:04:16,440 --> 00:04:17,970 the design of your network. 132 00:04:17,970 --> 00:04:19,410 And you need to identify that 133 00:04:19,410 --> 00:04:20,850 inside of your budgeting as well, 134 00:04:20,850 --> 00:04:23,340 because funding these high availability-features 135 00:04:23,340 --> 00:04:24,900 is really expensive. 136 00:04:24,900 --> 00:04:27,120 As I said, if I want to put a second router in there, 137 00:04:27,120 --> 00:04:30,300 that might cost me another 3,000 or $5,000. 138 00:04:30,300 --> 00:04:32,640 In my own personal network, we have a file server, 139 00:04:32,640 --> 00:04:34,530 and it's a small NAS device. 140 00:04:34,530 --> 00:04:36,540 We're not comfortable just having all of those 141 00:04:36,540 --> 00:04:37,620 on our devices there. 142 00:04:37,620 --> 00:04:38,940 So we decided we weren't comfortable 143 00:04:38,940 --> 00:04:41,580 having all of our file storage on a single hard drive, 144 00:04:41,580 --> 00:04:43,530 and we built this NAS array instead. 145 00:04:43,530 --> 00:04:45,240 So if one of those drives goes out, 146 00:04:45,240 --> 00:04:47,700 we have three others that are carrying the load. 147 00:04:47,700 --> 00:04:49,050 This is the idea here. 148 00:04:49,050 --> 00:04:51,750 Now, eventually we decided we didn't need that NAS anymore, 149 00:04:51,750 --> 00:04:54,990 and so we replaced that NAS enclosure with a full RAID 5. 150 00:04:54,990 --> 00:04:56,880 Later on, we took that full RAID 5 151 00:04:56,880 --> 00:04:58,650 and we switched it over to a cloud server 152 00:04:58,650 --> 00:04:59,760 that has redundant backups 153 00:04:59,760 --> 00:05:01,500 in two different cloud environments. 154 00:05:01,500 --> 00:05:03,210 And so all of these things work together 155 00:05:03,210 --> 00:05:04,590 based on our decisions. 156 00:05:04,590 --> 00:05:06,060 But as we moved up that scale 157 00:05:06,060 --> 00:05:07,650 and got more and more redundancy, 158 00:05:07,650 --> 00:05:09,630 we have more and more cost associated. 159 00:05:09,630 --> 00:05:10,830 It was a lot cheaper just to have 160 00:05:10,830 --> 00:05:13,230 an eight terabyte hard drive with all of our files on it. 161 00:05:13,230 --> 00:05:14,400 Then we went to a NAS array, 162 00:05:14,400 --> 00:05:16,140 and that cost two or three times that money. 163 00:05:16,140 --> 00:05:17,370 Then we went to a full RAID 5, 164 00:05:17,370 --> 00:05:18,723 and that cost a couple more times than that. 165 00:05:18,723 --> 00:05:21,480 Then we went to a cloud and we have to pay more for that. 166 00:05:21,480 --> 00:05:23,190 Remember, all your decisions here 167 00:05:23,190 --> 00:05:24,840 are going to cost you more money, 168 00:05:24,840 --> 00:05:27,960 but if it's worth it to you, that would be important, right? 169 00:05:27,960 --> 00:05:29,880 And so these are the things you have to balance 170 00:05:29,880 --> 00:05:32,070 as you're designing these fully redundant networks 171 00:05:32,070 --> 00:05:34,260 based on those technical goals. 172 00:05:34,260 --> 00:05:35,430 You also need to categorize 173 00:05:35,430 --> 00:05:38,070 all of your business applications into profiles 174 00:05:38,070 --> 00:05:39,660 to help with this redundancy mission 175 00:05:39,660 --> 00:05:41,550 that you're trying to go and accomplish here. 176 00:05:41,550 --> 00:05:42,383 This will really help you 177 00:05:42,383 --> 00:05:45,090 as you start going into the quality of service as well. 178 00:05:45,090 --> 00:05:46,350 Now, if I said, for instance, 179 00:05:46,350 --> 00:05:50,010 that web is considered category one, email is category two, 180 00:05:50,010 --> 00:05:52,050 and streaming video is going to be category three, 181 00:05:52,050 --> 00:05:53,520 then we can apply profiles 182 00:05:53,520 --> 00:05:55,140 and give certain levels of service 183 00:05:55,140 --> 00:05:56,760 to each of those categories. 184 00:05:56,760 --> 00:05:58,890 Now we'll talk specifically of how that works 185 00:05:58,890 --> 00:06:01,680 when we talk about quality of service in a future lesson. 186 00:06:01,680 --> 00:06:02,700 Another thing we want to do 187 00:06:02,700 --> 00:06:04,320 is establish performance standards 188 00:06:04,320 --> 00:06:06,480 for our high-availability networks. 189 00:06:06,480 --> 00:06:08,850 What are the standards that we're going to have to have? 190 00:06:08,850 --> 00:06:10,140 These standards are going to drive 191 00:06:10,140 --> 00:06:12,030 how success is measured for us. 192 00:06:12,030 --> 00:06:14,430 And in the case of my file server for instance, 193 00:06:14,430 --> 00:06:17,010 we measure success as it being up and available 194 00:06:17,010 --> 00:06:19,140 when my video editors need access it 195 00:06:19,140 --> 00:06:20,490 and that they don't lose data, 196 00:06:20,490 --> 00:06:21,960 because if we lost all our files, 197 00:06:21,960 --> 00:06:23,580 that would be bad for us, right? 198 00:06:23,580 --> 00:06:25,050 Those are two metrics that we have, 199 00:06:25,050 --> 00:06:27,840 and we have numbers associated with each of those things. 200 00:06:27,840 --> 00:06:29,640 In other organizations, we measure it 201 00:06:29,640 --> 00:06:32,490 based on the uptime of the entire end-to-end service. 202 00:06:32,490 --> 00:06:35,130 So if a client can't get out to the internet for an ISP, 203 00:06:35,130 --> 00:06:36,390 that would be a bad thing. 204 00:06:36,390 --> 00:06:37,980 That's one of their measurements. 205 00:06:37,980 --> 00:06:40,350 Now the other one might be what is their uptime? 206 00:06:40,350 --> 00:06:41,910 All these performance standards are developed 207 00:06:41,910 --> 00:06:44,220 through metrics and key performance indicators. 208 00:06:44,220 --> 00:06:45,390 If you're using something like ITIL 209 00:06:45,390 --> 00:06:47,310 as your IT service management standards, 210 00:06:47,310 --> 00:06:48,600 this is what you're going to be doing 211 00:06:48,600 --> 00:06:49,620 as you're trying to run those 212 00:06:49,620 --> 00:06:51,780 inside your organization as well. 213 00:06:51,780 --> 00:06:54,930 Finally, here we want to define how we manage and measure 214 00:06:54,930 --> 00:06:57,420 the high-availability solutions for ourselves. 215 00:06:57,420 --> 00:07:00,180 Metrics are going to be really useful to quantify success 216 00:07:00,180 --> 00:07:02,310 if you develop those metrics correctly. 217 00:07:02,310 --> 00:07:05,370 Decision makers and leaders love seeing metrics. 218 00:07:05,370 --> 00:07:07,680 They love seeing charts and seeing the performance 219 00:07:07,680 --> 00:07:09,150 and how it's going up over time, 220 00:07:09,150 --> 00:07:10,680 and how our availability is going up, 221 00:07:10,680 --> 00:07:12,450 and how our costs are going down. 222 00:07:12,450 --> 00:07:13,830 Those are all good things. 223 00:07:13,830 --> 00:07:15,270 But if you don't know what you're measuring 224 00:07:15,270 --> 00:07:16,590 or why you're measuring it, 225 00:07:16,590 --> 00:07:18,660 it really goes back to your performance standards. 226 00:07:18,660 --> 00:07:19,920 Then, these are the kind of things 227 00:07:19,920 --> 00:07:21,690 that are wasting your time with metrics. 228 00:07:21,690 --> 00:07:23,580 A lot of people measure a lot of things 229 00:07:23,580 --> 00:07:24,600 and they don't really tell you 230 00:07:24,600 --> 00:07:26,700 if you're getting the outcome you're wanting. 231 00:07:26,700 --> 00:07:28,440 I want to make sure that you think about 232 00:07:28,440 --> 00:07:31,020 how you decide on what metrics you're going to use. 233 00:07:31,020 --> 00:07:33,270 Now, we've covered a lot of different design criteria 234 00:07:33,270 --> 00:07:36,240 in this lesson, but the real big takeaway here 235 00:07:36,240 --> 00:07:38,010 that I want you to think about is this. 236 00:07:38,010 --> 00:07:39,690 If you have an existing network, 237 00:07:39,690 --> 00:07:41,280 you can add availability to it 238 00:07:41,280 --> 00:07:43,080 and you can add redundancy to it. 239 00:07:43,080 --> 00:07:44,790 You can retrofit stuff in, 240 00:07:44,790 --> 00:07:46,620 but it's going to cost you a lot more time 241 00:07:46,620 --> 00:07:48,180 and a lot more money. 242 00:07:48,180 --> 00:07:50,070 It is much, much cheaper 243 00:07:50,070 --> 00:07:52,140 to design this stuff early in the process 244 00:07:52,140 --> 00:07:54,210 when you start building a network from scratch. 245 00:07:54,210 --> 00:07:55,980 So, if you're designing a network 246 00:07:55,980 --> 00:07:58,410 and you're asked early on what kind of things you need, 247 00:07:58,410 --> 00:08:00,180 I want you to think about all these things 248 00:08:00,180 --> 00:08:02,490 a redundancy in your initial design. 249 00:08:02,490 --> 00:08:05,700 Adding them in early is going to save you a lot of money. 250 00:08:05,700 --> 00:08:07,950 Every project has three main factors, 251 00:08:07,950 --> 00:08:10,140 time, cost, and quality. 252 00:08:10,140 --> 00:08:11,820 And usually one of these things 253 00:08:11,820 --> 00:08:14,790 is going to suffer at the expense of the other two. 254 00:08:14,790 --> 00:08:17,040 For example, if I asked you to build me a network 255 00:08:17,040 --> 00:08:18,390 and I want it to be fully redundant 256 00:08:18,390 --> 00:08:21,060 and available by tomorrow, could you do it? 257 00:08:21,060 --> 00:08:24,450 Well, maybe, but it's probably going to cost me a lot of money, 258 00:08:24,450 --> 00:08:26,250 and because I give you very little time, 259 00:08:26,250 --> 00:08:27,780 it's going to cost me even more, 260 00:08:27,780 --> 00:08:29,910 or your quality is going to suffer. 261 00:08:29,910 --> 00:08:32,309 So you could do it good, you could do it quick, 262 00:08:32,309 --> 00:08:35,070 or you could do it cheap, but you can't do all three. 263 00:08:35,070 --> 00:08:37,770 It's always going to be a trade off between these three things, 264 00:08:37,770 --> 00:08:39,510 and I want you to remember as you're out there 265 00:08:39,510 --> 00:08:41,070 and you're designing networks, 266 00:08:41,070 --> 00:08:43,350 you need to make sure you're thinking about your redundancy 267 00:08:43,350 --> 00:08:45,660 and your availability and your reliability, 268 00:08:45,660 --> 00:08:48,240 because often that quality is going to suffer 269 00:08:48,240 --> 00:08:50,070 in favor of getting things out quicker 270 00:08:50,070 --> 00:08:52,570 or getting things out cheaper.