1 00:00:00,330 --> 00:00:01,170 ‫Welcome back. 2 00:00:01,200 --> 00:00:04,290 ‫In this video, we are going to talk about regular expressions. 3 00:00:04,290 --> 00:00:11,880 ‫And it's a language or a pattern that the regular expression engine attempts to match in the input text. 4 00:00:11,880 --> 00:00:19,980 ‫So we can create input text and then we can use regular expressions to check for specific patterns. 5 00:00:19,980 --> 00:00:21,570 ‫And that's super powerful. 6 00:00:21,570 --> 00:00:25,860 ‫And it's not only used in C sharp, it's used in all programming languages pretty much. 7 00:00:25,860 --> 00:00:32,910 ‫And it's useful in order to find out, for example, if a user entered a correct email address or if 8 00:00:32,910 --> 00:00:37,470 ‫a user entered a correct link or stuff like that in general. 9 00:00:37,470 --> 00:00:44,040 ‫So you can use regular expressions in order to figure those things out and search for specific patterns 10 00:00:44,040 --> 00:00:44,760 ‫and do something. 11 00:00:44,760 --> 00:00:48,960 ‫If they're there or if they're not there, then don't do anything, for example. 12 00:00:48,960 --> 00:00:55,230 ‫And well, I am on this website where you can check out the regular expression language with a quick 13 00:00:55,230 --> 00:01:04,440 ‫reference, and there are so many different escaped characters in order to find stuff or in order to 14 00:01:04,440 --> 00:01:05,400 ‫create patterns. 15 00:01:05,400 --> 00:01:10,490 ‫And there are a lot here, and I'm going to use the most important ones of those. 16 00:01:10,500 --> 00:01:15,780 ‫I'm not going to go into every single one of them, but you will see many different examples which will 17 00:01:15,990 --> 00:01:20,790 ‫make sure that you understand regular expressions, and then it will be super easy to go back to the 18 00:01:20,790 --> 00:01:23,490 ‫side and use the other ones if you ever need them. 19 00:01:23,580 --> 00:01:24,210 ‫All right. 20 00:01:24,210 --> 00:01:31,560 ‫So let's go back into Visual Studio, and I have created a little sample text. 21 00:01:31,560 --> 00:01:36,930 ‫So let's create a new regular expression project. 22 00:01:36,960 --> 00:01:37,890 ‫I have one here. 23 00:01:37,920 --> 00:01:42,390 ‫I'll just call it any way you want and let's create a new file in here as well. 24 00:01:42,420 --> 00:01:43,680 ‫New item. 25 00:01:43,920 --> 00:01:50,220 ‫And this time we're not going to create a class, it's just going to be a very basic text file. 26 00:01:50,220 --> 00:01:55,620 ‫So here, text file and I'm going to call it sample text. 27 00:01:57,000 --> 00:01:59,370 ‫Now let's open up that sample text file. 28 00:01:59,370 --> 00:02:04,110 ‫And I just copied and pasted some code in there or some text in there. 29 00:02:04,110 --> 00:02:12,060 ‫And as you can see, numbers, telephone numbers, German telephone numbers and so forth, the alphabet, 30 00:02:12,510 --> 00:02:15,630 ‫some lalala text and so forth. 31 00:02:15,630 --> 00:02:23,970 ‫So we are going to use regular expressions in order to mark those specific patterns that we're searching 32 00:02:23,970 --> 00:02:27,360 ‫for, and that will make a lot more sense once we get into practice. 33 00:02:27,360 --> 00:02:33,870 ‫So let's press control F or command F in order to get to the find tool here. 34 00:02:33,870 --> 00:02:42,000 ‫And once you're here, press this, use regular expressions or alt e so that it will use regular expressions. 35 00:02:42,000 --> 00:02:47,760 ‫Now, now our search console here will apply regular expressions. 36 00:02:47,760 --> 00:02:50,700 ‫So what our regular expressions will use for some of them. 37 00:02:50,700 --> 00:02:52,890 ‫Let's start with a very simple one. 38 00:02:52,980 --> 00:02:58,470 ‫So I'm going to go ahead and start with tabs. 39 00:02:58,470 --> 00:03:01,020 ‫So let's go for backslash tab. 40 00:03:01,200 --> 00:03:04,230 ‫And as you can see now, tabs are marked. 41 00:03:04,230 --> 00:03:08,340 ‫So in my case, I'm going to make it a little bigger so you can see it better. 42 00:03:08,340 --> 00:03:09,720 ‫Here are some tips. 43 00:03:09,720 --> 00:03:11,370 ‫Let me create some more tabs. 44 00:03:11,370 --> 00:03:14,730 ‫As you can see now, the tabs are all marked here. 45 00:03:14,730 --> 00:03:16,020 ‫They're all highlighted. 46 00:03:17,100 --> 00:03:25,260 ‫Now let's find every new line by backslash n so you can see this is where the new line starts and here 47 00:03:25,260 --> 00:03:25,650 ‫as well. 48 00:03:25,650 --> 00:03:28,140 ‫So this is pretty much where I pressed enter. 49 00:03:28,140 --> 00:03:29,640 ‫Let me press some more enter. 50 00:03:29,640 --> 00:03:36,000 ‫As you can see here, each time I press enter and I break the line, I have this backslash n So that's 51 00:03:36,000 --> 00:03:40,230 ‫how I can figure out whether there was a new line or not. 52 00:03:40,560 --> 00:03:43,890 ‫So what do I get all of those backslash end and so forth from? 53 00:03:43,890 --> 00:03:51,360 ‫Well, I can find them here under the regular expression, language, quick reference, but I've also 54 00:03:51,360 --> 00:03:53,670 ‫created a little snippet of text file. 55 00:03:53,670 --> 00:03:57,660 ‫So let me create a little file and I'm going to paste it in there. 56 00:03:57,660 --> 00:04:07,470 ‫You can download them from the assets of this lecture and I'm going to call them snippets, dot text. 57 00:04:08,820 --> 00:04:15,600 ‫So here we have character escapes which are, for example, tabs, so we can find out whether there 58 00:04:15,600 --> 00:04:20,850 ‫are tabs in my string, which is in my example, text file, the whole file. 59 00:04:20,850 --> 00:04:27,150 ‫So here I'm checking for the whole file and I'm checking for every single new line or every single tab 60 00:04:27,180 --> 00:04:29,100 ‫in my whole file. 61 00:04:29,130 --> 00:04:29,730 ‫Pretty much. 62 00:04:29,730 --> 00:04:31,230 ‫Okay then. 63 00:04:32,050 --> 00:04:38,440 ‫We have the character classes, for example, a dot, a simple dot is a wild card matches any single 64 00:04:38,440 --> 00:04:43,080 ‫character except for backslash n so except for new line. 65 00:04:43,090 --> 00:04:44,590 ‫Let's check out if that's true. 66 00:04:44,710 --> 00:04:46,560 ‫So we have new lines. 67 00:04:46,570 --> 00:04:50,470 ‫Let's check it out again by using the N key here. 68 00:04:51,010 --> 00:04:52,600 ‫As you can see, we get all the new lines. 69 00:04:52,600 --> 00:04:56,350 ‫Now let's use the dot and dot, as you can see. 70 00:04:57,270 --> 00:04:59,730 ‫Matches any single character. 71 00:04:59,850 --> 00:05:04,350 ‫So that's what this dot and or dot does is a wild card. 72 00:05:04,350 --> 00:05:07,290 ‫So if you want to mark everything, well, that's the way to go. 73 00:05:07,290 --> 00:05:12,330 ‫So you can see it checks out everything it says except for the new line. 74 00:05:12,330 --> 00:05:15,450 ‫But as it seems, new lines are also marked. 75 00:05:15,450 --> 00:05:19,830 ‫So each time I create a new line, it's still marked with this dot. 76 00:05:20,220 --> 00:05:20,550 ‫All right. 77 00:05:20,550 --> 00:05:21,810 ‫So let's check out the next one. 78 00:05:21,810 --> 00:05:26,160 ‫I'm going to go through some of them and then we'll combine them to use some cool, cool things. 79 00:05:26,160 --> 00:05:26,700 ‫All right. 80 00:05:26,700 --> 00:05:30,420 ‫So matches any digital or any decimal digit. 81 00:05:30,420 --> 00:05:32,140 ‫Let's go ahead and check that out. 82 00:05:32,160 --> 00:05:34,650 ‫Backslash D standing for digit. 83 00:05:34,650 --> 00:05:42,090 ‫And as you can see, my number here, one, two, three, four, five, 67890 are all marked and here 84 00:05:42,090 --> 00:05:47,880 ‫all my numbers are marked and the values in between, for example, the plus sign, the hash tag and 85 00:05:47,880 --> 00:05:49,800 ‫so forth, they are not marked. 86 00:05:50,970 --> 00:05:58,770 ‫So any number will be found or any number will match that pattern of this regular expression. 87 00:05:58,770 --> 00:06:01,290 ‫Backslash d which stands for decimal. 88 00:06:01,830 --> 00:06:08,190 ‫Backslash capital d will do the opposite, so it will take everything else but numbers, as you can 89 00:06:08,190 --> 00:06:09,090 ‫see here again. 90 00:06:12,010 --> 00:06:15,520 ‫If you want to have the white space, you can use backslash s. 91 00:06:15,760 --> 00:06:18,490 ‫Then you get any white space that you have in your file. 92 00:06:18,490 --> 00:06:23,170 ‫So if you, let's say I want to get rid of any white space in your file, you could use this pattern 93 00:06:23,170 --> 00:06:26,020 ‫and get rid of everything that matches that pattern. 94 00:06:27,400 --> 00:06:30,940 ‫Capital s on the other side is everything that is not a white space. 95 00:06:30,940 --> 00:06:33,430 ‫So as you can see, the capital is pretty much the opposite. 96 00:06:33,430 --> 00:06:34,060 ‫Always. 97 00:06:34,060 --> 00:06:36,760 ‫So DX is all numbers. 98 00:06:36,760 --> 00:06:42,610 ‫Capital D is except numbers, so everything except numbers. 99 00:06:42,970 --> 00:06:48,190 ‫All right then let's check out w w will give me any word. 100 00:06:48,190 --> 00:06:51,610 ‫Can character and numbers are included. 101 00:06:51,610 --> 00:06:57,850 ‫So here it gives me all the numbers and all characters such as ABCD and so on. 102 00:06:57,850 --> 00:07:03,490 ‫So here the whole alphabet is marked as well as all of my digits. 103 00:07:03,490 --> 00:07:11,620 ‫Here you can see as well dots are not marked and slashes and those meta characters, those specific 104 00:07:11,620 --> 00:07:16,150 ‫characters, they are not marked as well and add neither. 105 00:07:16,540 --> 00:07:17,260 ‫All right. 106 00:07:17,260 --> 00:07:26,350 ‫If you want to have everything else, use capital W now if you want to have, for example, the LA, 107 00:07:26,380 --> 00:07:28,930 ‫let's search for la la. 108 00:07:28,960 --> 00:07:35,590 ‫As you can see, I can search it with lowercase or uppercase in both cases. 109 00:07:35,590 --> 00:07:42,760 ‫My la la la la la text here if you want to sing la la la la la la la and so forth, then that's the 110 00:07:42,760 --> 00:07:44,080 ‫one that is marked here. 111 00:07:44,080 --> 00:07:51,730 ‫But as you can see, all of them are matching that pattern, even though I'm not using the literal capital 112 00:07:51,730 --> 00:07:52,750 ‫letter or not. 113 00:07:53,140 --> 00:07:53,710 ‫All right. 114 00:07:53,710 --> 00:08:01,120 ‫And if I would like to only have the ones that are bound to one specific side, I can go ahead and use 115 00:08:02,200 --> 00:08:14,080 ‫Beck or D l a and they are so only the ones that have a boundary towards the left will now be displayed. 116 00:08:14,080 --> 00:08:20,680 ‫So which means that there is nothing before the LA or there is a white space before the LA, so only 117 00:08:20,680 --> 00:08:21,850 ‫those ones will be marked. 118 00:08:21,850 --> 00:08:26,530 ‫So as you can see, this LA is not marked neither this one. 119 00:08:26,530 --> 00:08:29,800 ‫That's because the word boundary is not on the left hand side. 120 00:08:29,800 --> 00:08:33,100 ‫So let's go back to the snippets real quick and check out what we have here. 121 00:08:33,100 --> 00:08:35,990 ‫So we have character groups which we need to look at. 122 00:08:35,990 --> 00:08:37,900 ‫Then we have the negation sign. 123 00:08:37,900 --> 00:08:43,180 ‫So the anchors, the match must start at the beginning of a string or line. 124 00:08:43,270 --> 00:08:50,230 ‫If you use that, then the double dollar sign and match must occur at the end of the string or before 125 00:08:50,500 --> 00:08:51,190 ‫a. 126 00:08:51,880 --> 00:08:55,000 ‫Next line at the end of the line of the string. 127 00:08:55,000 --> 00:08:55,720 ‫Pretty much. 128 00:08:56,440 --> 00:08:57,550 ‫All right, then. 129 00:08:58,030 --> 00:08:59,500 ‫Yeah, let's check those ones out. 130 00:08:59,500 --> 00:09:03,490 ‫Let's go back to our sample file, for example. 131 00:09:05,910 --> 00:09:07,500 ‫Circumflex. 132 00:09:09,090 --> 00:09:11,160 ‫And as you can see here, this one is marked. 133 00:09:11,160 --> 00:09:14,550 ‫So it's at the start of a string. 134 00:09:14,880 --> 00:09:16,320 ‫Let's go for an A. 135 00:09:16,350 --> 00:09:22,680 ‫As you can see here, there is the A at the beginning of the string or if you want to have an M, that's 136 00:09:22,680 --> 00:09:24,810 ‫at the beginning of the string as well. 137 00:09:25,050 --> 00:09:26,190 ‫So there you are. 138 00:09:27,690 --> 00:09:30,520 ‫What if I want to have the one at the end of the string? 139 00:09:30,540 --> 00:09:33,450 ‫Well, the thing is that we have a whole text. 140 00:09:33,450 --> 00:09:37,220 ‫So this whole sample text file is one string, pretty much. 141 00:09:37,230 --> 00:09:39,480 ‫So if I want to have it at the end. 142 00:09:39,510 --> 00:09:43,700 ‫As you can see, there is one character at the end of the string and that's this plus here. 143 00:09:43,710 --> 00:09:50,040 ‫So if I want to have the plus, by the way, or the DOT or any of those metro characters, I need to 144 00:09:50,040 --> 00:09:51,110 ‫escape those. 145 00:09:51,120 --> 00:09:53,490 ‫So if I search for them, I need to escape them. 146 00:09:53,730 --> 00:09:55,250 ‫And how can I escape them? 147 00:09:55,260 --> 00:09:57,420 ‫Well, I can escape them by using the backslash. 148 00:09:57,420 --> 00:10:01,080 ‫So backslash plus will give me the plus here. 149 00:10:01,120 --> 00:10:05,070 ‫As you can see, the literal plus on the other side. 150 00:10:05,850 --> 00:10:09,660 ‫So if I just enter plus, as you can see, this plus will not be found. 151 00:10:09,660 --> 00:10:15,630 ‫And that is because it's a meta character and we need to escape them if we use regular expressions in 152 00:10:15,630 --> 00:10:16,940 ‫order to find them. 153 00:10:16,950 --> 00:10:19,800 ‫So Backslash plus will give me the plus. 154 00:10:19,800 --> 00:10:24,630 ‫And as you can see, this plus here is marked as well as the plus there. 155 00:10:24,790 --> 00:10:29,760 ‫But now if I only want to have the one at the end of a string and as I said, this whole text is one 156 00:10:29,760 --> 00:10:35,280 ‫string, I simply use the dollar sign and now I can see this plus is marked. 157 00:10:35,310 --> 00:10:40,140 ‫However, the plus here is not marked or not highlighted. 158 00:10:41,640 --> 00:10:44,790 ‫Now let's look at range real ranges real quick. 159 00:10:44,790 --> 00:10:50,490 ‫So as you can see, we have this character group matches a single or any single character in a character 160 00:10:50,490 --> 00:10:53,820 ‫group by default the match is case sensitive. 161 00:10:53,820 --> 00:10:55,510 ‫So character groups. 162 00:10:55,530 --> 00:10:56,370 ‫Let's check them out. 163 00:10:56,370 --> 00:10:58,170 ‫Let's go into our sample text. 164 00:10:59,580 --> 00:11:06,200 ‫And in those square brackets, I can go ahead and do something like A to Z. 165 00:11:06,210 --> 00:11:11,580 ‫So now every single character between A and Z will be highlighted. 166 00:11:11,580 --> 00:11:14,720 ‫So as you can see, all characters are highlighted. 167 00:11:14,730 --> 00:11:21,180 ‫If I only want to have a to F, as you can see, only a f are highlighted and as you can see, it doesn't 168 00:11:21,180 --> 00:11:21,540 ‫care. 169 00:11:21,540 --> 00:11:25,050 ‫It takes capital letters as well as lower capital letters. 170 00:11:25,170 --> 00:11:30,840 ‫And so you can see our Visual Studio doesn't seem to care about this by default. 171 00:11:30,840 --> 00:11:32,270 ‫The match is case sensitive. 172 00:11:32,280 --> 00:11:35,970 ‫So in our case, the character group was not case sensitive, as you can see. 173 00:11:35,970 --> 00:11:38,100 ‫So it takes either. 174 00:11:38,640 --> 00:11:39,060 ‫All right. 175 00:11:39,060 --> 00:11:47,250 ‫But if you want to make sure and in other cases you need to use A to Z, for example, like that and 176 00:11:47,250 --> 00:11:50,460 ‫A to Z like that. 177 00:11:50,460 --> 00:11:54,840 ‫So they would have lower capital and upper capital letters? 178 00:11:54,840 --> 00:11:55,590 ‫Pretty much. 179 00:11:56,400 --> 00:11:56,790 ‫All right. 180 00:11:56,790 --> 00:12:00,870 ‫So that's one group that's an example of a group. 181 00:12:00,870 --> 00:12:04,620 ‫We can use any letter in here. 182 00:12:04,860 --> 00:12:11,640 ‫So as you can see, every single thing or every single letter that we put in there is marked as well 183 00:12:11,640 --> 00:12:13,140 ‫now or highlighted as well now. 184 00:12:13,140 --> 00:12:17,670 ‫So within this group, everything that is in the group will be highlighted. 185 00:12:20,640 --> 00:12:24,060 ‫Now we can create other ranges as well with digits. 186 00:12:24,060 --> 00:12:29,470 ‫So let's say I want to have every number, number between one and five that we are now. 187 00:12:29,470 --> 00:12:32,400 ‫I get all the numbers between one and five zero not included. 188 00:12:32,400 --> 00:12:34,860 ‫So one, two, three, four, five. 189 00:12:36,280 --> 00:12:37,860 ‫What if I want to have the opposite? 190 00:12:37,870 --> 00:12:41,200 ‫Well, then I can use the circumflex again. 191 00:12:41,380 --> 00:12:47,040 ‫So circumflex 1 to 5 will give me all numbers except for 1 to 5. 192 00:12:47,050 --> 00:12:53,050 ‫So if I want to have every single number except for the seven, let's say, what would I enter? 193 00:12:54,160 --> 00:12:55,140 ‫All right, I have. 194 00:12:55,140 --> 00:12:56,320 ‫I hope you figured it out. 195 00:12:56,320 --> 00:12:58,810 ‫It's just circumflex seven. 196 00:13:00,040 --> 00:13:03,760 ‫All right, how do I now figure out a specific pattern? 197 00:13:03,760 --> 00:13:08,290 ‫Because so far, we have used very generic, regular expressions. 198 00:13:08,290 --> 00:13:13,000 ‫We just used the very simple characters and so on and so on. 199 00:13:13,000 --> 00:13:19,540 ‫So what if I want to have this specific structure, let's say three digits, another three digits and 200 00:13:19,540 --> 00:13:22,780 ‫then four digits, and they are separated by the dash. 201 00:13:23,320 --> 00:13:24,510 ‫So how do I do it? 202 00:13:24,520 --> 00:13:30,100 ‫Well, I say it's one digit, then it's another digit and it's another digit. 203 00:13:30,220 --> 00:13:35,740 ‫Then as you can see, that fits for all those patterns that we have or all those numbers that we have 204 00:13:35,740 --> 00:13:36,400 ‫here. 205 00:13:36,400 --> 00:13:39,460 ‫That's because all of them are three numbers at least. 206 00:13:39,460 --> 00:13:45,490 ‫So if I add another one here, as you can see, the first line is highlighted. 207 00:13:45,520 --> 00:13:54,280 ‫Then here in our third line, only the ones with at least four numbers connected to each other is highlighted. 208 00:13:54,280 --> 00:14:01,000 ‫So what if I want to have this line three highlighted and line four as well because they are with the 209 00:14:01,000 --> 00:14:12,100 ‫same pattern, then I can do the following or the following thing like that minus backslash d, backslash 210 00:14:12,100 --> 00:14:18,700 ‫d, backslash d and backslash d So now I have those two lines or those two numbers highlighted. 211 00:14:18,700 --> 00:14:27,130 ‫This is a way how I can figure out all telephone numbers with this structure, and there are easier 212 00:14:27,130 --> 00:14:27,940 ‫ways to do that. 213 00:14:27,940 --> 00:14:31,540 ‫I don't have to enter so many D's if I don't want to. 214 00:14:31,570 --> 00:14:41,350 ‫There is a simpler way I can use the following approach so I can just say, okay, within curly brackets 215 00:14:41,350 --> 00:14:48,370 ‫I say home or I tell how many of the foregoing escape characters or patterns do I want. 216 00:14:48,370 --> 00:14:52,930 ‫So in this case, I say, okay, I want to have three D's, which means three digits. 217 00:14:53,170 --> 00:15:00,040 ‫The same thing goes for our second entry or second part of information. 218 00:15:00,040 --> 00:15:03,490 ‫And for the last one, as you can see, three is not enough. 219 00:15:03,490 --> 00:15:04,240 ‫We need four. 220 00:15:04,270 --> 00:15:06,850 ‫So now it fits those two patterns. 221 00:15:06,850 --> 00:15:12,460 ‫So what if I don't want only to have the minuses, but also the dots or also the hashes? 222 00:15:12,640 --> 00:15:21,670 ‫Well, I can either use that in both cases, but then as you can see, everything is included. 223 00:15:21,670 --> 00:15:28,300 ‫So as you can see here, we have three numbers, then we have something else and then we have a lot 224 00:15:28,300 --> 00:15:29,110 ‫more numbers. 225 00:15:29,110 --> 00:15:31,780 ‫So we have actually this piece here. 226 00:15:31,780 --> 00:15:35,200 ‫That's why the line seven is marked as well. 227 00:15:35,200 --> 00:15:36,010 ‫Same for eight. 228 00:15:36,010 --> 00:15:42,430 ‫As you can see, it starts after the 49, which means we have this pattern here as well. 229 00:15:42,430 --> 00:15:44,110 ‫But I don't want to have those patterns. 230 00:15:44,110 --> 00:15:46,180 ‫I want to be more precise. 231 00:15:46,180 --> 00:15:49,900 ‫I want to have, for example, the hash tag in there. 232 00:15:49,900 --> 00:15:54,160 ‫I want to have the minus and dot, but I only want to have this specific structure. 233 00:15:54,280 --> 00:15:55,750 ‫So how do I get that? 234 00:15:58,400 --> 00:16:05,660 ‫Well, I can use a lateral group search, which means instead of the dot, I'm going to use square brackets 235 00:16:05,660 --> 00:16:09,860 ‫again and they will be my little group search. 236 00:16:09,860 --> 00:16:12,800 ‫So in here I say either it's a dot. 237 00:16:13,780 --> 00:16:17,110 ‫Or it's a hash or it's a minus. 238 00:16:17,860 --> 00:16:22,630 ‫So as you can see now, all of those three characters are fine. 239 00:16:22,900 --> 00:16:24,130 ‫But nothing else. 240 00:16:24,160 --> 00:16:28,390 ‫Now, if I want to use it for the first character here as well. 241 00:16:28,390 --> 00:16:30,250 ‫So I had a dot here. 242 00:16:30,250 --> 00:16:32,410 ‫So it took the minus, the dot and the hash. 243 00:16:32,410 --> 00:16:36,010 ‫But I want to have all three of them and not anything else. 244 00:16:36,010 --> 00:16:39,130 ‫So what if I enter an A here, for example? 245 00:16:39,400 --> 00:16:41,200 ‫Well, now it's not the case anymore. 246 00:16:41,200 --> 00:16:45,850 ‫But if it's just a dot, then the A is fine. 247 00:16:45,940 --> 00:16:51,880 ‫But if I enter my group that I had there, my specific little group search. 248 00:16:51,880 --> 00:16:54,250 ‫So this is a little group search here. 249 00:16:54,430 --> 00:16:59,230 ‫Now, it's not included anymore, so it has to be one of those three characters. 250 00:17:00,220 --> 00:17:00,880 ‫All right. 251 00:17:00,880 --> 00:17:07,780 ‫If you feel like a challenge and you're up for a real challenge, then create the structure for all 252 00:17:07,780 --> 00:17:08,650 ‫those three numbers. 253 00:17:08,650 --> 00:17:14,320 ‫So create a pattern for all those three different numbers, because these are three different ways of 254 00:17:14,320 --> 00:17:17,560 ‫creating a mobile phone number in Germany. 255 00:17:18,100 --> 00:17:19,900 ‫And that's the general structure. 256 00:17:19,900 --> 00:17:24,520 ‫So this is if you call from Germany, this is if you call from outside of Germany and that one as well. 257 00:17:24,520 --> 00:17:30,340 ‫So either a plus or double zero, that's generally if you call someone outside of your country, you 258 00:17:30,340 --> 00:17:33,670 ‫need to have the plus plus the country code. 259 00:17:33,670 --> 00:17:38,350 ‫And it's the same with zero zero, which represents this plus. 260 00:17:38,740 --> 00:17:39,220 ‫All right. 261 00:17:39,220 --> 00:17:43,720 ‫So if you want to, you can create the structure for this one. 262 00:17:43,720 --> 00:17:45,280 ‫It's pretty difficult. 263 00:17:45,280 --> 00:17:48,550 ‫And I still need to show you one thing before you can do that. 264 00:17:50,230 --> 00:17:52,600 ‫And maybe it's a little more than just one thing. 265 00:17:53,140 --> 00:17:57,280 ‫So, first of all, let's go ahead with the pipe. 266 00:17:57,280 --> 00:18:08,740 ‫So let's say I want to have plus 49 or I want to have something like zero zero 49. 267 00:18:09,430 --> 00:18:11,130 ‫Now, that by itself won't work. 268 00:18:11,140 --> 00:18:11,860 ‫Why is that? 269 00:18:11,860 --> 00:18:17,980 ‫Well, if we look at the snippet, we can find qualifiers and we have not used qualifiers yet. 270 00:18:17,980 --> 00:18:19,840 ‫And that's something that we will need to do now. 271 00:18:19,870 --> 00:18:23,620 ‫The same thing for grouping constructs and alternation constructs. 272 00:18:23,620 --> 00:18:26,920 ‫So these are the ones that we are going to use now. 273 00:18:27,310 --> 00:18:33,370 ‫We need to check those out before we can go ahead and create the whole structure for those numbers. 274 00:18:33,490 --> 00:18:39,160 ‫So let's start off with using alternation constructs and qualifiers. 275 00:18:39,160 --> 00:18:44,430 ‫So the plus says matches the previous element one or more times. 276 00:18:44,440 --> 00:18:46,420 ‫Well, I don't want to do that. 277 00:18:46,420 --> 00:18:48,700 ‫I just want to have a literal plus. 278 00:18:48,700 --> 00:18:53,770 ‫And as you know, a plus is a meta character and it needs to be escaped. 279 00:18:53,770 --> 00:18:56,920 ‫So what I need to do before the plus is a backslash. 280 00:18:57,280 --> 00:18:59,580 ‫And now you can see I'm using the pipe here. 281 00:18:59,590 --> 00:19:05,440 ‫So I say either it's plus 49 or it's zero zero 49. 282 00:19:05,440 --> 00:19:12,850 ‫And as you can see, both those structures or both those patterns are marked here or both my numbers 283 00:19:12,850 --> 00:19:14,710 ‫here are highlighted. 284 00:19:15,430 --> 00:19:16,990 ‫So that's what this pipe does. 285 00:19:16,990 --> 00:19:18,380 ‫It's either or. 286 00:19:18,400 --> 00:19:25,570 ‫So one of those has to be true in the whole number that we're looking at. 287 00:19:25,690 --> 00:19:26,680 ‫Then it's fine. 288 00:19:26,680 --> 00:19:29,560 ‫And that's not just for numbers, that's for characters as well. 289 00:19:30,040 --> 00:19:39,880 ‫So, for example, if I want to have Mr. and Mrs. so if I want to tag those, I use M R pipe M's, which 290 00:19:39,880 --> 00:19:43,690 ‫means either Mr. or Ms. have to be true. 291 00:19:43,690 --> 00:19:46,510 ‫And it's the case here in our case. 292 00:19:48,690 --> 00:19:54,170 ‫So that's the either or, as you can see here, our alternation construct. 293 00:19:54,180 --> 00:19:58,230 ‫Then the grouping construct is something that we need to look at. 294 00:19:58,650 --> 00:20:02,700 ‫And in order to look at those, I think it's best if we look at the links here. 295 00:20:02,700 --> 00:20:11,160 ‫So we have certain links here, HTTPS and HTTP links and either with W, W, W or without. 296 00:20:11,160 --> 00:20:12,960 ‫So we have four different links here. 297 00:20:13,230 --> 00:20:20,190 ‫And if I want to use a group, let's say I want to use this group w w w but as you can see, it's not 298 00:20:20,190 --> 00:20:22,080 ‫the case for all links. 299 00:20:22,470 --> 00:20:32,190 ‫So in order to find the links, we need to first of all go to HTTP s, then we have this structure. 300 00:20:32,190 --> 00:20:33,660 ‫So far everything is good. 301 00:20:33,660 --> 00:20:42,390 ‫And now if I want to have those two still active, I need to check out this group w w w because here 302 00:20:42,390 --> 00:20:44,310 ‫it's the case and here it's not. 303 00:20:44,310 --> 00:20:45,960 ‫So how do I do that? 304 00:20:45,960 --> 00:20:53,490 ‫Well, I need to use a group, so I'm going to group w w w and as you can see so far only this link 305 00:20:53,490 --> 00:20:53,850 ‫here. 306 00:20:53,850 --> 00:20:56,340 ‫The first one is highlighted. 307 00:20:56,340 --> 00:21:01,500 ‫But then if I enter this question mark, then it's fine. 308 00:21:01,500 --> 00:21:06,480 ‫So I'm using the group w w but I'm combining it with. 309 00:21:07,280 --> 00:21:09,830 ‫The question mark and what does this question mark do? 310 00:21:09,860 --> 00:21:15,140 ‫Well, it's a quantifier and it matches the previous element, zero or one time. 311 00:21:15,470 --> 00:21:23,780 ‫So the qualifiers are the star, which matches the previous element zero or more times, then the plus 312 00:21:23,780 --> 00:21:27,770 ‫sign, which matches the previous element one or more times. 313 00:21:27,770 --> 00:21:33,200 ‫And then we have the question mark, which matches the previous element zero or one times. 314 00:21:33,230 --> 00:21:36,530 ‫You can also say how many times you want us to be. 315 00:21:36,530 --> 00:21:37,310 ‫Exactly. 316 00:21:37,310 --> 00:21:38,990 ‫So it matches the previous element. 317 00:21:38,990 --> 00:21:39,800 ‫Exactly. 318 00:21:39,800 --> 00:21:42,650 ‫End times and for a range as well. 319 00:21:42,650 --> 00:21:44,300 ‫So at least end times. 320 00:21:44,300 --> 00:21:47,210 ‫But more than or but no more than MX times. 321 00:21:47,210 --> 00:21:50,080 ‫So that's the different quantifier that you can use. 322 00:21:50,090 --> 00:21:56,180 ‫So now you've seen one example of grouping, so grouping the W and then one example of qualifiers by 323 00:21:56,180 --> 00:21:57,230 ‫using the question mark. 324 00:21:57,230 --> 00:22:00,350 ‫So as you can see, both things are still active. 325 00:22:00,350 --> 00:22:07,790 ‫So this w w w and the one without is still active and here it's actually w w w dot. 326 00:22:07,790 --> 00:22:09,830 ‫So this dot is included as well. 327 00:22:09,830 --> 00:22:14,810 ‫And then after the question mark, I can go ahead and say tutorials dot eu. 328 00:22:15,620 --> 00:22:18,710 ‫So both things are active. 329 00:22:20,100 --> 00:22:20,760 ‫Okay. 330 00:22:21,060 --> 00:22:24,270 ‫So let's go back, though. 331 00:22:24,300 --> 00:22:29,670 ‫We still have this little challenge to get all of those numbers to be active. 332 00:22:31,330 --> 00:22:35,810 ‫And by now you should be able to do that by what you have seen. 333 00:22:35,830 --> 00:22:36,890 ‫It's pretty tough. 334 00:22:36,910 --> 00:22:41,290 ‫I'm telling you, it can take quite a while, maybe even half an hour or so. 335 00:22:41,290 --> 00:22:47,410 ‫Even though it's a very basic example, it can take a while to figure it out based on what you've learned 336 00:22:47,410 --> 00:22:49,840 ‫or based on the snippets that you've seen here. 337 00:22:49,840 --> 00:22:57,430 ‫So it's maybe best if you download the snippets file and look at the different character classes, character 338 00:22:57,430 --> 00:23:01,360 ‫escapes, anchors and so forth to try it. 339 00:23:01,420 --> 00:23:03,160 ‫That's, as I said, a difficult one. 340 00:23:03,160 --> 00:23:08,050 ‫If you don't feel like a difficult challenge, then wait for the easier challenge, which will be to 341 00:23:08,050 --> 00:23:13,450 ‫find all website links or to well, mark all website links or highlight all of them. 342 00:23:14,410 --> 00:23:21,520 ‫All right, so now you should know everything that you need in order to create patterns to find German 343 00:23:21,520 --> 00:23:23,050 ‫mobile phone numbers. 344 00:23:23,050 --> 00:23:24,730 ‫And there is one thing to consider. 345 00:23:24,730 --> 00:23:31,990 ‫So they always start with a zero or a plus 49 or a00 49, then follows a one. 346 00:23:31,990 --> 00:23:33,100 ‫So that's always the case. 347 00:23:33,100 --> 00:23:36,790 ‫So as you can see, a one here, another one there and one here. 348 00:23:37,210 --> 00:23:42,400 ‫The following numbers either is six or seven, so there are no other numbers allowed here. 349 00:23:42,430 --> 00:23:47,320 ‫Then the next digit, which is in this case the fourth digit can be any number. 350 00:23:47,440 --> 00:23:53,260 ‫Then you have a slash and following are just eight numbers. 351 00:23:55,420 --> 00:24:01,870 ‫And those eight numbers can be any number, so it can be any of those ten different digits. 352 00:24:02,710 --> 00:24:04,390 ‫So that's generally the structure. 353 00:24:04,390 --> 00:24:06,070 ‫Please try to build that. 354 00:24:06,070 --> 00:24:09,460 ‫If you feel like a real challenge, it is not easy. 355 00:24:09,460 --> 00:24:11,470 ‫It can take quite a while to set it up. 356 00:24:11,470 --> 00:24:19,690 ‫But you have seen what you need to do that and well, please just give it a try and go ahead and check 357 00:24:19,690 --> 00:24:21,700 ‫out the snippets to do so. 358 00:24:21,700 --> 00:24:26,410 ‫So always go back to the snippets in order to figure it out and it can be difficult. 359 00:24:26,410 --> 00:24:32,290 ‫And if you feel like you're not making any progress and you can't get it to work so can't get all those 360 00:24:32,290 --> 00:24:37,330 ‫three numbers to be highlighted, then that's fine because I'll explain just after. 361 00:24:39,700 --> 00:24:40,210 ‫All right. 362 00:24:40,210 --> 00:24:45,190 ‫So I'm going to explain now, if you want to try it yourself, just pause the video. 363 00:24:45,370 --> 00:24:45,790 ‫Okay. 364 00:24:45,790 --> 00:24:52,090 ‫So first of all, we have either a zero, a plus 49 or a00 49. 365 00:24:52,240 --> 00:24:59,860 ‫So let's start off with the plus 49 and I'm going to use a group for that. 366 00:24:59,860 --> 00:25:07,000 ‫I need the backslash plus 49 in order to use the the literal plus, as you can see here. 367 00:25:07,600 --> 00:25:11,830 ‫Then the alternative is a0049. 368 00:25:11,830 --> 00:25:15,070 ‫So it's either plus 49 or it's 0049. 369 00:25:15,490 --> 00:25:18,880 ‫And then the other alternative is it's just a zero. 370 00:25:18,880 --> 00:25:21,970 ‫So as you can see, it's just this zero here. 371 00:25:22,690 --> 00:25:25,900 ‫But as you can see, all the others are now active as well. 372 00:25:25,900 --> 00:25:27,940 ‫All other zeros are active as well. 373 00:25:28,030 --> 00:25:34,720 ‫And I'm going to add a question mark here, because I'm saying there can be a zero or there can be none. 374 00:25:34,720 --> 00:25:39,370 ‫So in those cases, there are none in those two cases. 375 00:25:39,490 --> 00:25:41,340 ‫Only in that case, we have a zero. 376 00:25:41,350 --> 00:25:44,530 ‫So now let's check out what this question mark means. 377 00:25:44,830 --> 00:25:48,400 ‫It means matches the previous element zero or one time. 378 00:25:49,540 --> 00:25:54,040 ‫So zero in those two cases and one in line eight. 379 00:25:54,670 --> 00:25:54,830 ‫Okay. 380 00:25:54,910 --> 00:25:55,510 ‫That's not it. 381 00:25:55,510 --> 00:25:57,160 ‫That's just the beginning of the number. 382 00:25:57,160 --> 00:25:57,670 ‫Right. 383 00:25:57,700 --> 00:25:59,050 ‫Follows a one. 384 00:25:59,050 --> 00:26:04,270 ‫So now we have zero one and then a six or a seven. 385 00:26:04,540 --> 00:26:11,710 ‫So I create a group again, just going to say six, five, seven within my brackets. 386 00:26:12,460 --> 00:26:16,230 ‫So so far everything is good, then it can be any number. 387 00:26:16,240 --> 00:26:25,870 ‫So now I can just use backslash D, then we have a slash and then we have another number. 388 00:26:25,870 --> 00:26:28,840 ‫And it's not just one, it's eight digits. 389 00:26:28,840 --> 00:26:33,850 ‫So we can just say, okay, it will be eight digits and that we are. 390 00:26:33,880 --> 00:26:35,920 ‫That's the structure. 391 00:26:35,920 --> 00:26:38,170 ‫I'm going to add it just below here. 392 00:26:38,290 --> 00:26:39,880 ‫So that's our pattern. 393 00:26:39,880 --> 00:26:46,600 ‫In order to get German telephone numbers, well, at least the ones that we see here and in general, 394 00:26:46,600 --> 00:26:48,570 ‫you should find all German telephone numbers. 395 00:26:48,580 --> 00:26:53,800 ‫Now, you could, of course, go ahead and try the same thing for your country numbers. 396 00:26:53,800 --> 00:27:00,700 ‫So whatever you're from, wherever you live, check out the telephone numbers or the mobile phone numbers 397 00:27:00,700 --> 00:27:06,000 ‫and see how the structure of those mobile phone numbers is and try to build a pattern for that. 398 00:27:06,010 --> 00:27:07,660 ‫You can even share that if you want. 399 00:27:07,660 --> 00:27:10,270 ‫That would be interesting to see for everything. 400 00:27:10,270 --> 00:27:14,290 ‫Everyone else in the course and me myself as well. 401 00:27:14,350 --> 00:27:19,060 ‫It would be very interesting to see where you're from and how the structure of your countries telephone 402 00:27:19,060 --> 00:27:20,770 ‫numbers or mobile phone numbers is. 403 00:27:21,220 --> 00:27:22,030 ‫All right, great. 404 00:27:22,030 --> 00:27:23,800 ‫So I hope you manage to do that. 405 00:27:23,800 --> 00:27:28,060 ‫And if you didn't, I hope you could follow along the structure that I've just explained. 406 00:27:28,240 --> 00:27:33,580 ‫Now, let's go ahead and well, actually, let's check out. 407 00:27:35,290 --> 00:27:43,300 ‫The Mr. and Miss example again, because here we have either Mr. Panetta or we have Mr. Mueller, without 408 00:27:43,300 --> 00:27:43,780 ‫a doubt. 409 00:27:43,780 --> 00:27:47,330 ‫So how can I check out if the DOT is there or not? 410 00:27:47,350 --> 00:27:48,940 ‫Well, by now you should know. 411 00:27:49,120 --> 00:27:51,100 ‫You can use a question mark for that. 412 00:27:51,100 --> 00:27:53,220 ‫So let's just start with Mr.. 413 00:27:53,230 --> 00:27:58,930 ‫As you can see, all three or all four lines are marked at least the. 414 00:27:58,930 --> 00:27:59,610 ‫Mr.. 415 00:27:59,620 --> 00:28:04,480 ‫And then if I use a dot, you can see that all of them are active as well. 416 00:28:04,480 --> 00:28:08,740 ‫But that's because that means wild card or means anything. 417 00:28:08,740 --> 00:28:11,740 ‫So it could be empty space here as well. 418 00:28:12,370 --> 00:28:16,630 ‫So if you want to make sure it's a literal dot, you need to use the backslash. 419 00:28:17,560 --> 00:28:24,430 ‫Now, the next one is an empty space and afterwards we have the name. 420 00:28:24,430 --> 00:28:27,370 ‫So it could be A to Z. 421 00:28:27,760 --> 00:28:32,560 ‫So anything A to Z like that. 422 00:28:33,430 --> 00:28:38,410 ‫So now how do I get Mr. Mueller and Mr. Robertson in there as well? 423 00:28:38,890 --> 00:28:46,720 ‫Well, I need to use the dot again, but this time with a question mark, because what I'm telling now 424 00:28:46,720 --> 00:28:52,450 ‫is there can be a dot order can be done, and afterwards we have an empty space. 425 00:28:52,450 --> 00:28:59,950 ‫So here I use the literal empty space to get that, but you can as well use backslash as because backslash 426 00:28:59,950 --> 00:29:06,430 ‫s if you check it out again says matches any whitespace character and that's what we wanted. 427 00:29:06,430 --> 00:29:10,690 ‫So now all of those names are correct. 428 00:29:11,840 --> 00:29:18,500 ‫And if you want to have a specific amount of characters following, so the name should be at least four 429 00:29:18,500 --> 00:29:21,470 ‫characters, then you can use this quantifier here. 430 00:29:21,470 --> 00:29:24,740 ‫So we are using the quantifier with N being four. 431 00:29:24,740 --> 00:29:29,780 ‫So we say it's at least four characters or four letters. 432 00:29:32,750 --> 00:29:41,030 ‫If you wanted to not only have characters like A to Z in there, but also numbers, for example, then 433 00:29:41,030 --> 00:29:49,370 ‫you could use backslash W So backslash W just means it's a as you can see here, it's a word character. 434 00:29:49,370 --> 00:29:53,630 ‫So A to Z, A to Z, 0 to 9 and underscore. 435 00:29:53,960 --> 00:29:59,000 ‫So it's useful for, let's say, player names or usernames, so stuff like that. 436 00:29:59,960 --> 00:30:04,490 ‫And if I want to have multiple characters, I can use the plus sign. 437 00:30:04,490 --> 00:30:08,960 ‫So the quantifier plus matches the previous element one or more times. 438 00:30:08,960 --> 00:30:10,310 ‫So let's check it out. 439 00:30:10,400 --> 00:30:11,720 ‫Let's add the plus. 440 00:30:11,720 --> 00:30:16,790 ‫And as you can see now, Mr. Duda, Mr. Muller, Mr. Robertson and Mr. G. 441 00:30:16,820 --> 00:30:25,820 ‫All are fully highlighted because, well, we have said that we want to have one or more characters 442 00:30:26,480 --> 00:30:32,360 ‫and if you wanted to have zero or more times, then we could use a star. 443 00:30:33,680 --> 00:30:39,170 ‫As you can see, the star works here as well because we have more times or more characters. 444 00:30:44,160 --> 00:30:46,560 ‫All right, now it's time for a little challenge. 445 00:30:47,280 --> 00:30:48,660 ‫Find a website link. 446 00:30:48,660 --> 00:30:55,920 ‫So we started off already, but now please go ahead and find every website link that's with this structure. 447 00:30:55,920 --> 00:30:57,450 ‫And it doesn't have to be tutorials. 448 00:30:57,450 --> 00:31:03,120 ‫EU It should not be a literal name, but it should be generally the structure and the structure is, 449 00:31:03,120 --> 00:31:08,880 ‫as you can see, https or HTTP colon, slash, slash. 450 00:31:10,050 --> 00:31:13,500 ‫Then w w w dot or not. 451 00:31:13,830 --> 00:31:22,740 ‫Then the name of the domain and the domain and the top level domain, which is EU in this case. 452 00:31:23,220 --> 00:31:25,860 ‫So it's separated by a dot, of course, as well. 453 00:31:26,190 --> 00:31:26,640 ‫All right. 454 00:31:26,640 --> 00:31:28,260 ‫So please go ahead and try that. 455 00:31:30,550 --> 00:31:30,910 ‫All right. 456 00:31:30,910 --> 00:31:31,690 ‫I hope you tried it. 457 00:31:31,690 --> 00:31:38,470 ‫So we start off with http sx and the SX is optional. 458 00:31:38,470 --> 00:31:40,270 ‫So we just enter the question mark. 459 00:31:40,270 --> 00:31:44,170 ‫So it has to be a yes or no as that's what this question mark does. 460 00:31:44,200 --> 00:31:49,630 ‫Then we have colon forward slash, forward slash, and then we have a group. 461 00:31:49,630 --> 00:31:55,600 ‫So we have a W WW DOT group which is optional. 462 00:31:55,600 --> 00:32:00,880 ‫So we use the question mark again, then we have the name. 463 00:32:00,880 --> 00:32:10,390 ‫And in this case I'm going to use a word group with W plus and it has to be backslash W here of course. 464 00:32:10,390 --> 00:32:18,850 ‫So now it's using the domain name as well, followed by a dot, which could be a literal dot, pretty 465 00:32:18,850 --> 00:32:28,540 ‫much so let's use a literal dot within the next group where we also have another word character. 466 00:32:28,960 --> 00:32:35,140 ‫So it can be one or more word characters and that's it. 467 00:32:35,140 --> 00:32:37,000 ‫So that's our structure. 468 00:32:38,050 --> 00:32:42,610 ‫So this one here, plus one or more times, which we have two times. 469 00:32:42,610 --> 00:32:46,450 ‫So the high level domain is this piece here with the dot before that. 470 00:32:46,450 --> 00:32:50,800 ‫So backslash dot, which means it's a literal dot that we have there. 471 00:32:50,950 --> 00:32:57,850 ‫Then we have the word itself or well, in this case it's EU or it could be dot com or it could be dot 472 00:32:57,850 --> 00:32:59,500 ‫academy or stuff like that. 473 00:33:00,160 --> 00:33:04,000 ‫Then we have the name itself, which is this piece here. 474 00:33:04,000 --> 00:33:11,680 ‫So the, the name of the domain w w w or not and http or https. 475 00:33:12,730 --> 00:33:13,120 ‫All right. 476 00:33:13,120 --> 00:33:19,450 ‫So you can create your own regular expressions, of course, but sometimes it just makes sense to Google 477 00:33:19,450 --> 00:33:19,840 ‫them. 478 00:33:19,840 --> 00:33:26,080 ‫So let's say you want to have the expression for email addresses. 479 00:33:26,080 --> 00:33:31,360 ‫So you could, of course, go ahead and say, okay, I want to have characters before that. 480 00:33:31,360 --> 00:33:47,710 ‫So you say W plus then followed by an ad sign again W plus followed by a literal dot like that and you 481 00:33:47,710 --> 00:33:50,530 ‫have again something like com whatever. 482 00:33:50,530 --> 00:33:54,700 ‫So again a backslash w plus. 483 00:33:55,000 --> 00:34:00,070 ‫So that would be a very, very generic thing, but it could be pretty fake as well. 484 00:34:00,070 --> 00:34:08,230 ‫So think people have thought about very, very complex regular expressions which really are pretty secure. 485 00:34:08,230 --> 00:34:10,780 ‫So let's check out the Internet for that. 486 00:34:10,780 --> 00:34:22,030 ‫So let's go back to the Internet and now let's Google for email address reg X or reg X regular expression, 487 00:34:22,030 --> 00:34:23,230 ‫and then you'll find this website. 488 00:34:23,230 --> 00:34:32,140 ‫It's called email regexp and it says email address, regular expression that 99.99% works. 489 00:34:33,670 --> 00:34:41,590 ‫And here you can find the regular expression, the official standard, then the Python version, the 490 00:34:41,590 --> 00:34:44,920 ‫JavaScript version, HTML and so forth. 491 00:34:44,920 --> 00:34:49,270 ‫So as you can see, the different programming languages here in this case. 492 00:34:49,270 --> 00:34:50,950 ‫So that's the one for C sharp. 493 00:34:50,950 --> 00:34:56,380 ‫But as you can see here, another way to use the system that met dot mail, dot mail address class. 494 00:34:56,380 --> 00:35:00,250 ‫So you can either use this expression or you can simply use the class. 495 00:35:00,250 --> 00:35:04,450 ‫And I would usually use the class, I guess, to determine whether an email address is valid. 496 00:35:04,450 --> 00:35:10,360 ‫Pass the email address to the mail address, dot mail address, string class constructor. 497 00:35:10,540 --> 00:35:13,720 ‫So that that sounds like a valid way as well. 498 00:35:13,720 --> 00:35:16,030 ‫But generally that's how you could do it. 499 00:35:16,030 --> 00:35:20,320 ‫So if you want to have a regular expression, you can very often search for it. 500 00:35:23,640 --> 00:35:31,080 ‫Using that one, however, and copying it into my Visual Studio file finder actually doesn't work. 501 00:35:31,080 --> 00:35:34,650 ‫So let's use the standard one. 502 00:35:34,650 --> 00:35:37,020 ‫So the general email regex. 503 00:35:37,020 --> 00:35:44,130 ‫So I'm just going to copy that code here and go back to Visual Studio and now search for that specific 504 00:35:44,430 --> 00:35:45,630 ‫regular expression. 505 00:35:45,630 --> 00:35:53,820 ‫And as you can see, test the test dot com does work if I use info at google dot com. 506 00:35:54,960 --> 00:35:56,970 ‫As you can see, it works as well. 507 00:35:57,210 --> 00:36:00,060 ‫That code seems to work as well and so forth. 508 00:36:00,060 --> 00:36:02,190 ‫So there you are. 509 00:36:02,220 --> 00:36:03,630 ‫That's a regular expression. 510 00:36:03,630 --> 00:36:09,690 ‫And if you would have to, well, figure this one out by yourself, I think you would. 511 00:36:10,200 --> 00:36:14,490 ‫You need quite a while, but that's the whole point, right? 512 00:36:14,490 --> 00:36:19,080 ‫You don't need to develop everything yourself and think of everything yourself. 513 00:36:19,080 --> 00:36:26,160 ‫You can save hours and hours of your life just using the Internet to figure things out, just using 514 00:36:26,160 --> 00:36:28,380 ‫Google to find what you're searching for. 515 00:36:28,410 --> 00:36:33,690 ‫Very often you'll find a solution which works pretty well, and you don't need to create all of it by 516 00:36:33,690 --> 00:36:34,380 ‫yourself. 517 00:36:34,590 --> 00:36:35,010 ‫All right. 518 00:36:35,010 --> 00:36:39,420 ‫So I hope you liked this short course to regular expressions. 519 00:36:39,420 --> 00:36:45,000 ‫And there's one last thing to do, and that's using regular expressions in C sharp. 520 00:36:45,000 --> 00:36:51,630 ‫And I know this video is pretty long, but hang on because now we're going to use C sharp to use regular 521 00:36:51,630 --> 00:36:52,440 ‫expressions. 522 00:36:52,440 --> 00:36:58,170 ‫So in order to use regular expressions, you can go ahead and create a regex. 523 00:36:58,530 --> 00:37:02,280 ‫And in this case, as you can see, it doesn't know regex. 524 00:37:02,280 --> 00:37:09,030 ‫So we need to add the namespace, which is system dot, text dot, regular expressions. 525 00:37:09,480 --> 00:37:10,770 ‫That's the one that we want. 526 00:37:10,770 --> 00:37:17,790 ‫Now we can use regex and I'm going to call it regex and this will be a new regex. 527 00:37:17,790 --> 00:37:20,400 ‫And here I can enter the pattern. 528 00:37:20,400 --> 00:37:26,040 ‫So you can as you can see, it's a string pattern and you can either put it in here or you can create 529 00:37:26,040 --> 00:37:30,150 ‫a string pattern, which will be your pattern. 530 00:37:30,150 --> 00:37:39,990 ‫So let's say in this case, we need to use the ADD at the beginning and I'm using backslash D to find 531 00:37:39,990 --> 00:37:41,670 ‫all numbers in my pattern. 532 00:37:41,670 --> 00:37:47,160 ‫So I'm going to give it to the constructor of the regex class here. 533 00:37:47,580 --> 00:37:52,330 ‫And as you can see now, it's going to use this regular expression pattern, which is in my case, just 534 00:37:52,330 --> 00:37:52,770 ‫the digits. 535 00:37:52,770 --> 00:37:55,020 ‫So it's just going to say, okay, give me all digits. 536 00:37:55,020 --> 00:38:02,280 ‫Now, let's create a test string and I'm going to call it text and that will be hi there. 537 00:38:02,280 --> 00:38:07,350 ‫My number is one, two, three, one, four. 538 00:38:08,070 --> 00:38:10,620 ‫So that's going to be my test. 539 00:38:10,950 --> 00:38:18,840 ‫And now in order to find every single character or every single time that this pattern fits, I'm going 540 00:38:18,840 --> 00:38:25,680 ‫to use a for each loop because I want to show all the different matches that there are. 541 00:38:26,670 --> 00:38:29,970 ‫So let's use a match collection for that. 542 00:38:30,180 --> 00:38:35,910 ‫Match collection, if you check it out, is a type which is not valid in the given context. 543 00:38:35,910 --> 00:38:36,930 ‫Well, that's fine. 544 00:38:36,930 --> 00:38:40,980 ‫Let's create a match collection. 545 00:38:40,980 --> 00:38:52,410 ‫I'm just going to say match collection is equal to reg x dot matches and the matches method. 546 00:38:52,410 --> 00:38:53,220 ‫What will it do? 547 00:38:53,220 --> 00:39:01,110 ‫Well, it will take an input, which is my text, and it will try to fit it with my pattern. 548 00:39:01,110 --> 00:39:03,840 ‫So we're using this regex here. 549 00:39:03,870 --> 00:39:04,890 ‫This is the pattern. 550 00:39:04,890 --> 00:39:06,330 ‫We gave it to the regex. 551 00:39:06,330 --> 00:39:10,590 ‫So now it's trying to match the pattern to the text that we have. 552 00:39:10,590 --> 00:39:12,870 ‫And the text is this line of code here. 553 00:39:13,170 --> 00:39:16,800 ‫So what we can do now is show all the different hits. 554 00:39:16,800 --> 00:39:28,230 ‫So I'm going to say something like so many hits found and I'm going to use a new line for that. 555 00:39:28,770 --> 00:39:30,450 ‫I'm just going to say like this. 556 00:39:30,780 --> 00:39:36,000 ‫So this is the amount of hits that I have and that will be my match collection dot count. 557 00:39:36,000 --> 00:39:42,660 ‫So as you can see, this match collection object has a property which is called count and it will just 558 00:39:42,660 --> 00:39:49,650 ‫say, how many times did it find whatever it tried to find? 559 00:39:49,650 --> 00:39:56,730 ‫So whatever the pattern was and whatever the text was and I'm just going to say text here as well. 560 00:39:56,730 --> 00:40:05,340 ‫So I say within the text, I have this many hits and now let's show the actual hits. 561 00:40:05,340 --> 00:40:18,090 ‫So I'm going to use for each and it's going to be a match which is called Hit in Match Collection. 562 00:40:19,380 --> 00:40:22,860 ‫So with this for each loop, I'm just going to loop through. 563 00:40:23,510 --> 00:40:28,340 ‫The whole match collections or head match collection, which contains all the different hits. 564 00:40:28,340 --> 00:40:34,250 ‫And I'm going to print every single hit on the screen in order to also see at which point in the string 565 00:40:34,250 --> 00:40:34,920 ‫it was. 566 00:40:34,940 --> 00:40:39,240 ‫So which position within this string it is. 567 00:40:39,280 --> 00:40:44,420 ‫So let's say it has I don't know, it's like eight or seven characters here. 568 00:40:44,420 --> 00:40:45,730 ‫Eight, nine, ten. 569 00:40:45,740 --> 00:40:48,440 ‫So it's around, let's say 2020. 570 00:40:48,440 --> 00:40:51,290 ‫Is this 121 is this 122, is that one? 571 00:40:51,290 --> 00:40:52,110 ‫And so forth. 572 00:40:52,130 --> 00:41:00,470 ‫So if you want to have those as well, we can use a group collection which will be my group and it's 573 00:41:00,470 --> 00:41:03,810 ‫going to be hit dot groups. 574 00:41:05,180 --> 00:41:07,040 ‫Next I want to display that. 575 00:41:07,040 --> 00:41:08,570 ‫So see w. 576 00:41:10,710 --> 00:41:12,570 ‫Double tap time. 577 00:41:12,570 --> 00:41:18,240 ‫And then I say, okay, I want to have the hit itself. 578 00:41:18,900 --> 00:41:24,810 ‫So I'm going to use a curly bracket found ET and the position. 579 00:41:24,810 --> 00:41:26,900 ‫So how do I get those? 580 00:41:26,910 --> 00:41:29,730 ‫Well, I use the group that I have created here. 581 00:41:29,730 --> 00:41:38,580 ‫So this group collection and I'm just going to say group dot or actually in square brackets at the position 582 00:41:38,580 --> 00:41:39,480 ‫zero. 583 00:41:39,480 --> 00:41:42,510 ‫Give me that and give me its value. 584 00:41:42,510 --> 00:41:45,120 ‫And then I also want to have the index. 585 00:41:45,120 --> 00:41:51,180 ‫So Group zero and index. 586 00:41:51,540 --> 00:41:52,800 ‫All right, now let's run it. 587 00:41:52,800 --> 00:41:57,810 ‫And what it will do is it will give me the value and the index where it was found. 588 00:41:57,810 --> 00:42:04,170 ‫And I'm going to start with a control f five so it stays open and it says one found at 23 to found a 589 00:42:04,170 --> 00:42:09,360 ‫24 three found at 25, one at 26 and four at 27. 590 00:42:09,360 --> 00:42:16,110 ‫So as you can see, it has found digits and it has given me the position of those digits. 591 00:42:16,110 --> 00:42:19,950 ‫And the position is the position within the string that we have here. 592 00:42:19,950 --> 00:42:20,640 ‫All right. 593 00:42:20,790 --> 00:42:28,110 ‫So if I have a different pattern, let's say I have this pattern of multiple DS, so I want to have 594 00:42:28,110 --> 00:42:29,490 ‫five digits. 595 00:42:30,780 --> 00:42:32,280 ‫So let's check it out again. 596 00:42:32,670 --> 00:42:38,580 ‫And now it should only find one digit or one entry and it says hi there. 597 00:42:38,580 --> 00:42:42,030 ‫Number is one, two, three, four, three, one, five, four. 598 00:42:42,030 --> 00:42:45,780 ‫And here, one, two, three, one, four, found at 23. 599 00:42:47,400 --> 00:42:58,680 ‫So whatever you have set up here that's going to be found and if I search for there, then it will give 600 00:42:58,680 --> 00:43:02,040 ‫me this there which starts at index three. 601 00:43:03,030 --> 00:43:05,850 ‫So I'm going to start it and we are. 602 00:43:06,000 --> 00:43:06,600 ‫Hi there. 603 00:43:06,600 --> 00:43:09,690 ‫My number is one, two, three, four and hi there. 604 00:43:09,690 --> 00:43:17,670 ‫Found at three because this is the third position of a three position here index three of our string. 605 00:43:18,240 --> 00:43:18,810 ‫Okay. 606 00:43:18,810 --> 00:43:23,190 ‫So that's an example of how you can use RegEx and C-sharp. 607 00:43:23,190 --> 00:43:23,850 ‫There are more. 608 00:43:23,850 --> 00:43:25,050 ‫You can check them out. 609 00:43:25,110 --> 00:43:28,170 ‫You can just check out the documentation for that. 610 00:43:28,170 --> 00:43:34,110 ‫But this video is more about the general idea of regular expressions and how to use them.