1 00:00:07,230 --> 00:00:14,940 So in this video, we're going to talk about Question 26, what is the purpose of the Gatkuoth method? 2 00:00:15,450 --> 00:00:19,360 We are not yet quite done with collections in the next lecture. 3 00:00:19,380 --> 00:00:26,310 We'll discuss dictionaries, but to understand dictionaries, we must first understand the good called 4 00:00:26,310 --> 00:00:26,820 method. 5 00:00:27,000 --> 00:00:28,890 So let's do it in this lecture. 6 00:00:29,340 --> 00:00:34,650 Good hash gold is one of the few methods that belong to the system object type. 7 00:00:34,920 --> 00:00:41,790 In other words, we can click on an object in C-sharp before we understand what this method does. 8 00:00:42,000 --> 00:00:43,470 Let's see it in action. 9 00:00:56,450 --> 00:01:03,290 For now, those values look enigmatic, but hopefully we'll understand them better later in director. 10 00:01:03,860 --> 00:01:08,600 The guitar scored method is a hash function implementation for an object. 11 00:01:08,990 --> 00:01:15,380 Let's see the definition of a hash function and hash function is a one way cryptographic algorithm that 12 00:01:15,380 --> 00:01:20,390 maps an input of an SRS to a unique output of a fixed length of bits. 13 00:01:20,990 --> 00:01:24,170 At least for me, this sounds completely cryptic. 14 00:01:24,770 --> 00:01:30,740 First, let's understand what the result of this hash function is in c sharp. 15 00:01:30,740 --> 00:01:31,730 It's an integer. 16 00:01:32,030 --> 00:01:37,880 In simple terms, hash is a number calculated for some object from its components. 17 00:01:38,240 --> 00:01:42,650 He was an object of person class and some hash calculated for it. 18 00:01:43,100 --> 00:01:47,910 We'll take a closer look at how hash is actually calculated a bit later. 19 00:01:48,110 --> 00:01:54,590 But for now, let's just say that it's a function of the values of fields and properties belonging to 20 00:01:54,590 --> 00:01:55,370 the object. 21 00:01:55,760 --> 00:02:01,310 In this case, we could, for example, associate each letter with some number, which would allow us 22 00:02:01,310 --> 00:02:04,910 to translate words John and Smith to integers. 23 00:02:05,480 --> 00:02:11,720 Then we would somehow combine those integers with the integer representing data of birth, and as a 24 00:02:11,720 --> 00:02:14,690 result, we would have the hash code of the person. 25 00:02:15,080 --> 00:02:21,770 According to this, if we created another object of the person class with named Don last name Smith 26 00:02:21,770 --> 00:02:26,150 and year of birth 1987, the hash should be the same. 27 00:02:26,420 --> 00:02:32,870 On the other hand, if this ever object to VAR, for example, different year of birth, its hash would 28 00:02:32,870 --> 00:02:36,900 be different if we calculated the hash for the second time. 29 00:02:36,920 --> 00:02:42,560 The result should be the same as it was for the first time, assuming the object was not modified. 30 00:02:43,250 --> 00:02:49,100 If you have two objects that are different instances, but we consider them equal, the hash code for 31 00:02:49,100 --> 00:02:50,900 both of them should be the same. 32 00:02:51,140 --> 00:02:58,640 For example, if we have two points both having X equal to turn and Y equal to 20, even if they are 33 00:02:58,640 --> 00:03:03,590 two different instances, we consider them equal, so their hash codes should be the same. 34 00:03:04,210 --> 00:03:09,230 At this point, you probably wonder, OK, but what is the use for hash codes? 35 00:03:09,860 --> 00:03:14,510 Well, there are many rules is that the work excuse enhanced collections. 36 00:03:14,720 --> 00:03:20,780 This may again sound cryptic by now, but don't worry, we'll soon learn about one of the most useful 37 00:03:20,780 --> 00:03:22,400 C sharps collections. 38 00:03:22,610 --> 00:03:29,780 The dictionary if you use dictionaries before you know that its value is stored under a key, the key 39 00:03:29,780 --> 00:03:32,510 can be any object, even a complex one. 40 00:03:32,690 --> 00:03:38,600 But the dictionary needs to be able to translate it to an integer, and that's exactly what a good hash 41 00:03:38,600 --> 00:03:40,220 called method comes in handy. 42 00:03:40,700 --> 00:03:44,390 We'll learn more about it in the lecture about the dictionaries. 43 00:03:44,930 --> 00:03:49,820 Let's go back to the hash functions that map complex objects into integers. 44 00:03:50,330 --> 00:03:56,390 The very important trait of the hash function is that it should uniformly distribute its values. 45 00:03:56,720 --> 00:04:04,190 This means if I call guitar called method on 100000 different objects of the point type, I should get 46 00:04:04,190 --> 00:04:10,700 very little or none duplicated hash codes, but it is possible to have duplicated hash codes. 47 00:04:11,060 --> 00:04:15,290 This situation is called hash called conflict, and it's perfectly normal. 48 00:04:15,590 --> 00:04:22,640 Many people consider hash codes to be the identifiers of objects and think that two different objects 49 00:04:22,640 --> 00:04:25,820 of the same type can't have the same hash code. 50 00:04:26,000 --> 00:04:28,610 But this is not true, and it cannot be. 51 00:04:28,970 --> 00:04:30,200 Let me prove it to you. 52 00:04:30,530 --> 00:04:32,840 Let's again consider a point type. 53 00:04:33,140 --> 00:04:36,260 It contains two fields X and Y. 54 00:04:36,650 --> 00:04:44,330 Both X and Y are integers, so each of them can have a value between minimal value of int and maximal 55 00:04:44,330 --> 00:04:46,760 value of Int for simplicity. 56 00:04:46,880 --> 00:04:53,420 Let's say that the minimum value of the integer is minus two billion and the maximal is plus two billion. 57 00:04:53,810 --> 00:05:01,190 This means we can have four billion different X coordinates and four billion different Y coordinates, 58 00:05:01,190 --> 00:05:05,990 which in total gifts four billion times four billion different points. 59 00:05:06,140 --> 00:05:07,760 What is 16 quintillion? 60 00:05:08,090 --> 00:05:14,620 On the other hand, the hash itself is an integer, so it can only have four billion different values 61 00:05:14,630 --> 00:05:17,930 so much, much less then different points. 62 00:05:18,590 --> 00:05:24,650 So when creating different bonds will sooner or later simply run out of different households. 63 00:05:24,950 --> 00:05:28,880 It is sometimes referred to as the balls into bins problem. 64 00:05:29,180 --> 00:05:36,170 If we have more balls than beans and it what is stored in some being it must mean that in some bins 65 00:05:36,290 --> 00:05:37,940 there is more than one bowl. 66 00:05:38,600 --> 00:05:40,550 Let's summarize the hash function. 67 00:05:40,700 --> 00:05:43,640 If I have two different objects of the same type. 68 00:05:43,870 --> 00:05:49,790 Ideally, their hash codes should be different if I have plenty of different objects of the same type. 69 00:05:50,000 --> 00:05:53,390 There should be as few duplicated hash codes as possible. 70 00:05:53,810 --> 00:05:55,970 Finally, if I have two objects. 71 00:05:56,070 --> 00:05:59,550 That I consider equal their hash codes should be the same. 72 00:06:00,150 --> 00:06:05,160 Let's see some implementations of the good code method for some built in types. 73 00:06:05,580 --> 00:06:07,980 Here is the implementation for the integer. 74 00:06:08,340 --> 00:06:12,640 For instance, the implementation is as simple as it can ever be. 75 00:06:12,660 --> 00:06:15,630 The integer value itself is a perfect hash code. 76 00:06:15,810 --> 00:06:21,450 It will be the same fork to equal integers, and it will be different for two different integers. 77 00:06:21,840 --> 00:06:27,780 There will be no duplicates at all because for each integer possible, the hash gold will be different. 78 00:06:28,170 --> 00:06:36,090 Now let's consider the point type before we think of our own implementation of the good hash coordinated. 79 00:06:36,210 --> 00:06:38,070 Let's see what is the default? 80 00:06:38,370 --> 00:06:45,170 As we said, the Gatkuoth method is defined in the system object class, so I can call it on any object, 81 00:06:45,180 --> 00:06:47,220 even if I did not override it. 82 00:06:47,760 --> 00:06:49,260 Let's see some points. 83 00:06:50,670 --> 00:06:53,910 As you can see, point one and point two are the same. 84 00:06:54,060 --> 00:06:56,730 So I would like them to have the equal Hershkowitz. 85 00:06:56,940 --> 00:07:00,630 Point three is different, so it should have a different housecoat. 86 00:07:01,080 --> 00:07:02,700 Let's see the result. 87 00:07:04,600 --> 00:07:11,080 Well, that's not what we wanted, the two first half goals should be the same, but they are not to 88 00:07:11,080 --> 00:07:12,250 understand why is that? 89 00:07:12,250 --> 00:07:19,390 So we must understand what's the default implementation of the good hash code method for reference types 90 00:07:19,410 --> 00:07:21,640 it basis on the reference itself. 91 00:07:21,790 --> 00:07:28,840 So the address of the object in memory for value types, it is calculated based on the values stored 92 00:07:28,840 --> 00:07:29,770 in the object. 93 00:07:30,070 --> 00:07:36,520 That explains why two points, even with the same X and Y, have different hash goals, which we declared 94 00:07:36,530 --> 00:07:37,330 the point as a. 95 00:07:37,960 --> 00:07:44,710 So our reference type point one and point two are two different objects with two different references. 96 00:07:45,310 --> 00:07:50,590 The hash code is built based on the reference, so it is different for both of them. 97 00:07:51,160 --> 00:07:56,920 So if we want to have the same hash codes for the points with the same X and Y, we can simply change 98 00:07:56,920 --> 00:08:00,610 the point from a class to obstruct, which is a value type. 99 00:08:04,410 --> 00:08:12,150 Now we have what we wanted, the first two points have the same harsh goals, but let's don't go back 100 00:08:12,150 --> 00:08:17,070 to our gloss and let's try to implement the catcalled method ourselves. 101 00:08:21,520 --> 00:08:27,660 Most of the base types in C-sharp already provide a good implementation of the good catcalled method. 102 00:08:28,000 --> 00:08:33,640 Those methods are usually strongly based on pure math and are also pretty low level. 103 00:08:33,970 --> 00:08:38,290 And because of that, I don't want to get into details on how they work. 104 00:08:38,530 --> 00:08:40,810 Just so you have an idea, let's see. 105 00:08:40,810 --> 00:08:43,810 They get such good method implementation for string. 106 00:08:46,220 --> 00:08:49,460 As you can see, this is a pretty low level stuff. 107 00:08:49,850 --> 00:08:54,440 Luckily for us, the hard work has already been done by others. 108 00:08:54,830 --> 00:09:00,890 When defining custom types, we can simply combine the hash codes of the values stored in the object 109 00:09:00,980 --> 00:09:02,210 into a single hash code. 110 00:09:02,510 --> 00:09:05,210 For the Punkt class, it would look like this. 111 00:09:11,050 --> 00:09:14,830 Hash cold combine takes on the object as parameters. 112 00:09:15,250 --> 00:09:19,180 Let's make sure the implementation for the point works as expected. 113 00:09:20,950 --> 00:09:21,290 Right. 114 00:09:21,730 --> 00:09:25,510 The crash codes for points with equal and worse are the same. 115 00:09:26,740 --> 00:09:33,700 Remember that we don't always need to combine all properties and fields of a type to get a Vikash gold. 116 00:09:38,920 --> 00:09:45,310 For example, if we had a Social Security number in the person class, which by definition identifies 117 00:09:45,310 --> 00:09:50,710 a person, it would be perfectly fine to use it as the only component of the hash code. 118 00:09:52,900 --> 00:09:59,260 We always consider two person objects equal if they go with the same Social Security number and we can 119 00:09:59,260 --> 00:10:02,120 ignore other fields if they were different. 120 00:10:02,140 --> 00:10:07,600 It would most likely mean there is some error in the data itself, as two different people should never 121 00:10:07,600 --> 00:10:09,880 have the same Social Security number. 122 00:10:10,480 --> 00:10:11,170 All right. 123 00:10:11,470 --> 00:10:14,800 We now know how to implement the guitar code method. 124 00:10:14,920 --> 00:10:17,560 But the question is, what should we do it? 125 00:10:18,130 --> 00:10:19,660 The answer is simple. 126 00:10:19,840 --> 00:10:26,830 If the type is going to be used as a key of A. collection, like a dictionary or the hash table and 127 00:10:26,830 --> 00:10:29,710 the default implementation is not working for us. 128 00:10:30,190 --> 00:10:37,330 For reference types, we usually don't want the default code implementation as it compares objects by 129 00:10:37,330 --> 00:10:38,290 references. 130 00:10:38,410 --> 00:10:44,470 As we've seen with the point grass, we had two points with the same and why he had their hash codes 131 00:10:44,470 --> 00:10:45,220 were different. 132 00:10:45,490 --> 00:10:50,680 So when used as keys in the dictionary, they would be considered two different keys. 133 00:10:51,070 --> 00:10:58,230 In this case, we usually want to override the Gatkuoth method and remember that hash called combine 134 00:10:58,240 --> 00:10:59,590 can be a great help. 135 00:10:59,890 --> 00:11:04,780 Of course, in some situations, caution is based on the reference itself are fine. 136 00:11:05,140 --> 00:11:07,120 It all depends on the context. 137 00:11:07,450 --> 00:11:10,540 Later, in the course, we'll learn about records. 138 00:11:10,810 --> 00:11:18,230 Records are reference types that provide their own value base to get hash code method for value types. 139 00:11:18,250 --> 00:11:19,630 It is a bit tricky. 140 00:11:20,110 --> 00:11:26,710 There is a default implementation that works fine and uses the values stored in the fields and properties 141 00:11:26,710 --> 00:11:29,260 of the type to calculate the hash code. 142 00:11:29,470 --> 00:11:32,740 We've seen it when we turn to the point class to abstract. 143 00:11:33,190 --> 00:11:39,850 The problem is that this default implementation uses reflection, and as we learned in the what this 144 00:11:39,850 --> 00:11:42,370 reflection lecture, it's painfully slow. 145 00:11:43,030 --> 00:11:48,370 Because of that, it's a good idea to provide a custom implementation of the good hash called method 146 00:11:48,490 --> 00:11:55,960 invalid types and create, especially if they are going to be used as just collections keys out when 147 00:11:55,960 --> 00:11:58,030 overriding the task code method. 148 00:11:58,210 --> 00:12:04,870 It is important to also overwrite the equals method will explain the reason for it in the lecture about 149 00:12:04,870 --> 00:12:05,890 dictionaries. 150 00:12:06,490 --> 00:12:08,680 Let's summarize the good hash code. 151 00:12:08,680 --> 00:12:14,590 Methode generates an integer foreign object based on this objects, fields and properties. 152 00:12:15,010 --> 00:12:21,550 This integer, called hash, is most often used in collections like hafsat or dictionary. 153 00:12:22,210 --> 00:12:28,450 During the interview, you can be asked a question which from my interviewing experience, almost no 154 00:12:28,450 --> 00:12:35,770 one knows the answer to come to objects of the same type different by value of the same hash codes. 155 00:12:36,310 --> 00:12:43,690 Yes, hash called duplications, also called hash called conflicts, can happen simply because the count 156 00:12:43,690 --> 00:12:49,600 of distinct hash codes is equal to the amount of integer, and there are many types that can have much 157 00:12:49,600 --> 00:12:52,120 more distinct objects than this count. 158 00:12:52,390 --> 00:12:57,640 Because of that, there are simply not enough hash codes for each distinct object. 159 00:12:57,910 --> 00:13:01,360 So at some point, they just start to repeat themselves. 160 00:13:02,200 --> 00:13:02,590 Why? 161 00:13:02,590 --> 00:13:09,910 It may be a good idea to provide a custom implementation of the hash called method for struct because 162 00:13:09,910 --> 00:13:15,190 the default implementation basis on reflection and because of that, it is slow. 163 00:13:15,640 --> 00:13:22,120 A custom implementation may be significantly faster, and if we use the struct as a key in collections 164 00:13:22,120 --> 00:13:25,690 extensively, it may improve the performance very much. 165 00:13:26,290 --> 00:13:27,950 All right, that's it. 166 00:13:27,970 --> 00:13:34,120 About Hash will continue this topic in the next lecture where we learn about dictionaries. 167 00:13:34,480 --> 00:13:37,200 Thanks for following along, and I'll see you there.