The Digital Cat - algorithmshttps://www.thedigitalcatonline.com/2023-11-18T11:00:00+01:00Adventures of a curious cat in the land of programmingData Partitioning and Consistent Hashing2022-08-23T12:00:00+00:002022-08-23T12:00:00+00:00Leonardo Giordanitag:www.thedigitalcatonline.com,2022-08-23:/blog/2022/08/23/data-partitioning-and-consistent-hashing/<p>This post is an introduction to partitioning, a technique for distributed storage systems, and to consistent hashing, a specific partitioning algorithm that promotes even distribution of data, while allowing to dynamically change the number of servers and minimising the cost of the transition.</p><p>My interest in partitioning dates back to 2015 when I was following courses at the MongoDB university and learned about <em>sharding</em>, the name MongoDB uses for partitioning. I was fascinated by the topic and discovered the technique known as <em>consistent hashing</em>; I enjoyed it a lot, so much that I wrote a little demo in Python to understand it better. Later, I focused on other things and forgot the project completely, until recently, when <a href="https://github.com/drocpdp">David Eynon</a> sent me a PR on GitHub to replace a deprecated testing library. So, I decided to brush up on my knowledge of consistent hashing and, as I often do on this blog, dump my thoughts in a post.</p><p>The topic of distributed storage and data processing is arguably rich and complicated, so while I will try to give a broader context to the concepts that I will introduce, I by no means intend to write a comprehensive guide to the subject matter. The audience of this post is developers who do not know what partitioning and consistent hashing are and want to take their first step into those topics.</p><div class="infobox"><i class="fa fa-"></i><div class="title">Code syntax</div><div><p>You will find some code examples mentioned in the post, which are written using the Python notation. If you are not familiar with the language, these are the main rules</p>
<ul><li><code>x**y</code> means x<sup>y</sup>, e.g. <code>2**3 => 8</code>.</li><li><code>x//y</code> means the integer division between <code>x</code> and <code>y</code>, e.g. <code>11//4 => 2</code>.</li><li><code>x%y</code> means the modulo operation (remainder of integer division), e.g. <code>11%4 => 3</code>.</li></ul></div></div><h2 id="rationale-2d0e">Rationale<a class="headerlink" href="#rationale-2d0e" title="Permanent link">¶</a></h2><p>When we design a system, we might want to scatter data among multiple sources to allow real concurrency of access and a more targeted optimisation.</p><p>For example, we might observe that in a given social media application there are two types of queries: some are very infrequent and involve tables related to personal data and the user profile, others are extremely frequent and pretty intensive, and are related to the content shared by the user. In this case, we might decide to store the tables related to the profile and the tables that are related to content in two different systems, A and B (here, the word <em>system</em> might be freely replaced by <em>computer</em>, <em>database</em>, <em>storage system</em>, or other similar components).</p><p>This means that the infrequent queries that fetch personal data will be served by system A, while the more frequent and intensive queries related to content will be served by system B.</p><p>Suddenly, we have the chance to deploy system B using more powerful and expensive hardware, or an architecture with better performances, without increasing the cost for tables that won't benefit from such an improvement as the ones stored in system A.</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/partitioning_rationale.jpg"></div><p>This is a standard approach in system design, and it requires the introduction of an additional layer of control that will route requests to the right source. This layer might be implemented in several places, for example:</p><ul><li>in the code of our application, with conditional structures that query different data sources</li><li>in the framework that we are using for the application, for example in a middleware that automatically routes requests according to nature or the query</li><li>in a wrapper around the storage that hides the fact that data exists in two different systems</li></ul><p>In the last case, this technique is usually called <em>partitioning</em>.</p><p>In this post, I will try to show the challenges we face when we partition data and focus on some of the algorithms that can be used to implement it, in particular on consistent hashing. Please note that, while some of these techniques are used by databases to provide internal partitioning, they have a wider range of applications and might come in handy in different contexts.</p><h2 id="design-choices-d096">Design choices<a class="headerlink" href="#design-choices-d096" title="Permanent link">¶</a></h2><p>Every design choice in a system depends on the requirements, and when it comes to data storage the most important factors are the <em>nature of the data</em>, its <em>distribution</em>, and the <em>access patterns</em>. Consider for example databases and Content Delivery Networks (CDNs): both are meant to store data, and the storage size of both can vary substantially. However, there are important differences between the two that greatly affect the design choices. Let's see some simple examples:</p><ul><li>databases are meant to store data in a long-term fashion, while caches are by definition short-lived. This means that an important requirement for databases is data preservation, and we should do everything in our power to avoid losing parts of the database. A cache, conversely, holds data for a short time, either predetermined by the system or forced by a change in the data source. As you can see in this case we not only take data loss as part of the equation, but we get to the point where we trigger it on purpose.</li></ul><ul><li>applications often make use of range queries, which means that they retrieve sets of results spanning a range of values of one of the keys; for example, you might want to see all employers within a certain range of salaries, or all users that have more than a certain amount of followers. In such cases, it makes little sense to scatter data among different physical sources, thus making the retrieval more complicated and ultimately affecting performances. Databases see very often an access pattern of this type, while caches, being usually implemented as key/value stores, do not need to take this into account.</li></ul><h2 id="a-practical-example-of-partitioning-1b06">A practical example of partitioning<a class="headerlink" href="#a-practical-example-of-partitioning-1b06" title="Permanent link">¶</a></h2><p>Let us consider a simple key/value store, for example a common address book where the key is the name of the contact and the value a rich document with their personal details. If multiple users access the store, chances are that the system will at a certain point struggle to serve all the requests, so we might want to partition the data to allow concurrent access. We can for example sort them alphabetically and split them in two, storing all values with a key that begins with the letters A-M in one server and the rest (keys N-Z) in the second one.</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/simple_partitioning_1.jpg"></div><p>This might seem a good idea, but we will soon discover that performances are not great. Unfortunately, our address book doesn't contain the same number of people for each letter, as (for example) we know more people whose name starts with A or C than with X or Z.</p><p>That poses a problem, as our partitioning doesn't achieve the desired outcome, that of splitting requests evenly between the two servers. If we increase the number of partitions, serving smaller groups of letters, we will just worsen the problem, to the point where a partition might be completely empty and thus receive no traffic: since the problem comes from the data distribution, we need to find a way to change that property.</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/simple_partitioning_2.jpg"></div><p>One way to deal with the problem is to change the boundaries of the partitions so that we get an almost even distribution of values among them. For example, we might store keys starting with A-B in the first partition, keys starting with C-D in the second, and all the rest in the third one.</p><p>The problem with such a strategy is that it is highly dependent on the actual data that we are storing. Not only does this mean the solution has to be customised for each use case (the partitions in the example might be good for one address book and completely wrong for another), but adding data to the storage might change the distribution and invalidate the solution.</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/simple_partitioning_3.jpg"></div><h2 id="hash-functions-to-the-rescue-2cc2">Hash functions to the rescue<a class="headerlink" href="#hash-functions-to-the-rescue-2cc2" title="Permanent link">¶</a></h2><p>An interesting solution to the problem of distributing data evenly is represented by hash functions. As I explained in my post <a href="https://www.thedigitalcatonline.com/blog/2018/04/06/introduction-to-hashing/">Introduction to hashing</a>, good hash functions produce a highly uniform distribution, which makes them ideal in this case. Please note that hash functions can help with <em>routing queries</em> and not with <em>storing data</em>. Hashed values cannot replace the content, as they are not bijective, i.e. given two different inputs the output might be the same (collision), so they can only be used to decide <em>where</em> to store a piece of information.</p><p>We can at this point devise a storage strategy based on hash functions. We can divide the output space of the hash function (codomain) into a certain amount of partitions and be sure that each one of them will contain a similar amount of elements. For example, the hash function might output a 32-bit number, so we know that each hashed value will be between 0 and 2<sup>32</sup> (4,294,967,295), and from here it's pretty straightforward to find partition boundaries. For example, we can create 16 partitions numbered 0 to 15, each one containing 2<sup>28</sup> hash values (268,435,456).</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/hash_functions.jpg"></div><p>Routing is at this point very simple, as we can mathematically find the partition number given the hash. There are many ways to do this but two simple approaches are</p><ul><li>using the integer division <code>hash(k) // partition_size</code>, e.g. <code>hash(k) // 2**28</code>. All keys from <code>0</code> to <code>268435455</code> end up in partition 0 (<code>268435455 // 2**28</code>), keys from <code>268435456</code> to <code>536870911</code> end up in partition 1, and so on.</li></ul><ul><li>using the modulo operator <code>hash(k) % number_of_partitions</code>, e.g. <code>hash(k) % 16</code>. This assigns values to partitions in a round robin fashion, where key <code>0</code> goes to partition 0 (<code>0%16</code>), key <code>1</code> to partition 1 (<code>1%16</code>), key <code>15</code> to partition 15 (<code>15%16</code>), and then starts again with key <code>16</code> which goes to partition 0 (<code>16%16</code>), and so on.</li></ul><p>This architecture has the clear advantage that thanks to the properties of hash functions, data is scattered evenly among the partitions. This means that when we query the system, requests will also be divided evenly, thus giving us a good distribution of the load.</p><p>As we will see later, however, this is not a good approach for dynamic systems.</p><h2 id="partitioning-use-cases-d7c3">Partitioning use cases<a class="headerlink" href="#partitioning-use-cases-d7c3" title="Permanent link">¶</a></h2><p>Hash functions are definitely interesting but they are not the perfect solution in every case. Let's have a brief look at three different types of systems that might benefit from partitioning and discuss their specific requirements.</p><h3 id="load-balancers-b5f9">Load balancers</h3><p>Pure load balancers solve a simple problem: to spread requests evenly across multiple <em>identical</em> servers. The key word here is "identical", as you cannot pick the wrong server, thus no routing can result in an error. However, spreading the load unevenly can result in performance loss, and possibly also service failure. For example, if a server gets overloaded queries might hit a timeout while waiting to be served.</p><p>For this reason, when load balancing is not content-aware, for example in a simple HTTP server scenario, round-robin partitioning is a good choice. The system just assigns new requests to servers on a rotation basis, which ensures perfectly even distribution. For example, this algorithm is the default choice for AWS Application Load Balancers.</p><p>Clearly, load balancers can be more complicated and feature-rich even without becoming content aware. The aforementioned AWS ALBs, for example, support also the "least outstanding requests" algorithm, which in simple words means choosing the server with the smallest workload.</p><h3 id="caches-27ec">Caches</h3><p>Caches are systems that temporarily store data whose retrieval is expensive, either for the user or for the provider. For example, if a system runs a long query on a database caching the result will be beneficial both for the system and the database. For the former, because a repeated run will get the result much faster and for the latter because the load of the new query is zero.</p><p>Caches can be found everywhere and vary dramatically in size, but they are one of the best examples of systems that benefit from partitioning. As I mentioned before, their standard usage patterns don't include range queries and data loss (flushing) is part of their normal workflow.</p><p>A Content Delivery Network (CDN) is a specific type of cache that is distributed geographically. The purpose of the CDN nodes is to store content in a location that is physically near the users, thus increasing the performance of the system. This means that two geographically distinct nodes of a CDN contain the same values (replication), and the routing policy is solely based on the physical position of the user with respect to the node. Internally, each CDN node can be implemented using partitioning, though, which might speed up the performances of that specific node.</p><h3 id="databases-14fd">Databases</h3><p>As for databases, I already mentioned that the most important problem is range queries or if you prefer, content-aware partitioning. In general, you can't partition a database without taking into account the content, or you will incur severe performance losses. So, when it comes to databases, partitioning has to be the result of a specific design and can't be applied regardless of the database schema. </p><p>To better understand the challenge, let's consider a simple database whose elements are employees with a name and a salary. Now, if we want to partition this database we have to choose a key for the partitioning itself. It might be the primary key, the name, or the salary, as these are the only values available in each record.</p><p>Say we use hash functions to partition the database and use the employee salary as a key. Because of the properties of hash functions, employees with the exact same salary will end up being stored in the same partition, but employees with similar salaries might end up in different ones. This depends on the number of partitions, clearly, but the main point is that records that are "near" (according to the selected key) now are potentially very far.</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/hash_functions_and_range_queries.jpg"></div><p>In the example above I used MD5 as the routing hash function, and you can reproduce the calculations using the following Python code</p><div class="code"><div class="content"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">hashlib</span>
<span class="k">def</span> <span class="nf">hash_value</span><span class="p">(</span><span class="n">value</span><span class="p">):</span>
<span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">hashlib</span><span class="o">.</span><span class="n">md5</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="s2">"utf-8"</span><span class="p">))</span><span class="o">.</span><span class="n">hexdigest</span><span class="p">(),</span> <span class="mi">16</span><span class="p">)</span>
<span class="c1"># 57500283691658467528082923406379043196</span>
<span class="n">hash_value</span><span class="p">(</span><span class="mi">60000</span><span class="p">)</span>
<span class="c1"># 209589555716047624083879134729984902154</span>
<span class="n">hash_value</span><span class="p">(</span><span class="mi">60100</span><span class="p">)</span>
<span class="c1"># 12</span>
<span class="n">hash_value</span><span class="p">(</span><span class="mi">60000</span><span class="p">)</span> <span class="o">%</span> <span class="mi">16</span>
<span class="c1"># 10</span>
<span class="n">hash_value</span><span class="p">(</span><span class="mi">60100</span><span class="p">)</span> <span class="o">%</span> <span class="mi">16</span>
</pre></div> </div> </div><p>Things do not go much better if we use the integer division. If we have 16 partitions, each one of them contains 2<sup>124</sup> values</p><div class="code"><div class="content"><div class="highlight"><pre><span class="c1"># 2</span>
<span class="n">hash_value</span><span class="p">(</span><span class="mi">60000</span><span class="p">)</span> <span class="o">//</span> <span class="mi">2</span><span class="o">**</span><span class="mi">124</span>
<span class="c1"># 9</span>
<span class="n">hash_value</span><span class="p">(</span><span class="mi">60100</span><span class="p">)</span> <span class="o">//</span> <span class="mi">2</span><span class="o">**</span><span class="mi">124</span>
</pre></div> </div> </div><p>Now, let's consider a query that selects all employees within a certain range of salaries. If the database is not partitioned, all records are kept on the same server, and if we optimised the system for such a query, the records will also be physically adjacent (e.g. stored in nearby memory addresses). This makes the query blazing fast, but if the database is partitioned the query has to collect values from multiple partitions which greatly penalises performance.</p><p>We can see a real example of this design challenge in the documentation of MongoDB, a non-relational database that supports partitioning (called <em>sharding</em>). MongoDB supports <a href="https://www.mongodb.com/docs/manual/core/hashed-sharding/">hashed sharding</a> and <a href="https://www.mongodb.com/docs/manual/core/ranged-sharding/">ranged sharding</a>. In their words</p><p><em>Hashed sharding uses either a single field hashed index or a compound hashed index as the shard key to partition data across your sharded cluster.</em></p><p><em>Range-based sharding involves dividing data into contiguous ranges determined by the shard key values. In this model, documents with "close" shard key values are likely to be in the same chunk or shard. This allows for efficient queries where reads target documents within a contiguous range. However, both read and write performance may decrease with poor shard key selection.</em></p><p>I highly recommend reading the two pages I linked above as they will give you a good idea of how a real system uses the concepts I introduced and what design challenges are involved when using partitioning.</p><h2 id="caching-and-scaling-strategies-90f4">Caching and scaling strategies<a class="headerlink" href="#caching-and-scaling-strategies-90f4" title="Permanent link">¶</a></h2><p>When we design distributed caches, an interesting problem we might face is that of scaling the system in and out to match the current load without wasting resources.</p><p>When the cache is under a light load we might want to run a small number of servers, but as soon as the number of requests increases we need to proportionally increase the number of cache nodes if we want to avoid a performance drop. This is usually not a big problem for partitioned databases, since in that case we change the number of partitions only occasionally to adjust performances or to increase the storage size, but caches like CDNs might need continuous adjustments during a single day.</p><p>Increasing or decreasing the number of nodes in a distributed cache might however be a pretty destructive action. Depending on the routing algorithm, if we add nodes (scale out) we might need to move data from existing ones to the newly added ones, and if we remove nodes (scale in) we will certainly lose the data contained in them. Both scenarios result in a (potentially massive) cache invalidation which can't be taken lightly.</p><p>The hash-based routing method presented in the previous section has terrible performances when it comes to scaling because any change in the number of servers impacts the key boundaries of the existing ones. Let's see a practical example of that and calculate the actual figures.</p><h3 id="scaling-out-with-hash-partitioning-d6de">Scaling out with hash partitioning</h3><p>Every time you consider a process or an algorithm you should have a look at how it behaves in the worst possible condition, to have a glimpse of what you might run into when you use it. For this reason, the following example considers a scale-out scenario in which all cache nodes are full. The best case is obviously when all nodes are empty, but in that case we don't need to scale out at all.</p><p>Let's consider a 32-bit hash function and 16 partitions numbered 0 to 15. Since the hash function space is 2<sup>32</sup> (4,294,967,296), each partition will contain 2<sup>28</sup> hash values (268,435,456). Each node is full, which means that all the possible 2<sup>28</sup> slots are assigned to a cached item, that is some data stored in the server that corresponds to that partition. The system is using the integer division routing system.</p><p>If we scale out to 17 partitions, increasing the pool by just by 1 node, each node will now contain a smaller part of the global data space, as now we split it among more nodes. In particular, each node used to contain 1/16 of the global data (268,435,456), and will now contain 1/17 of it (approx. 252,645,135). Our biggest problem is now managing the transition between the initial setup and the new one.</p><p>The first node hosted 1/16 of the data space, the keys from <code>0</code> to <code>268435455</code>. It will now contain 1/17 of the data space, the keys from <code>0</code> to <code>252645134</code>. To simplify the example it is useful to convert everything into a common unit of measure: the node used to contain 17/272 of the space (1/16) and contains now 16/272 (1/17) of it.</p><p>This means that 1/272 of the whole data space has to be moved to the second node, corresponding to the keys from <code>252645135</code> to <code>268435455</code>. It is important to note that these keys cannot be moved to the newly added node, but have to be moved to the second node because the algorithm we use maps keys to nodes in order.</p><p>This means that the second node will receive 1/272 of the whole data space. Since it originally already contained 17/272 of the whole space it should now theoretically contain 18/272 of it. However, as it happened for the first node, we want to balance the content and reduce it to 16/272, so now we have 2/272 of the whole space that we want to move to the third node.</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/ripple_effect.jpg"></div><p>So, we move 1/272 from node 1 to node 2, 2/272 from node 2 to node 3, 3/272 from node 3 to node 4, and going on with the example we end up moving 16/272 (1/17) from the 16th node to the 17th, which fills it with the correct amount of keys. However, in doing so we moved 136/272 (1/272 + 2/272 + 3/272 + ... + 16/272) of the data space between nodes, which is exactly 50% of it.</p><p>So, for any initial size and a scale out of 1 single node, we have to move 50% of the data stored in our cache, and it might only get worse by increasing the number of final nodes until we end up having to move almost 100% of it (in an extreme case). A similar effect plagues the scale-in action, where one or more nodes are removed from the pool, and the keys they contain have to be migrated to the remaining nodes, creating a ripple effect to redistribute the keys according to the algorithm.</p><p>Using a modulo routing strategy doesn't change things: as I mentioned before, the core issue is that the addition of new nodes changes the routing of the whole data space, requiring a massive migration of keys in the entire system.</p><h2 id="a-different-approach-be6e">A different approach<a class="headerlink" href="#a-different-approach-be6e" title="Permanent link">¶</a></h2><p>While the idea of using hash functions looked very promising, we quickly found that the trivial implementation has very poor performances in a dynamic setting. As we clearly saw in the previous section, the problem is that upon scaling more than half of the keys have to be moved across nodes, so if we could find a way to avoid this we could still use hash functions to scatter data uniformly across the nodes.</p><p>As you might have already figured out, the issue comes from the attempt to keep all nodes perfectly balanced. The modulo and integer division algorithms distribute keys evenly (as long as the hash function has a good diffusion), but this is a double-edged sword. The balance is extremely beneficial in a static environment, but it is also the Achilles heel of this architecture when we change the number of nodes.</p><p>When we design a system, requirements are paramount. Everything we add to the final product should be there to satisfy one or more requirements. However, often requirements clash with each other, and trying to implement all of them at once might lead to situations where there is no apparent solution. In such cases, it is useful to temporarily drop one or more requirements and investigate the options we have, and this is exactly what we can do in this case: maintaining balance is an important feature, but let's see what would happen if we didn't have that requirement.</p><p>If we don't care about balancing nodes we can solve the problem with a different approach. Instead of using the integer division to find the slot, we can keep a table of the minimum hash served by each slot and route requests according to that. Each row of the table will have a minimum hash and the node that serves them.</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/hash_table.jpg"></div><p>This means that when we increase the number of slots we can just drop a new slot anywhere and assign to it all the keys that fall under its domain. This means that the new node will become the owner of keys that belonged to another node as it happened before, but with an important difference. Now all keys come from another single node, and the amount of keys moved is a fraction of those contained in it (which is much less than half of the keys). In the worst case, we need to move all keys contained in a node, which once again is much less than half of the keys.</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/hash_table_add_node.jpg"></div><p>As you can see, this relieves the load of one single node. According to what we said before, we are not trying to balance the load of the whole cluster. If we could use this technique to cover multiple spaces with a single added node, though, we could relieve the load of more than one other node. In principle this is simple: we just need to add multiple rows with the same node to the table.</p><div class="imageblock"><img src="/images/data-partitioning-and-consistent-hashing/hash_table_add_node_multiple.jpg"></div><p>Pay attention to the fact that we added multiple rows, that is multiple partitions, but they are all served by the same physical node. This has several advantages:</p><ul><li>It fills the new node with keys coming from several different nodes without rippling effects.</li><li>The key transfer load is spread among different nodes, noticeably hitting only the new node.</li></ul><p>There is also an interesting turn of events: since keys for the new node are fetched from several different existing nodes, the process will keep the cluster balanced! This is a remarkable outcome: we temporarily dropped a requirement and found a solution that provides that exact requirement in a different way.</p><p>The key part of this new process is the idea that multiple partitions can be served by the same node. The only missing part at this point is a way to identify the new partitions (the sets of hashes) served by the new node in a deterministic way.</p><h2 id="consistent-hashing-1397">Consistent hashing<a class="headerlink" href="#consistent-hashing-1397" title="Permanent link">¶</a></h2><p>Finally, let me introduce consistent hashing as a technique to implement the process described above.</p><p>As we discussed in the previous section, the only missing part is an algorithm that produces a deterministic set of hash ranges for a single new node. These hash ranges represent the partitions served by that node and should be scattered across the whole hash space. It is important for them to be spread because this way they will each receive some keys from existing nodes, instead of migrating a bulk of keys from a single one. The more evenly spread, the better the distribution of the load and the more balanced the resulting cluster.</p><p>As we saw previously, any time we need to scatter data across a given space in a deterministic way, hash functions are a good choice, and they can be used in this case as well. The idea is simple: <em>each partition of a node is assigned a name and this name is hashed with the same function used to hash the keys stored in the system</em>. This will produce a deterministic value in the hash space, and <em>that value will be the minimum value served by that partition</em>. Thanks to diffusion the names of all partitions will generate different hash values that won't easily clash, and this is the way we generate the routing table.</p><p>Let's see an example, bearing in mind that the specific function can change among implementations.</p><p>For simplicity's sake, I used a custom hash function that outputs 28-bit hashes (7 hexadecimal digits). This makes it possible to compare hashes visually and simplifies the example. To do this I took the first 7 digits of the SHA1 hash with the following Python code</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">def</span> <span class="nf">hash_name</span><span class="p">(</span><span class="n">name</span><span class="p">):</span>
<span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">hashlib</span><span class="o">.</span><span class="n">sha1</span><span class="p">(</span><span class="n">name</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="s2">"utf-8"</span><span class="p">))</span><span class="o">.</span><span class="n">hexdigest</span><span class="p">()[:</span><span class="mi">7</span><span class="p">],</span> <span class="mi">16</span><span class="p">)</span>
</pre></div> </div> </div><p>thus creating a hash function whose values go from <code>0x0000000</code> to <code>0xfffffff</code>. At the end of the post you will find the Python code that I used to generate the following routing tables, and you are free to experiment using different settings.</p><p>WARNING: this is not a good hash function! SHA1 produces 160 bits hashes, so taking the first 28 bits reduces the hash space to a microscopic fraction of the total, as we go from 2<sup>160</sup> total hashes to 2<sup>28</sup>. Please keep in mind that this is done only to simplify the visualisation of the example.</p><p>All our nodes are called <code>server-X</code> with <code>X</code> being a letter of the English alphabet, thus giving us <code>server-a</code>, <code>server-b</code>, and so on. I decided to create 5 partitions per server, numbered from 0 to 4, which are generated appending <code>-Y</code> to the name, where <code>Y</code> is the number of the partition. For example:</p><div class="code"><div class="content"><div class="highlight"><pre>server-a-0 -- hash --> 148456820
server-a-1 -- hash --> 57674441
server-a-2 -- hash --> 216250418
server-a-3 -- hash --> 30595746
server-a-4 -- hash --> 23746828
</pre></div> </div> </div><p>If we do this for two nodes (<code>server-a</code> and <code>server-b</code>) and then sort the results we will get a full routing table</p><div class="code"><div class="content"><div class="highlight"><pre> 23746828 --> server-a-4 ( 6848918 hashes)
30595746 --> server-a-3 (27078695 hashes)
57674441 --> server-a-1 ( 3228787 hashes)
60903228 --> server-b-2 (17957108 hashes)
78860336 --> server-b-0 ( 7773725 hashes)
86634061 --> server-b-4 (61822759 hashes)
148456820 --> server-a-0 (67793598 hashes)
216250418 --> server-a-2 (17304439 hashes)
233554857 --> server-b-3 (29289666 hashes)
262844523 --> server-b-1 ( 5590932 hashes)
</pre></div> </div> </div><p>Remember that the hashes in the routing table are the minimum hash served by that partition. For example, the first line tells us that all hashes from <code>23746828</code> are served by the partition <code>server-a-4</code>, while hashes from <code>30595746</code> are served by the partition <code>server-a-3</code>. This means that the partition <code>server-a-4</code> serves 6848918 hashes (as you can read in the table). A key whose hash is <code>79249022</code> will be served by <code>server-b-0</code></p><div class="code"><div class="content"><div class="highlight"><pre> 60903228 --> server-b-2 (17957108 hashes)
78860336 --> server-b-0 ( 7773725 hashes)
^
|
79249022 -----------+
86634061 --> server-b-4 (61822759 hashes)
148456820 --> server-a-0 (67793598 hashes)
</pre></div> </div> </div><p>Since partitions are not physically separated, but are just virtual entities belonging to a node, the route table can be simplified to</p><div class="code"><div class="content"><div class="highlight"><pre> 23746828 -- > server-a (37156400 hashes)
60903228 -- > server-b (87553592 hashes)
148456820 -- > server-a (85098037 hashes)
233554857 -- > server-b (34880598 hashes)
</pre></div> </div> </div><hr><p>What we achieved is remarkable, but there are still two problems. Let's have a look at a simple routing table for three nodes with 5 partitions each</p><div class="code"><div class="title">3 nodes with 5 partitions each</div><div class="content"><div class="highlight"><pre> 23746828 --> server-a (23267855 hashes)
47014683 --> server-c (10659758 hashes)
57674441 --> server-a ( 3228787 hashes)
60903228 --> server-b (63557309 hashes)
124460537 --> server-c (23996283 hashes)
148456820 --> server-a (31382512 hashes)
179839332 --> server-c (36411086 hashes)
216250418 --> server-a (17304439 hashes)
233554857 --> server-b (15386579 hashes)
248941436 --> server-c (13903087 hashes)
262844523 --> server-b ( 5590932 hashes)
</pre></div> </div> </div><p>First, the lowest value is not 0, which means that there are some hashes (23,746,828 in this case) which are not served by any slot. Second, in general the distribution doesn't cover the space evenly, as some nodes receive too many keys compared to others. This second problem isn't actually visible in the setups I showed so far, but it becomes noticeable increasing the number of servers. For example, with two nodes we have this situation</p><div class="code"><div class="title">2 nodes with 5 partitions each</div><div class="content"><div class="highlight"><pre>server-a: 122254437 hashes
server-b: 146181018 hashes
</pre></div> </div> </div><p>while with 5 nodes it becomes</p><div class="code"><div class="title">5 nodes with 5 partitions each</div><div class="content"><div class="highlight"><pre>server-a: 64211359 hashes
server-b: 66179053 hashes
server-c: 57545779 hashes
server-d: 43217324 hashes
server-e: 37281940 hashes
</pre></div> </div> </div><p>As you can see, in the second case the load of <code>server-e</code> is 56% that of <code>server-b</code>.</p><hr><p>The first problem is easily solved assigning the initial hashes to the last node, that is considering the hash space mapped on a circle. This means that for 2 nodes with 5 partitions each we have</p><div class="code"><div class="title">Routing table of 2 nodes with 5 partitions each</div><div class="content"><div class="highlight"><pre>Full routing table
<span class="hll"> 0 --> server-b-1 (23746828 hashes)
</span> 23746828 --> server-a-4 (6848918 hashes)
30595746 --> server-a-3 (27078695 hashes)
57674441 --> server-a-1 (3228787 hashes)
60903228 --> server-b-2 (17957108 hashes)
78860336 --> server-b-0 (7773725 hashes)
86634061 --> server-b-4 (61822759 hashes)
148456820 --> server-a-0 (67793598 hashes)
216250418 --> server-a-2 (17304439 hashes)
233554857 --> server-b-3 (29289666 hashes)
262844523 --> server-b-1 (5590932 hashes)
Simplified routing table
<span class="hll"> 0 -- > server-b (23746828 hashes)
</span> 23746828 -- > server-a (37156400 hashes)
60903228 -- > server-b (87553592 hashes)
148456820 -- > server-a (85098037 hashes)
233554857 -- > server-b (34880598 hashes)
</pre></div> </div> </div><p>where the partition <code>server-b-1</code> contains the orphaned initial hashes.</p><p>The second problem is a matter of statistical approach. The hash function that we use to map the partition name to the key space cannot be controlled, as its diffusion property has been designed to avoid a regular spacing of values. However, if we increase the number of partitions we expect the hash function to spread values across the whole space. At that point, each partition will be assigned just a tiny key space, and the differences between partitions will be less noticeable. In other words, by increasing the number of partitions dramatically we should achieve a better distribution. Let's compare the results of 5 nodes with 2 partitions each</p><div class="code"><div class="title">5 nodes with 2 partitions each</div><div class="content"><div class="highlight"><pre>server-a 36500586
server-b 76678431
server-c 31738329
server-d 56183426
server-e 67334683
</pre></div> </div> </div><p>with the results of 5 nodes with 3000 partitions each</p><div class="code"><div class="title">5 nodes with 3000 partitions each</div><div class="content"><div class="highlight"><pre>server-a 53385222
server-b 53855877
server-c 53755762
server-d 53597662
server-e 53840932
</pre></div> </div> </div><p>There is clearly an upper limit to the number of partitions that we can create. If we create more partitions than the possible number of hashes we will end up having empty ones and incurring routing errors as some of them will clash, but this is a purely theoretical case: using standard real hash functions we generate hashes of at least 160 bits, which means a codomain of 2<sup>160</sup> possible values (more than 10<sup>48</sup>). With 10,000 nodes (which is a considerable amount of servers in 2022) the threshold would be greater than 10<sup>44</sup> partitions per server.</p><p>So far, we achieved great results, but we already managed to properly partition the space with simple techniques. The real power of consistent hashing is in the way it behaves in a dynamic setting.</p><h2 id="consistent-hashing-and-scaling-649a">Consistent hashing and scaling<a class="headerlink" href="#consistent-hashing-and-scaling-649a" title="Permanent link">¶</a></h2><p>The interesting thing about consistent hashing is its amazing behaviour in a dynamic environment. As you might remember, the problem with hash partitioning was that a change in the number of nodes had ripple effects that resulted in a massive migration of at least half the keys.</p><p>With consistent hashing, when we add a new node we need to generate the hash values for that and put them in the routing table, and at that point we need to migrate the keys that fall under the domain of the newly created slots. Let's see an example before we discuss the performances.</p><p>The initial setup is 2 nodes with 5 partitions each</p><div class="code"><div class="title">2 nodes with 5 partitions</div><div class="content"><div class="highlight"><pre>Full routing table
0 --> server-b-1 (23746828 hashes)
23746828 --> server-a-4 (6848918 hashes)
30595746 --> server-a-3 (27078695 hashes)
57674441 --> server-a-1 (3228787 hashes)
60903228 --> server-b-2 (17957108 hashes)
78860336 --> server-b-0 (7773725 hashes)
86634061 --> server-b-4 (61822759 hashes)
148456820 --> server-a-0 (67793598 hashes)
216250418 --> server-a-2 (17304439 hashes)
233554857 --> server-b-3 (29289666 hashes)
262844523 --> server-b-1 (5590932 hashes)
Simplified routing table
0 -- > server-b (23746828 hashes)
23746828 -- > server-a (37156400 hashes)
60903228 -- > server-b (87553592 hashes)
148456820 -- > server-a (85098037 hashes)
233554857 -- > server-b (34880598 hashes)
Stats
server-a 122254437
server-b 146181018
TOTAL HASHES: 268435455/268435455
</pre></div> </div> </div><p>if we add one node we migrate to this new setup</p><div class="code"><div class="title">3 nodes with 5 partitions</div><div class="content"><div class="highlight"><pre>Full routing table
0 --> server-b-1 (23746828 hashes)
23746828 --> server-a-4 (6848918 hashes)
30595746 --> server-a-3 (16418937 hashes)
47014683 --> server-c-3 (10659758 hashes)
57674441 --> server-a-1 (3228787 hashes)
60903228 --> server-b-2 (17957108 hashes)
78860336 --> server-b-0 (7773725 hashes)
86634061 --> server-b-4 (37826476 hashes)
124460537 --> server-c-2 (23996283 hashes)
148456820 --> server-a-0 (31382512 hashes)
179839332 --> server-c-1 (25303093 hashes)
205142425 --> server-c-4 (11107993 hashes)
216250418 --> server-a-2 (17304439 hashes)
233554857 --> server-b-3 (15386579 hashes)
248941436 --> server-c-0 (13903087 hashes)
262844523 --> server-b-1 (5590932 hashes)
Simplified routing table
0 -- > server-b (23746828 hashes)
23746828 -- > server-a (23267855 hashes)
47014683 -- > server-c (10659758 hashes)
57674441 -- > server-a ( 3228787 hashes)
60903228 -- > server-b (63557309 hashes)
124460537 -- > server-c (23996283 hashes)
148456820 -- > server-a (31382512 hashes)
179839332 -- > server-c (36411086 hashes)
216250418 -- > server-a (17304439 hashes)
233554857 -- > server-b (15386579 hashes)
248941436 -- > server-c (13903087 hashes)
262844523 -- > server-b ( 5590932 hashes)
Stats
server-a 75183593
server-b 108281648
server-c 84970214
TOTAL HASHES: 268435455/268435455
</pre></div> </div> </div><p>Let's have a closer look to what happens with <code>server-c</code></p><div class="code"><div class="content"><div class="highlight"><pre>Simplified routing table
0 -- > server-b (23746828 hashes)
23746828 -- > server-a (23267855 hashes) ----+ 10659758 hashes
| from server-a
47014683 -- > server-c (10659758 hashes) <---+
57674441 -- > server-a ( 3228787 hashes)
60903228 -- > server-b (63557309 hashes) ----+ 23996283 hashes
| from server-b
124460537 -- > server-c (23996283 hashes) <---+
148456820 -- > server-a (31382512 hashes) ----+ 36411086 hashes
| from server-a
179839332 -- > server-c (36411086 hashes) <---+
216250418 -- > server-a (17304439 hashes)
233554857 -- > server-b (15386579 hashes) ----+ 13903087 hashes
| from server-b
248941436 -- > server-c (13903087 hashes) <---+
262844523 -- > server-b ( 5590932 hashes)
</pre></div> </div> </div><p>Globally, <code>server-c</code> receives 47,070,844 hashes from <code>server-a</code> and 37,899,370 hashes from <code>server-b</code>, which results in a migration of approximately 30% of the total hashes. As you can see there is no ripple effect here, as the boundaries of the existing partitions do not change.</p><p>Let's consider the performances in the worst case when we add one single node. If we are terribly unlucky (and we use a hash function with clear issues) each partition of the new node will cover completely a partition of an existing node. Assuming that the initial setup with N nodes created a balanced cluster, each node contains 1/Nth of the total keys, and in the worst case we need to move all of them from an existing node to the newly added one.</p><p>So, adding one node to a cluster of N nodes using consistent hashing results, in the worst case, in the migration of 1/Nth of the keys. In the previous example, then, we expected to migrate <em>at most</em> 50% of the keys (1/2), and we ended up migrating 30$ of them.</p><p>This is a terrific result. Not only it's much better than the previous one (<em>at least</em> 50% of the keys), but it gets better increasing the size of the cluster. In a cluster with 100 nodes, adding a node will result (in the worst case!) in the migration of 1/100 of the keys.</p><h2 id="source-code-b277">Source code<a class="headerlink" href="#source-code-b277" title="Permanent link">¶</a></h2><p>All routing tables shown in the post have been created with the following Python script. Please bear in mind that this is just demo code, so things haven't been optimised or designed particularly well. Feel free to change the hash function and the parameters of the script to experiment and see what consistent hashing can do.</p><div class="code"><div class="title"><code>consistent_hashing_demo.py</code></div><div class="content"><div class="highlight"><pre><span class="kn">import</span> <span class="nn">hashlib</span>
<span class="kn">import</span> <span class="nn">itertools</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">string</span>
<span class="kn">from</span> <span class="nn">operator</span> <span class="kn">import</span> <span class="n">itemgetter</span>
<span class="n">NUM_NODES</span> <span class="o">=</span> <span class="mi">3</span>
<span class="n">NUM_PARTITIONS</span> <span class="o">=</span> <span class="mi">5</span>
<span class="k">def</span> <span class="nf">hash_name</span><span class="p">(</span><span class="n">name</span><span class="p">):</span>
<span class="n">encoded_name</span> <span class="o">=</span> <span class="n">name</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="s2">"utf-8"</span><span class="p">)</span>
<span class="n">hash_encoded_name</span> <span class="o">=</span> <span class="n">hashlib</span><span class="o">.</span><span class="n">sha1</span><span class="p">(</span><span class="n">encoded_name</span><span class="p">)</span><span class="o">.</span><span class="n">hexdigest</span><span class="p">()</span>
<span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">hash_encoded_name</span><span class="p">[:</span><span class="mi">7</span><span class="p">],</span> <span class="mi">16</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">create_partitions</span><span class="p">(</span><span class="n">node_name</span><span class="p">,</span> <span class="n">partitions</span><span class="p">):</span>
<span class="n">partition_hashes</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">partition_number</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">partitions</span><span class="p">):</span>
<span class="n">partition_name</span> <span class="o">=</span> <span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="n">node_name</span><span class="si">}</span><span class="s2">-</span><span class="si">{</span><span class="n">partition_number</span><span class="si">}</span><span class="s2">"</span>
<span class="n">partition_hash</span> <span class="o">=</span> <span class="n">hash_name</span><span class="p">(</span><span class="n">partition_name</span><span class="p">)</span>
<span class="n">partition_hashes</span><span class="o">.</span><span class="n">append</span><span class="p">(</span>
<span class="p">{</span>
<span class="s2">"min_hash"</span><span class="p">:</span> <span class="n">partition_hash</span><span class="p">,</span>
<span class="s2">"partition_name"</span><span class="p">:</span> <span class="n">partition_name</span><span class="p">,</span>
<span class="s2">"node_name"</span><span class="p">:</span> <span class="n">node_name</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">)</span>
<span class="k">return</span> <span class="n">partition_hashes</span>
<span class="k">def</span> <span class="nf">create_routing_table</span><span class="p">(</span><span class="n">node_names</span><span class="p">,</span> <span class="n">partitions</span><span class="p">):</span>
<span class="n">table</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">node_name</span> <span class="ow">in</span> <span class="n">node_names</span><span class="p">:</span>
<span class="n">table</span><span class="o">.</span><span class="n">extend</span><span class="p">(</span><span class="n">create_partitions</span><span class="p">(</span><span class="n">node_name</span><span class="p">,</span> <span class="n">partitions</span><span class="p">))</span>
<span class="n">table</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">table</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">itemgetter</span><span class="p">(</span><span class="s2">"min_hash"</span><span class="p">))</span>
<span class="k">return</span> <span class="n">table</span>
<span class="k">if</span> <span class="n">NUM_NODES</span> <span class="o">></span> <span class="nb">len</span><span class="p">(</span><span class="n">string</span><span class="o">.</span><span class="n">ascii_lowercase</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Too many servers"</span><span class="p">)</span>
<span class="n">sys</span><span class="o">.</span><span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="n">nodes</span> <span class="o">=</span> <span class="p">[</span><span class="sa">f</span><span class="s2">"server-</span><span class="si">{</span><span class="n">i</span><span class="si">}</span><span class="s2">"</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">string</span><span class="o">.</span><span class="n">ascii_lowercase</span><span class="p">[:</span><span class="n">NUM_NODES</span><span class="p">]]</span>
<span class="n">routing_table</span> <span class="o">=</span> <span class="n">create_routing_table</span><span class="p">(</span><span class="n">nodes</span><span class="p">,</span> <span class="n">NUM_PARTITIONS</span><span class="p">)</span>
<span class="n">routing_table</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span>
<span class="s2">"min_hash"</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
<span class="s2">"partition_name"</span><span class="p">:</span> <span class="n">routing_table</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">][</span><span class="s2">"partition_name"</span><span class="p">],</span>
<span class="s2">"node_name"</span><span class="p">:</span> <span class="n">routing_table</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">][</span><span class="s2">"node_name"</span><span class="p">],</span>
<span class="p">}</span>
<span class="p">]</span> <span class="o">+</span> <span class="n">routing_table</span>
<span class="n">routing_table_shift</span> <span class="o">=</span> <span class="n">routing_table</span><span class="p">[</span><span class="mi">1</span><span class="p">:]</span> <span class="o">+</span> <span class="p">[</span>
<span class="p">{</span><span class="s2">"min_hash"</span><span class="p">:</span> <span class="mh">0xFFFFFFF</span><span class="p">,</span> <span class="s2">"partition_name"</span><span class="p">:</span> <span class="s2">"END"</span><span class="p">}</span>
<span class="p">]</span>
<span class="n">full_routing_table</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">routing_table</span><span class="p">,</span> <span class="n">routing_table_shift</span><span class="p">):</span>
<span class="n">full_routing_table</span><span class="o">.</span><span class="n">append</span><span class="p">(</span>
<span class="p">{</span>
<span class="s2">"min_hash"</span><span class="p">:</span> <span class="n">i</span><span class="p">[</span><span class="s2">"min_hash"</span><span class="p">],</span>
<span class="s2">"partition_name"</span><span class="p">:</span> <span class="n">i</span><span class="p">[</span><span class="s2">"partition_name"</span><span class="p">],</span>
<span class="s2">"node_name"</span><span class="p">:</span> <span class="n">i</span><span class="p">[</span><span class="s2">"node_name"</span><span class="p">],</span>
<span class="s2">"served_hashes"</span><span class="p">:</span> <span class="n">j</span><span class="p">[</span><span class="s2">"min_hash"</span><span class="p">]</span> <span class="o">-</span> <span class="n">i</span><span class="p">[</span><span class="s2">"min_hash"</span><span class="p">],</span>
<span class="p">}</span>
<span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Full routing table"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">full_routing_table</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">'</span><span class="si">{</span><span class="n">r</span><span class="p">[</span><span class="s2">"min_hash"</span><span class="p">]</span><span class="si">:</span><span class="s1">9</span><span class="si">}</span><span class="s1"> --> </span><span class="si">{</span><span class="n">r</span><span class="p">[</span><span class="s2">"partition_name"</span><span class="p">]</span><span class="si">}</span><span class="s1"> (</span><span class="si">{</span><span class="n">r</span><span class="p">[</span><span class="s2">"served_hashes"</span><span class="p">]</span><span class="si">}</span><span class="s1"> hashes)'</span><span class="p">)</span>
<span class="n">grouped_routing_table</span> <span class="o">=</span> <span class="n">itertools</span><span class="o">.</span><span class="n">groupby</span><span class="p">(</span>
<span class="n">full_routing_table</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">itemgetter</span><span class="p">(</span><span class="s2">"node_name"</span><span class="p">)</span>
<span class="p">)</span>
<span class="n">simplified_routing_table</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">grouped_routing_table</span><span class="p">:</span>
<span class="n">consecutive_partitions</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">r</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
<span class="n">simplified_routing_table</span><span class="o">.</span><span class="n">append</span><span class="p">(</span>
<span class="p">{</span>
<span class="s2">"node_name"</span><span class="p">:</span> <span class="n">r</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
<span class="s2">"min_hash"</span><span class="p">:</span> <span class="n">consecutive_partitions</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="s2">"min_hash"</span><span class="p">],</span>
<span class="s2">"served_hashes"</span><span class="p">:</span> <span class="nb">sum</span><span class="p">([</span><span class="n">i</span><span class="p">[</span><span class="s2">"served_hashes"</span><span class="p">]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">consecutive_partitions</span><span class="p">]),</span>
<span class="p">}</span>
<span class="p">)</span>
<span class="nb">print</span><span class="p">()</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Simplified routing table"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">simplified_routing_table</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">'</span><span class="si">{</span><span class="n">r</span><span class="p">[</span><span class="s2">"min_hash"</span><span class="p">]</span><span class="si">:</span><span class="s1">9</span><span class="si">}</span><span class="s1"> -- > </span><span class="si">{</span><span class="n">r</span><span class="p">[</span><span class="s2">"node_name"</span><span class="p">]</span><span class="si">}</span><span class="s1"> (</span><span class="si">{</span><span class="n">r</span><span class="p">[</span><span class="s2">"served_hashes"</span><span class="p">]</span><span class="si">:</span><span class="s1">8</span><span class="si">}</span><span class="s1"> hashes)'</span><span class="p">)</span>
<span class="nb">print</span><span class="p">()</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Stats"</span><span class="p">)</span>
<span class="n">stats</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">nodes</span><span class="p">:</span>
<span class="n">slots</span> <span class="o">=</span> <span class="nb">filter</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">x</span><span class="p">[</span><span class="s2">"node_name"</span><span class="p">]</span> <span class="o">==</span> <span class="n">node</span><span class="p">,</span> <span class="n">simplified_routing_table</span><span class="p">)</span>
<span class="n">total_hashes</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">([</span><span class="n">i</span><span class="p">[</span><span class="s2">"served_hashes"</span><span class="p">]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">slots</span><span class="p">])</span>
<span class="n">stats</span><span class="o">.</span><span class="n">append</span><span class="p">({</span><span class="s2">"node_name"</span><span class="p">:</span> <span class="n">node</span><span class="p">,</span> <span class="s2">"served_hashes"</span><span class="p">:</span> <span class="n">total_hashes</span><span class="p">})</span>
<span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="n">stats</span><span class="p">:</span>
<span class="nb">print</span><span class="p">(</span><span class="n">r</span><span class="p">[</span><span class="s2">"node_name"</span><span class="p">],</span> <span class="n">r</span><span class="p">[</span><span class="s2">"served_hashes"</span><span class="p">])</span>
<span class="n">total_hashes</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">([</span><span class="n">i</span><span class="p">[</span><span class="s2">"served_hashes"</span><span class="p">]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">stats</span><span class="p">])</span>
<span class="nb">print</span><span class="p">()</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"TOTAL HASHES: </span><span class="si">{</span><span class="n">total_hashes</span><span class="si">}</span><span class="s2">/</span><span class="si">{</span><span class="mi">2</span><span class="o">**</span><span class="mi">28</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="mi">1</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
</pre></div> </div> </div><h2 id="final-words-9803">Final words<a class="headerlink" href="#final-words-9803" title="Permanent link">¶</a></h2><p>I hope this long post was useful to introduce you to the topic of partitioning and in general to system design. As I mentioned, such concepts are currently in use by well-known systems, and still discussed as none of them is perfect, so it is worth understanding the fundamental issues before adopting a specific solution.</p><h2 id="resources-edc5">Resources<a class="headerlink" href="#resources-edc5" title="Permanent link">¶</a></h2><ul><li>Martin Kleppmann, <em>Designing Data-Intensive Applications</em>, Chapter 6 "Partitioning", O’Reilly 2017 <a href="https://www.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/">official site</a>.</li><li>The <a href="https://en.wikipedia.org/wiki/Consistent_hashing">Wikipedia article</a> about consistent hashing.</li><li><a href="https://www.toptal.com/big-data/consistent-hashing">A Guide to Consistent Hashing</a> by Juan Pablo Carzolio.</li><li>The <a href="https://www.cs.princeton.edu/courses/archive/fall09/cos518/papers/chash.pdf">original article</a> by David Karger et al.: "Consistent Hashing and Random Trees: Distributed Caching protocols for Relieving Hot Spots ont the World Wide Web".</li><li>An <a href="https://arxiv.org/pdf/1406.2294.pdf">alternative algorithm</a> by John Lamping and Eric Veach: "A Fast, Minimal Memory, Consistent Hash Algorithm".</li></ul><h2 id="feedback-d845">Feedback<a class="headerlink" href="#feedback-d845" title="Permanent link">¶</a></h2><p>Feel free to reach me on <a href="https://twitter.com/thedigicat">Twitter</a> if you have questions. The <a href="https://github.com/TheDigitalCatOnline/blog_source/issues">GitHub issues</a> page is the best place to submit corrections.</p>Public key cryptography: OpenSSH private keys2021-06-03T14:00:00+01:002021-06-03T14:00:00+01:00Leonardo Giordanitag:www.thedigitalcatonline.com,2021-06-03:/blog/2021/06/03/public-key-cryptography-openssh-private-keys/<p>An in-depth discussion of the format of OpenSSH keys</p><p>When you create standard RSA keys with <code>ssh-keygen</code> you end up with a private key in PEM format, and a public key in OpenSSH format. Both have been described in detail in my post <a href="https://www.thedigitalcatonline.com/blog/2018/04/25/rsa-keys/">Public key cryptography: RSA keys</a>. In 2014, OpenSSH introduced a custom format for private keys that is apparently similar to PEM but is internally completely different. This format is used by default when you create ed25519 keys and it is expected to be the default format for all keys in the future, so it is worth having a look.</p><p>While investigating this topic I found a lot of misconceptions and wrong or partially wrong statements on Stack Overflow, so I hope this might be a comprehensive view of what this format is, its relationship with PEM, and the tools that you can use to manipulate it.</p><p>I'm not the first programmer to look into this, clearly, and I have to mention two posts that I read before writing this one: <a href="https://peterlyons.com/problog/2017/12/openssh-ed25519-private-key-file-format/">OpenSSH ed25519 private key file format</a> written in December 2017 by Peter Lyons and <a href="http://dnaeon.github.io/openssh-private-key-binary-format/">The OpenSSH private key binary format</a>, written in August 2020 by Marin Atanasov Nikolov. I'm sure many others have done this research but these are the resources that I found and I want to say a big thanks to both authors for sharing their findings. I will shamelessly use their results in the following explanation, as I hope others will do with what I'm writing here. Sharing knowledge is one of the best ways to help others.</p><p>Please note that all the private keys shown in this post have been trashed after I published it.</p><p>Note: as the word "key" can identify several different component of the systems I will describe, I will as much as possible use the words "private key" and "encryption key". The first is the key that we generate to be used in SSH, while the second is a parameter of a (symmetric) encryption algorithm.</p><h2 id="kdfs-and-protection-at-rest-523a">KDFs and protection at rest<a class="headerlink" href="#kdfs-and-protection-at-rest-523a" title="Permanent link">¶</a></h2><p>Describing the introduction of the new format, the <a href="https://www.openssh.com/txt/release-6.5">OpenSSH changelog</a> says</p><div class="code"><div class="content"><div class="highlight"><pre>Add a new private key format that uses a bcrypt KDF to better
protect keys at rest. This format is used unconditionally for
Ed25519 keys, but may be requested when generating or saving
existing keys of other types via the -o ssh-keygen(1) option.
We intend to make the new format the default in the near future.
Details of the new format are in the PROTOCOL.key file.
</pre></div> </div> </div><p>Before we start dissecting the format, then, it is worth briefly discussing what a KDF is, what bcrypt is, and what it means to protect keys at rest.</p><h3 id="key-derivation-functions-decf">Key Derivation Functions</h3><p>Whenever a system is protected by a password you want to store the latter somewhere. This is clearly necessary to check the validity of the passwords that the user inputs and decide if you should grant access, but you shouldn't store the password in clear text, as a breach in the storage might compromise the whole system. The idea behind storing password securely is to run them through a hash function and store the hash: whenever someone inputs a password we can run the hash function again and compare the two hashes. However, we also want to prevent the attacker to be able to reconstruct the password from the hash, so we need a <em><a href="https://en.wikipedia.org/wiki/Cryptographic_hash_function">cryptographic hash function</a></em>, which is a hash function with added requirements to prevent an easy inversion of the process.</p><p>The same strategy can be applied when it comes to encryption. An encryption system needs a key (a sequence of bits used to encrypt the message) and we need to derive it from the password given by the user. Encryption keys are required to have a specific length dictated by the encryption algorithm that we use, so hashing looks like a good solution, as all hashes generated by a given algorithm are by definition of the same size. <a href="https://en.wikipedia.org/wiki/Advanced_Encryption_Standard">AES</a>, for example, one of the most widespread symmetric block ciphers, uses a key of exactly 128, 192, or 256 bits. Converting the password into a key of predetermined size is called <em>stretching</em>.</p><p>Any cryptographic system can be broken using a brute-force attack, as you can always test all possible inputs. In the case of login, we can just input all possible passwords until we get access to the system, while in the case of encryption we can try to decrypt using all possible keys until we obtain a meaningful result. This means that the most important thing we can do to protect such systems is to make brute-force attacks infeasible. This can be done increasing the key size (using more bits) but also using a slow stretching algorithm.</p><p>While hash functions created for things like digital signatures should be fast, then, hash functions that we use to obfuscate the password (for storage) or to create the key (for encryption/decryption) have to be very slow. The slowness of the processing can frustrate brute-force attacks and make them less effective is not infeasible. An example: at the current state of technology, you can easily hash 1 trillion passwords a second with a trivial expense, but if each one of those hashes takes 1 second you end up having to wait more than 31,000 years before you test all of them.</p><p>The process that converts a password into a key is called <em><a href="https://en.wikipedia.org/wiki/Key_derivation_function">Key Derivation Function</a></em> (KDF) and despite the name it is usually a complex algorithm and not a single mathematical function. <a href="https://en.wikipedia.org/wiki/PBKDF2">PBKDF2</a> is an important KDF, defined as part of the specification <a href="https://datatracker.ietf.org/doc/html/rfc2898">PKCS #5</a>, and it can use any pseudorandom function as part of the key stretching. An important feature of PBKDF2 is that it accepts an iteration count as input, that allows to slow down the process. As we just saw, this is the key to making the algorithm slower in order to adapt to the increasing computing power available to attackers.</p><h3 id="bcrypt-46df">bcrypt</h3><p>The password-hashing function known as <a href="https://en.wikipedia.org/wiki/Bcrypt">bcrypt</a> was created in 1999 and is based on the <a href="https://en.wikipedia.org/wiki/Blowfish_(cipher)">Blowfish</a> cipher created in 1993. Bcrypt is well know to be an extremely good choice thanks to the simple fact that its slowness can be increased tuning one of the parameters of the algorithm called "cost factor". This represents the number of iterations done in the setup of the underlying cipher, and its logarithmic nature makes easy to adapt the whole process to the increasing computational power available to attackers. <a href="https://auth0.com/blog/hashing-in-action-understanding-bcrypt/">This post</a> attempts to estimate the time to hash a password of 15 characters with a cost of 30 (the maximum is actually 31) with a decent 2017 laptop (2.8 GHz Intel Core i7 16 GB RAM). The result turns out to be around 500 days which makes you understand that bcrypt won't die easily. It is important to note here that bcrypt is not a KDF, but a hash function. As such, it might be part of a KDF, but not replace the whole process.</p><h3 id="protection-at-rest-9c52">Protection at rest</h3><p>Protection <a href="https://en.wikipedia.org/wiki/Data_at_rest">at rest</a> refers to the scheme that ensures data is secure when it is stored. Practically speaking, when it comes to SSH keys, we refer to the fact that an attacker that can physically access a key, for example stealing a laptop, actually owns an encrypted version of the key, which can't be used without first decrypting it. As the attacker is supposed to ignore the password used to encrypt the key, the only strategy they can use is to brute-force the key, and here is where the concept of protection at rest comes into play. Actually, the <a href="https://xkcd.com/538/">other strategy</a> they can employ is to kidnap you and to force you to reveal the password, but this somehow falls outside the sphere of cryptographic security. </p><h2 id="pem-format-and-protection-at-rest-aafc">PEM format and protection at rest<a class="headerlink" href="#pem-format-and-protection-at-rest-aafc" title="Permanent link">¶</a></h2><p>Now that I clarified some terminology, let's have a look at what the standard PEM format does to store encrypted passwords. As I explained in my post <a href="https://www.thedigitalcatonline.com/blog/2018/04/25/rsa-keys/">Public key cryptography: RSA keys</a> a PEM file contains a text header, a text footer, and some content. The content is always an ASN.1 structure created using DER and encoded using base64.</p><p>For encrypted private keys, the ASN.1 structure is created following a standard called <a href="https://datatracker.ietf.org/doc/html/rfc5208">PKCS #8</a>. This standard uses an encryption scheme called PBES2 described in the specification PKCS #5, which uses a symmetric cipher and a password, previously converted into an encryption key using the KDF called PBKDF2. I hope at this point some if not all of these names ring a bell.</p><p>We can roughly sketch the process with the following steps:</p><ul><li>Create the private key using the requested asymmetric algorithm (e.g. RSA or ED25519)</li><li>Encrypt the private key following PBES2<ul><li>Stretch the password into an encryption key using PBKDF2 with one of the possible hash functions and a random salt value</li><li>Encrypt the private key using the newly created encryption key</li></ul></li><li>Represent the encrypted key and the parameters used for PBKDF2 using ASN.1/DER</li><li>Encode the result with base64</li><li>Add a header and a footer that specify the nature of the content</li></ul><p>Let's create an encrypted key with OpenSSL and analyse it. The command I used is</p><div class="code"><div class="content"><div class="highlight"><pre>openssl genpkey -aes-256-cbc -algorithm RSA\
-pkeyopt rsa_keygen_bits:4096 -pass pass:foobar\
-out key_rsa_4096_openssl_pw
</pre></div> </div> </div><p>which creates a 4096 bits RSA key and encrypts it with AES using <code>foobar</code> as password. What I get is a file in the aforementioned PEM format</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN ENCRYPTED PRIVATE KEY-----
MIIJrTBXBgkqhkiG9w0BBQ0wSjApBgkqhkiG9w0BBQwwHAQIW+BK6UQtCPACAggA
MAwGCCqGSIb3DQIJBQAwHQYJYIZIAWUDBAEqBBCIvU4FD31mkYR76ugTEhuwBIIJ
UJPHGeObOC1lHMrTTKhdyiekEcJhCO3rzP/gqVpqXkjhUASTWEsE9LEcuGKdrzAN
Dsy/WL9revg9UAQtGAk8WTSqWhv5JaCC4FqLGirqLMzhU51Jf4GbmCOWAWGP7TZu
[...]
QEfBUexTcFVf13cVX7LFGOAZ3yIvFc3sfl5nyYY9Nerk8MxUOW+9Ck5loTEzMj9j
xJf5RsNvcoGVg33Rf7vl2xFIAD+PFdehd8n2CveQ48LJ9Zfn0gsRPQrPL+02Nlhu
7f44uW/Vq2YqG3PN1n8GUTexvF/qCKkd2T2QmHYnK9cryRn0xHvzSjSsQls170sA
Svu0sdTwh1tIs/sxRGuSta+iXPfHJnW4sZzh/2lAMvkgML6h9JAeIYV6e/qUqYSq
GxSfj7s0Qs0K5e3Xv1lCQUhSz82fBysznjeAhWa45YEV
-----END ENCRYPTED PRIVATE KEY-----
</pre></div> </div> </div><p>We can dump the ASN.1 content directly from the PEM format using <code>openssl asn1parse</code></p><div class="code"><div class="content"><div class="highlight"><pre>$ openssl asn1parse -inform pem -in key_rsa_4096_openssl_pw
0:d=0 hl=4 l=2477 cons: SEQUENCE
4:d=1 hl=2 l= 87 cons: SEQUENCE
6:d=2 hl=2 l= 9 prim: OBJECT :PBES2 <span class="callout">1</span>
17:d=2 hl=2 l= 74 cons: SEQUENCE
19:d=3 hl=2 l= 41 cons: SEQUENCE
21:d=4 hl=2 l= 9 prim: OBJECT :PBKDF2 <span class="callout">2</span>
32:d=4 hl=2 l= 28 cons: SEQUENCE
34:d=5 hl=2 l= 8 prim: OCTET STRING [HEX DUMP]:5BE04AE9442D08F0 <span class="callout">4</span>
44:d=5 hl=2 l= 2 prim: INTEGER :0800 <span class="callout">5</span>
48:d=5 hl=2 l= 12 cons: SEQUENCE
50:d=6 hl=2 l= 8 prim: OBJECT :hmacWithSHA256 <span class="callout">6</span>
60:d=6 hl=2 l= 0 prim: NULL
62:d=3 hl=2 l= 29 cons: SEQUENCE
64:d=4 hl=2 l= 9 prim: OBJECT :aes-256-cbc <span class="callout">3</span>
75:d=4 hl=2 l= 16 prim: OCTET STRING [HEX DUMP]:88BD4E050F7D6691847BEAE813121BB0
93:d=1 hl=4 l=2384 prim: OCTET STRING [HEX DUMP]:93C719E39B382D[...]
</pre></div> </div> </div><p>Please note that I truncated the final <code>OCTET STRING</code> that contains the encrypted key as it is pretty long.</p><p>You can clearly see that this key is encrypted using PBES2 <span class="callout">1</span> and PBKDF2 <span class="callout">2</span>. The algorithm used to encrypt the key is <code>aes-256-cbc</code> <span class="callout">3</span>, as I asked. Specifically, this is AES with a key of 256 bits in <a href="https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Cipher_block_chaining_(CBC)">CBC mode</a>).</p><p>According to the <a href="https://datatracker.ietf.org/doc/html/rfc2898#appendix-A.4">PKCS #5 specification</a>, the <code>PBES2</code> block contains</p><div class="code"><div class="content"><div class="highlight"><pre>PBES2-params ::= SEQUENCE {
keyDerivationFunc AlgorithmIdentifier {{PBES2-KDFs}},
encryptionScheme AlgorithmIdentifier {{PBES2-Encs}} }
</pre></div> </div> </div><p>and indeed we have <code>PBKDF2</code> <span class="callout">1</span> for <code>keyDerivationFunc</code>, and <code>aes-256-cbc</code> <span class="callout">3</span> for <code>encryptionScheme</code>. The sequence <code>PBKDF2</code> is specified in the <a href="https://datatracker.ietf.org/doc/html/rfc2898#appendix-A.2">same document</a> as</p><div class="code"><div class="content"><div class="highlight"><pre>PBKDF2-params ::= SEQUENCE {
salt CHOICE {
specified OCTET STRING,
otherSource AlgorithmIdentifier {{PBKDF2-SaltSources}}
},
iterationCount INTEGER (1..MAX),
keyLength INTEGER (1..MAX) OPTIONAL,
prf AlgorithmIdentifier {{PBKDF2-PRFs}} DEFAULT
algid-hmacWithSHA1 }
</pre></div> </div> </div><p>As you can see in the ASN.1 dump the salt is <code>5BE04AE9442D08F0</code> <span class="callout">4</span>, the iteration count is 2048 (<code>0x800</code>) <span class="callout">5</span>, and the hash function (<code>prf</code>, pseudorandom function) is <code>hmacWithSHA256</code> <span class="callout">6</span> without any additional parameters. The value 2048 for the iterations is a default value in OpenSSL (see the definition of <a href="https://github.com/openssl/openssl/blob/5bcbdee621fbf05df7431b8fbb0ea7de7054e1f0/include/openssl/evp.h#L41">PKCS5_DEFAULT_ITER</a>).</p><h2 id="opensshs-private-key-format-413c">OpenSSH's private key format<a class="headerlink" href="#opensshs-private-key-format-413c" title="Permanent link">¶</a></h2><p>As we saw at the beginning of the post, the OpenSSH team came up with a custom format to store the private keys, so now that we are familiar with the nomenclature and with the way PEM stores encrypted keys, lets see what this new format can do.</p><p>The best starting point for our investigation is the tool <code>ssh-keygen</code> which we can use to create private keys. The source can be found in the OpenSSH repository in the file <a href="https://github.com/openssh/openssh-portable/blob/2dc328023f60212cd29504fc05d849133ae47355/ssh-keygen.c">ssh-keygen.c</a>. This file uses two different functions, <code>sshkey_private_to_blob2</code> (<a href="https://github.com/openssh/openssh-portable/blob/2dc328023f60212cd29504fc05d849133ae47355/sshkey.c#L3883">source code</a>) for the new format and <code>sshkey_private_to_blob_pem_pkcs8</code> (<a href="https://github.com/openssh/openssh-portable/blob/2dc328023f60212cd29504fc05d849133ae47355/sshkey.c#L4371">source code</a>) for keys in PKCS #8 format. The former calls <code>bcrypt_pbkdf</code> which comes from OpenBSD (<a href="https://github.com/openbsd/src/blob/2207c4325726fdc5c4bcd0011af0fdf7d3dab137/sys/lib/libsa/bcrypt_pbkdf.c#L96">source code</a>).</p><p>This function contains a modified implementation of PBKDF2 that uses bcrypt as the core hash function. The comment that you can find at the top of the file <a href="https://github.com/openbsd/src/blob/master/sys/lib/libsa/bcrypt_pbkdf.c#L28">bcrypt_pbkdf.c</a> says</p><div class="code"><div class="content"><div class="highlight"><pre>/*
* pkcs #5 pbkdf2 implementation using the "bcrypt" hash
*
* The bcrypt hash function is derived from the bcrypt password hashing
* function with the following modifications:
* 1. The input password and salt are preprocessed with SHA512.
* 2. The output length is expanded to 256 bits.
* 3. Subsequently the magic string to be encrypted is lengthened and modified
* to "OxychromaticBlowfishSwatDynamite"
* 4. The hash function is defined to perform 64 rounds of initial state
* expansion. (More rounds are performed by iterating the hash.)
*
* Note that this implementation pulls the SHA512 operations into the caller
* as a performance optimization.
*
* One modification from official pbkdf2. Instead of outputting key material
* linearly, we mix it. pbkdf2 has a known weakness where if one uses it to
* generate (e.g.) 512 bits of key material for use as two 256 bit keys, an
* attacker can merely run once through the outer loop, but the user
* always runs it twice. Shuffling output bytes requires computing the
* entirety of the key material to assemble any subkey. This is something a
* wise caller could do; we just do it for you.
*/
</pre></div> </div> </div><p>As you can see, this is intended to be a <code>pkcs #5 pbkdf2 implementation</code> that uses <code>bcrypt</code> as its underlying hash function. It also mentions some modifications, and it's worth noting that when you modify a standard you are not following the standard any more. I won't run through all the details of the implementation, though, as it's beyond the scope of the post.</p><p>So, the OpenSSH private key format ultimately contains a private key encrypted with a non-standard version of PBKDF2 that uses bcrypt as its core hash function. The structure that contains the key is not ASN.1, even though it's base64 encoded and wrapped between header and footer that are similar to the PEM ones. A description of the structure can be found in <a href="https://github.com/openssh/openssh-portable/blob/2dc328023f60212cd29504fc05d849133ae47355/PROTOCOL.key">https://github.com/openssh/openssh-portable/blob/2dc328023f60212cd29504fc05d849133ae47355/PROTOCOL.key</a>.</p><h3 id="cost-factor-and-rounds-1a31">Cost factor and rounds</h3><p>PBKDF2 uses the concept of <em>rounds</em> to make the key stretching slower. This is the number of times the hash function is called internally (using as salt the output of the previous iteration), so in PBKDF2 the number of rounds or iterations is directly proportional to the slowness of the stretching operation.</p><p>Bcrypt implements a similar mechanism with its <em>cost factor</em>. The cost factor in the standard bcrypt implementation is defined as the binary logarithm of the number of iterations of a specific part of the process (the repeated expansion of the password and the salt). Using the binary logarithm means that a cost factor of 4 (the minimum) corresponds to 16 iterations, while 31 (the maximum) corresponds to 2,147,483,648 (more than 2 billion) iterations.</p><p>In the OpenSSH/OpenBSD implementation things are a bit different.</p><p>OpenBSD's version of bcrypt runs with a fixed cost of 6, that creates 64 iterations of the key expansion (<a href="https://github.com/openbsd/src/blob/2207c4325726fdc5c4bcd0011af0fdf7d3dab137/sys/lib/libsa/bcrypt_pbkdf.c#L68">source code</a>), but being an implementation of PBKDF2 it can still be hardened increasing the number of rounds (<a href="https://github.com/openbsd/src/blob/2207c4325726fdc5c4bcd0011af0fdf7d3dab137/sys/lib/libsa/bcrypt_pbkdf.c#L139">source code</a>). Those rounds correspond to the value given to the parameter <code>-a</code> of the <code>ssh-keygen</code> command line.</p><h3 id="how-many-rounds-12df">How many rounds?</h3><p>When it comes to KDFs, the advice is always to run as much iterations as possible while keeping the specific application usable, so you need to tune your SSH keys testing different values in your system. To give you some rough estimations, Wikipedia mentions that for PBKDF2 the number of iterations used by Apple and Lastpass is between 2k and 100k. It is worth reiterating though that you shouldn't aim to use other people's figures, in this case. Instead, run tests of your software and hardware.</p><p>On my laptop, an i7-8565U with 32GiB of RAM running Kubuntu 20.04 I get the following results, which are pretty linear:</p><div class="code"><div class="content"><div class="highlight"><pre>ssh-keygen -a 100 -t ed25519 0.667s
ssh-keygen -a 500 -t ed25519 3.148s
ssh-keygen -a 1000 -t ed25519 6.331s
ssh-keygen -a 5000 -t ed25519 31.624s
</pre></div> </div> </div><p>A sensible value for me might be between 100 and 500, then, so that I don't have to wait too long every time I push and pull my branches from GitHub.</p><h2 id="can-we-convert-private-openssh-keys-into-pem-2c12">Can we convert private OpenSSH keys into PEM?<a class="headerlink" href="#can-we-convert-private-openssh-keys-into-pem-2c12" title="Permanent link">¶</a></h2><p>As OpenSSL doesn't understand the OpenSSH private keys format, a common question among programmers and devops is if it is possible to convert it into a PEM format. As you might have guessed reading the previous sections, the answer is no. The PEM format for private keys uses PKCS#5, so it supports only the standard implementation of PBKDF2.</p><p>It's interesting to note that the OpenSSL team also specifically decided not to support this new format as it is not standard (see <a href="https://github.com/openssl/openssl/issues/5323">https://github.com/openssl/openssl/issues/5323</a>).</p><h2 id="a-poorly-documented-format-2ea8">A poorly documented format<a class="headerlink" href="#a-poorly-documented-format-2ea8" title="Permanent link">¶</a></h2><p>PEM, PKCS #8, ASN.1, and all other formats that we use every day, included the OpenSSH public key format, are well documented and standardised in RFCs or similar documents. The OpenSSH private key format is documented in a tiny file that you can find in the source code, but doesn't offer more than a quick overview. To have a good understanding of what is going on I had to read the source code, not only of OpenSSH, but also of OpenBSD.</p><p>I think poor documentation like this might be acceptable in personal projects or in new tools, but SSH is used by the whole world, and when the team decides to come up with a completely new format for one of its most important elements I would expect them to detail every single bit of it, or at least try to be more open about the reasons and the implementation. I also personally believe that standards can't but benefit intercommunication between systems and, in cryptography, improve security, since they are reviewed and discussed by a wider audience.</p><p>The claim is that the new SSH private key format offers a better protection of keys at rest. I'd be very interested to see a cryptanalysis made by some expert (which I'm not). Cryptography is a tricky field, and often things that are apparently smart end up being tragically wrong.</p><h2 id="resources-edc5">Resources<a class="headerlink" href="#resources-edc5" title="Permanent link">¶</a></h2><ul><li>OpenSSL documentation: <a href="https://www.openssl.org/docs/man1.1.0/apps/asn1parse.html">asn1parse</a>, <a href="https://www.openssl.org/docs/man1.1.0/apps/genpkey.html">genpkey</a></li><li>The <a href="https://en.wikipedia.org/wiki/Base64">Base64</a> encoding</li><li>The Abstract Syntax Notation One <a href="https://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One">ASN.1</a> interface description language</li><li><a href="https://tools.ietf.org/html/rfc4251">RFC 4251 - The Secure Shell (SSH) Protocol Architecture</a></li><li><a href="https://tools.ietf.org/html/rfc4253">RFC 4253 - The Secure Shell (SSH) Transport Layer Protocol</a></li><li><a href="https://tools.ietf.org/html/rfc4716">RFC 4716 - The Secure Shell (SSH) Public Key File Format</a></li><li><a href="https://datatracker.ietf.org/doc/html/rfc2898">RFC 5208 - PKCS #5: Password-Based Cryptography Specification Version 2.0</a></li><li><a href="https://tools.ietf.org/html/rfc5208">RFC 5208 - Public-Key Cryptography Standards (PKCS) #8: Private-Key Information Syntax Specification Version 1.2</a></li><li><a href="https://tools.ietf.org/html/rfc5958">RFC 5958 - Asymmetric Key Packages</a></li><li><a href="https://tools.ietf.org/html/rfc7468">RFC 7468 - Textual Encodings of PKIX, PKCS, and CMS Structures</a></li></ul><h2 id="feedback-d845">Feedback<a class="headerlink" href="#feedback-d845" title="Permanent link">¶</a></h2><p>Feel free to reach me on <a href="https://twitter.com/thedigicat">Twitter</a> if you have questions. The <a href="https://github.com/TheDigitalCatOnline/blog_source/issues">GitHub issues</a> page is the best place to submit corrections.</p>Public key cryptography: SSL certificates2020-11-04T23:00:00+01:002020-11-04T23:00:00+01:00Leonardo Giordanitag:www.thedigitalcatonline.com,2020-11-04:/blog/2020/11/04/public-key-cryptography-ssl-certificates/<p>An in-depth discussion of the format of X.509 certificates and the signing mechanism</p><p>In the context of public key cryptography, certificates are a way to prove the identity of the owner of a public key.</p>
<p>While public key cryptography allows us to communicate securely through an insecure network, it leaves the problem of identity untouched. Once we established an encrypted communication we can be sure that the data we send and receive cannot be read or tampered with by third parties. But how can we be sure that the entity on the other side of the communication channel, with which we initiated the communication, is what it claims to be?</p>
<p>In other words, the messages cannot be read or modified by malicious third-parties, but what if we established communication with a malicious actor in the first place? Such a situation can arise during a man-in-the-middle attack, where the low-level network communication is hijacked by a malicious actor who pretends to be the desired recipient of the communication.</p>
<p>In the context of the Internet, and in particular of the World Wide Web, the main concern is that the server that provides services we log into (think of every service that has your personal or financial data like you bank, Google, Facebook, Netflix, etc.) is run by the company that we trust and not by an attacker who wants to steal our data.</p>
<p>In this post I will try to clarify the main components of the certificates system and to explain the meaning of the major acronyms and names that you might hear when you deal with this part of web development.</p>
<h2 id="clarification-ssl-vs-tls">Clarification: SSL vs TLS<a class="headerlink" href="#clarification-ssl-vs-tls" title="Permanent link">¶</a></h2>
<p>In the world of web development and infrastructure management, we normally speak of SSL protocol and of SSL certificates, but it has to be noted that SSL (Secure Sockets Layer) is the name of a deprecated protocol. The current implementation of the protocol used to secure web applications is <strong>TLS</strong> (Transport Layer Security).</p>
<p>The story of SSL and TLS is rich of events and spans 25 years since its inception by Taher Elgamal at Netscape. In short, SSL had 3 major versions (the first of which was never publicly used), and was replaced by TLS in 1999. TLS itself has gone through 3 revisions at the time of writing, TLS 1.3 being the latest version available.</p>
<p>The TLS/SSL nomenclature is one of many sources of confusion in the complicated world of security and applied cryptography. In this article I will use only the acronym TLS, but I went for SSL in the title because I wanted the subject matter to be recognisable also by developers that are not much into security and cryptography.</p>
<h2 id="x509-certificates">X.509 certificates<a class="headerlink" href="#x509-certificates" title="Permanent link">¶</a></h2>
<p>While the problem of the identity in an insecure network can be solved in several ways, the solution embraced to secure the World Wide Web is based on a standard called <strong>X.509</strong>. When we mention TLS certificates, we usually mean X.509 certificates used in a TLS connection, such as that created by HTTPS.</p>
<p>X.509 is the ITU-T standard used to represent certificates, and has been chosen to be the standard used in the TLS protocol. The standard doesn't only define the binary structure of the certificate itself, but it also defines procedures to revoke the certificates, and establishes a hierarchical system of certification known as <strong>certificate path</strong>, or <strong>certificate chain</strong>.</p>
<p>The structure of an X.509 certificate is expressed using <a href="https://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One">ASN.1</a>, a notation used natively by the PEM format (discussed <a href="https://www.thedigitalcatonline.com/blog/2018/04/25/rsa-keys/">here</a>). You can read the full specification in <a href="https://tools.ietf.org/html/rfc2459">RFC 2459</a>, in particular <a href="https://tools.ietf.org/html/rfc2459#section-4">Section 4</a> "Certificate and Certificate Extensions Profile". I will refer to this later when I will have a look at a real certificate.</p>
<h2 id="how-are-certificates-related-to-https">How are certificates related to HTTPS?<a class="headerlink" href="#how-are-certificates-related-to-https" title="Permanent link">¶</a></h2>
<p>Before I discuss how certificates solve the problem of identity (or ownership of a public key), let's clarify the relationship between them and HTTPS.</p>
<p>HTTPS stands for HTTP Secure, and the core of the protocol consists of running HTTP over TLS. When we access a web site with HTTPS the browser first establishes a TLS connection with the server and then communicates with it using pure HTTP. This means that the whole HTTP protocol is encrypted, as the secure channel is established outside it, and also means that, aside from the different URI scheme <code>https://</code> instead of <code>http://</code>, there are no differences between the two protocols.</p>
<p>Certificates come into play when the browser establishes the TLS connection, which is why you need to set-up HTTPS as part of your infrastructure and not in your web application. By the time the HTTP requests reach your application they are already decrypted and accessible in plain text, as the HTTP protocol mandates. We usually say that we "terminate TLS" when a component of our infrastructure manages certificates and decrypts HTTPS into HTTP.</p>
<h2 id="how-do-certificates-work">How do certificates work?<a class="headerlink" href="#how-do-certificates-work" title="Permanent link">¶</a></h2>
<p>The X.509 standard establishes entities called <strong>Certificate Authorities</strong> (CAs), and creates a hierarchy of trust called <strong>chain</strong> between them. The idea is that there is a set of entities that are trusted worldwide by operating systems, browsers, and other network-related software, and that these entities can trust other entities, thus creating a trust network.</p>
<p>While the market of Certificate Authorities is dominated by three major commercial players (see the <a href="https://w3techs.com/technologies/overview/ssl_certificate">usage statistics</a>) there are approximately 100 organisations operating worldwide, among which some non-profit ones. Not all of these are trusted by all operating systems or browsers, though.</p>
<p>The set of CAs trusted by an organisation is called <strong>root program</strong>. The Mozilla community runs a program that is independent from the hardware/software platform, aptly called <a href="https://wiki.mozilla.org/CA">Mozilla's CA Certificate Program</a> and uses data contained in the <a href="https://www.ccadb.org/">Common CA Database</a> (CCADB). Private companies such as Microsoft, Apple, and Oracle run their own root programs and software running on the respective platforms (Windows, macOS/iOS, Java) can decide to trust the CAs provided by those programs.</p>
<p>In the open-source world, the Mozilla root program is by far the most influential and important source of information, being used by other software packages and Linux distributions.</p>
<p>It is possible to create certificates that are not signed by any CA, and these are called <strong>self-signed certificates</strong>. Such certificates can be used with any software that relies on certificates, but it requires such a software to disable certificate checking with the Certificate Authorities. Self-signed certificates are obviously useful for testing purposes, but there are scenarios in which it might be desirable not to rely on the CAs and establish a private network of trust.</p>
<h2 id="example-ca-root-certificate">Example: CA root certificate<a class="headerlink" href="#example-ca-root-certificate" title="Permanent link">¶</a></h2>
<p>The certificates for root CAs that are part of the Mozilla root program can be retrieved from the <a href="https://www.ccadb.org/resources">Common CA Database</a> web page, or can be seen in the Firefox <a href="https://hg.mozilla.org/mozilla-central/file/tip/security/nss/lib/ckfw/builtins/certdata.txt">source code</a> directly. On a running Firefox browser you can open the <a href="about:preferences#privacy">Privacy & Security</a> menu and click on "View Certificates" at the bottom of the page. The CAs are listed under the tab "Authorities".</p>
<p>The interesting thing you can do here is to export a CA certificate. If you do it Firefox will save it in a file with extension <code>.crt</code>, that contains data in PEM format. I exported the certificate for <code>Amazon Root CA 1</code> and I ended up with the file <code>AmazonRootCA1.crt</code>. If, instead of exporting, you view the certificate, you will end up in a page that allows you to download the certificate and the chain, both in PEM format, in files with the extension <code>.pem</code>. As you see, you are not the only one who is confused.</p>
<p>I described the PEM format <a href="https://www.thedigitalcatonline.com/blog/2018/04/25/rsa-keys/">in a post on RSA keys</a> so I won't repeat here the whole discussion about it. The <a href="https://tools.ietf.org/html/rfc7468">RFC 7468</a> ("Textual Encodings of PKIX, PKCS, and CMS Structures") describes certificates in section 5. Section 4 mentions the module <code>id-pkix1-e</code> for <code>Certificate</code>, <code>CertificateList</code>, and <code>SubjectPublicKeyInfo</code> <a href="https://tools.ietf.org/html/rfc5280">RFC 5280</a> ("Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile").</p>
<p>The identifier <code>id-pkix1-e</code> is part of a registry of objects to be used in ASN.1 data created in the framework of the Public-Key Infrastructure using X.509 (PKIX) Working Group, that defined the infrastructure around the X.509 certificates system. Basically it's a standard way to identify binary objects and their structure. You can see a full list of all the objects in <a href="https://tools.ietf.org/html/rfc7299">RFC 7299</a> ("Object Identifier Registry for the PKIX Working Group"). Not a very exciting one to read, if you ask me.</p>
<p>I can dump the content of the Amazon Root CA 1 certificate with OpenSSL</p>
<div class="highlight"><pre><span></span><code>$ openssl asn1parse -inform pem -in amazon-root-ca-1.pem
0:d=0 hl=4 l= 833 cons: SEQUENCE
4:d=1 hl=4 l= 553 cons: SEQUENCE
8:d=2 hl=2 l= 3 cons: cont [ 0 ]
10:d=3 hl=2 l= 1 prim: INTEGER :02
13:d=2 hl=2 l= 19 prim: INTEGER :066C9FCF99BF8C0A39E2F0788A43E696365BCA
34:d=2 hl=2 l= 13 cons: SEQUENCE
36:d=3 hl=2 l= 9 prim: OBJECT :sha256WithRSAEncryption
47:d=3 hl=2 l= 0 prim: NULL
49:d=2 hl=2 l= 57 cons: SEQUENCE
51:d=3 hl=2 l= 11 cons: SET
53:d=4 hl=2 l= 9 cons: SEQUENCE
55:d=5 hl=2 l= 3 prim: OBJECT :countryName
60:d=5 hl=2 l= 2 prim: PRINTABLESTRING :US
64:d=3 hl=2 l= 15 cons: SET
66:d=4 hl=2 l= 13 cons: SEQUENCE
68:d=5 hl=2 l= 3 prim: OBJECT :organizationName
73:d=5 hl=2 l= 6 prim: PRINTABLESTRING :Amazon
81:d=3 hl=2 l= 25 cons: SET
83:d=4 hl=2 l= 23 cons: SEQUENCE
85:d=5 hl=2 l= 3 prim: OBJECT :commonName
90:d=5 hl=2 l= 16 prim: PRINTABLESTRING :Amazon Root CA 1
108:d=2 hl=2 l= 30 cons: SEQUENCE
110:d=3 hl=2 l= 13 prim: UTCTIME :150526000000Z
125:d=3 hl=2 l= 13 prim: UTCTIME :380117000000Z
140:d=2 hl=2 l= 57 cons: SEQUENCE
142:d=3 hl=2 l= 11 cons: SET
144:d=4 hl=2 l= 9 cons: SEQUENCE
146:d=5 hl=2 l= 3 prim: OBJECT :countryName
151:d=5 hl=2 l= 2 prim: PRINTABLESTRING :US
155:d=3 hl=2 l= 15 cons: SET
157:d=4 hl=2 l= 13 cons: SEQUENCE
159:d=5 hl=2 l= 3 prim: OBJECT :organizationName
164:d=5 hl=2 l= 6 prim: PRINTABLESTRING :Amazon
172:d=3 hl=2 l= 25 cons: SET
174:d=4 hl=2 l= 23 cons: SEQUENCE
176:d=5 hl=2 l= 3 prim: OBJECT :commonName
181:d=5 hl=2 l= 16 prim: PRINTABLESTRING :Amazon Root CA 1
199:d=2 hl=4 l= 290 cons: SEQUENCE
203:d=3 hl=2 l= 13 cons: SEQUENCE
205:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption
216:d=4 hl=2 l= 0 prim: NULL
218:d=3 hl=4 l= 271 prim: BIT STRING
493:d=2 hl=2 l= 66 cons: cont [ 3 ]
495:d=3 hl=2 l= 64 cons: SEQUENCE
497:d=4 hl=2 l= 15 cons: SEQUENCE
499:d=5 hl=2 l= 3 prim: OBJECT :X509v3 Basic Constraints
504:d=5 hl=2 l= 1 prim: BOOLEAN :255
507:d=5 hl=2 l= 5 prim: OCTET STRING [HEX DUMP]:30030101FF
514:d=4 hl=2 l= 14 cons: SEQUENCE
516:d=5 hl=2 l= 3 prim: OBJECT :X509v3 Key Usage
521:d=5 hl=2 l= 1 prim: BOOLEAN :255
524:d=5 hl=2 l= 4 prim: OCTET STRING [HEX DUMP]:03020186
530:d=4 hl=2 l= 29 cons: SEQUENCE
532:d=5 hl=2 l= 3 prim: OBJECT :X509v3 Subject Key Identifier
537:d=5 hl=2 l= 22 prim: OCTET STRING [HEX DUMP]:04148418CC8534ECBC0C94942E08599CC7B2104E0A08
561:d=1 hl=2 l= 13 cons: SEQUENCE
563:d=2 hl=2 l= 9 prim: OBJECT :sha256WithRSAEncryption
574:d=2 hl=2 l= 0 prim: NULL
576:d=1 hl=4 l= 257 prim: BIT STRING
</code></pre></div>
<p>Let's read part of it using the aforementioned <a href="https://tools.ietf.org/html/rfc5280#section-4">section 4 of RFC 5280</a>.</p>
<p>The signed certificate is a sequence of three main components</p>
<div class="highlight"><pre><span></span><code> Certificate ::= SEQUENCE {
tbsCertificate TBSCertificate,
signatureAlgorithm AlgorithmIdentifier,
signatureValue BIT STRING }
</code></pre></div>
<p>and the <code>TBSCertificate</code> structure represents the unsigned certificate (TBS = To Be Signed)</p>
<div class="highlight"><pre><span></span><code>TBSCertificate ::= SEQUENCE {
version [0] EXPLICIT Version DEFAULT v1,
serialNumber CertificateSerialNumber,
signature AlgorithmIdentifier,
issuer Name,
validity Validity,
subject Name,
subjectPublicKeyInfo SubjectPublicKeyInfo,
issuerUniqueID [1] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version MUST be v2 or v3
subjectUniqueID [2] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version MUST be v2 or v3
extensions [3] EXPLICIT Extensions OPTIONAL
-- If present, version MUST be v3
}
</code></pre></div>
<p>Comparing this with the output of OpenSSL we can find fields such as <code>version</code></p>
<div class="highlight"><pre><span></span><code> 10:d=3 hl=2 l= 1 prim: INTEGER :02
</code></pre></div>
<p>which according to the documentation is 3 (binary <code>02</code>). Many values are of type <code>PRINTABLESTRING</code>, so they are readable already in the ASN.1 dump.</p>
<p>The validity of the certificate is</p>
<div class="highlight"><pre><span></span><code> 110:d=3 hl=2 l= 13 prim: UTCTIME :150526000000Z
125:d=3 hl=2 l= 13 prim: UTCTIME :380117000000Z
</code></pre></div>
<p>and following section 4.1.2.5.1 of the RFC we find out that the certificate is valid between 26 May 2015 and 17 Jan 2038. You can easily read these values in the certificate page in the browser without getting an headache trying to decode ASN.1.</p>
<p>The CA signed the certificate using a certain algorithm. The algorithm identifier is repeated twice, first in the structure <code>Certificate</code> (<code>signatureAlgorithm AlgorithmIdentifier</code>) and then in the structure <code>TBSCertificate</code> (<code>signature AlgorithmIdentifier</code>). The two fields must have the same value.</p>
<div class="highlight"><pre><span></span><code> 34:d=2 hl=2 l= 13 cons: SEQUENCE
36:d=3 hl=2 l= 9 prim: OBJECT :sha256WithRSAEncryption
47:d=3 hl=2 l= 0 prim: NULL
[...]
561:d=1 hl=2 l= 13 cons: SEQUENCE
563:d=2 hl=2 l= 9 prim: OBJECT :sha256WithRSAEncryption
574:d=2 hl=2 l= 0 prim: NULL
</code></pre></div>
<p>For this certificate, the algorithm used by Amazon is <code>sha256WithRSAEncryption</code>. This label is described in <a href="https://tools.ietf.org/html/rfc4055">RFC 4055</a> ("Additional Algorithms and Identifiers for RSA Cryptography for use in the Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile") as "PKCS #1 version 1.5 signature algorithm with SHA-256". The specific algorithm can be found in <a href="https://tools.ietf.org/html/rfc2313">RFC 2313</a> ("PKCS #1: RSA Encryption Version 1.5"). As the name of the algorithm suggests, the certificate is first digested with SHA-256 and then encrypted using RSA and the private key of the signer.</p>
<p>Speaking of keys, the public key the CA used for the certificate can be found in the field <code>subjectpublickeyinfo</code>, which is again made of a field type <code>AlgorithmIdentifier</code> and a bit string with the value of the key. In this case the fields are</p>
<div class="highlight"><pre><span></span><code> 205:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption
216:d=4 hl=2 l= 0 prim: NULL
218:d=3 hl=4 l= 271 prim: BIT STRING
</code></pre></div>
<p>The algorithm <code>rsaEncryption</code> is described in <a href="https://tools.ietf.org/html/rfc3279">RFC 3279</a> ("Algorithms and Identifiers for the Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile"), section 2.3.1 as</p>
<div class="highlight"><pre><span></span><code> RSAPublicKey ::= SEQUENCE {
modulus INTEGER, -- n
publicExponent INTEGER } -- e
</code></pre></div>
<p>(<em>sic</em>) or in <a href="https://tools.ietf.org/html/rfc8017">RFC 8017</a> ("PKCS #1: RSA Cryptography Specifications Version 2.2")</p>
<div class="highlight"><pre><span></span><code>RSAPublicKey ::= SEQUENCE {
modulus INTEGER, -- n
publicExponent INTEGER -- e
}
</code></pre></div>
<p>We can then use the option <code>-strparse</code> of the module <code>asn1parse</code> to find the actual values</p>
<div class="highlight"><pre><span></span><code>$ openssl asn1parse -inform pem -in amazon-root-ca-1.pem -strparse 218
0:d=0 hl=4 l= 266 cons: SEQUENCE
4:d=1 hl=4 l= 257 prim: INTEGER :B2788071CA78D5E371AF478050747D6ED8D78876F4
9968F7582160F97484012FAC022D86D3A0437A4EB2A4D036BA01BE8DDB48C80717364CF4EE8823C73EEB37F5B5
19F84968B0DED7B976381D619EA4FE8236A5E54A56E445E1F9FDB416FA74DA9C9B35392FFAB02050066C7AD080
B2A6F9AFEC47198F503807DCA2873958F8BAD5A9F948673096EE94785E6F89A351C0308666A14566BA54EBA3C3
91F948DCFFD1E8302D7D2D747035D78824F79EC4596EBB738717F2324628B843FAB71DAACAB4F29F240E2D4BF7
715C5E69FFEA9502CB388AAE50386FDBFB2D621BC5C71E54E177E067C80F9C8723D63F40207F2080C4804C3E3B
24268E04AE6C9AC8AA0D
265:d=1 hl=2 l= 3 prim: INTEGER :010001
</code></pre></div>
<p>As we already saw for <a href="https://www.thedigitalcatonline.com/blog/2018/04/25/rsa-keys/">RSA keys</a>), OpenSSL has a specific module for important structures, and the X.509 certificates are definitely worth a module aptly called <code>x509</code>. using that we can easily decode any certificate</p>
<div class="highlight"><pre><span></span><code>$ openssl x509 -inform pem -in amazon-root-ca-1.pem -noout -text
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
06:6c:9f:cf:99:bf:8c:0a:39:e2:f0:78:8a:43:e6:96:36:5b:ca
Signature Algorithm: sha256WithRSAEncryption
Issuer: C = US, O = Amazon, CN = Amazon Root CA 1
Validity
Not Before: May 26 00:00:00 2015 GMT
Not After : Jan 17 00:00:00 2038 GMT
Subject: C = US, O = Amazon, CN = Amazon Root CA 1
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public-Key: (2048 bit)
Modulus:
00:b2:78:80:71:ca:78:d5:e3:71:af:47:80:50:74:
7d:6e:d8:d7:88:76:f4:99:68:f7:58:21:60:f9:74:
84:01:2f:ac:02:2d:86:d3:a0:43:7a:4e:b2:a4:d0:
36:ba:01:be:8d:db:48:c8:07:17:36:4c:f4:ee:88:
23:c7:3e:eb:37:f5:b5:19:f8:49:68:b0:de:d7:b9:
76:38:1d:61:9e:a4:fe:82:36:a5:e5:4a:56:e4:45:
e1:f9:fd:b4:16:fa:74:da:9c:9b:35:39:2f:fa:b0:
20:50:06:6c:7a:d0:80:b2:a6:f9:af:ec:47:19:8f:
50:38:07:dc:a2:87:39:58:f8:ba:d5:a9:f9:48:67:
30:96:ee:94:78:5e:6f:89:a3:51:c0:30:86:66:a1:
45:66:ba:54:eb:a3:c3:91:f9:48:dc:ff:d1:e8:30:
2d:7d:2d:74:70:35:d7:88:24:f7:9e:c4:59:6e:bb:
73:87:17:f2:32:46:28:b8:43:fa:b7:1d:aa:ca:b4:
f2:9f:24:0e:2d:4b:f7:71:5c:5e:69:ff:ea:95:02:
cb:38:8a:ae:50:38:6f:db:fb:2d:62:1b:c5:c7:1e:
54:e1:77:e0:67:c8:0f:9c:87:23:d6:3f:40:20:7f:
20:80:c4:80:4c:3e:3b:24:26:8e:04:ae:6c:9a:c8:
aa:0d
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Basic Constraints: critical
CA:TRUE
X509v3 Key Usage: critical
Digital Signature, Certificate Sign, CRL Sign
X509v3 Subject Key Identifier:
84:18:CC:85:34:EC:BC:0C:94:94:2E:08:59:9C:C7:B2:10:4E:0A:08
Signature Algorithm: sha256WithRSAEncryption
98:f2:37:5a:41:90:a1:1a:c5:76:51:28:20:36:23:0e:ae:e6:
28:bb:aa:f8:94:ae:48:a4:30:7f:1b:fc:24:8d:4b:b4:c8:a1:
97:f6:b6:f1:7a:70:c8:53:93:cc:08:28:e3:98:25:cf:23:a4:
f9:de:21:d3:7c:85:09:ad:4e:9a:75:3a:c2:0b:6a:89:78:76:
44:47:18:65:6c:8d:41:8e:3b:7f:9a:cb:f4:b5:a7:50:d7:05:
2c:37:e8:03:4b:ad:e9:61:a0:02:6e:f5:f2:f0:c5:b2:ed:5b:
b7:dc:fa:94:5c:77:9e:13:a5:7f:52:ad:95:f2:f8:93:3b:de:
8b:5c:5b:ca:5a:52:5b:60:af:14:f7:4b:ef:a3:fb:9f:40:95:
6d:31:54:fc:42:d3:c7:46:1f:23:ad:d9:0f:48:70:9a:d9:75:
78:71:d1:72:43:34:75:6e:57:59:c2:02:5c:26:60:29:cf:23:
19:16:8e:88:43:a5:d4:e4:cb:08:fb:23:11:43:e8:43:29:72:
62:a1:a9:5d:5e:08:d4:90:ae:b8:d8:ce:14:c2:d0:55:f2:86:
f6:c4:93:43:77:66:61:c0:b9:e8:41:d7:97:78:60:03:6e:4a:
72:ae:a5:d1:7d:ba:10:9e:86:6c:1b:8a:b9:59:33:f8:eb:c4:
90:be:f1:b9
</code></pre></div>
<p>Now I'm pretty sure you want to kill me because I could have shown you this from the start. But I like to understand things, and the easy path doesn't always make everything clear. At any rate, here you have a way to read an X.509 certificate in PEM format.</p>
<p>Please note that in this certificate the <code>Issuer</code> and the <code>Subject</code> are the same entity, as this is a root certificate, which is signed by the same entity that creates it.</p>
<div class="highlight"><pre><span></span><code> Issuer: C = US, O = Amazon, CN = Amazon Root CA 1
[...]
Subject: C = US, O = Amazon, CN = Amazon Root CA 1
</code></pre></div>
<p>Moreover, one of the version 3 extensions of the self-signed certificate is a basic constraint with the boolean <code>CA</code> set to true. It also has the extension <code>Key Usage</code> set to <code>Digital Signature, Certificate Sign, CRL Sign</code>, which means that the certificate can be used to sign other certificates.</p>
<h2 id="example-self-signed-certificate">Example: self-signed certificate<a class="headerlink" href="#example-self-signed-certificate" title="Permanent link">¶</a></h2>
<p>You can use OpenSSL to create a self-signed certificate using the module <code>req</code> that you would normally use to create certificate requests. As a self-signed certificate doesn't need approval, the module can directly output the certificate.</p>
<div class="highlight"><pre><span></span><code>$<span class="w"> </span>openssl<span class="w"> </span>req<span class="w"> </span>-x509<span class="w"> </span>-newkey<span class="w"> </span>rsa:2048<span class="w"> </span>-keyout<span class="w"> </span>self-signed-key.pem<span class="w"> </span>-out<span class="w"> </span>self-signed.pem<span class="w"> </span>-days<span class="w"> </span><span class="m">365</span><span class="w"> </span>-nodes<span class="w"> </span>-subj<span class="w"> </span><span class="s1">'/CN=localhost'</span>
Generating<span class="w"> </span>a<span class="w"> </span>RSA<span class="w"> </span>private<span class="w"> </span>key
....+++++
................+++++
writing<span class="w"> </span>new<span class="w"> </span>private<span class="w"> </span>key<span class="w"> </span>to<span class="w"> </span><span class="s1">'self-signed-key.pem'</span>
-----
</code></pre></div>
<p>(note that for simplicity's sake I specified the option <code>-nodes</code> that prevents the key to be protected with a password, but this is a bad practice). This command creates the two files I mentioned, <code>self-signed-key.pem</code> (the private key) and <code>self-signed.pem</code>.</p>
<p>We can read the certificate using the module <code>x509</code></p>
<div class="highlight"><pre><span></span><code>$ openssl x509 -inform pem -in self-signed.pem -noout -text
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
46:e5:2f:8e:42:82:43:b8:ac:88:cb:6d:0c:2f:71:28:a9:fe:00:ec
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN = localhost
Validity
Not Before: Nov 3 00:23:34 2020 GMT
Not After : Nov 3 00:23:34 2021 GMT
Subject: CN = localhost
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public-Key: (2048 bit)
Modulus:
00:b7:14:ef:3b:eb:8b:a9:40:18:c5:d2:eb:1d:4f:
5d:e4:a3:17:f3:df:ce:b7:d3:3f:52:58:eb:61:02:
a2:68:0a:cd:0f:97:ae:e0:a5:ac:a7:88:cf:a1:15:
0a:97:ca:e7:03:8a:a5:c0:66:38:ef:bb:59:4d:48:
17:db:a7:bd:fa:4b:50:2a:be:e9:5b:bb:59:65:71:
dc:99:73:9c:bc:4d:3b:42:97:91:e9:3b:1a:8a:9d:
cc:41:38:ba:8b:8f:df:65:ff:5b:1f:ef:8a:b7:c5:
93:07:ce:15:4c:13:72:78:59:64:9a:5b:95:20:b6:
b3:8e:aa:c3:29:c3:7f:28:39:43:81:59:e4:0f:26:
7c:3f:49:d2:06:05:d9:54:ab:09:65:96:01:cc:c2:
72:be:85:1f:40:ea:94:35:04:09:9d:87:eb:a1:90:
36:ce:d2:55:f9:ee:08:db:52:78:e8:70:d0:25:89:
13:8e:0f:9d:98:98:d1:4d:67:06:8f:8a:61:9e:3a:
73:89:aa:0a:0a:1b:05:a7:52:32:ef:1b:78:5a:5f:
4b:b6:c9:a7:4e:15:10:04:50:99:00:09:2f:60:8e:
aa:20:af:6b:ee:f5:60:0b:29:da:38:1c:b2:73:14:
99:a4:ee:5e:89:e6:77:0b:ba:cf:d3:5d:d7:a3:ea:
c4:bf
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Subject Key Identifier:
64:7B:C1:FC:99:74:56:B7:82:D1:4F:E7:2D:94:77:1A:09:52:26:5C
X509v3 Authority Key Identifier:
keyid:64:7B:C1:FC:99:74:56:B7:82:D1:4F:E7:2D:94:77:1A:09:52:26:5C
X509v3 Basic Constraints: critical
CA:TRUE
Signature Algorithm: sha256WithRSAEncryption
43:7b:0b:c8:98:b8:6f:72:af:39:4a:d9:76:ce:e3:9d:3a:c7:
9f:14:b0:4f:20:0a:45:b3:b4:8c:e5:37:4c:bf:15:ad:8e:5c:
45:4f:3e:b7:ef:8d:60:57:bb:6f:d9:5e:6a:d3:04:05:4a:ff:
f2:66:b1:76:66:59:7e:24:89:0a:50:28:c9:d5:f5:7a:00:07:
8a:79:9c:6e:53:43:66:e5:9a:10:d8:f8:e1:f2:c1:f1:17:d0:
d2:9e:50:80:fe:2a:ca:08:b6:98:e9:b5:a4:82:23:31:45:35:
33:da:2c:e3:fe:54:f2:bd:f2:61:91:f4:32:e3:7d:4c:3a:e5:
3a:0f:cd:36:b0:8b:af:9f:8e:3d:0e:0b:a5:df:4a:3a:91:83:
b3:b2:5f:3c:47:81:73:4f:a2:c1:49:06:75:17:25:fa:5a:8d:
30:e5:55:7f:9c:3e:15:a8:b5:ab:f7:45:38:e3:76:8e:d4:0d:
60:fc:42:17:3d:85:72:41:1d:53:9d:58:b0:e9:29:0c:e4:6b:
14:c2:22:c4:d5:7b:de:36:da:df:d8:a0:4f:a4:0a:f2:3e:ca:
7e:66:a6:10:38:97:24:73:5b:db:eb:0b:6c:a8:f8:37:15:2c:
0e:b1:82:44:cc:fe:85:b0:cb:6c:26:4b:4a:70:33:dc:7e:f5:
84:ba:07:db
</code></pre></div>
<p>As you can see this certificate has the same value in <code>Issuer</code> and <code>Subject</code>, as happened before for the Amazon Root one. It also has the flag <code>CA</code> set to true but it doesn't have the extension <code>Key Usage</code> meaning that this certificate can't be used to sign other certificates.</p>
<h2 id="example-this-sites-certificate">Example: this site's certificate<a class="headerlink" href="#example-this-sites-certificate" title="Permanent link">¶</a></h2>
<p>You can see TLS certificates and the chain of trust in action in this very website. Following the documentation of you browser (instructions for Firefox are <a href="https://support.mozilla.org/en-US/kb/secure-website-certificate">here</a>), you can see the certificate used by The Digital Cat. At the time of writing the blog is hosted on GitHub Pages, even tough I'm using a custom domain, and GitHub partnered with <a href="https://letsencrypt.org/">Let's Encrypt</a> to provide certificates for such a configuration (details <a href="https://github.blog/2018-05-01-github-pages-custom-domains-https/">here</a>).</p>
<p>Indeed, the certificate for <a href="https://www.thedigitalcatonline.com">thedigitalcatonline.com</a> is provided by "Let's Encrypt Authority X3", which in turn is trusted by Digital Signature Trust Co. with its root CA "DST Root CA X3".</p>
<p>Let's have a look at the three certificates. The one for The Digital Cat is</p>
<div class="highlight"><pre><span></span><code>Certificate:
Data:
Version: 3 (0x2)
Serial Number:
03:93:02:bb:9a:c9:ed:a5:c3:d1:16:00:8b:15:76:af:e5:d9
Signature Algorithm: sha256WithRSAEncryption
Issuer: C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
Validity
Not Before: Oct 22 04:53:28 2020 GMT
Not After : Jan 20 04:53:28 2021 GMT
Subject: CN = www.thedigitalcatonline.com
[...]
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Subject Key Identifier:
63:4E:15:85:56:5A:A4:94:02:C2:16:42:A4:A5:97:9A:38:02:57:97
X509v3 Authority Key Identifier:
keyid:A8:4A:6A:63:04:7D:DD:BA:E6:D1:39:B7:A6:45:65:EF:F3:A8:EC:A1
Authority Information Access:
OCSP - URI:http://ocsp.int-x3.letsencrypt.org
CA Issuers - URI:http://cert.int-x3.letsencrypt.org/
X509v3 Subject Alternative Name:
DNS:www.thedigitalcatonline.com
[...]
</code></pre></div>
<p>And you can see that this time the <code>Subject</code> is <code>www.thedigitalcatonline.com</code>, but the <code>Issuer</code> is <code>Let's Encrypt Authority X3</code>. The certificate provided by the organisation <code>Let's Encrypt</code> is</p>
<div class="highlight"><pre><span></span><code>Certificate:
Data:
Version: 3 (0x2)
Serial Number:
0a:01:41:42:00:00:01:53:85:73:6a:0b:85:ec:a7:08
Signature Algorithm: sha256WithRSAEncryption
Issuer: O = Digital Signature Trust Co., CN = DST Root CA X3
Validity
Not Before: Mar 17 16:40:46 2016 GMT
Not After : Mar 17 16:40:46 2021 GMT
Subject: C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
[...]
X509v3 extensions:
X509v3 Basic Constraints: critical
CA:TRUE, pathlen:0
X509v3 Key Usage: critical
Digital Signature, Certificate Sign, CRL Sign
Authority Information Access:
OCSP - URI:http://isrg.trustid.ocsp.identrust.com
CA Issuers - URI:http://apps.identrust.com/roots/dstrootcax3.p7c
X509v3 Authority Key Identifier:
keyid:C4:A7:B1:A4:7B:2C:71:FA:DB:E1:4B:90:75:FF:C4:15:60:85:89:10
X509v3 Certificate Policies:
Policy: 2.23.140.1.2.1
Policy: 1.3.6.1.4.1.44947.1.1.1
CPS: http://cps.root-x1.letsencrypt.org
X509v3 CRL Distribution Points:
Full Name:
URI:http://crl.identrust.com/DSTROOTCAX3CRL.crl
X509v3 Subject Key Identifier:
A8:4A:6A:63:04:7D:DD:BA:E6:D1:39:B7:A6:45:65:EF:F3:A8:EC:A1
[...]
</code></pre></div>
<p>Here, the <code>Subject</code> is <code>Let's Encrypt Authority X3</code> (the <code>Issuer</code> of the previous certificate), and the <code>Issuer</code> is <code>DST Root CA X3</code>. Last, the certificate provided by the organisation <code>Digital Signature Trust Co.</code> is</p>
<div class="highlight"><pre><span></span><code>Certificate:
Data:
Version: 3 (0x2)
Serial Number:
44:af:b0:80:d6:a3:27:ba:89:30:39:86:2e:f8:40:6b
Signature Algorithm: sha1WithRSAEncryption
Issuer: O = Digital Signature Trust Co., CN = DST Root CA X3
Validity
Not Before: Sep 30 21:12:19 2000 GMT
Not After : Sep 30 14:01:15 2021 GMT
Subject: O = Digital Signature Trust Co., CN = DST Root CA X3
[...]
X509v3 extensions:
X509v3 Basic Constraints: critical
CA:TRUE
X509v3 Key Usage: critical
Certificate Sign, CRL Sign
X509v3 Subject Key Identifier:
C4:A7:B1:A4:7B:2C:71:FA:DB:E1:4B:90:75:FF:C4:15:60:85:89:10
[...]
</code></pre></div>
<p>As happened for the certificate <code>Amazon Root CA 1</code> that we discussed before, this one is self-signed, having the same value for <code>Subject</code> and <code>Issuer</code>.</p>
<h2 id="how-to-verify-certificates-with-openssl">How to verify certificates with OpenSSL<a class="headerlink" href="#how-to-verify-certificates-with-openssl" title="Permanent link">¶</a></h2>
<p>To verify if a certificate is valid we can use the module <code>verify</code> of OpenSSL. By default, OpenSSL doesn't trust anything, and <code>verify</code> relies on a default path in the system to find root certificates. You can see the path running</p>
<div class="highlight"><pre><span></span><code>$<span class="w"> </span>openssl<span class="w"> </span>version<span class="w"> </span>-d
OPENSSLDIR:<span class="w"> </span><span class="s2">"/usr/lib/ssl"</span>
</code></pre></div>
<p>On Ubuntu 20.04, the directory <code>/usr/lib/ssl/certs</code> is a symbolic link to <code>/etc/ssl/certs</code> that is installed by the package <a href="https://packages.ubuntu.com/focal/ca-certificates"><code>ca-certificates</code></a> which is linked to the Mozilla's CA Certificate Program (details on that package can be found in the <a href="https://salsa.debian.org/debian/ca-certificates/-/blob/master/debian/README.Debian">source code</a>).</p>
<p>So, if a root certificate is included in the Mozilla program, it is trusted by OpenSSL</p>
<div class="highlight"><pre><span></span><code>$<span class="w"> </span>openssl<span class="w"> </span>verify<span class="w"> </span>amazon-root-ca-1.pem<span class="w"> </span>
amazon-root-ca-1.pem:<span class="w"> </span>OK
</code></pre></div>
<p>while a self-signed certificate is not</p>
<div class="highlight"><pre><span></span><code>$<span class="w"> </span>openssl<span class="w"> </span>verify<span class="w"> </span>self-signed.pem<span class="w"> </span>
<span class="nv">CN</span><span class="w"> </span><span class="o">=</span><span class="w"> </span>localhost
error<span class="w"> </span><span class="m">18</span><span class="w"> </span>at<span class="w"> </span><span class="m">0</span><span class="w"> </span>depth<span class="w"> </span>lookup:<span class="w"> </span>self<span class="w"> </span>signed<span class="w"> </span>certificate
error<span class="w"> </span>self-signed.pem:<span class="w"> </span>verification<span class="w"> </span>failed
</code></pre></div>
<p>A non-root certificate can be verified specifying which root certificate signed it. So, the certificate for this website is not trusted automatically</p>
<div class="highlight"><pre><span></span><code>$<span class="w"> </span>openssl<span class="w"> </span>verify<span class="w"> </span>www-thedigitalcatonline-com.pem<span class="w"> </span>
<span class="nv">CN</span><span class="w"> </span><span class="o">=</span><span class="w"> </span>www.thedigitalcatonline.com
error<span class="w"> </span><span class="m">20</span><span class="w"> </span>at<span class="w"> </span><span class="m">0</span><span class="w"> </span>depth<span class="w"> </span>lookup:<span class="w"> </span>unable<span class="w"> </span>to<span class="w"> </span>get<span class="w"> </span><span class="nb">local</span><span class="w"> </span>issuer<span class="w"> </span>certificate
error<span class="w"> </span>www-thedigitalcatonline-com.pem:<span class="w"> </span>verification<span class="w"> </span>failed
</code></pre></div>
<p>But it is verified specifying the certificate for Let's Encrypt that signed it</p>
<div class="highlight"><pre><span></span><code>$<span class="w"> </span>openssl<span class="w"> </span>verify<span class="w"> </span>-CAfile<span class="w"> </span>lets-encrypt-x3.pem<span class="w"> </span>www-thedigitalcatonline-com.pem<span class="w"> </span>
www-thedigitalcatonline-com.pem:<span class="w"> </span>OK
</code></pre></div>
<p>because the certificate <code>lets-encrypt-x3.pem</code> is signed by <code>DST_Root_CA_X3.pem</code> which is included in the Mozilla program, and thus included in my Linux distribution.</p>
<p>If I remove the default certificates path OpenSSL doesn't accept the certificate for Let's Encrypt any more</p>
<div class="highlight"><pre><span></span><code>$<span class="w"> </span>openssl<span class="w"> </span>verify<span class="w"> </span>-no-CApath<span class="w"> </span>-CAfile<span class="w"> </span>lets-encrypt-x3.pem<span class="w"> </span>www-thedigitalcatonline-com.pem
<span class="nv">C</span><span class="w"> </span><span class="o">=</span><span class="w"> </span>US,<span class="w"> </span><span class="nv">O</span><span class="w"> </span><span class="o">=</span><span class="w"> </span>Let<span class="s1">'s Encrypt, CN = Let'</span>s<span class="w"> </span>Encrypt<span class="w"> </span>Authority<span class="w"> </span>X3
error<span class="w"> </span><span class="m">2</span><span class="w"> </span>at<span class="w"> </span><span class="m">1</span><span class="w"> </span>depth<span class="w"> </span>lookup:<span class="w"> </span>unable<span class="w"> </span>to<span class="w"> </span>get<span class="w"> </span>issuer<span class="w"> </span>certificate
error<span class="w"> </span>www-thedigitalcatonline-com.pem:<span class="w"> </span>verification<span class="w"> </span>failed
</code></pre></div>
<h2 id="low-level-certificate-validation-process">Low-level certificate validation process<a class="headerlink" href="#low-level-certificate-validation-process" title="Permanent link">¶</a></h2>
<p>Let's have a look at the signature process for x.509 certificates. The process depends on the specific algorithm used to sign the certificate, so I will use the certificate <code>Amazon Root CA 1</code> as an example, leaving to the reader the investigation about other algorithms.</p>
<p>A signed certificate is made of two parts, the certificate itself and the signature. The signature contains an encrypted hash of the certificate, so the verification is done in three steps:</p>
<ol>
<li>Decrypt the encrypted hash using the public key</li>
<li>Compute the hash of the certificate using the same algorithm</li>
<li>Compare the hashes</li>
</ol>
<p>For the Amazon root certificate, we know the signature algorithm and value from the output of <code>openssl x509</code></p>
<div class="highlight"><pre><span></span><code>$ openssl x509 -inform pem -in amazon-root-ca-1.pem -noout -text
[...]
Signature Algorithm: sha256WithRSAEncryption
98:f2:37:5a:41:90:a1:1a:c5:76:51:28:20:36:23:0e:ae:e6:
28:bb:aa:f8:94:ae:48:a4:30:7f:1b:fc:24:8d:4b:b4:c8:a1:
97:f6:b6:f1:7a:70:c8:53:93:cc:08:28:e3:98:25:cf:23:a4:
f9:de:21:d3:7c:85:09:ad:4e:9a:75:3a:c2:0b:6a:89:78:76:
44:47:18:65:6c:8d:41:8e:3b:7f:9a:cb:f4:b5:a7:50:d7:05:
2c:37:e8:03:4b:ad:e9:61:a0:02:6e:f5:f2:f0:c5:b2:ed:5b:
b7:dc:fa:94:5c:77:9e:13:a5:7f:52:ad:95:f2:f8:93:3b:de:
8b:5c:5b:ca:5a:52:5b:60:af:14:f7:4b:ef:a3:fb:9f:40:95:
6d:31:54:fc:42:d3:c7:46:1f:23:ad:d9:0f:48:70:9a:d9:75:
78:71:d1:72:43:34:75:6e:57:59:c2:02:5c:26:60:29:cf:23:
19:16:8e:88:43:a5:d4:e4:cb:08:fb:23:11:43:e8:43:29:72:
62:a1:a9:5d:5e:08:d4:90:ae:b8:d8:ce:14:c2:d0:55:f2:86:
f6:c4:93:43:77:66:61:c0:b9:e8:41:d7:97:78:60:03:6e:4a:
72:ae:a5:d1:7d:ba:10:9e:86:6c:1b:8a:b9:59:33:f8:eb:c4:
90:be:f1:b9
</code></pre></div>
<p>You can see the signed certificate binary values with <code>cat amazon-root-ca-1.pem | tail -n+2 | head -n-1 | base64 -di | hexdump -ve '/1 "%02x "' -e '2/8 "\n"'</code>. While we can recognise the signature in the last 256 bytes we can't easily separate the bytes with the signature algorithm. If we open the signed certificate with an ASN.1 parser, instead, we can easily find the binary value of the certificate part</p>
<div class="highlight"><pre><span></span><code>30 82 03 41 30 82 02 29 a0 03 02 01 02 02 13 06
6c 9f cf 99 bf 8c 0a 39 e2 f0 78 8a 43 e6 96 36
5b ca 30 0d 06 09 2a 86 48 86 f7 0d 01 01 0b 05
00 30 39 31 0b 30 09 06 03 55 04 06 13 02 55 53
31 0f 30 0d 06 03 55 04 0a 13 06 41 6d 61 7a 6f
6e 31 19 30 17 06 03 55 04 03 13 10 41 6d 61 7a
6f 6e 20 52 6f 6f 74 20 43 41 20 31 30 1e 17 0d
31 35 30 35 32 36 30 30 30 30 30 30 5a 17 0d 33
38 30 31 31 37 30 30 30 30 30 30 5a 30 39 31 0b
30 09 06 03 55 04 06 13 02 55 53 31 0f 30 0d 06
03 55 04 0a 13 06 41 6d 61 7a 6f 6e 31 19 30 17
06 03 55 04 03 13 10 41 6d 61 7a 6f 6e 20 52 6f
6f 74 20 43 41 20 31 30 82 01 22 30 0d 06 09 2a
86 48 86 f7 0d 01 01 01 05 00 03 82 01 0f 00 30
82 01 0a 02 82 01 01 00 b2 78 80 71 ca 78 d5 e3
71 af 47 80 50 74 7d 6e d8 d7 88 76 f4 99 68 f7
58 21 60 f9 74 84 01 2f ac 02 2d 86 d3 a0 43 7a
4e b2 a4 d0 36 ba 01 be 8d db 48 c8 07 17 36 4c
f4 ee 88 23 c7 3e eb 37 f5 b5 19 f8 49 68 b0 de
d7 b9 76 38 1d 61 9e a4 fe 82 36 a5 e5 4a 56 e4
45 e1 f9 fd b4 16 fa 74 da 9c 9b 35 39 2f fa b0
20 50 06 6c 7a d0 80 b2 a6 f9 af ec 47 19 8f 50
38 07 dc a2 87 39 58 f8 ba d5 a9 f9 48 67 30 96
ee 94 78 5e 6f 89 a3 51 c0 30 86 66 a1 45 66 ba
54 eb a3 c3 91 f9 48 dc ff d1 e8 30 2d 7d 2d 74
70 35 d7 88 24 f7 9e c4 59 6e bb 73 87 17 f2 32
46 28 b8 43 fa b7 1d aa ca b4 f2 9f 24 0e 2d 4b
f7 71 5c 5e 69 ff ea 95 02 cb 38 8a ae 50 38 6f
db fb 2d 62 1b c5 c7 1e 54 e1 77 e0 67 c8 0f 9c
87 23 d6 3f 40 20 7f 20 80 c4 80 4c 3e 3b 24 26
8e 04 ae 6c 9a c8 aa 0d 02 03 01 00 01 a3 42 30
40 30 0f 06 03 55 1d 13 01 01 ff 04 05 30 03 01
01 ff 30 0e 06 03 55 1d 0f 01 01 ff 04 04 03 02
01 86 30 1d 06 03 55 1d 0e 04 16 04 14 84 18 cc
85 34 ec bc 0c 94 94 2e 08 59 9c c7 b2 10 4e 0a
08
</code></pre></div>
<p>The signature algorithm part is</p>
<div class="highlight"><pre><span></span><code>30 0d 06 09 2a 86 48 86 f7 0d 01 01 0b 05 00
03 82 01 01 00
</code></pre></div>
<p>and the ASN.1 parser tells us that those bytes represent an <code>OBJECT IDENTIFIER</code> which value is <code>2.16.840.1.101.3.4.2.1</code>. Now, object identifiers are not complicated per se, they are just a way to identify algorithms and other well known components in ASN.1 structures. The <a href="https://tools.ietf.org/html/rfc5280#section-4.1.1.2">description of the field</a> <code>signatureAlgorithm</code> of an x.509 certificate mentions three other RFCs that contains descriptions of the available algorithms. In particular, <a href="https://tools.ietf.org/html/rfc4055#section-2.1">RFC 4055</a> contains the description of PKCS #1 one-way hash functions, one of which is</p>
<div class="highlight"><pre><span></span><code>id-sha256 OBJECT IDENTIFIER ::= { joint-iso-itu-t(2)
country(16) us(840) organization(1) gov(101)
csor(3) nistalgorithm(4) hashalgs(2) 1 }
</code></pre></div>
<p>You can see the values in the object identifier between parentheses. Since these are PKCS #1 (a.k.a. RSA) has functions, OpenSSL identifies it as <code>sha256WithRSAEncryption</code> (see again <a href="https://tools.ietf.org/html/rfc4055#section-5">RFC 4055</a>).</p>
<p>RSA encryption is described in <a href="https://tools.ietf.org/html/rfc2313">RFC 2313</a> ("PKCS #1: RSA Encryption Version 1.5") and the signature algorithm based on RSA is described there in <a href="https://tools.ietf.org/html/rfc2313#section-10">section 10</a>. In particular, section 10.2 details the verification process, which is the one we are interested in. The steps are</p>
<ul>
<li>Bit-string-to-octet-string conversion of the signature</li>
<li>RSA decryption</li>
<li>Digest decoding (ASN.1)</li>
<li>Message digesting and comparison</li>
</ul>
<p>As for the signature conversion, the sentence</p>
<div class="highlight"><pre><span></span><code>Specifically, assuming that the length in bits of the
signature S is a multiple of eight, the first bit of the signature
shall become the most significant bit of the first octet of the
encrypted data, and so on through the last bit of the signature,
which shall become the least significant bit of the last octet of the
encrypted data.
</code></pre></div>
<p>is a very verbose way to say that the signature is big-endian.</p>
<p>So, the hexadecimal value of the signature is</p>
<div class="highlight"><pre><span></span><code>98f2375a4190a11ac57651282036230eaee628bbaaf894ae48a4307f1bfc248d
4bb4c8a197f6b6f17a70c85393cc0828e39825cf23a4f9de21d37c8509ad4e9a
753ac20b6a897876444718656c8d418e3b7f9acbf4b5a750d7052c37e8034bad
e961a0026ef5f2f0c5b2ed5bb7dcfa945c779e13a57f52ad95f2f8933bde8b5c
5bca5a525b60af14f74befa3fb9f40956d3154fc42d3c7461f23add90f48709a
d9757871d1724334756e5759c2025c266029cf2319168e8843a5d4e4cb08fb23
1143e843297262a1a95d5e08d490aeb8d8ce14c2d055f286f6c49343776661c0
b9e841d7977860036e4a72aea5d17dba109e866c1b8ab95933f8ebc490bef1b9
</code></pre></div>
<p>And reading the field <code>Subject Public Key Info</code> of the certificate we find the public key. Remember that this is a root certificate, so it is signed using the same key that it contains, which is not true in general.</p>
<p>The public key's modulus is</p>
<div class="highlight"><pre><span></span><code>b2788071ca78d5e371af478050747d6ed8d78876f49968f7582160f97484012f
ac022d86d3a0437a4eb2a4d036ba01be8ddb48c80717364cf4ee8823c73eeb37
f5b519f84968b0ded7b976381d619ea4fe8236a5e54a56e445e1f9fdb416fa74
da9c9b35392ffab02050066c7ad080b2a6f9afec47198f503807dca2873958f8
bad5a9f948673096ee94785e6f89a351c0308666a14566ba54eba3c391f948dc
ffd1e8302d7d2d747035d78824f79ec4596ebb738717f2324628b843fab71daa
cab4f29f240e2d4bf7715c5e69ffea9502cb388aae50386fdbfb2d621bc5c71e
54e177e067c80f9c8723d63f40207f2080c4804c3e3b24268e04ae6c9ac8aa0d
</code></pre></div>
<p>and the exponent is <code>0x10001</code> (default choice).</p>
<p>RSA public-key signature decryption is performed with <code>signature ^ exponent mod modulus</code>, and this operation returns</p>
<div class="highlight"><pre><span></span><code>1fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
fffffffffffffffffffff003031300d0609608648016503040201050004206fc
4b8ac3d2b52c08baf56255e43d22c762962e4facab01ace16d48ec008be0a
</code></pre></div>
<p>Once the padding is removed, we are left with an ASN.1 binary structure that represents the digest</p>
<div class="highlight"><pre><span></span><code>DigestInfo ::= SEQUENCE {
digestAlgorithm DigestAlgorithmIdentifier,
digest Digest }
</code></pre></div>
<p>(see <a href="https://tools.ietf.org/html/rfc2313#section-10.1.2">RFC 2313 - Section 10.1.2</a>)</p>
<p>The value of <code>Digest</code> can be extracted with an ASN.1 parser or by taking the last 256 bits and is <code>6fc4b8ac3d2b52c08baf56255e43d22c762962e4facab01ace16d48ec008be0a</code>.</p>
<p>At this point we need to process the certificate bytes (without signature) with the SHA-256 hash function and we will find a matching value of <code>6fc4b8ac3d2b52c08baf56255e43d22c762962e4facab01ace16d48ec008be0a</code>.</p>
<p>This process (for the specific case of this certificate) can be easily done in Python</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">cryptography</span> <span class="kn">import</span> <span class="n">x509</span>
<span class="kn">from</span> <span class="nn">hashlib</span> <span class="kn">import</span> <span class="n">sha256</span>
<span class="n">certificate_pem_file</span> <span class="o">=</span> <span class="s2">"amazon-root-ca-1.pem"</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">certificate_pem_file</span><span class="p">,</span> <span class="s2">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="n">certificate_pem</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">certificate</span> <span class="o">=</span> <span class="n">x509</span><span class="o">.</span><span class="n">load_pem_x509_certificate</span><span class="p">(</span><span class="n">certificate_pem</span><span class="p">)</span>
<span class="n">modulus</span> <span class="o">=</span> <span class="n">certificate</span><span class="o">.</span><span class="n">public_key</span><span class="p">()</span><span class="o">.</span><span class="n">public_numbers</span><span class="p">()</span><span class="o">.</span><span class="n">n</span>
<span class="n">exponent</span> <span class="o">=</span> <span class="n">certificate</span><span class="o">.</span><span class="n">public_key</span><span class="p">()</span><span class="o">.</span><span class="n">public_numbers</span><span class="p">()</span><span class="o">.</span><span class="n">e</span>
<span class="n">signature</span> <span class="o">=</span> <span class="nb">int</span><span class="o">.</span><span class="n">from_bytes</span><span class="p">(</span><span class="n">certificate</span><span class="o">.</span><span class="n">signature</span><span class="p">,</span> <span class="s2">"big"</span><span class="p">)</span>
<span class="n">verification</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">signature</span><span class="p">,</span> <span class="n">exponent</span><span class="p">,</span> <span class="n">modulus</span><span class="p">)</span>
<span class="n">digest</span> <span class="o">=</span> <span class="nb">bytes</span><span class="p">()</span><span class="o">.</span><span class="n">fromhex</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="nb">hex</span><span class="p">(</span><span class="n">verification</span><span class="p">))[</span><span class="o">-</span><span class="mi">64</span><span class="p">:])</span>
<span class="n">calculated_digest</span> <span class="o">=</span> <span class="n">sha256</span><span class="p">(</span><span class="n">certificate</span><span class="o">.</span><span class="n">tbs_certificate_bytes</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">digest</span><span class="o">.</span><span class="n">hex</span><span class="p">()</span> <span class="o">==</span> <span class="n">calculated_digest</span><span class="o">.</span><span class="n">hexdigest</span><span class="p">())</span>
</code></pre></div>
<p>This is arguably not the best Python code ever, but it's a simple way to demonstrate the process. As I said, this is far from being general, as it assumes the signature is <code>sha256WithRSAEncryption</code>, which might not be the case.</p>
<p>What I showed you here is what happens when we validate a root certificate. When we validate a non-root certificate the process is exactly the same (taking into account that the algorithms involved might be different), only the public key used to sign the certificate doesn't come from the certificate itself, but from the signer one. So, in the case of this blog, the certificate for www.thedigitalcat.com has a signature encrypted with the public key of Let's Encrypt. And the certificate for Let's Encrypt will be signed using the public key of Digital Signature Trust Co. This is what creates the chain of trust.</p>
<h2 id="algorithms-used-by-root-certificates">Algorithms used by root certificates<a class="headerlink" href="#algorithms-used-by-root-certificates" title="Permanent link">¶</a></h2>
<p>A quick scan of the certificates that are part of the Mozilla program reveals that the vast majority of them is using RSA to self-sign them</p>
<div class="highlight"><pre><span></span><code>$<span class="w"> </span><span class="k">for</span><span class="w"> </span>i<span class="w"> </span><span class="k">in</span><span class="w"> </span>/etc/ssl/certs/*.pem<span class="p">;</span><span class="w"> </span><span class="k">do</span><span class="w"> </span>openssl<span class="w"> </span>x509<span class="w"> </span>-inform<span class="w"> </span>pem<span class="w"> </span>-in<span class="w"> </span><span class="si">${</span><span class="nv">i</span><span class="si">}</span><span class="w"> </span>-noout<span class="w"> </span>-text<span class="w"> </span><span class="p">|</span><span class="w"> </span>grep<span class="w"> </span>-E<span class="w"> </span><span class="s2">"Public Key Algorithm"</span><span class="p">;</span><span class="w"> </span><span class="k">done</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>sort<span class="w"> </span><span class="p">|</span><span class="w"> </span>uniq<span class="w"> </span>-c
<span class="w"> </span><span class="m">25</span><span class="w"> </span>Public<span class="w"> </span>Key<span class="w"> </span>Algorithm:<span class="w"> </span>id-ecPublicKey
<span class="w"> </span><span class="m">114</span><span class="w"> </span>Public<span class="w"> </span>Key<span class="w"> </span>Algorithm:<span class="w"> </span>rsaEncryption
</code></pre></div>
<p>while part of them are using <code>id-ecPublicKey</code> which is the identifier of elliptic curves algorithms.</p>
<p>When it comes to signature algorithms, instead, there is more variety</p>
<div class="highlight"><pre><span></span><code>$<span class="w"> </span><span class="k">for</span><span class="w"> </span>i<span class="w"> </span><span class="k">in</span><span class="w"> </span>/etc/ssl/certs/*.pem<span class="p">;</span><span class="w"> </span><span class="k">do</span><span class="w"> </span>openssl<span class="w"> </span>x509<span class="w"> </span>-inform<span class="w"> </span>pem<span class="w"> </span>-in<span class="w"> </span><span class="si">${</span><span class="nv">i</span><span class="si">}</span><span class="w"> </span>-noout<span class="w"> </span>-text<span class="w"> </span><span class="p">|</span><span class="w"> </span>grep<span class="w"> </span>-E<span class="w"> </span><span class="s2">"^ Signature Algorithm"</span><span class="p">;</span><span class="w"> </span><span class="k">done</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>sort<span class="w"> </span><span class="p">|</span><span class="w"> </span>uniq<span class="w"> </span>-c
<span class="w"> </span><span class="m">7</span><span class="w"> </span>Signature<span class="w"> </span>Algorithm:<span class="w"> </span>ecdsa-with-SHA256
<span class="w"> </span><span class="m">18</span><span class="w"> </span>Signature<span class="w"> </span>Algorithm:<span class="w"> </span>ecdsa-with-SHA384
<span class="w"> </span><span class="m">47</span><span class="w"> </span>Signature<span class="w"> </span>Algorithm:<span class="w"> </span>sha1WithRSAEncryption
<span class="w"> </span><span class="m">57</span><span class="w"> </span>Signature<span class="w"> </span>Algorithm:<span class="w"> </span>sha256WithRSAEncryption
<span class="w"> </span><span class="m">9</span><span class="w"> </span>Signature<span class="w"> </span>Algorithm:<span class="w"> </span>sha384WithRSAEncryption
<span class="w"> </span><span class="m">1</span><span class="w"> </span>Signature<span class="w"> </span>Algorithm:<span class="w"> </span>sha512WithRSAEncryption
</code></pre></div>
<p>Even here, elliptic curves are slowly being adopted.</p>
<h2 id="aws-components-related-to-certificates">AWS components related to certificates<a class="headerlink" href="#aws-components-related-to-certificates" title="Permanent link">¶</a></h2>
<p>If you are using AWS, you can create certificates with ACM, the <a href="https://aws.amazon.com/certificate-manager/">AWS Certificate Manager</a>. Such certificates cannot be downloaded, they can only be attached to other AWS components. For this reason, the generation process requires you to create any request, as you might have to do with other authorities. Certificates created in the ACM are free.</p>
<p>Certificates created in the ACM can be attached to several AWS components, most notably <a href="https://aws.amazon.com/documentation/elastic-load-balancing/">Load Balancers</a>, <a href="https://aws.amazon.com/documentation/cloudfront/">CloudFront</a>, and <a href="https://aws.amazon.com/documentation/apigateway/">API Gateway</a>.</p>
<p>Traditionally, load balancers are the place where TLS is terminated for HTTPS, requiring a connection to port 443. While <a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/application/index.html">Application Load Balancers</a> can do that, in 2019 AWS <a href="https://aws.amazon.com/blogs/aws/new-tls-termination-for-network-load-balancers/">announced</a> support for certificates in <a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/network/index.html">Network Load Balancers</a> as well.</p>
<h3 id="lets-encrypt">Let's encrypt<a class="headerlink" href="#lets-encrypt" title="Permanent link">¶</a></h3>
<p>In an effort to push for HTTP encryption of any public server, the Internet Security Research Group founded in 2016 a non-profit CA named <a href="https://letsencrypt.org/">Let's Encrypt</a>, which provides at no charge TLS certificates valid for 90 days. Such certificates can be renewed automatically as part of the setup (<a href="https://certbot.org/">certbot</a>) and represent a viable alternative to certificates issued by other CA, in particular for open source projects. This blog uses a certificate issued by Let's Encrypt (provided by GitHub Pages) and will thus expire in less than 3 months (but also automatically renewed).</p>
<h2 id="final-words">Final words<a class="headerlink" href="#final-words" title="Permanent link">¶</a></h2>
<p>I hope this post helped to clarify some of the most obscure points of certificates, that definitely bugged be when I first approached them. As always when standards are involved, the risk is to get lost in the myriad of documents where information is scattered, and not to realise that some (if not many) parts of the systems we run every day have a long history and thus a big burden of legacy code or nomenclature.</p>
<h2 id="resources">Resources<a class="headerlink" href="#resources" title="Permanent link">¶</a></h2>
<ul>
<li>The Wikipedia article on <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security#SSL_1.0,_2.0,_and_3.0">TLS</a></li>
<li>The Wikipedia article on <a href="https://en.wikipedia.org/wiki/Certificate_authority">Certificate authority</a></li>
<li>The Wikipedia article on <a href="https://en.wikipedia.org/wiki/X.509">X.509</a></li>
<li>The Wikipedia article on <a href="https://en.wikipedia.org/wiki/Let%27s_Encrypt">Let's Encrypt</a></li>
<li>OpenSSL documentation: <a href="https://www.openssl.org/docs/man1.1.0/apps/asn1parse.html">asn1parse</a>, <a href="https://www.openssl.org/docs/man1.1.1/man1/x509.html">x509</a>, <a href="https://www.openssl.org/docs/man1.1.1/man1/verify.html">verify</a></li>
<li>The Abstract Syntax Notation One <a href="https://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One">ASN.1</a> interface description language</li>
<li><a href="https://tools.ietf.org/html/rfc2313">RFC 2313</a> - "PKCS #1: RSA Encryption Version 1.5"</li>
<li><a href="https://tools.ietf.org/html/rfc2459">RFC 2459</a> - "Internet X.509 Public Key Infrastructure Certificate and CRL Profile"</li>
<li><a href="https://tools.ietf.org/html/rfc3279">RFC 3279</a> - "Algorithms and Identifiers for the Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile"</li>
<li><a href="https://tools.ietf.org/html/rfc4055">RFC 4055</a> - "Additional Algorithms and Identifiers for RSA Cryptography for use in the Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile"</li>
<li><a href="https://tools.ietf.org/html/rfc5280">RFC 5280</a> - "Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile"</li>
<li><a href="https://tools.ietf.org/html/rfc7299">RFC 7299</a> - "Object Identifier Registry for the PKIX Working Group"</li>
<li><a href="https://tools.ietf.org/html/rfc7468">RFC 7468</a> - "Textual Encodings of PKIX, PKCS, and CMS Structures"</li>
<li><a href="https://tools.ietf.org/html/rfc8017">RFC 8017</a> - "PKCS #1: RSA Cryptography Specifications Version 2.2"</li>
<li><a href="https://tools.ietf.org/html/rfc8446">RFC 8446</a> - "The Transport Layer Security (TLS) Protocol Version 1.3"</li>
<li><a href="https://cryptography.io/en/latest/">pyca/cryptography</a> - The Python pyca/cryptography package</li>
</ul>
<h2 id="feedback">Feedback<a class="headerlink" href="#feedback" title="Permanent link">¶</a></h2>
<p>Feel free to reach me on <a href="https://twitter.com/thedigicat">Twitter</a> if you have questions. The <a href="https://github.com/TheDigitalCatOnline/blog_source/issues">GitHub issues</a> page is the best place to submit corrections.</p>Multiple inheritance and mixin classes in Python2020-03-27T12:00:00+01:002022-10-08T14:00:00+01:00Leonardo Giordanitag:www.thedigitalcatonline.com,2020-03-27:/blog/2020/03/27/mixin-classes-in-python/<p>This post describes what mixin classes are in theory, why we need them and how they can be implemented in Python. It also shows a working example from the class-based views code of the Django framework.</p><p>I recently revisited three old posts on Django class-based views that I wrote for this blog, updating them to Django 3.0 (you can find them <a href="https://www.thedigitalcatonline.com/blog/2013/10/28/digging-up-django-class-based-views-1/">here</a>) and noticed once again that the code base uses <em>mixin classes</em> to increase code reuse. I also realised that mixins are not very popular in Python, so I decided to explore them, brushing up my knowledge of the OOP theory in the meanwhile.</p><p>To fully appreciate the content of the post, be sure you grasp two pillars of the OOP approach: <strong>delegation</strong>, in particular how it is implemented through inheritance, and <strong>polymorphism</strong>. <a href="https://www.thedigitalcatonline.com/blog/2014/08/20/python-3-oop-part-3-delegation-composition-and-inheritance/">This post about delegation</a> and <a href="https://www.thedigitalcatonline.com/blog/2014/08/21/python-3-oop-part-4-polymorphism/">this post about polymorphism</a> contain all you need to understand how Python implements those concepts.</p><h2 id="multiple-inheritance-blessing-and-curse-1a08">Multiple inheritance: blessing and curse<a class="headerlink" href="#multiple-inheritance-blessing-and-curse-1a08" title="Permanent link">¶</a></h2><h3 id="general-concepts-2aca">General concepts</h3><p>To discuss mixins we need to start from one of the most controversial subjects in the whole OOP world: multiple inheritance. This is a natural extension of the concept of simple inheritance, where a class automatically delegates method and attribute resolution to another class (the parent class).</p><p>Let me state it again, as it is important for the rest of the discussion: <em>inheritance is just an automatic delegation mechanism</em>.</p><p>Delegation was introduced in OOP as a way to reduce code duplication. When an object needs a specific feature it just delegates it to another class (either explicitly or implicitly), so the code is written just once.</p><p>Let's consider the example of code management website, clearly completely fictional and not inspired by any existing product. Let's assume we created the following hierarchy</p><div class="code"><div class="content"><div class="highlight"><pre> assignable reviewable item
(assign_to_user, ask_review_to_user)
^
|
|
|
pull request
</pre></div> </div> </div><p>which allows us to put in <code>pull request</code> only the specific code required by that element. This is a great achievement, as it is what libraries do for code, but on live objects. Method calls and delegation are nothing more than messages between objects, so the delegation hierarchy is just a simple networked system.</p><p>Unfortunately, the use of inheritance over composition often leads to systems that, paradoxically, increase code duplication. The main problem lies in the fact that inheritance can directly delegate to only one other class (the parent class), as opposed to composition, where the object can delegate to any number of other ones. This limitation of inheritance means that we might have a class that inherits from another one because it needs some of its features, but doing this receives features it doesn't want, or shouldn't have.</p><p>Let's continue the example of the code management portal, and consider an <code>issue</code>, which is an item that we want to store in the system, but cannot be reviewed by a user. If we create a hierarchy like this</p><div class="code"><div class="content"><div class="highlight"><pre> assignable reviewable item
(assign_to_user, ask_review_to_user)
^
|
|
|
|
+--------+--------+
| |
| |
| |
issue pull request
(not reviewable)
</pre></div> </div> </div><p>we end up putting the features related to the review process in an object that shouldn't have them. The standard solution to this problem is that of increasing the depth of the inheritance hierarchy and to derive from the new simpler ancestor.</p><div class="code"><div class="content"><div class="highlight"><pre> assignable item
(assign_to_user)
^
|
|
|
|
+------+--------------+
| |
| |
| |
| reviewable assignable item
| (ask_review_to_user)
| ^
| |
| |
| |
issue pull request
</pre></div> </div> </div><p>However, this approach stops being viable as soon as an object needs to inherit from a given class but not from the parent of that class. For example, an element that has to be reviewable but not assignable, like a <code>best practice</code> that we want to add to the site. If we want to keep using inheritance, the only solution at this point is to duplicate the code that implements the reviewable nature of the item (or the code that implements the assignable feature) and create two different class hierarchies.</p><div class="code"><div class="content"><div class="highlight"><pre> assignable item +--------> reviewable item
(assign_to_user) | (ask_review_to_user)
^ | ^
| | |
| | |
| CODE DUPLICATION |
| | |
+------+--------------+ | |
| | | |
| | | |
| | V |
| reviewable assignable item |
| (ask_review_to_user) |
| ^ |
| | |
| | |
| | |
issue pull request best practice
</pre></div> </div> </div><p>Please note that this doesn't even take into account that the new <code>reviewable item</code> might need attributes from <code>assignable item</code>, which prompts for another level of depth in the hierarchy, where we isolate those features in a more generic class. So, unfortunately, chances are that this is only the first of many compromises we will have to accept to keep the system in a stable state if we can't change our approach.</p><p>Multiple inheritance was then introduced in OOP, as it was clear that an object might want to delegate certain actions to a given class, and other actions to a different one, mimicking what life forms do when they inherit traits from multiple ancestors (parents, grandparents, etc.).</p><p>The above situation can then be solved having <code>pull request</code> inherit from both the class that provides the assign feature and from the one that implements the reviewable nature. </p><div class="code"><div class="content"><div class="highlight"><pre> assignable item reviewable item
(assign_to_user) (ask_review_to_user)
^ ^ ^
| | |
| | |
| | |
| | |
+------+-------------+ +----------------------+ |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
issue pull request best practice
</pre></div> </div> </div><p>Generally speaking, then, multiple inheritance is introduced to give the programmer a way to keep using inheritance without introducing code duplication, keeping the class hierarchy simpler and cleaner. Eventually, everything we do in software design is to try and separate concerns, that is, to isolate features, and multiple inheritance can help to do this.</p><p>These are just examples and might be valid or not, depending on the concrete case, but they clearly show the issues that we can have even with a very simple hierarchy of 4 classes. Many of these problems clearly arise from the fact that we wanted to implement delegation only through inheritance, and I dare to say that 80% of the architectural errors in OOP projects come from using inheritance instead of composition and from using god objects, that is classes that have responsibilities over too many different parts of the system. Always remember that OOP was born with the idea of small objects interacting through messages, so the considerations we make for monolithic architectures are valid even here.</p><p>That said, as inheritance and composition implement two different types of delegation (<em>to be</em> and <em>to have</em>), they are both valuable, and multiple inheritance is the way to remove the single provider limitation that comes from having only one parent class.</p><h3 id="why-is-it-controversial-a9c1">Why is it controversial?</h3><p>Given what I just said, multiple inheritance seems to be a blessing. When an object can inherit from multiple parents, we can easily spread responsibilities among different classes and use only the ones we need, promoting code reuse and avoiding god objects.</p><p>Unfortunately, things are not that simple. First of all, we face the issue that every microservice-oriented architecture faces, that is the risk of going from god objects (the extreme monolithic architecture) to almost empty objects (the extreme distributed approach), burdening the programmer with too a fine-grained control that eventually results in a system where relationships between objects are so complicated that it becomes impossible to grasp the effect of a change in the code.</p><p>There is a more immediate problem in multiple inheritance, though. As it happens with the natural inheritance, parents can provide the same "genetic trait" in two different flavours, but the resulting individual will have only one. Leaving aside genetics (which is incredibly more complicated than programming) and going back to OOP, we face a problem when an object inherits from two other objects that provide the same attribute.</p><p>So, if your class <code>Child</code> inherits from parents <code>Parent1</code> and <code>Parent2</code>, and both provide the <code>__init__</code> method, which one should your object use?</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">Parent1</span><span class="p">():</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Parent2</span><span class="p">():</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Child</span><span class="p">(</span><span class="n">Parent1</span><span class="p">,</span> <span class="n">Parent2</span><span class="p">):</span>
<span class="c1"># This inherits from both Parent1 and Parent2,</span>
<span class="c1"># which __init__ does it use?</span>
<span class="k">pass</span>
</pre></div> </div> </div><p>Things can even get worse, as parents can have different signatures of the common method, for example</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">Parent1</span><span class="p">:</span>
<span class="c1"># This inherits from Ancestor but redefines __init__</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">status</span><span class="p">):</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Parent2</span><span class="p">:</span>
<span class="c1"># This inherits from Ancestor but redefines __init__</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">name</span><span class="p">):</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Child</span><span class="p">(</span><span class="n">Parent1</span><span class="p">,</span> <span class="n">Parent2</span><span class="p">):</span>
<span class="c1"># This inherits from both Parent1 and Parent2,</span>
<span class="c1"># which __init__ does it use?</span>
<span class="k">pass</span>
</pre></div> </div> </div><p>The problem can be extended even further, introducing a common ancestor above <code>Parent1</code> and <code>Parent2</code>.</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">Ancestor</span><span class="p">:</span>
<span class="c1"># The common ancestor, defines its own __init__ method</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Parent1</span><span class="p">(</span><span class="n">Ancestor</span><span class="p">):</span>
<span class="c1"># This inherits from Ancestor but redefines __init__</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">status</span><span class="p">):</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Parent2</span><span class="p">(</span><span class="n">Ancestor</span><span class="p">):</span>
<span class="c1"># This inherits from Ancestor but redefines __init__</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">name</span><span class="p">):</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Child</span><span class="p">(</span><span class="n">Parent1</span><span class="p">,</span> <span class="n">Parent2</span><span class="p">):</span>
<span class="c1"># This inherits from both Parent1 and Parent2,</span>
<span class="c1"># which __init__ does it use?</span>
<span class="k">pass</span>
</pre></div> </div> </div><p>As you can see, we already have a problem when we introduce multiple parents, and a common ancestor just adds a new level of complexity. The ancestor class can clearly be at any point of the inheritance tree (grandparent, grand-grandparent, etc.), the important part is that it is shared between <code>Parent1</code> and <code>Parent2</code>. This is the so-called diamond problem, as the inheritance graph has the shape of a diamond</p><div class="code"><div class="content"><div class="highlight"><pre> Ancestor
^ ^
/ \
/ \
Parent1 Parent2
^ ^
\ /
\ /
Child
</pre></div> </div> </div><p>So, while with single-parent inheritance the rules are straightforward, with multiple inheritance we immediately have a more complex situation that doesn't have a trivial solution. Does all this prevent multiple inheritance from being implemented?</p><p>Not at all! There are solutions to this problem, as we will see shortly, but this further level of intricacy makes multiple inheritance something that doesn't fit easily in a design and has to be implemented carefully to avoid subtle bugs. Remember that inheritance is an automatic delegation mechanism, as this makes what happens in the code less evident. For these reasons, multiple inheritance is often depicted as scary and convoluted, and usually given some space only in the advanced OOP courses, at least in the Python world. I believe every Python programmer, instead, should familiarise with it and learn how to take advantage of it.</p>
<div class="advertisement">
<a href="https://www.thedigitalcat.academy/freebie-first-class-objects">
<img src="/images/first-class-objects/cover.jpg" />
</a>
<div class="body">
<h2 id="first-class-objects-in-python-fffa">First-class objects in Python<a class="headerlink" href="#first-class-objects-in-python-fffa" title="Permanent link">¶</a></h2>
<p>Higher-order functions, wrappers, and factories</p>
<p>Learn all you need to know to understand first-class citizenship in Python, the gateway to grasp how decorators work and how functional programming can supercharge your code.</p>
<div class="actions">
<a class="action" href="https://www.thedigitalcat.academy/freebie-first-class-objects">Get your FREE copy</a>
</div>
</div>
</div>
<h3 id="multiple-inheritance-the-python-way-d87f">Multiple inheritance: the Python way</h3><p>Let's see how it is possible to solve the diamond problem. Unlike genetics, we programmers can't afford any level of uncertainty or randomness in our processes, so in the presence of a possible ambiguity as the one created by multiple inheritance, we need to write down a rule that will be strictly followed in every case. In Python, this rule goes by the name of MRO (Method Resolution Order), which was introduced in Python 2.3 and is described in <a href="https://www.python.org/download/releases/2.3/mro/">this document</a> by Michele Simionato.</p><p>There is a lot to say about MRO and the underlying C3 linearisation algorithm, but for the scope of this post, it is enough to see how it solves the diamond problem. In case of multiple inheritance, Python follows the usual inheritance rules (automatic delegation to an ancestor if the attribute is not present locally), but the <em>order</em> followed to traverse the inheritance tree now includes all the classes that are specified in the class signature. In the example above, Python would look for attributes in the following order: <code>Child</code>, <code>Parent1</code>, <code>Parent2</code>, <code>Ancestor</code>.</p><p>So, as in the case of standard inheritance, this means that the first class in the list that implements a specific attribute will be the selected provider for that resolution. An example might clarify the matter</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">Ancestor</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">rewind</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Ancestor: rewind"</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">Parent1</span><span class="p">(</span><span class="n">Ancestor</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">open</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Parent1: open"</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">Parent2</span><span class="p">(</span><span class="n">Ancestor</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">open</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Parent2: open"</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">close</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Parent2: close"</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">flush</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Parent2: flush"</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">Child</span><span class="p">(</span><span class="n">Parent1</span><span class="p">,</span> <span class="n">Parent2</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">flush</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="nb">print</span><span class="p">(</span><span class="s2">"Child: flush"</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">Child</span><span class="o">.</span><span class="vm">__mro__</span><span class="p">)</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Child</span><span class="p">()</span>
<span class="n">c</span><span class="o">.</span><span class="n">rewind</span><span class="p">()</span>
<span class="n">c</span><span class="o">.</span><span class="n">open</span><span class="p">()</span>
<span class="n">c</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
<span class="n">c</span><span class="o">.</span><span class="n">flush</span><span class="p">()</span>
</pre></div> </div> </div><p>As you can see, we can access the MRO of any class reading its <code>__mro__</code> attribute, and as we expected its value is <code>(<class '__main__.Child'>, <class '__main__.Parent1'>, <class '__main__.Parent2'>, <class '__main__.Ancestor'>, <class 'object'>)</code>.</p><p>So, in this case an instance <code>c</code> of <code>Child</code> provides <code>rewind</code>, <code>open</code>, <code>close</code>, and <code>flush</code>. When <code>c.rewind</code> is called, the code in <code>Ancestor</code> is executed, as this is the first class in the MRO list that provides that method. The method <code>open</code> is provided by <code>Parent1</code>, while <code>close</code> is provided by <code>Parent2</code>. If the method <code>c.flush</code> is called, the code is provided by the <code>Child</code> class itself, that redefines it overriding the one provided by <code>Parent2</code>.</p><p>As we see with the <code>flush</code> method, Python doesn't change its behaviour when it comes to method overriding with multiple parents. The first implementation of a method with that name is executed, and the parent's implementation is not automatically called. As in the case of standard inheritance, then, it's up to us to design classes with matching method signatures.</p><h3 id="under-the-bonnet-cf95">Under the bonnet</h3><p>How does multiple inheritance work internally? How does Python create the MRO list?</p><p>Python has a very simple approach to OOP (even though it ultimately ends with a mind-blowing ouroboros, see <a href="https://www.thedigitalcatonline.com/blog/2014/09/01/python-3-oop-part-5-metaclasses/">here</a>). Classes are objects themselves, so they contain data structures that are used by the language to provide features, and delegation makes no exception. When we run a method on an object, Python silently uses the <code>__getattribute__</code> method (provided by <code>object</code>), which uses <code>__class__</code> to reach the class from the instance, and <code>__bases__</code> to find the parent classes. The latter, in particular, is a tuple, so it is ordered, and it contains all the classes that the current class inherits from.</p><p>The MRO is created using only <code>__bases__</code>, but the underlying algorithm is not that trivial and has to with the monotonicity of the resulting class linearisation. It is less scary than it sounds, but not something you want to read while suntanning, probably. If that's the case, the aforementioned <a href="https://www.python.org/download/releases/2.3/mro/">document</a> by Michele Simionato contains all the gory details on class linearisation that you always wanted to explore while lying on the beach.</p><h2 id="inheritance-and-interfaces-42cb">Inheritance and interfaces<a class="headerlink" href="#inheritance-and-interfaces-42cb" title="Permanent link">¶</a></h2><p>To approach mixins, we need to discuss inheritance in detail, and specifically the role of method signatures.</p><p>In Python, when you override a method provided by an ancestor class, you have to decide if and when to call its original implementation. This gives the programmer the freedom to decide whether they need to just augment a method or to replace it completely. Remember that the only thing Python does when a class inherits from another is to automatically delegate methods that are not implemented.</p><p>When a class inherits from another we are ideally creating objects that keep the backward compatibility with the interface of the parent class, to allow a polymorphic use of them. This means that when we inherit from a class and override a method changing its signature we are doing something that is dangerous and, at least from the point of view of polymorphism, wrong. Have a look at this example</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">GraphicalEntity</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_x</span> <span class="o">=</span> <span class="n">pos_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_y</span> <span class="o">=</span> <span class="n">pos_y</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_x</span> <span class="o">=</span> <span class="n">size_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_y</span> <span class="o">=</span> <span class="n">size_y</span>
<span class="k">def</span> <span class="nf">move</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_x</span> <span class="o">=</span> <span class="n">pos_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_y</span> <span class="o">=</span> <span class="n">pos_y</span>
<span class="k">def</span> <span class="nf">resize</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_x</span> <span class="o">=</span> <span class="n">size_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_y</span> <span class="o">=</span> <span class="n">size_y</span>
<span class="k">class</span> <span class="nc">Rectangle</span><span class="p">(</span><span class="n">GraphicalEntity</span><span class="p">):</span>
<span class="k">pass</span>
<span class="k">class</span> <span class="nc">Square</span><span class="p">(</span><span class="n">GraphicalEntity</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size</span><span class="p">,</span> <span class="n">size</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">resize</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">size</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="n">resize</span><span class="p">(</span><span class="n">size</span><span class="p">,</span> <span class="n">size</span><span class="p">)</span>
</pre></div> </div> </div><p>Please note that <code>Square</code> changes the signature of both <code>__init__</code> and <code>resize</code>. Now, when we instantiate those classes we need to keep in mind the different signature of <code>__init__</code> in <code>Square</code></p><div class="code"><div class="content"><div class="highlight"><pre><span class="n">r1</span> <span class="o">=</span> <span class="n">Rectangle</span><span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">200</span><span class="p">,</span> <span class="mi">15</span><span class="p">,</span> <span class="mi">30</span><span class="p">)</span>
<span class="n">r2</span> <span class="o">=</span> <span class="n">Rectangle</span><span class="p">(</span><span class="mi">150</span><span class="p">,</span> <span class="mi">280</span><span class="p">,</span> <span class="mi">23</span><span class="p">,</span> <span class="mi">55</span><span class="p">)</span>
<span class="n">q1</span> <span class="o">=</span> <span class="n">Square</span><span class="p">(</span><span class="mi">300</span><span class="p">,</span> <span class="mi">400</span><span class="p">,</span> <span class="mi">50</span><span class="p">)</span>
</pre></div> </div> </div><p>We usually accept that an enhanced version of a class accepts different parameters when it is initialised, as we do not expect it to be polymorphic on <code>__init__</code>. Problems arise when we try to leverage polymorphism on other methods, for example resizing all <code>GraphicalEntity</code> objects in a list</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">for</span> <span class="n">shape</span> <span class="ow">in</span> <span class="p">[</span><span class="n">r1</span><span class="p">,</span> <span class="n">r2</span><span class="p">,</span> <span class="n">q1</span><span class="p">]:</span>
<span class="n">size_x</span> <span class="o">=</span> <span class="n">shape</span><span class="o">.</span><span class="n">size_x</span>
<span class="n">size_y</span> <span class="o">=</span> <span class="n">shape</span><span class="o">.</span><span class="n">size_y</span>
<span class="n">shape</span><span class="o">.</span><span class="n">resize</span><span class="p">(</span><span class="n">size_x</span><span class="o">*</span><span class="mi">2</span><span class="p">,</span> <span class="n">size_y</span><span class="o">*</span><span class="mi">2</span><span class="p">)</span>
</pre></div> </div> </div><p>Since <code>r1</code>, <code>r2</code>, and <code>q1</code> are all objects that inherit from <code>GraphicalEntity</code> we expect them to provide the interface provided by that class, but this fails, because <code>Square</code> changed the signature of <code>resize</code>. The same would happen if we instantiated them in a for loop from a list of classes, but as I said it is generally accepted that child classes change the signature of the <code>__init__</code> method. This is not true, for example, in a plugin-based system, where all plugins shall be initialised the same way.</p><p>This is a classic problem in OOP. While we, as humans, perceive a square just as a slightly special rectangle, from the interface point of view the two classes are different, and thus should not be in the same inheritance tree when we are dealing with dimensions. This is an important consideration: <code>Rectangle</code> and <code>Square</code> are polymorphic on the <code>move</code> method, but not on <code>__init__</code> and <code>resize</code>. So, the question is if we could somehow separate the two natures of being movable and resizeable.</p><p>Now, discussing interfaces, polymorphism, and the reasons behind them would require an entirely separate post, so in the following sections, I'm going to ignore the matter and just consider the object interface optional. You will thus find examples of objects that break the interface of the parent, and objects that keep it. Just remember: whenever you change the signature of a method you change the (implicit) interface of the object, and thus you stop polymorphism. I'll discuss another time if I consider this right or wrong.</p><h2 id="mixin-classes-cf82">Mixin classes<a class="headerlink" href="#mixin-classes-cf82" title="Permanent link">¶</a></h2><p>MRO is a good solution that prevents ambiguity, but it leaves programmers with the responsibility of creating sensible inheritance trees. The algorithm helps to resolve complicated situations, but this doesn't mean we should create them in the first place. So, how can we leverage multiple inheritance without creating systems that are too complicated to grasp? Moreover, is it possible to use multiple inheritance to solve the problem of managing the double (or multiple) nature of an object, as in the previous example of a movable and resizeable shape?</p><p>The solution comes from mixin classes: those are small classes that provide attributes but are not included in the standard inheritance tree, working more as "additions" to the current class than as proper ancestors. Mixins originate in the LISP programming language, and specifically in what could be considered the first version of the Common Lisp Object System, the Flavors extension. Modern OOP languages implement mixins in many different ways: Scala, for example, has a feature called <em>traits</em>, which live in their own space with a specific hierarchy that doesn't interfere with the proper class inheritance.</p><h3 id="mixin-classes-in-python-a023">Mixin classes in Python</h3><p>Python doesn't provide support for mixins with any dedicated language feature, so we use multiple inheritance to implement them. This clearly requires great discipline from the programmer, as it violates one of the main assumptions for mixins: their orthogonality to the inheritance tree. In Python, so-called mixins are classes that live in the normal inheritance tree, but they are kept small to avoid creating hierarchies that are too complicated for the programmer to grasp. In particular, mixins shouldn't have common ancestors other than <code>object</code> with the other parent classes.</p><p>Let's have a look at a simple example</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">GraphicalEntity</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_x</span> <span class="o">=</span> <span class="n">pos_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_y</span> <span class="o">=</span> <span class="n">pos_y</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_x</span> <span class="o">=</span> <span class="n">size_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_y</span> <span class="o">=</span> <span class="n">size_y</span>
<span class="k">class</span> <span class="nc">ResizableMixin</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">resize</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_x</span> <span class="o">=</span> <span class="n">size_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_y</span> <span class="o">=</span> <span class="n">size_y</span>
<span class="k">class</span> <span class="nc">ResizableGraphicalEntity</span><span class="p">(</span><span class="n">GraphicalEntity</span><span class="p">,</span> <span class="n">ResizableMixin</span><span class="p">):</span>
<span class="k">pass</span>
<span class="n">rge</span> <span class="o">=</span> <span class="n">ResizableGraphicalEntity</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">200</span><span class="p">,</span> <span class="mi">300</span><span class="p">)</span>
<span class="n">rge</span><span class="o">.</span><span class="n">resize</span><span class="p">(</span><span class="mi">1000</span><span class="p">,</span> <span class="mi">2000</span><span class="p">)</span>
</pre></div> </div> </div><p>Here, the class <code>ResizableMixin</code> doesn't inherit from <code>GraphicalEntity</code>, but directly from <code>object</code>, so <code>ResizableGraphicalEntity</code> gets from it just the <code>resize</code> method. As we said before, this simplifies the inheritance tree of <code>ResizableGraphicalEntity</code> and helps to reduce the risk of the diamond problem. It leaves us free to use <code>GraphicalEntity</code> as a parent for other classes without having to inherit methods that we don't want. Please remember that this happens because the classes are designed to avoid it, and not because of language features: the MRO algorithm just ensures that there will always be an unambiguous choice in case of multiple ancestors.</p><p>Mixins cannot usually be too generic. After all, they are designed to add features to classes, but these new features often interact with other pre-existing features of the augmented class. In this case, the <code>resize</code> method interacts with the attributes <code>size_x</code> and <code>size_y</code> that have to be present in the object. Obviously, there are obviously examples of <em>pure</em> mixins, but since they would require no initialization their scope is definitely limited.</p><h3 id="using-mixins-to-hijack-inheritance-030c">Using mixins to hijack inheritance</h3><p>Thanks to the MRO, Python programmers can leverage multiple inheritance to override methods that objects inherit from their parents, allowing them to customise classes without code duplication. Let's have a look at this example</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">GraphicalEntity</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_x</span> <span class="o">=</span> <span class="n">pos_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_y</span> <span class="o">=</span> <span class="n">pos_y</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_x</span> <span class="o">=</span> <span class="n">size_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_y</span> <span class="o">=</span> <span class="n">size_y</span>
<span class="k">class</span> <span class="nc">Button</span><span class="p">(</span><span class="n">GraphicalEntity</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">status</span> <span class="o">=</span> <span class="kc">False</span>
<span class="k">def</span> <span class="nf">toggle</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">status</span> <span class="o">=</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">status</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">Button</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">200</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
</pre></div> </div> </div><p>As you can see the <code>Button</code> class extends the <code>GraphicalEntity</code> one in a classic way, using <code>super</code> to call the parent's <code>__init__</code> method before adding the new <code>status</code> attribute. Now, if I wanted to create a <code>SquareButton</code> class I have two choices.</p><p>I might just override <code>__init__</code> in the new class</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">GraphicalEntity</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_x</span> <span class="o">=</span> <span class="n">pos_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_y</span> <span class="o">=</span> <span class="n">pos_y</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_x</span> <span class="o">=</span> <span class="n">size_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_y</span> <span class="o">=</span> <span class="n">size_y</span>
<span class="k">class</span> <span class="nc">Button</span><span class="p">(</span><span class="n">GraphicalEntity</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">status</span> <span class="o">=</span> <span class="kc">False</span>
<span class="k">def</span> <span class="nf">toggle</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">status</span> <span class="o">=</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">status</span>
<span class="k">class</span> <span class="nc">SquareButton</span><span class="p">(</span><span class="n">Button</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size</span><span class="p">,</span> <span class="n">size</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">SquareButton</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">200</span><span class="p">)</span>
</pre></div> </div> </div><p>which performs the requested job, but strongly connects the feature of having a single dimension with the <code>Button</code> nature. If we wanted to create a circular image we could not inherit from <code>SquareButton</code>, as the image has a different nature.</p><p>The second option is that of isolating the features connected with having a single dimension in a mixin class, and add it as a parent for the new class</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">GraphicalEntity</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_x</span> <span class="o">=</span> <span class="n">pos_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_y</span> <span class="o">=</span> <span class="n">pos_y</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_x</span> <span class="o">=</span> <span class="n">size_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_y</span> <span class="o">=</span> <span class="n">size_y</span>
<span class="k">class</span> <span class="nc">Button</span><span class="p">(</span><span class="n">GraphicalEntity</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">status</span> <span class="o">=</span> <span class="kc">False</span>
<span class="k">def</span> <span class="nf">toggle</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">status</span> <span class="o">=</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">status</span>
<span class="k">class</span> <span class="nc">SingleDimensionMixin</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size</span><span class="p">,</span> <span class="n">size</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">SquareButton</span><span class="p">(</span><span class="n">SingleDimensionMixin</span><span class="p">,</span> <span class="n">Button</span><span class="p">):</span>
<span class="k">pass</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">SquareButton</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">200</span><span class="p">)</span>
</pre></div> </div> </div><p>The second solution gives the same final result, but promotes code reuse, as now the <code>SingleDimensionMixin</code> class can be applied to other classes derived from <code>GraphicalEntity</code> and make them accept only one size, while in the first solution that feature was tightly connected with the <code>Button</code> ancestor class.</p><p>Please note that the position of the mixin is important as <code>super</code> follows the MRO. As it is, the MRO of <code>SquareButton</code> is <code>(SquareButton, SingleDimensionMixin, Button, GraphicalEntity, object)</code>, so, when we instantiate it the <code>__init__</code> method is provided by <code>SingleDimensionMixin</code>, which in turn calls through <code>super</code> the method <code>__init__</code> of <code>Button</code>. The call <code>super().__init__(pos_x, pos_y, size, size)</code> in <code>SingleDimensionMixin</code> and the signature <code>def __init__(self, pos_x, pos_y, size_x, size_y):</code> in <code>Button</code> match, so everything works.</p><p>If we defined <code>SquareButton</code> as</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">SquareButton</span><span class="p">(</span><span class="n">Button</span><span class="p">,</span> <span class="n">SingleDimensionMixin</span><span class="p">):</span>
<span class="k">pass</span>
</pre></div> </div> </div><p>then the <code>__init__</code> method would first be provided by <code>Button</code>, and its <code>super</code> would call the <code>__init__</code> method of <code>GraphicalEntity</code>. This would however result in an error, as we run <code>SquareButton(10, 20, 200)</code>, and <code>Button.__init__</code> expects four parameters.</p><p>Mixins are not used only when you want to change the object's interface, though. Leveraging <code>super</code> we can achieve interesting designs like</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">GraphicalEntity</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_x</span> <span class="o">=</span> <span class="n">pos_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">pos_y</span> <span class="o">=</span> <span class="n">pos_y</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_x</span> <span class="o">=</span> <span class="n">size_x</span>
<span class="bp">self</span><span class="o">.</span><span class="n">size_y</span> <span class="o">=</span> <span class="n">size_y</span>
<span class="k">class</span> <span class="nc">Button</span><span class="p">(</span><span class="n">GraphicalEntity</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">status</span> <span class="o">=</span> <span class="kc">False</span>
<span class="k">def</span> <span class="nf">toggle</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="bp">self</span><span class="o">.</span><span class="n">status</span> <span class="o">=</span> <span class="ow">not</span> <span class="bp">self</span><span class="o">.</span><span class="n">status</span>
<span class="k">class</span> <span class="nc">LimitSizeMixin</span><span class="p">:</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">):</span>
<span class="n">size_x</span> <span class="o">=</span> <span class="nb">min</span><span class="p">(</span><span class="n">size_x</span><span class="p">,</span> <span class="mi">500</span><span class="p">)</span>
<span class="n">size_y</span> <span class="o">=</span> <span class="nb">min</span><span class="p">(</span><span class="n">size_y</span><span class="p">,</span> <span class="mi">400</span><span class="p">)</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">pos_x</span><span class="p">,</span> <span class="n">pos_y</span><span class="p">,</span> <span class="n">size_x</span><span class="p">,</span> <span class="n">size_y</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">LimitSizeButton</span><span class="p">(</span><span class="n">LimitSizeMixin</span><span class="p">,</span> <span class="n">Button</span><span class="p">):</span>
<span class="k">pass</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">LimitSizeButton</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">2000</span><span class="p">,</span> <span class="mi">1000</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">b</span><span class="o">.</span><span class="n">size_x</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="n">b</span><span class="o">.</span><span class="n">size_y</span><span class="p">)</span>
</pre></div> </div> </div><p>Here, the MRO or <code>LimitSizeButton</code> is <code>(<class '__main__.LimitSizeButton'>, <class '__main__.LimitSizeMixin'>, <class '__main__.Button'>, <class '__main__.GraphicalEntity'>, <class 'object'>)</code>, which means that when we initialize it the <code>__init__</code> method is first provided by <code>LimitSizeMixin</code>, which then calls through <code>super</code> the <code>__init__</code> method of <code>Button</code>, and through the latter the <code>__init__</code> method of <code>GraphicalEntity</code>.</p><p>Remember that in Python, you are never forced to call the parent's implementation of a method, so the mixin here might also stop the dispatching mechanism if that is the requirement of the business logic of the new object.</p>
<div class="advertisement">
<a href="https://www.thedigitalcat.academy/freebie-first-class-objects">
<img src="/images/first-class-objects/cover.jpg" />
</a>
<div class="body">
<h2 id="first-class-objects-in-python-fffa">First-class objects in Python<a class="headerlink" href="#first-class-objects-in-python-fffa" title="Permanent link">¶</a></h2>
<p>Higher-order functions, wrappers, and factories</p>
<p>Learn all you need to know to understand first-class citizenship in Python, the gateway to grasp how decorators work and how functional programming can supercharge your code.</p>
<div class="actions">
<a class="action" href="https://www.thedigitalcat.academy/freebie-first-class-objects">Get your FREE copy</a>
</div>
</div>
</div>
<h2 id="a-concrete-example-django-class-based-views-f83d">A concrete example: Django class-based views<a class="headerlink" href="#a-concrete-example-django-class-based-views-f83d" title="Permanent link">¶</a></h2><p>Finally, let's get to the original source of inspiration for this post: the Django codebase. I will show you here how the Django programmers used multiple inheritance and mixin classes to promote code reuse, and you will now hopefully grasp all the reasons behind them.</p><p>The example I chose can be found in the <a href="https://github.com/django/django/blob/3.0/django/views/generic/base.py#L117">code of generic views</a>, and in particular in two classes: <code>TemplateResponseMixin</code> and <code>TemplateView</code>.</p><p>As you might know, Django <code>View</code> class is the ancestor of all class-based views and provides a <code>dispatch</code> method that converts HTTP request methods into Python function calls (<a href="https://github.com/django/django/blob/3.0/django/views/generic/base.py#L89">CODE</a>). Now, the <code>TemplateView</code> is a view that answers to a GET request rendering a template with the data coming from a context passed when the view is called. Given the mechanism behind Django views, then, <code>TemplateView</code> should implement a <code>get</code> method and return the content of the HTTP response. The code of the class is</p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">TemplateView</span><span class="p">(</span><span class="n">TemplateResponseMixin</span><span class="p">,</span> <span class="n">ContextMixin</span><span class="p">,</span> <span class="n">View</span><span class="p">):</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Render a template. Pass keyword arguments from the URLconf to the context.</span>
<span class="sd"> """</span>
<span class="k">def</span> <span class="nf">get</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">request</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="n">context</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">get_context_data</span><span class="p">(</span><span class="o">**</span><span class="n">kwargs</span><span class="p">)</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">render_to_response</span><span class="p">(</span><span class="n">context</span><span class="p">)</span>
</pre></div> </div> </div><p>As you can see <code>TemplateView</code> is a <code>View</code>, but it uses two mixins to inject features. Let's have a look at <code>TemplateResponseMixin</code></p><div class="code"><div class="content"><div class="highlight"><pre><span class="k">class</span> <span class="nc">TemplateResponseMixin</span><span class="p">:</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">render_to_response</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">context</span><span class="p">,</span> <span class="o">**</span><span class="n">response_kwargs</span><span class="p">):</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
<span class="k">def</span> <span class="nf">get_template_names</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="p">[</span><span class="o">...</span><span class="p">]</span>
</pre></div> </div> </div><p>It is clear that <code>TemplateResponseMixin</code> just adds to any class the two methods <code>get_template_names</code> and <code>render_to_response</code>. The latter is called in the <code>get</code> method of <code>TemplateView</code> to create the response. Let's have a look at a simplified schema of the calls:</p><div class="code"><div class="content"><div class="highlight"><pre>GET request --> TemplateView.dispatch --> View.dispatch --> TemplateView.get --> TemplateResponseMixin.render_to_response
</pre></div> </div> </div><p>It might look complicated, but try to follow the code a couple of times and the whole picture will start to make sense. The important thing I want to stress is that the code in <code>TemplateResponseMixin</code> is available for any class that wants to have the feature of rendering a template, for example <code>DetailView</code> (<a href="https://github.com/django/django/blob/3.0/django/views/generic/detail.py#L164">CODE</a>), which receives the feature of showing the details of a single object by <code>SingleObjectTemplateResponseMixin</code>, which inherits from <code>TemplateResponseMixin</code>, overriding its method <code>get_template_names</code> (<a href="https://github.com/django/django/blob/3.0/django/views/generic/detail.py#L111">CODE</a>).</p><p>As we discussed before, mixins cannot be too generic, and here we see a good example of a mixin designed to work on specific classes. <code>TemplateResponseMixin</code> has to be applied to classes that contain <code>self.request</code> (<a href="https://github.com/django/django/blob/3.0/django/views/generic/base.py#L133">CODE</a>), and while this doesn't mean exclusively classes derived from <code>View</code>, it is clear that it has been designed to augment that specific type.</p><h2 id="takeaway-points-72a2">Takeaway points<a class="headerlink" href="#takeaway-points-72a2" title="Permanent link">¶</a></h2><ul><li>Inheritance is designed to promote code reuse but can lead to the opposite result</li><li>Multiple inheritance allows us to keep the inheritance tree simple</li><li>Multiple inheritance leads to possible problems that are solved in Python through the MRO</li><li>Interfaces (either implicit or explicit) should be part of your design</li><li>Mixin classes are used to add simple changes to classes</li><li>Mixins are implemented in Python using multiple inheritance: they have great expressive power but require careful design.</li></ul><h2 id="final-words-9803">Final words<a class="headerlink" href="#final-words-9803" title="Permanent link">¶</a></h2><p>I hope this post helped you to understand a bit more how multiple inheritance works, and to be less scared by it. I also hope I managed to show you that classes have to be carefully designed and that there is a lot to consider when you create a class system. Once again, please don't forget composition, it's a powerful and too often forgotten tool.</p><h2 id="updates-0083">Updates<a class="headerlink" href="#updates-0083" title="Permanent link">¶</a></h2><p>2020-03-13: GitHub user <a href="https://github.com/sureshvv">sureshvv</a> noticed that the <code>LimitSizeMixin</code> method <code>__init__</code> had the wrong parameters <code>pos_x</code> and <code>pos_y</code>, instead of <code>size_x</code> and <code>size_y</code>. Thanks!</p><p>2021-12-20: <a href="https://github.com/akocur">Alexander</a> fixed a mistake in the part relative to <code>SquareButton</code> and the behaviour of <code>super()</code>. Thanks!</p><h2 id="feedback-d845">Feedback<a class="headerlink" href="#feedback-d845" title="Permanent link">¶</a></h2><p>Feel free to reach me on <a href="https://twitter.com/thedigicat">Twitter</a> if you have questions. The <a href="https://github.com/TheDigitalCatOnline/blog_source/issues">GitHub issues</a> page is the best place to submit corrections.</p>Public key cryptography: RSA keys2018-04-25T13:00:00+01:002022-01-23T11:00:00+00:00Leonardo Giordanitag:www.thedigitalcatonline.com,2018-04-25:/blog/2018/04/25/rsa-keys/<p> An in-depth discussion of the format of RSA keys, the PEM format, ASN, and PKCS</p><p>I bet you created at least once an RSA key pair, usually because you needed to connect to GitHub and you wanted to avoid typing your password every time. You diligently followed the documentation on how to create SSH keys and after a couple of minutes your setup was complete.</p><p>But do you know what you actually did?</p><p>Do you know what the file <code>~/.ssh/id_rsa</code> really contains? Why did ssh create two files with such a different format? Did you notice that one file begins with <code>ssh-rsa</code>, while the other begins with <code>-----BEGIN RSA PRIVATE KEY-----</code>? Have you noticed that sometimes the header of the second file misses the <code>RSA</code> part and just says <code>BEGIN PRIVATE KEY</code>?</p><p>I believe that a minimum level of knowledge regarding the various formats of RSA keys is mandatory for every developer nowadays, not to mention the importance of understanding them deeply if you want to pursue a career in the infrastructure management world.</p><h2 id="rsa-algorithm-and-key-pairs-069b">RSA algorithm and key pairs<a class="headerlink" href="#rsa-algorithm-and-key-pairs-069b" title="Permanent link">¶</a></h2><p>Since the invention of public-key cryptography, various systems have been devised to create the key pair. One of the first ones is RSA, the creation of three brilliant cryptographers, that dates back to 1977. The story of RSA is quite interesting, as it was first invented by an English mathematician, Clifford Cocks, who was however forced to keep it secret by the British intelligence office he was working for.</p><p>Keeping in mind that RSA is not a synonym for public-key cryptography but only one of the possible implementations, I wanted to write a post on it because it is still, more than 40 years after its publication, one of the most widespread algorithms. In particular it is the standard algorithm used to generate SSH key pairs, and since nowadays every developer has their public key on GitHub, BitBucket, or similar systems, we may arguably say that RSA is pretty ubiquitous.</p><p>I will not cover the internals of the RSA algorithm in this article, however. If you are interested in the gory details of the mathematical framework you may find plenty of resources both on Internet and in the textbooks. The theory behind it is not trivial, but it is definitely worth the time if you want to be serious about the mathematical part of cryptography.</p><p>In this article I will instead explore two ways to create RSA key pairs and the formats used to store them. Applied cryptography is, like many other topics in computer science, a moving target, and the tools change often. Sometimes it is pretty easy to find out <strong>how</strong> to do something (StackOverflow helps), but less easy to get a clear picture of what is going on.</p><p>All the examples shown in this post use a 2048-bits RSA key created for this purpose, so all the numbers you see come from a real example. The key has been obviously trashed after I wrote the article.</p><h2 id="the-pem-format-7416">The PEM format<a class="headerlink" href="#the-pem-format-7416" title="Permanent link">¶</a></h2><p>Let's start the discussion about key pairs with the format used to store them. Nowadays the most widely accepted storage format is called PEM (Privacy-enhanced Electronic Mail). As the name suggests, this format was initially created for e-mail encryption but later became a general format to store cryptographic data like keys and certificates. It is described in <a href="https://tools.ietf.org/html/rfc7468">RFC 7468</a> ("Textual Encodings of PKIX, PKCS, and CMS Structures").</p><p>An example private key in PEM format is the following</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN PRIVATE KEY-----
MIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQCy9f0/nwkXESzk
L4v4ftZ24VJYvkQ/Nt6vsLab3iSWtJXqrRsBythCcbAU6W95OGxjbTSFFtp0poqM
cPuogocMR7QhjY9JGG3fcnJ7nYDCGRHD4zfG5Af/tHwvJ2ew0WTYoemvlfZIG/jZ
7fsuOQSyUpJoxGAlb6/QpnfSmJjxCx0VEoppWDn8CO3VhOgzVhWx0dcne+ZcUy3K
kt3HBQN0hosRfqkVSRTvkpK4RD8TaW5PrVDe1r2Q5ab37TO+Ls4xxt16QlPubNxW
eH3dHVzXdmFAItuH0DuyLyMoW1oxZ6+NrKu+pAAERxM303gejFzKDqXid5m1EOTv
k4xhyqYNAgMBAAECggEBALJCVQAKagOQGCczNTlRHk9MIbpDy7cr8KUQYNThcZCs
UKhxxXUDmGaW1838uA0HJu/i1226Vd/cBCXgZMx1OBADXGoPl6o3qznnxiFbweWV
Ex0MN4LloRITtZ9CoQZ/jPQ8U4mS1r79HeP2KTzhjswRc8Tn1t1zYq1zI+eiGLX/
sPJF63ljJ8yHST7dE0I07V87FKTE2SN0WX9kptPLLBDwzS1X6Z9YyNKPIEnRQzzE
vWdwF60b3RyDz7j7foyP3PC0+3fee4KFdJzt+/1oePf3kwBz8PQq3cuoOF1+0Fzf
yqKiunV2AXI6liAf7MwuZcZeFPZfHTTW7N/j+FQBgAECgYEA4dFjib9u/3rkT2Vx
Bu2ByBpItfs1b4PdSiKehlS9wDZxa72dRt/RSYEyVFBUlYrKXP2nCdl8yMap6SA9
Bfe51F5oWhml9YJn/LF/z1ArMs/tuUyupY7l9j66XzPQmUbIZSEyNEQQ09ZYdIvK
4lbySJbCqa2TQNPIOSZS2o7XNG0CgYEAyuFVybOkVGtfw89MyA1TnVMcQGusXtgo
GOl3tJb59hTO+xF547+/qyK8p/iOu4ybEyeucBEyQt/whmNwtsdngtvVDb4f7psz
Frmqx7q7fPoKnvJsPJds9i2o9B7+BlRY3HwcvKePsctP96pQ0RbOFkCVak6J6t9S
k/qhOiNJ9CECgYEAvDuTMk5tku54g6o2ZiTyir9GHtOwviz3+AUViTn4FdIMB1g+
UsbcqN3V+ywe5ayUdKFHbNFqz92x4k7qLyBJObocWAaLLTQvxBadSE02RRvHuC8w
YXbVP8cYCaWiWzICdzINrD2UnVBN2ZBxZOw+970btN6oIWCnxOOqKt7oip0CgYAp
Fekhp9enoPcL2HdcLBa6zZHzGdsWef/ky6MKV2jXhO9FuQxOKw7NwYMjIRsGsDrX
bjnNSC49jMxQ6uJwoYE85vgGiHI/B/8YoxEK0a4WaSytc7qnqqLOWADXL0+SSJKW
VCwdqHFZOCtBpKQpM80YhIu9s5oKjp9SiHcOJwdbAQKBgDq047hBqyNFFb8KjS5A
+26VOJcC2DRHTprYSRJNxsHTQnONTnUQJl32t0TrqkqIp5lTRr7vBH2wJM6LKk45
I7BWY4mUirC7sDGHl3DaFPRBiut1rpg0kSKi2VNRF7Bb75OKEhGjvm6IKVe8Kl8d
5cpQwm9C7go4OiorY0DVLho2
-----END PRIVATE KEY-----
</pre></div> </div> </div><p>Basically, you can tell you are dealing with a PEM format from the typical header and footer that identify the content. While the hyphens and the two words <code>BEGIN</code> and <code>END</code> are always present, the <code>PRIVATE KEY</code> part describes the content and can change if the PEM file contains something different from a key, for example an X.509 certificate for SSL.</p><p>The PEM format specifies that the the body of the content (the part between the header and the footer) is encoded using <a href="https://en.wikipedia.org/wiki/Base64">Base64</a>.</p><p>If the private key has been encrypted with a password the header and the footer are different</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN ENCRYPTED PRIVATE KEY-----
MIIFHzBJBgkqhkiG9w0BBQ0wPDAbBgkqhkiG9w0BBQwwDgQIf75rXIakuSICAggA
MB0GCWCGSAFlAwQBKgQQf8HMdJ9FZJjwHkMQjkNA3gSCBNClWB7cJ5f8ThrQtmoA
t2WQCvEWTY9nRYwaTnL1SmXyuMDFrX5CWEuVFh/Zj77KB9jhBJaHw2XtFXxF8bV7
F10u93ih/n0S5QwN9CSPDhRp2kD5lIWB8WVG+VgtncqDrAfJRmpuPmzpjMJBxE2r
MvWJG5beMCS25qD0mAxihtbriqFoCtEygQ7vsSfeQpaBQvT5pKLOVaVgwFTFTf+7
cgqB8/UKKmPXSM4GMJ9VNAvUx0mAxI9MnUFlBWimK76OAzdlO9Si99R8OiRRS10x
AO1AwWSDHGWpbckK0g9K7wLgAgOw8LLVUJh67o9Mfg58DP9Ca0ZdPPVo0C7oavBD
NFlUsKqmSfqfgOAm4qGJ7GB3KgWGFdz+yexNLRLN63hE6qACAuQ1oLmwoorE8toh
MhT3c6IxnVWlYNXJkkb5iV9e8E2X/xzibvwv+CJJ9ulCU8uS7gp0rjlCKFwt/8d4
g3Cef/JWn9nI9YwRLNShJeQOe8hZkkLXHefUhBa2o2++C5C6mgWvuYLK6a0zfCMY
WCqjKKvDQfuxwDbeM03jJ97Je6dXy7rtJvJd10vYvpIVtHnNSdg1evpSiaAmWt4C
X5/AzbHNvwTIEvILfOtYvxLB/RdWqr1/VXuH4dJF6AYtHfQHjXetmL/fDA86Bqf6
Eb+uDr+PPuH4qw1tfJBdTSOOJzhhPqdT4ERYnOvfNxTKzsKYZT+kWvWXe9zyO13W
C0eceVi4rBjKpKpKecKDgFJGZ1u7jS0OW3FDIOfm/osu9z25g5CVIpuWU3JquWib
GatHET9wIEg7LRqC/i65q6tCnd9azevKtiur1I0tuh05iwP5kZ8drIzaGdObuvK1
/pbEPnj1ZcRlAZ34jnG841xvf4vofrOE+hGTNF5HypOCvO/8Lms3aB6NletIvHBE
99ynQyF9TAgSAFAumOws+qnRcnfVOF5lzIEE2pmeMVMqi5s7TT4hlhOuCbyfEFU8
xOXxNazT+0o7urIYOc77vA1LsWrk+9dAfm43CbBZvYav/gMoBc5fsLgAUAm1lkt5
5Hjaf+iMIN0v7aEKDrNDOtyQr13YdyuEClzXxeMtlhU+QfErpQHvH0jE4gywEgz7
tvVGwrbiLgg0y537+kg0/rS3N0eI94GhY0q/nR/QFObbN0nmoIYVVSGtufJx1r9v
YEVZA7HZE9pjnun1ylE1/SoYc/816rjBUcW5CCbkMDIz1LsFPr2SkQeHTNzK3/9J
Kny1lerfA+TA/hUyZ1KJjxuao+rJkH2fJ25qs3r6NP+PPbq3sAl1TPGhMCnNaFdo
YQWDDwz26ZR2ywfsquqLXMwnIEeUI/hQTng9ZxLkJMY22rQSA9nsdvR8S1b0U8Qu
ViYEjCTMWF8HEFFO721MlkTgchzq6fiF+9ZydCpVUJWolcfw1OgUvvTSI7Eyhelb
7fc1fTVFeEMsHrtjpu8dg+IaCNraBzv5QZx6MYW7SSoTVp8mJoPnzYbsZs9nHJGX
iQOFmO/sIryOoeJlpOCGT55yU74yRXrBsYZyLz0P9K1FDQS6l9W33BqmF9vSXujs
kSByq8v1OU0IqidnMmZtTDSRlpQL/oadqQnsA6jiWyMznuUEU8tfgUALE4DKRq8P
wBLKVfMiwcWAbl121M2DCLj9/g==
-----END ENCRYPTED PRIVATE KEY-----
</pre></div> </div> </div><p>When the PEM format is used to store cryptographic keys the body of the content is in a format called PKCS #8. Initially a standard created by a private company (RSA Laboratories), it became a de facto standard so has been described in various RFCs, most notably <a href="https://tools.ietf.org/html/rfc5208">RFC 5208</a> ("Public-Key Cryptography Standards (PKCS) #8: Private-Key Information Syntax Specification Version 1.2").</p><p>The PKCS #8 format describes the content using a description language called <a href="https://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One">ASN.1</a> (Abstract Syntax Notation One) and the relative binary encoding <a href="https://en.wikipedia.org/wiki/X.690">DER</a> (Distinguished Encoding Rules) to serialise the resulting structure. This means that Base64-decoding the content will return some binary content that can be processed only by an ASN.1 parser.</p><p>Let me visually recap the structure</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN label-----
+--------------------------- Base64 ---------------------------+
| |
| PKCS #8 content: |
| ASN.1 language serialised with DER |
| |
+--------------------------------------------------------------+
-----END label-----
</pre></div> </div> </div><p>Please note that, due to the structure of the underlying ASN.1 structure, RSA PEM bodies start always with the same characters: <code>MIG</code> for 1024 bit keys, <code>MII</code> for 2048 and 4096 bit ones.</p><h3 id="openssl-and-asn.1-4acc">OpenSSL and ASN.1</h3><p>OpenSSL can directly decode a key in PEM format and show the underlying ASN.1 structure with the module <code>asn1parse</code></p><div class="code"><div class="content"><div class="highlight"><pre>$ openssl asn1parse -inform pem -in private.pem
0:d=0 hl=4 l=1214 cons: SEQUENCE
4:d=1 hl=2 l= 1 prim: INTEGER :00
7:d=1 hl=2 l= 13 cons: SEQUENCE
9:d=2 hl=2 l= 9 prim: OBJECT :rsaEncryption
20:d=2 hl=2 l= 0 prim: NULL
22:d=1 hl=4 l=1192 prim: OCTET STRING [HEX DUMP]:308204A40201000282010100B2F5FD3F9F0917112
CE42F8BF87ED676E15258BE443F36DEAFB0B69BDE2496B495EAAD1B01CAD84271B014E96F79386C636D348516DA74A68
A8C70FBA882870C47B4218D8F49186DDF72727B9D80C21911C3E337C6E407FFB47C2F2767B0D164D8A1E9AF95F6481BF
8D9EDFB2E3904B2529268C460256FAFD0A677D29898F10B1D15128A695839FC08EDD584E8335615B1D1D7277BE65C532
DCA92DDC7050374868B117EA9154914EF9292B8443F13696E4FAD50DED6BD90E5A6F7ED33BE2ECE31C6DD7A4253EE6CD
C56787DDD1D5CD776614022DB87D03BB22F23285B5A3167AF8DACABBEA40004471337D3781E8C5CCA0EA5E27799B510E
4EF938C61CAA60D02030100010282010100B24255000A6A03901827333539511E4F4C21BA43CBB72BF0A51060D4E1719
0AC50A871C57503986696D7CDFCB80D0726EFE2D76DBA55DFDC0425E064CC753810035C6A0F97AA37AB39E7C6215BC1E
595131D0C3782E5A11213B59F42A1067F8CF43C538992D6BEFD1DE3F6293CE18ECC1173C4E7D6DD7362AD7323E7A218B
5FFB0F245EB796327CC87493EDD134234ED5F3B14A4C4D92374597F64A6D3CB2C10F0CD2D57E99F58C8D28F2049D1433
CC4BD677017AD1BDD1C83CFB8FB7E8C8FDCF0B4FB77DE7B8285749CEDFBFD6878F7F7930073F0F42ADDCBA8385D7ED05
CDFCAA2A2BA757601723A96201FECCC2E65C65E14F65F1D34D6ECDFE3F85401800102818100E1D16389BF6EFF7AE44F6
57106ED81C81A48B5FB356F83DD4A229E8654BDC036716BBD9D46DFD1498132545054958ACA5CFDA709D97CC8C6A9E92
03D05F7B9D45E685A19A5F58267FCB17FCF502B32CFEDB94CAEA58EE5F63EBA5F33D09946C8652132344410D3D658748
BCAE256F24896C2A9AD9340D3C8392652DA8ED7346D02818100CAE155C9B3A4546B5FC3CF4CC80D539D531C406BAC5ED
82818E977B496F9F614CEFB1179E3BFBFAB22BCA7F88EBB8C9B1327AE70113242DFF0866370B6C76782DBD50DBE1FEE9
B3316B9AAC7BABB7CFA0A9EF26C3C976CF62DA8F41EFE065458DC7C1CBCA78FB1CB4FF7AA50D116CE1640956A4E89EAD
F5293FAA13A2349F42102818100BC3B93324E6D92EE7883AA366624F28ABF461ED3B0BE2CF7F805158939F815D20C075
83E52C6DCA8DDD5FB2C1EE5AC9474A1476CD16ACFDDB1E24EEA2F204939BA1C58068B2D342FC4169D484D36451BC7B82
F306176D53FC71809A5A25B320277320DAC3D949D504DD9907164EC3EF7BD1BB4DEA82160A7C4E3AA2ADEE88A9D02818
02915E921A7D7A7A0F70BD8775C2C16BACD91F319DB1679FFE4CBA30A5768D784EF45B90C4E2B0ECDC18323211B06B03
AD76E39CD482E3D8CCC50EAE270A1813CE6F80688723F07FF18A3110AD1AE16692CAD73BAA7AAA2CE5800D72F4F92489
296542C1DA87159382B41A4A42933CD18848BBDB39A0A8E9F5288770E27075B010281803AB4E3B841AB234515BF0A8D2
E40FB6E95389702D834474E9AD849124DC6C1D342738D4E7510265DF6B744EBAA4A88A7995346BEEF047DB024CE8B2A4
E3923B0566389948AB0BBB031879770DA14F4418AEB75AE98349122A2D9535117B05BEF938A1211A3BE6E882957BC2A5
F1DE5CA50C26F42EE0A383A2A2B6340D52E1A36
</pre></div> </div> </div><p>This that you see in the code snippet is then the private key in ASN.1 format. Remember that DER is only used to go from the text representation of ASN.1 to binary data, so we don't see it unless we decode the Base64 content into a file and open it with a binary editor.</p><p>Note that the ASN.1 structure contains the type of the object (<code>rsaEncryption</code>, in this case). You can further decode the <code>OCTET STRING</code> field, which is the actual key, specifying the offset</p><div class="code"><div class="content"><div class="highlight"><pre>$ openssl asn1parse -inform pem -in private.pem -strparse 22
0:d=0 hl=4 l=1188 cons: SEQUENCE
4:d=1 hl=2 l= 1 prim: INTEGER :00
7:d=1 hl=4 l= 257 prim: INTEGER :B2F5FD3F9F0917112CE42F8BF87ED676E15258BE443F36DEAFB
0B69BDE2496B495EAAD1B01CAD84271B014E96F79386C636D348516DA74A68A8C70FBA882870C47B4218D8F49186DDF
72727B9D80C21911C3E337C6E407FFB47C2F2767B0D164D8A1E9AF95F6481BF8D9EDFB2E3904B2529268C460256FAFD
0A677D29898F10B1D15128A695839FC08EDD584E8335615B1D1D7277BE65C532DCA92DDC7050374868B117EA9154914
EF9292B8443F13696E4FAD50DED6BD90E5A6F7ED33BE2ECE31C6DD7A4253EE6CDC56787DDD1D5CD776614022DB87D03
BB22F23285B5A3167AF8DACABBEA40004471337D3781E8C5CCA0EA5E27799B510E4EF938C61CAA60D
268:d=1 hl=2 l= 3 prim: INTEGER :010001
273:d=1 hl=4 l= 257 prim: INTEGER :B24255000A6A03901827333539511E4F4C21BA43CBB72BF0A51
060D4E17190AC50A871C57503986696D7CDFCB80D0726EFE2D76DBA55DFDC0425E064CC753810035C6A0F97AA37AB39
E7C6215BC1E595131D0C3782E5A11213B59F42A1067F8CF43C538992D6BEFD1DE3F6293CE18ECC1173C4E7D6DD7362A
D7323E7A218B5FFB0F245EB796327CC87493EDD134234ED5F3B14A4C4D92374597F64A6D3CB2C10F0CD2D57E99F58C8
D28F2049D1433CC4BD677017AD1BDD1C83CFB8FB7E8C8FDCF0B4FB77DE7B8285749CEDFBFD6878F7F7930073F0F42AD
DCBA8385D7ED05CDFCAA2A2BA757601723A96201FECCC2E65C65E14F65F1D34D6ECDFE3F854018001
534:d=1 hl=3 l= 129 prim: INTEGER :E1D16389BF6EFF7AE44F657106ED81C81A48B5FB356F83DD4A2
29E8654BDC036716BBD9D46DFD1498132545054958ACA5CFDA709D97CC8C6A9E9203D05F7B9D45E685A19A5F58267FC
B17FCF502B32CFEDB94CAEA58EE5F63EBA5F33D09946C8652132344410D3D658748BCAE256F24896C2A9AD9340D3C83
92652DA8ED7346D
666:d=1 hl=3 l= 129 prim: INTEGER :CAE155C9B3A4546B5FC3CF4CC80D539D531C406BAC5ED82818E
977B496F9F614CEFB1179E3BFBFAB22BCA7F88EBB8C9B1327AE70113242DFF0866370B6C76782DBD50DBE1FEE9B3316
B9AAC7BABB7CFA0A9EF26C3C976CF62DA8F41EFE065458DC7C1CBCA78FB1CB4FF7AA50D116CE1640956A4E89EADF529
3FAA13A2349F421
798:d=1 hl=3 l= 129 prim: INTEGER :BC3B93324E6D92EE7883AA366624F28ABF461ED3B0BE2CF7F80
5158939F815D20C07583E52C6DCA8DDD5FB2C1EE5AC9474A1476CD16ACFDDB1E24EEA2F204939BA1C58068B2D342FC4
169D484D36451BC7B82F306176D53FC71809A5A25B320277320DAC3D949D504DD9907164EC3EF7BD1BB4DEA82160A7C
4E3AA2ADEE88A9D
930:d=1 hl=3 l= 128 prim: INTEGER :2915E921A7D7A7A0F70BD8775C2C16BACD91F319DB1679FFE4C
BA30A5768D784EF45B90C4E2B0ECDC18323211B06B03AD76E39CD482E3D8CCC50EAE270A1813CE6F80688723F07FF18
A3110AD1AE16692CAD73BAA7AAA2CE5800D72F4F92489296542C1DA87159382B41A4A42933CD18848BBDB39A0A8E9F5
288770E27075B01
1061:d=1 hl=3 l= 128 prim: INTEGER :3AB4E3B841AB234515BF0A8D2E40FB6E95389702D834474E9AD8
49124DC6C1D342738D4E7510265DF6B744EBAA4A88A7995346BEEF047DB024CE8B2A4E3923B0566389948AB0BBB0318
79770DA14F4418AEB75AE98349122A2D9535117B05BEF938A1211A3BE6E882957BC2A5F1DE5CA50C26F42EE0A383A2A
2B6340D52E1A36
</pre></div> </div> </div><p>Being this an RSA key the fields represent specific components of the algorithm. We find in order: the modulus <code>n = pq</code>, the public exponent <code>e</code>, the private exponent <code>d</code>, the two prime numbers <code>p</code> and <code>q</code>, and the values <code>d_p</code>, <code>d_q</code>, and <code>q_inv</code> (for the <a href="https://en.wikipedia.org/wiki/Chinese_remainder_theorem">Chinese remainder theorem</a> speed-up).</p><p>If the key has been encrypted there are fields with information about the cipher, and the <code>OCTET STRING</code> fields cannot be further parsed because of the encryption.</p><div class="code"><div class="content"><div class="highlight"><pre>$ openssl asn1parse -inform pem -in private-enc.pem
0:d=0 hl=4 l=1311 cons: SEQUENCE
4:d=1 hl=2 l= 73 cons: SEQUENCE
6:d=2 hl=2 l= 9 prim: OBJECT :PBES2
17:d=2 hl=2 l= 60 cons: SEQUENCE
19:d=3 hl=2 l= 27 cons: SEQUENCE
21:d=4 hl=2 l= 9 prim: OBJECT :PBKDF2
32:d=4 hl=2 l= 14 cons: SEQUENCE
34:d=5 hl=2 l= 8 prim: OCTET STRING [HEX DUMP]:7FBE6B5C86A4B922
44:d=5 hl=2 l= 2 prim: INTEGER :0800
48:d=3 hl=2 l= 29 cons: SEQUENCE
50:d=4 hl=2 l= 9 prim: OBJECT :aes-256-cbc
61:d=4 hl=2 l= 16 prim: OCTET STRING [HEX DUMP]:7FC1CC749F456498F01E43108E4340DE
79:d=1 hl=4 l=1232 prim: OCTET STRING [HEX DUMP]:A5581EDC2797FC4E1AD0B66A00B765900AF1164D8
F67458C1A4E72F54A65F2B8C0C5AD7E42584B95161FD98FBECA07D8E1049687C365ED157C45F1B57B175D2EF778A1FE7
D12E50C0DF4248F0E1469DA40F9948581F16546F9582D9DCA83AC07C9466A6E3E6CE98CC241C44DAB32F5891B96DE302
4B6E6A0F4980C6286D6EB8AA1680AD132810EEFB127DE42968142F4F9A4A2CE55A560C054C54DFFBB720A81F3F50A2A6
3D748CE06309F55340BD4C74980C48F4C9D41650568A62BBE8E0337653BD4A2F7D47C3A24514B5D3100ED40C164831C6
5A96DC90AD20F4AEF02E00203B0F0B2D550987AEE8F4C7E0E7C0CFF426B465D3CF568D02EE86AF043345954B0AAA649F
A9F80E026E2A189EC60772A058615DCFEC9EC4D2D12CDEB7844EAA00202E435A0B9B0A28AC4F2DA213214F773A2319D5
5A560D5C99246F9895F5EF04D97FF1CE26EFC2FF82249F6E94253CB92EE0A74AE3942285C2DFFC77883709E7FF2569FD
9C8F58C112CD4A125E40E7BC8599242D71DE7D48416B6A36FBE0B90BA9A05AFB982CAE9AD337C2318582AA328ABC341F
BB1C036DE334DE327DEC97BA757CBBAED26F25DD74BD8BE9215B479CD49D8357AFA5289A0265ADE025F9FC0CDB1CDBF0
4C812F20B7CEB58BF12C1FD1756AABD7F557B87E1D245E8062D1DF4078D77AD98BFDF0C0F3A06A7FA11BFAE0EBF8F3EE
1F8AB0D6D7C905D4D238E2738613EA753E044589CEBDF3714CACEC298653FA45AF5977BDCF23B5DD60B479C7958B8AC1
8CAA4AA4A79C283805246675BBB8D2D0E5B714320E7E6FE8B2EF73DB9839095229B9653726AB9689B19AB47113F70204
83B2D1A82FE2EB9ABAB429DDF5ACDEBCAB62BABD48D2DBA1D398B03F9919F1DAC8CDA19D39BBAF2B5FE96C43E78F565C
465019DF88E71BCE35C6F7F8BE87EB384FA1193345E47CA9382BCEFFC2E6B37681E8D95EB48BC7044F7DCA743217D4C0
81200502E98EC2CFAA9D17277D5385E65CC8104DA999E31532A8B9B3B4D3E219613AE09BC9F10553CC4E5F135ACD3FB4
A3BBAB21839CEFBBC0D4BB16AE4FBD7407E6E3709B059BD86AFFE032805CE5FB0B8005009B5964B79E478DA7FE88C20D
D2FEDA10A0EB3433ADC90AF5DD8772B840A5CD7C5E32D96153E41F12BA501EF1F48C4E20CB0120CFBB6F546C2B6E22E0
834CB9DFBFA4834FEB4B7374788F781A1634ABF9D1FD014E6DB3749E6A086155521ADB9F271D6BF6F60455903B1D913D
A639EE9F5CA5135FD2A1873FF35EAB8C151C5B90826E4303233D4BB053EBD929107874CDCCADFFF492A7CB595EADF03E
4C0FE15326752898F1B9AA3EAC9907D9F276E6AB37AFA34FF8F3DBAB7B009754CF1A13029CD6857686105830F0CF6E99
476CB07ECAAEA8B5CCC2720479423F8504E783D6712E424C636DAB41203D9EC76F47C4B56F453C42E5626048C24CC585
F0710514EEF6D4C9644E0721CEAE9F885FBD672742A555095A895C7F0D4E814BEF4D223B13285E95BEDF7357D3545784
32C1EBB63A6EF1D83E21A08DADA073BF9419C7A3185BB492A13569F262683E7CD86EC66CF671C919789038598EFEC22B
C8EA1E265A4E0864F9E7253BE32457AC1B186722F3D0FF4AD450D04BA97D5B7DC1AA617DBD25EE8EC912072ABCBF5394
D08AA276732666D4C349196940BFE869DA909EC03A8E25B23339EE50453CB5F81400B1380CA46AF0FC012CA55F322C1C
5806E5D76D4CD8308B8FDFE
</pre></div> </div> </div><h3 id="openssl-and-rsa-keys-9de7">OpenSSL and RSA keys</h3><p>Another way to look into a private key with OpenSSL is to use the module <code>rsa</code>. While the module <code>asn1parse</code> is a generic ASN.1 parser, the module <code>rsa</code> knows the structure of an RSA key and can properly output the field names</p><div class="code"><div class="content"><div class="highlight"><pre>$ openssl rsa -in private.pem -noout -text
Private-Key: (2048 bit)
modulus:
00:b2:f5:fd:3f:9f:09:17:11:2c:e4:2f:8b:f8:7e:
d6:76:e1:52:58:be:44:3f:36:de:af:b0:b6:9b:de:
24:96:b4:95:ea:ad:1b:01:ca:d8:42:71:b0:14:e9:
6f:79:38:6c:63:6d:34:85:16:da:74:a6:8a:8c:70:
fb:a8:82:87:0c:47:b4:21:8d:8f:49:18:6d:df:72:
72:7b:9d:80:c2:19:11:c3:e3:37:c6:e4:07:ff:b4:
7c:2f:27:67:b0:d1:64:d8:a1:e9:af:95:f6:48:1b:
f8:d9:ed:fb:2e:39:04:b2:52:92:68:c4:60:25:6f:
af:d0:a6:77:d2:98:98:f1:0b:1d:15:12:8a:69:58:
39:fc:08:ed:d5:84:e8:33:56:15:b1:d1:d7:27:7b:
e6:5c:53:2d:ca:92:dd:c7:05:03:74:86:8b:11:7e:
a9:15:49:14:ef:92:92:b8:44:3f:13:69:6e:4f:ad:
50:de:d6:bd:90:e5:a6:f7:ed:33:be:2e:ce:31:c6:
dd:7a:42:53:ee:6c:dc:56:78:7d:dd:1d:5c:d7:76:
61:40:22:db:87:d0:3b:b2:2f:23:28:5b:5a:31:67:
af:8d:ac:ab:be:a4:00:04:47:13:37:d3:78:1e:8c:
5c:ca:0e:a5:e2:77:99:b5:10:e4:ef:93:8c:61:ca:
a6:0d
publicExponent: 65537 (0x10001)
privateExponent:
00:b2:42:55:00:0a:6a:03:90:18:27:33:35:39:51:
1e:4f:4c:21:ba:43:cb:b7:2b:f0:a5:10:60:d4:e1:
71:90:ac:50:a8:71:c5:75:03:98:66:96:d7:cd:fc:
b8:0d:07:26:ef:e2:d7:6d:ba:55:df:dc:04:25:e0:
64:cc:75:38:10:03:5c:6a:0f:97:aa:37:ab:39:e7:
c6:21:5b:c1:e5:95:13:1d:0c:37:82:e5:a1:12:13:
b5:9f:42:a1:06:7f:8c:f4:3c:53:89:92:d6:be:fd:
1d:e3:f6:29:3c:e1:8e:cc:11:73:c4:e7:d6:dd:73:
62:ad:73:23:e7:a2:18:b5:ff:b0:f2:45:eb:79:63:
27:cc:87:49:3e:dd:13:42:34:ed:5f:3b:14:a4:c4:
d9:23:74:59:7f:64:a6:d3:cb:2c:10:f0:cd:2d:57:
e9:9f:58:c8:d2:8f:20:49:d1:43:3c:c4:bd:67:70:
17:ad:1b:dd:1c:83:cf:b8:fb:7e:8c:8f:dc:f0:b4:
fb:77:de:7b:82:85:74:9c:ed:fb:fd:68:78:f7:f7:
93:00:73:f0:f4:2a:dd:cb:a8:38:5d:7e:d0:5c:df:
ca:a2:a2:ba:75:76:01:72:3a:96:20:1f:ec:cc:2e:
65:c6:5e:14:f6:5f:1d:34:d6:ec:df:e3:f8:54:01:
80:01
prime1:
00:e1:d1:63:89:bf:6e:ff:7a:e4:4f:65:71:06:ed:
81:c8:1a:48:b5:fb:35:6f:83:dd:4a:22:9e:86:54:
bd:c0:36:71:6b:bd:9d:46:df:d1:49:81:32:54:50:
54:95:8a:ca:5c:fd:a7:09:d9:7c:c8:c6:a9:e9:20:
3d:05:f7:b9:d4:5e:68:5a:19:a5:f5:82:67:fc:b1:
7f:cf:50:2b:32:cf:ed:b9:4c:ae:a5:8e:e5:f6:3e:
ba:5f:33:d0:99:46:c8:65:21:32:34:44:10:d3:d6:
58:74:8b:ca:e2:56:f2:48:96:c2:a9:ad:93:40:d3:
c8:39:26:52:da:8e:d7:34:6d
prime2:
00:ca:e1:55:c9:b3:a4:54:6b:5f:c3:cf:4c:c8:0d:
53:9d:53:1c:40:6b:ac:5e:d8:28:18:e9:77:b4:96:
f9:f6:14:ce:fb:11:79:e3:bf:bf:ab:22:bc:a7:f8:
8e:bb:8c:9b:13:27:ae:70:11:32:42:df:f0:86:63:
70:b6:c7:67:82:db:d5:0d:be:1f:ee:9b:33:16:b9:
aa:c7:ba:bb:7c:fa:0a:9e:f2:6c:3c:97:6c:f6:2d:
a8:f4:1e:fe:06:54:58:dc:7c:1c:bc:a7:8f:b1:cb:
4f:f7:aa:50:d1:16:ce:16:40:95:6a:4e:89:ea:df:
52:93:fa:a1:3a:23:49:f4:21
exponent1:
00:bc:3b:93:32:4e:6d:92:ee:78:83:aa:36:66:24:
f2:8a:bf:46:1e:d3:b0:be:2c:f7:f8:05:15:89:39:
f8:15:d2:0c:07:58:3e:52:c6:dc:a8:dd:d5:fb:2c:
1e:e5:ac:94:74:a1:47:6c:d1:6a:cf:dd:b1:e2:4e:
ea:2f:20:49:39:ba:1c:58:06:8b:2d:34:2f:c4:16:
9d:48:4d:36:45:1b:c7:b8:2f:30:61:76:d5:3f:c7:
18:09:a5:a2:5b:32:02:77:32:0d:ac:3d:94:9d:50:
4d:d9:90:71:64:ec:3e:f7:bd:1b:b4:de:a8:21:60:
a7:c4:e3:aa:2a:de:e8:8a:9d
exponent2:
29:15:e9:21:a7:d7:a7:a0:f7:0b:d8:77:5c:2c:16:
ba:cd:91:f3:19:db:16:79:ff:e4:cb:a3:0a:57:68:
d7:84:ef:45:b9:0c:4e:2b:0e:cd:c1:83:23:21:1b:
06:b0:3a:d7:6e:39:cd:48:2e:3d:8c:cc:50:ea:e2:
70:a1:81:3c:e6:f8:06:88:72:3f:07:ff:18:a3:11:
0a:d1:ae:16:69:2c:ad:73:ba:a7:aa:a2:ce:58:00:
d7:2f:4f:92:48:92:96:54:2c:1d:a8:71:59:38:2b:
41:a4:a4:29:33:cd:18:84:8b:bd:b3:9a:0a:8e:9f:
52:88:77:0e:27:07:5b:01
coefficient:
3a:b4:e3:b8:41:ab:23:45:15:bf:0a:8d:2e:40:fb:
6e:95:38:97:02:d8:34:47:4e:9a:d8:49:12:4d:c6:
c1:d3:42:73:8d:4e:75:10:26:5d:f6:b7:44:eb:aa:
4a:88:a7:99:53:46:be:ef:04:7d:b0:24:ce:8b:2a:
4e:39:23:b0:56:63:89:94:8a:b0:bb:b0:31:87:97:
70:da:14:f4:41:8a:eb:75:ae:98:34:91:22:a2:d9:
53:51:17:b0:5b:ef:93:8a:12:11:a3:be:6e:88:29:
57:bc:2a:5f:1d:e5:ca:50:c2:6f:42:ee:0a:38:3a:
2a:2b:63:40:d5:2e:1a:36
</pre></div> </div> </div><p>The fields are the same we found in the ASN.1 structure, but in this representation we have a better view of the specific values of the RSA key. You can compare the two and see that the value of the fields are the same.</p><p>If you want to learn something about RSA, try to investigate the historical reasons behind the choice of 65537 as a common public exponent (as you can see here in the section <code>publicExponent</code>).</p><h3 id="pkcs-8-vs-pkcs-1-8e2d">PKCS #8 vs PKCS #1</h3><p>The first version of the PKCS standard (PKCS #1) was specifically tailored to contain an RSA key. Its ASN.1 definition can be found in <a href="https://tools.ietf.org/html/rfc8017">RFC 8017</a> ("PKCS #1: RSA Cryptography Specifications Version 2.2")</p><div class="code"><div class="content"><div class="highlight"><pre>RSAPublicKey ::= SEQUENCE {
modulus INTEGER, -- n
publicExponent INTEGER -- e
}
RSAPrivateKey ::= SEQUENCE {
version Version,
modulus INTEGER, -- n
publicExponent INTEGER, -- e
privateExponent INTEGER, -- d
prime1 INTEGER, -- p
prime2 INTEGER, -- q
exponent1 INTEGER, -- d mod (p-1)
exponent2 INTEGER, -- d mod (q-1)
coefficient INTEGER, -- (inverse of q) mod p
otherPrimeInfos OtherPrimeInfos OPTIONAL
}
</pre></div> </div> </div><p>Subsequently, as the need to describe new types of algorithms increased, the PKCS #8 standard was developed. This can contain different types of keys, and defines a specific field for the algorithm identifier. Its ASN.1 definition can be found in <a href="https://tools.ietf.org/html/rfc5958">RFC 5958</a> ("Asymmetric Key Packages")</p><div class="code"><div class="content"><div class="highlight"><pre>OneAsymmetricKey ::= SEQUENCE {
version Version,
privateKeyAlgorithm PrivateKeyAlgorithmIdentifier,
privateKey PrivateKey,
attributes [0] Attributes OPTIONAL,
...,
[[2: publicKey [1] PublicKey OPTIONAL ]],
...
}
PrivateKey ::= OCTET STRING
-- Content varies based on type of key. The
-- algorithm identifier dictates the format of
-- the key.
</pre></div> </div> </div><p>The definition of the field <code>PrivateKey</code> for the RSA algorithm is the same used in PKCS #1.</p><p>If the PEM format uses PKCS #8 its header and footer are</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN PRIVATE KEY-----
[...]
-----END PRIVATE KEY-----
</pre></div> </div> </div><p>If it uses PKCS #1, however, there has to be an external identification of the algorithm, so the header and footer are</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN RSA PRIVATE KEY-----
[...]
-----END RSA PRIVATE KEY-----
</pre></div> </div> </div><p>The structure of PKCS #8 is the reason why we had to parse the field at offset 22 to access the RSA parameters when using the module <code>asn1parse</code> of OpenSSL. If you are parsing a PKCS #1 key in PEM format you don't need this second step.</p><h2 id="private-and-public-key-1848">Private and public key<a class="headerlink" href="#private-and-public-key-1848" title="Permanent link">¶</a></h2><p>In the RSA algorithm the public key is built using the modulus and the public exponent, which means that we can always derive the public key from the private key. OpenSSL can easily do this with the module <code>rsa</code>, producing the public key in PEM format</p><div class="code"><div class="content"><div class="highlight"><pre>$ openssl rsa -in private.pem -pubout
writing RSA key
-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAsvX9P58JFxEs5C+L+H7W
duFSWL5EPzber7C2m94klrSV6q0bAcrYQnGwFOlveThsY200hRbadKaKjHD7qIKH
DEe0IY2PSRht33Jye52AwhkRw+M3xuQH/7R8LydnsNFk2KHpr5X2SBv42e37LjkE
slKSaMRgJW+v0KZ30piY8QsdFRKKaVg5/Ajt1YToM1YVsdHXJ3vmXFMtypLdxwUD
dIaLEX6pFUkU75KSuEQ/E2luT61Q3ta9kOWm9+0zvi7OMcbdekJT7mzcVnh93R1c
13ZhQCLbh9A7si8jKFtaMWevjayrvqQABEcTN9N4Hoxcyg6l4neZtRDk75OMYcqm
DQIDAQAB
-----END PUBLIC KEY-----
</pre></div> </div> </div><p>You can dump the information in the public key specifying the flag <code>-pubin</code></p><div class="code"><div class="content"><div class="highlight"><pre>$ openssl rsa -in public.pem -noout -text -pubin
Public-Key: (2048 bit)
Modulus:
00:b2:f5:fd:3f:9f:09:17:11:2c:e4:2f:8b:f8:7e:
d6:76:e1:52:58:be:44:3f:36:de:af:b0:b6:9b:de:
24:96:b4:95:ea:ad:1b:01:ca:d8:42:71:b0:14:e9:
6f:79:38:6c:63:6d:34:85:16:da:74:a6:8a:8c:70:
fb:a8:82:87:0c:47:b4:21:8d:8f:49:18:6d:df:72:
72:7b:9d:80:c2:19:11:c3:e3:37:c6:e4:07:ff:b4:
7c:2f:27:67:b0:d1:64:d8:a1:e9:af:95:f6:48:1b:
f8:d9:ed:fb:2e:39:04:b2:52:92:68:c4:60:25:6f:
af:d0:a6:77:d2:98:98:f1:0b:1d:15:12:8a:69:58:
39:fc:08:ed:d5:84:e8:33:56:15:b1:d1:d7:27:7b:
e6:5c:53:2d:ca:92:dd:c7:05:03:74:86:8b:11:7e:
a9:15:49:14:ef:92:92:b8:44:3f:13:69:6e:4f:ad:
50:de:d6:bd:90:e5:a6:f7:ed:33:be:2e:ce:31:c6:
dd:7a:42:53:ee:6c:dc:56:78:7d:dd:1d:5c:d7:76:
61:40:22:db:87:d0:3b:b2:2f:23:28:5b:5a:31:67:
af:8d:ac:ab:be:a4:00:04:47:13:37:d3:78:1e:8c:
5c:ca:0e:a5:e2:77:99:b5:10:e4:ef:93:8c:61:ca:
a6:0d
Exponent: 65537 (0x10001)
</pre></div> </div> </div><h2 id="generating-key-pairs-with-openssl-5c37">Generating key pairs with OpenSSL<a class="headerlink" href="#generating-key-pairs-with-openssl-5c37" title="Permanent link">¶</a></h2><p>If you want to generate an RSA private key you can do it with OpenSSL</p><div class="code"><div class="content"><div class="highlight"><pre>$ openssl genpkey -algorithm RSA -out private.pem \
-pkeyopt rsa_keygen_bits:2048
......................................................................+++
..........+++
</pre></div> </div> </div><p>Since OpenSSL is a collection of modules, we specify <code>genpkey</code> to generate a private key. The option <code>-algorithm</code> specifies which algorithm we want to use to generate the key (RSA in this case), <code>-out</code> specifies the name of the output file, and <code>-pkeyopt</code> allows us to set the value for specific key options. In this case the length of the RSA key in bits.</p><p>If you want an encrypted key you can generate one specifying the cipher (for example <code>-aes-256-cbc</code>)</p><div class="code"><div class="content"><div class="highlight"><pre>$ openssl genpkey -algorithm RSA -out private-enc.pem \
-aes-256-cbc -pkeyopt rsa_keygen_bits:2048
...........................+++
..........+++
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
</pre></div> </div> </div><p>You can see the list of supported ciphers with <code>openssl list-cipher-algorithms</code>. In both cases you can then extract the public key with the method shown previously. OpenSSL private keys are created using PKCS #8, so unencrypted keys will be in the form</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN PRIVATE KEY-----
[...]
-----END PRIVATE KEY-----
</pre></div> </div> </div><p>and encrypted ones in the form</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN ENCRYPTED PRIVATE KEY-----
[...]
-----END ENCRYPTED PRIVATE KEY-----
</pre></div> </div> </div><h2 id="generating-key-pairs-with-openssh-d4fa">Generating key pairs with OpenSSH<a class="headerlink" href="#generating-key-pairs-with-openssh-d4fa" title="Permanent link">¶</a></h2><p>Another tool that you can use to generate key pairs is ssh-keygen, which is a tool included in the SSH suite that is specifically used to create and manage SSH keys. As SSH keys are standard asymmetrical keys we can use the tool to create keys for other purposes.</p><p>To create a key pair just run</p><div class="code"><div class="content"><div class="highlight"><pre>ssh-keygen -m PEM -t rsa -b 2048 -f key
</pre></div> </div> </div><p>The option <code>-m</code> specifies the key format. By default OpenSSH uses its own format specified in <a href="https://tools.ietf.org/html/rfc4716">RFC 4716</a> ("The Secure Shell (SSH) Public Key File Format").</p><p>The option <code>-t</code> specifies the key generation algorithm (RSA in this case), while the option <code>-b</code> specifies the length of the key in bits.</p><p>The option <code>-f</code> sets the name of the output file. If not present, ssh-keygen will ask the name of the file, offering to save it to the default file <code>~/.ssh/id_rsa</code>. The tool always asks for a password to encrypt the key, but you are allowed to enter an empty one to skip the encryption.</p><p>This tool creates two files. One is the private key file, named as requested, and the second is the public key file, named like the private key one but with the extension <code>.pub</code>.</p><p>The value <code>PEM</code> specified for the option <code>-m</code> writes the private key using the PKCS #1 format, so the key will be in the form</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN RSA PRIVATE KEY-----
[...]
-----END RSA PRIVATE KEY-----
</pre></div> </div> </div><p>Using <code>-m PKCS8</code> instead uses PKCS #8 and the key will be in the form</p><div class="code"><div class="content"><div class="highlight"><pre>-----BEGIN PRIVATE KEY-----
[...]
-----END PRIVATE KEY-----
</pre></div> </div> </div><h3 id="the-openssh-public-key-format-68f1">The OpenSSH public key format</h3><p>The public key saved by ssh-keygen is written in the so-called SSH-format, which is not a standard in the cryptography world. It's structure is <code>ALGORITHM KEY COMMENT</code>, where the <code>KEY</code> part of the format is encoded with Base64.</p><p>For example</p><div class="code"><div class="content"><div class="highlight"><pre>ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCy9f0/nwkXESzkL4v4ftZ24VJYvkQ/Nt6vsLab3iSWtJXqrRsBythCcbAU6W9
5OGxjbTSFFtp0poqMcPuogocMR7QhjY9JGG3fcnJ7nYDCGRHD4zfG5Af/tHwvJ2ew0WTYoemvlfZIG/jZ7fsuOQSyUpJoxGAlb6
/QpnfSmJjxCx0VEoppWDn8CO3VhOgzVhWx0dcne+ZcUy3Kkt3HBQN0hosRfqkVSRTvkpK4RD8TaW5PrVDe1r2Q5ab37TO+Ls4xx
t16QlPubNxWeH3dHVzXdmFAItuH0DuyLyMoW1oxZ6+NrKu+pAAERxM303gejFzKDqXid5m1EOTvk4xhyqYN user@host
</pre></div> </div> </div><p>To manually decode the central part of the key you can use <code>base64</code> and <code>hexdump</code></p><div class="code"><div class="content"><div class="highlight"><pre>$ cat key.pub | cut -d " " -f2 | \
base64 -d | hexdump -ve '/1 "%02x "' -e '2/8 "\n"'
00 00 00 07 73 73 68 2d 72 73 61 00 00 00 03 01
00 01 00 00 01 01 00 b2 f5 fd 3f 9f 09 17 11 2c
e4 2f 8b f8 7e d6 76 e1 52 58 be 44 3f 36 de af
b0 b6 9b de 24 96 b4 95 ea ad 1b 01 ca d8 42 71
b0 14 e9 6f 79 38 6c 63 6d 34 85 16 da 74 a6 8a
8c 70 fb a8 82 87 0c 47 b4 21 8d 8f 49 18 6d df
72 72 7b 9d 80 c2 19 11 c3 e3 37 c6 e4 07 ff b4
7c 2f 27 67 b0 d1 64 d8 a1 e9 af 95 f6 48 1b f8
d9 ed fb 2e 39 04 b2 52 92 68 c4 60 25 6f af d0
a6 77 d2 98 98 f1 0b 1d 15 12 8a 69 58 39 fc 08
ed d5 84 e8 33 56 15 b1 d1 d7 27 7b e6 5c 53 2d
ca 92 dd c7 05 03 74 86 8b 11 7e a9 15 49 14 ef
92 92 b8 44 3f 13 69 6e 4f ad 50 de d6 bd 90 e5
a6 f7 ed 33 be 2e ce 31 c6 dd 7a 42 53 ee 6c dc
56 78 7d dd 1d 5c d7 76 61 40 22 db 87 d0 3b b2
2f 23 28 5b 5a 31 67 af 8d ac ab be a4 00 04 47
13 37 d3 78 1e 8c 5c ca 0e a5 e2 77 99 b5 10 e4
ef 93 8c 61 ca a6 0d
</pre></div> </div> </div><p>The structure of this binary file is pretty simple, and is described in two different RFCs. <a href="https://tools.ietf.org/html/rfc4253">RFC 4253</a> ("SSH Transport Layer Protocol") states in section 6.6 that</p><div class="code"><div class="content"><div class="highlight"><pre>The "ssh-rsa" key format has the following specific encoding:
string "ssh-rsa"
mpint e
mpint n
</pre></div> </div> </div><p>while the definition of the types <code>string</code> and <code>mpint</code> can be found in <a href="https://tools.ietf.org/html/rfc4251">RFC 4251</a> ("SSH Protocol Architecture"), section 5</p><div class="code"><div class="content"><div class="highlight"><pre>string
[...] They are stored as a uint32 containing its length
(number of bytes that follow) and zero (= empty string) or more
bytes that are the value of the string. Terminating null
characters are not used. [...]
mpint
Represents multiple precision integers in two's complement format,
stored as a string, 8 bits per byte, MSB first. [...]
</pre></div> </div> </div><p>This means that the above sequence of bytes is interpreted as 4 bytes of length (32 bits of the type <code>uint32</code>) followed by that number of bytes of content.</p><div class="code"><div class="content"><div class="highlight"><pre>(4 bytes) 00 00 00 07 = 7
(7 bytes) 73 73 68 2d 72 73 61 = "ssh-rsa" (US-ASCII)
(4 bytes) 00 00 00 03 = 3
(3 bytes) 01 00 01 = 65537 (a common value for the RSA exponent)
(4 bytes) 00 00 01 01 = 257
(257 bytes) 00 b2 .. ca a6 0d = The key modulus
</pre></div> </div> </div><p>Please note that since we created a key of 2048 bits we should have a modulus of 256 bytes. Instead this key uses 257 bytes prefixing the number with a byte <code>00</code> to avoid it being interpreted as negative (two's complement format).</p><p>The structure shown above is the reason why all the RSA public SSH keys start with the same 12 characters <code>AAAAB3NzaC1y</code>. This string, converted in Base64 gives the initial 9 bytes <code>00 00 00 07 73 73 68 2d 72</code> (Base64 characters are not a one-to-one mapping of the source bytes). If the exponent is the standard 65537 the key starts with <code>AAAAB3NzaC1yc2EAAAADAQAB</code>, which encoded gives the fist 18 bytes <code>00 00 00 07 73 73 68 2d 72 73 61 00 00 00 03 01 00 01</code>. </p><h2 id="converting-between-pem-and-openssh-format-0167">Converting between PEM and OpenSSH format<a class="headerlink" href="#converting-between-pem-and-openssh-format-0167" title="Permanent link">¶</a></h2><p>We often need to convert files created with one tool to a different format, so this is a list of the most common conversions you might need. I prefer to consider the key format instead of the source tool, but I give a short description of the reason why you should want to perform the conversion.</p><h3 id="pempkcs1-to-pempkcs8-e2dc">PEM/PKCS#1 to PEM/PKCS#8</h3><p>This is useful to convert OpenSSH private keys to a newer format.</p><div class="code"><div class="content"><div class="highlight"><pre>openssl pkcs8 -topk8 -inform PEM -outform PEM -in pkcs1.pem -out pkcs8.pem
</pre></div> </div> </div><h3 id="openssh-public-to-pempkcs8-4070">OpenSSH public to PEM/PKCS#8</h3><p>To convert public OpenSSH keys in a PEM format using PKCS #8 (prints to stdout)</p><div class="code"><div class="content"><div class="highlight"><pre>ssh-keygen -e -f public.pub -m PKCS8
</pre></div> </div> </div><p>This is easy to remember because <code>-e</code> stands for export. Note that you can also use <code>-m PEM</code> to convert the key into a PEM format that uses PKCS #1.</p><h3 id="pempkcs8-to-openssh-public-4e55">PEM/PKCS#8 to OpenSSH public</h3><p>If you need to use in SSH a key pair created with another system </p><div class="code"><div class="content"><div class="highlight"><pre>ssh-keygen -i -f public.pem -m PKCS8
</pre></div> </div> </div><p>This is easy to remember because <code>-i</code> stands for import. As happened when exporting the key, you can import a PEM/PKCS #1 key using <code>-m PEM</code>.</p><h2 id="reading-rsa-keys-in-python-1f46">Reading RSA keys in Python<a class="headerlink" href="#reading-rsa-keys-in-python-1f46" title="Permanent link">¶</a></h2><p>In Python you can use the package <code>pycrypto</code> to access a PEM file containing an RSA key with the function <code>RSA.importKey</code>. Now you can hopefully understand the <a href="https://www.dlitz.net/software/pycrypto/api/current/Crypto.PublicKey.RSA-module.html">documentation</a> that says</p><div class="code"><div class="content"><div class="highlight"><pre>externKey (string) - The RSA key to import, encoded as a string.
An RSA public key can be in any of the following formats:
* X.509 subjectPublicKeyInfo DER SEQUENCE (binary or PEM encoding)
* PKCS#1 RSAPublicKey DER SEQUENCE (binary or PEM encoding)
* OpenSSH (textual public key only)
An RSA private key can be in any of the following formats:
* PKCS#1 RSAPrivateKey DER SEQUENCE (binary or PEM encoding)
* PKCS#8 PrivateKeyInfo DER SEQUENCE (binary or PEM encoding)
* OpenSSH (textual public key only)
For details about the PEM encoding, see RFC1421/RFC1423.
In case of PEM encoding, the private key can be encrypted with DES or 3TDES
according to a certain pass phrase. Only OpenSSL-compatible pass phrases are
supported.
</pre></div> </div> </div><p>In practice what you can do with a file <code>private.pem</code> is</p><div class="code"><div class="content"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">Crypto.PublicKey</span> <span class="kn">import</span> <span class="n">RSA</span>
<span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'private.pem'</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span>
<span class="n">key</span> <span class="o">=</span> <span class="n">RSA</span><span class="o">.</span><span class="n">importKey</span><span class="p">(</span><span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">())</span>
</pre></div> </div> </div><p>and the variable <code>key</code> will contain an instance of <code>_RSAobj</code> (not a very pythonic name, to be honest). This instance contains the RSA parameters as attributes as stated in the <a href="https://www.dlitz.net/software/pycrypto/api/current/Crypto.PublicKey.RSA._RSAobj-class.html">documentation</a></p><div class="code"><div class="content"><div class="highlight"><pre><span class="n">modulus</span> <span class="o">=</span> <span class="n">key</span><span class="o">.</span><span class="n">n</span>
<span class="n">public_exponent</span> <span class="o">=</span> <span class="n">key</span><span class="o">.</span><span class="n">e</span>
<span class="n">private_exponent</span> <span class="o">=</span> <span class="n">key</span><span class="o">.</span><span class="n">d</span>
<span class="n">first_prime_number</span> <span class="o">=</span> <span class="n">key</span><span class="o">.</span><span class="n">p</span>
<span class="n">second_prime_number</span> <span class="o">=</span> <span class="n">key</span><span class="o">.</span><span class="n">q</span>
<span class="n">q_inv_crt</span> <span class="o">=</span> <span class="n">key</span><span class="o">.</span><span class="n">u</span>
</pre></div> </div> </div>
<div class="advertisement">
<a href="https://www.thedigitalcat.academy/freebie-first-class-objects">
<img src="/images/first-class-objects/cover.jpg" />
</a>
<div class="body">
<h2 id="first-class-objects-in-python-fffa">First-class objects in Python<a class="headerlink" href="#first-class-objects-in-python-fffa" title="Permanent link">¶</a></h2>
<p>Higher-order functions, wrappers, and factories</p>
<p>Learn all you need to know to understand first-class citizenship in Python, the gateway to grasp how decorators work and how functional programming can supercharge your code.</p>
<div class="actions">
<a class="action" href="https://www.thedigitalcat.academy/freebie-first-class-objects">Get your FREE copy</a>
</div>
</div>
</div>
<h2 id="final-words-9803">Final words<a class="headerlink" href="#final-words-9803" title="Permanent link">¶</a></h2><p>I keep finding on StackOverflow (and on other boards) messages of users that are confused by RSA keys, the output of the various tools, and by the subtle but important differences between the formats, so I hope this post helped you to get a better understanding of the matter.</p><h2 id="resources-edc5">Resources<a class="headerlink" href="#resources-edc5" title="Permanent link">¶</a></h2><ul><li>The Wikipedia article on <a href="https://en.wikipedia.org/wiki/RSA_(cryptosystem)">RSA</a></li><li>OpenSSL documentation: <a href="https://www.openssl.org/docs/man1.1.1/man1/asn1parse.html">asn1parse</a>, <a href="https://www.openssl.org/docs/man1.1.1/man1/rsa.html">rsa</a>, <a href="https://www.openssl.org/docs/man1.1.1/man1/genpkey.html">genpkey</a></li><li>The <a href="https://en.wikipedia.org/wiki/Base64">Base64</a> encoding</li><li>The Abstract Syntax Notation One <a href="https://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One">ASN.1</a> interface description language</li><li><a href="https://tools.ietf.org/html/rfc4251">RFC 4251 - The Secure Shell (SSH) Protocol Architecture</a></li><li><a href="https://tools.ietf.org/html/rfc4253">RFC 4253 - The Secure Shell (SSH) Transport Layer Protocol</a></li><li><a href="https://tools.ietf.org/html/rfc4716">RFC 4716 - The Secure Shell (SSH) Public Key File Format</a></li><li><a href="https://tools.ietf.org/html/rfc5208">RFC 5208 - Public-Key Cryptography Standards (PKCS) #8: Private-Key Information Syntax Specification Version 1.2</a></li><li><a href="https://tools.ietf.org/html/rfc5958">RFC 5958 - Asymmetric Key Packages</a></li><li><a href="https://tools.ietf.org/html/rfc7468">RFC 7468 - Textual Encodings of PKIX, PKCS, and CMS Structures</a></li><li><a href="https://tools.ietf.org/html/rfc8017">RFC 8017 - PKCS #1: RSA Cryptography Specifications Version 2.2</a></li><li><a href="https://www.dlitz.net/software/pycrypto/">PyCrypto</a> - The Python Cryptography Toolkit</li></ul><h2 id="feedback-d845">Feedback<a class="headerlink" href="#feedback-d845" title="Permanent link">¶</a></h2><p>Feel free to reach me on <a href="https://twitter.com/thedigicat">Twitter</a> if you have questions. The <a href="https://github.com/TheDigitalCatOnline/blog_source/issues">GitHub issues</a> page is the best place to submit corrections.</p>Introduction to hashing2018-04-06T11:30:00+01:002023-11-18T11:00:00+01:00Leonardo Giordanitag:www.thedigitalcatonline.com,2018-04-06:/blog/2018/04/06/introduction-to-hashing/<p>What is hashing and why is it important? What are hash functions and how can we define a good one?</p><p>Have you ever used dictionaries or maps in your language of choice, or have you ever met a mysterious MD5 code while downloading a file from a server? Maybe you are a programmer, and using Git to manage your code you ended up dealing with strange numbers called SHA1, and surely you heard at least a couple of times the term <em>cache</em>, which probably needed to be emptied in your browser.</p><p>What do all these concept have in common?</p><p>In this post I want to introduce you to the concept of <em>hashing</em>, which is one of the basic topics a good programmer shall know. Hashes are such an important topic in computer science that lacking knowledge in this field means being confused about a wide range of other subjects, like cryptography and security. Data structures, Bitcoin and blockchains, HTTPS, all these topics have hashing as one of their building blocks. As you can see, it is worth mastering the concept.</p><p>This will obviously be only a humble introduction to the subject matter, as the whole concept is too broad for a single post. You can start a serious study of this important part of computer science reading the Wikipedia articles liked at the bottom of the page and reading a good book (or taking a course) in either cryptography or data structures.</p><h2 id="a-practical-example-e9ca">A practical example<a class="headerlink" href="#a-practical-example-e9ca" title="Permanent link">¶</a></h2><p>Let me give you a concrete example of hashing before we analyse the matter in depth.</p><p>I want to download a big file from Internet, for example a Linux distribution. I already downloaded it in the past, but I'm not sure if the version I have is the same available for download now. The file has been renamed, so the original name has been lost. I might obviously just download it again, install it and manually check the version.</p><p>I wonder if there is a simpler solution, one that can possibly be automated. Downloading an ISO file from Internet is nowadays very cheap, but manually checking for the version isn't, at least in terms of time. I might also download it and compare all the files contained in both the new and the old ISO images, but then again, this process is not very fast.</p><p>The best solution would be for the server to provide a sort of label that <em>depends only on the data</em> in a mathematical and <em>deterministic</em> way. I might then run the same algorithm on the file I already downloaded and get a label that can be easily compared with the one provided by the server. If the process of computing the label is fast enough this might be the perfect solution.</p><p>A typical algorithm used for this purpose is MD5, and the label computed by the server could be something like <code>ef67d799b71de37423202c587662c87f</code>. Computing the MD5 of a 600 MB file takes less than a couple of seconds on a modern computer, so I can check if the file I own is the same the server provides in a very short time.</p><p>You can test MD5 on your own using the <code>md5sum</code> program that comes with all Linux distributions or other Unix-based systems. Open a terminal and run the following command</p><div class="code"><div class="content"><div class="highlight"><pre><span class="nb">echo</span><span class="w"> </span><span class="s2">"This is a simple input string"</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>md5sum
</pre></div> </div> </div><p>and the result will be <code>8a7cc3b47880b5ef880ac6ef30785a1a</code>, independently of your operating system.</p><p>MD5 is one of many <strong></strong>hash functions<strong></strong> that have been invented to deal with problems like the one I exemplified. Recently, I had the need to synchronise daily two AWS S3 buckets containing more than 60 gigabytes of files. Without hash functions it would be impossible to quickly identify the files that need to be copied.</p><p>The rest of the post is dedicated to the exploration of such an important and intriguing part of contemporary technology.</p><h2 id="hash-functions-65d4">Hash functions<a class="headerlink" href="#hash-functions-65d4" title="Permanent link">¶</a></h2><p>Let's start from the formal definition of hash function:</p><blockquote><p>A hash function is any function that can be used to map data of arbitrary size to data of fixed size</p><cite></cite></blockquote><p>This description may sound intimidating at first, but it is actually pretty simple. Let's consider a dictionary where you want to look up a word that you don't know, like for example "quagmire". What you do is to jump directly to a section labelled "Q" in the dictionary, then quickly identify the part of the section where words that start with "QU" are, and promptly find the word. Congratulations, you just used a hash function! </p><div class="imageblock"><img src="/images/introduction_to_hashing/hash1.jpg"></div><p>Getting the first letter of the word is, as a matter of fact, <em>a function</em> (an operation) <em>that maps</em> (connects) <em>data of arbitrary size</em> (words) <em>to data of fixed size</em> (a single letter of the alphabet). Using this method we can connect any word (also invented ones!) to a letter of the alphabet.</p><p>Before we move on, I want to stress one aspect that is clear from the previous example. Through a hash function we can connect a set of potentially infinite values (all the words that we can create) with a finite set (the letters of an alphabet). This is the most important concept we have to keep in mind when dealing with hashing.</p><h2 id="uniqueness-70c6">Uniqueness<a class="headerlink" href="#uniqueness-70c6" title="Permanent link">¶</a></h2><p>The result of a hash function is not unique, which means that two different inputs may give the same output. This is pretty easy to understand in the dictionary example, where multiple words can give as a result the same letter, as multiple words begin with that letter.</p><p>It is also evident that <strong></strong>hash functions cannot produce unique results by design<strong></strong>. The goal of a hash function is to map an infinite set with a finite set, so it is obvious that multiple elements of the infinite set will map to the same element in the finite one.</p><p>Let me give you a very simple example. Let's create a hash function that returns the first 32 bits (4 bytes) of the input, padding them with zeros if the input is shorter that 32 bits. I will use the ASCII standard to convert strings of characters into hexadecimal numbers, so every letter is represented by 1 byte.</p><div class="code"><div class="content"><div class="highlight"><pre>"This is a string"
|
+-----> 54 68 69 73 20 69 73 20 61 20 73 74 72 69 6e 67
|
+---> 54 68 69 73
"One"
|
+-----> 4f 6e 65
|
+---> 4f 6e 65 00
"The quick fox"
|
+-----> 54 68 65 20 71 75 69 63 6b 20 66 6f 78
|
+---> 54 68 65 20 <== COLLISION
"The lazy dog"
|
+-----> 54 68 65 20 6c 61 7a 79 20 64 6f 67
|
+---> 54 68 65 20 <== COLLISION
</pre></div> </div> </div><p>As you can see we have multiple input strings with different lengths, and while the first three produce different output values the last one produces the same value as the third one. This is straightforward, as the two strings start with the same four characters and our hash function considers only those to compute its result.</p><p>Such an event is called <em>collision</em> and it is a direct effect of the non-uniqueness of hash values. <strong></strong>It will always happen, with hash functions, that different values produce the same output<strong></strong>, and it is important to understand that this is not because our hash function is trivial, but this is in <em>the very nature of hash functions</em>, for strict mathematical reasons.</p><p>Collisions are not intrinsically bad, but we have to be aware they can happen when we develop algorithms that use hash functions. If we are writing a dictionary for a human language where 80% of the words starts with "A" it is pointless to use the first letter to partition the book because the first section would be almost as big as the whole tome. This may seem too imaginative an example, but when we manage data structures problems such this arise more often than not.</p><div class="imageblock"><img src="/images/introduction_to_hashing/hash2.jpg"></div><p>In this last example avoiding collisions is easy. We just need to increase the number of characters that we consider until there are no clashes on the right. This is a very empirical way to sort the problem, though, and it's possible only because we are dealing with a narrow set of inputs and a very simple hash function. In the next section we will discuss how more complicated hash functions deal with this problem.</p><h2 id="digital-hash-functions-e95e">Digital hash functions<a class="headerlink" href="#digital-hash-functions-e95e" title="Permanent link">¶</a></h2><p>As we saw the definition of hash functions involves functions, which are mappings. In other words we just need to describe a process that couples objects from the infinite source set of inputs to the finite destination set of outputs. Taking the fist letter of a word is such a process, but other examples may be grouping people according to the colour of the eyes or cataloguing films by production year. Among the various processes that we can use a big role is played by digital processes, that is functions that involve some operation on binary numbers.</p><p>When we <em>digitalise</em> something we represent it with a sequence of bits, and once this is done there is no real difference between strings, images, videos, sounds, programs. Everything in a computer is ultimately a sequence of bits, and those sequences can be sliced and changed with pure numerical functions such as additions and multiplications.</p><h2 id="cryptographic-hash-functions-d043">Cryptographic hash functions<a class="headerlink" href="#cryptographic-hash-functions-d043" title="Permanent link">¶</a></h2><p>Hash functions play a decisive role in security and in cryptography, and can be found in algorithms that provide authentication, i.e. secure ways to demonstrate the authenticity of some data. While the actual cryptographic techniques are not in the scope of this article, it is important to know that hash functions used for cryptographic purposes are not different, in principle, from hash functions used for other tasks that do not require any degree of security. <strong>Cryptographic hash functions</strong>, however, must have some specific properties that give the function a certain degree of "robustness". Being able to find the input of a hash function given the output, for example, would be catastrophic for some security algorithms that rely on the infeasibility of such an operation.</p><h2 id="good-hash-functions-773b">"Good" hash functions<a class="headerlink" href="#good-hash-functions-773b" title="Permanent link">¶</a></h2><p>The definition of hash function is pretty inclusive as the only required property is that of returning a fixed-length output. Hash functions used in practice may however have other properties. Such properties may be desirable or mandatory depending on the application, so functions that are extremely good for cryptography may be a poor choice for data structures like dictionaries. </p><p>Let me briefly describe some of the most important properties that you should be aware of.</p><h3 id="determinism-633d">Determinism</h3><p>Given the algorithm (with its parameters) and given the input data, the hash <strong></strong>must always be the same<strong></strong>. The result of the hashing function depends only on the data itself, and not on other external factors like for example time or computer system.</p><p>Pay attention to the fact that this definition considers the algorithm and its parameters. This means that we can include external factors in the computation, but they have to be fixed for the whole life of the result itself.</p><div class="imageblock"><img src="/images/introduction_to_hashing/hash3.jpg"></div><p>Let's consider a system that uses a hash to speed up searches in some arrays. For several reasons the hashing algorithm employs an initial random seed that is derived from the boot time. As long as the system is running (i.e. it is not rebooted), the algorithm is consistent, and we may consider the random seed as a constant parameter. We may also persist the hashes on a storage, because when we load them they are still perfectly valid. As soon as the system is rebooted, however, the whole set of hashes created during the previous execution becomes invalid and meaningless. This is not the case, though, if the hashing function bases its computation on the actual data only.</p><h3 id="diffusion-16fd">Diffusion</h3><p>Changing one single bit of the source data shall results in a <strong></strong>complete change<strong></strong> of the hash number. Compare for example the MD5 hash values of two similar strings</p><div class="code"><div class="content"><div class="highlight"><pre>The quick brown fox jumps over the lazy dog
|
+--> 37c4b87edffc5d198ff5a185cee7ee09
The quick brown fox jumps over the lazy cog
|
+--> 15546a0bcace46fd5e12ec29adca5e70
</pre></div> </div> </div><p>As you can see when a single input byte is different (the letter <code>d</code> in <code>dog</code> becomes a <code>c</code>), the whole result changes.</p><p>This implies that every part of the output is computed considering all the bits of the input. A function that returns the first <code>n</code> bits of the input does not have a good diffusion, as two different strings may return exactly the same hash if they have the same first <code>n</code> bits (see the example given above when I spoke about uniqueness). This property is important for cryptographic hash function.</p><h3 id="minimal-change-continuity-a7eb">Minimal change (continuity)</h3><p>An interesting property of some hash functions is that <strong></strong>similar input values map to similar hash values<strong></strong>. The exact definition of "similar" may vary, but in general we might associate it with the number of changes from the first output to the second. This behaviour is handy in some searching algorithms, where it is important that similar objects are stored near each other.</p><p>Note that this property is somehow the opposite of diffusion, thus demonstrating that not all these properties might be found in a single hash function.</p><h3 id="uniformity-346e">Uniformity</h3><p>A hash function has a given finite number of possible outputs, because the output has a finite length. When a hash function is uniform, producing the output for each possible input produces a <strong></strong>uniform distribution of outputs<strong></strong>, that is there is no output value that is used more often than others. When designing data structures this is often the desirable behaviour, since it leads to an uniform use of resources, for example memory, leading to an uniform behaviour of other algorithms that work on the same structure, like search. </p><div class="imageblock"><img src="/images/introduction_to_hashing/hash4.jpg"></div><p>Uniformity is obviously linked to the number of collisions produced by a hash function, and a perfectly uniform hash function will have the same number of collisions for each output value. Increasing the number of possible output values, thus, results in a uniform reduction of collisions.</p><h3 id="non-invertible-5b1d">Non-invertible</h3><p>Inverting a function means to create a function that returns the original input given the output. For example multiplication by 2 is an invertible function, as given the result we may easily divide by 2 and retrieve the original input.</p><p>With non-injective functions the only caveat is that there are multiple inputs that return the same output, but this doesn't prevent the creation of an inverse function. For example, 3 squared gives 9 and since the inverse of the square function is the square root, we can apply it to the result and retrieve the possible inputs, that is +3 and -3.</p><p>With non-invertible functions <strong></strong>there is no simple way to find the input given the output<strong></strong>. Mathematically we speak of <em>one-way functions</em>, as computing the inverse is either impossible of infeasible. Mind that "infeasible" has a well-defined meaning in mathematics, but I will not go deeper into it in this article. It will be sufficient to consider it as "too hard to compute in a reasonable time with the current state of technology". Cryptographic hash functions must be non-invertible.</p><div class="imageblock"><img src="/images/introduction_to_hashing/hash5.jpg"></div><h3 id="collisions-resistant-5cbc">Collisions-resistant</h3><p>A hash function is said to be collision-resistant when <strong></strong>it is hard to find two different inputs that produce the same hash value<strong></strong>. Mind that the definition of "hard" here is the same as that of "infeasible" in the previous section. This property is very important in cryptography, where collisions can be exploited to crack a cipher.</p><h2 id="theoretical-and-practical-inputs-960d">Theoretical and practical inputs<a class="headerlink" href="#theoretical-and-practical-inputs-960d" title="Permanent link">¶</a></h2><p>It is important to understand that the analysis of a hash function can be made considering either theoretical or practical inputs. Theoretical inputs are all the possible inputs, like "all the possible strings", while a set of practical inputs might be "the names of a group of people". The latter might be very large but it is not infinite. </p><p>Obviously, a hash function that provides interesting properties when dealing with theoretical inputs will show the same properties when applied to practical inputs, but often such functions are complex and slow. Not to mention that it is very difficult to create them.</p><p>Let me show you an example. As we saw above, a hash function that returns the first letter of a string is not a very good one. It lacks the diffusion property, for instance, an its uniformity is questionable, as all the strings that begin with the same letter will produce the same hash, leading to a large number of collisions. This is bad for data structures, so such a function is in theory not optimal.</p><p>However, if we are working on a set of strings like</p><div class="code"><div class="content"><div class="highlight"><pre>A poor workman blames his tool
Barking dogs seldom bite
Common sense ain't common
Doctors make the worst patients
...
You can't teach an old dog new tricks
</pre></div> </div> </div><p>where it is known (or evident) that each string begins with a different letter, suddenly our hash function becomes a perfect choice to build a searchable data structure, because <strong></strong>given this input set<strong></strong> there are no collisions. So, an analysis of the practical inputs is always paramount when we consider hash functions, as theoretically poor functions may perform very well on specific sets of inputs.</p><p>A very good example of such an analysis can be found in the source code of the Python language. The implementation of dictionaries contains an in-depth discussion of the choices made when implementing the hashing mechanism behind those structures. You can find it <a href="https://github.com/python/cpython/blob/3.8/Objects/dictobject.c#L135">here</a>. If you never approached data structures I recommend starting from a simpler explanation, however, as you might be intimidated by that discussion. You will find a good basic tutorial on hash tables in any data structures course or textbook.</p><h2 id="final-words-9803">Final words<a class="headerlink" href="#final-words-9803" title="Permanent link">¶</a></h2><p>As I said this is just a very quick and humble introduction to hashing. I think you cannot call yourself a programmer nowadays without knowing something about hashing, and what I summarized in this post is enough to understand hash uses like Bitcoin or SSL. If you want to study the topic in depth, however, I recommend taking a course or reading a book on data structures.</p><h2 id="resources-edc5">Resources<a class="headerlink" href="#resources-edc5" title="Permanent link">¶</a></h2><ul><li><a href="https://en.wikipedia.org/wiki/Hash_function">Hash function</a> on Wikipedia</li><li><a href="https://en.wikipedia.org/wiki/Cryptographic_hash_function">Cryptographic hash function</a> on Wikipedia</li><li><a href="https://www.youtube.com/watch?v=tLkHk__-M6Q">A lesson on hash functions</a> by Prof. Christof Paar</li><li>MIT Professor Srinivas Devadas on <a href="https://www.youtube.com/watch?v=KqqOXndnvic">Cryptographic hash functions</a></li><li>Wiley & Sons publishes a book on <a href="https://www.amazon.co.uk/Structures-Algorithms-Python-Michael-Goodrich/dp/1118290275">Data Structures and Algorithms in Python</a></li><li>O'Reilly publishes a book on <a href="https://www.amazon.com/Mastering-Algorithms-Techniques-Sorting-Encryption/dp/1565924533/ref=sr_1_1?s=books&ie=UTF8&qid=1523009383&sr=1-1&keywords=Mastering+Algorithms+with+C">Mastering Algorithms with C: Useful Techniques from Sorting to Encryption</a></li></ul><h2 id="updates-0083">Updates<a class="headerlink" href="#updates-0083" title="Permanent link">¶</a></h2><p>2018-04-28 <a href="https://www.reddit.com/user/gixslayer">gixslayer</a> and <a href="https://www.reddit.com/user/SevenGlass">SevenGlass</a> discussed on reddit the right command line for the <code>md5sum</code> example on Windows. See <a href="https://www.reddit.com/r/programming/comments/8fbepo/introduction_to_hashing/dy316go">the original comments</a>.</p><h2 id="feedback-d845">Feedback<a class="headerlink" href="#feedback-d845" title="Permanent link">¶</a></h2><p>Feel free to reach me on <a href="https://twitter.com/thedigicat">Twitter</a> if you have questions. The <a href="https://github.com/TheDigitalCatOnline/blog_source/issues">GitHub issues</a> page is the best place to submit corrections.</p>