{"id":10934,"date":"2023-11-03T11:06:29","date_gmt":"2023-11-03T10:06:29","guid":{"rendered":"https:\/\/monodes.com\/predaelli\/?p=10934"},"modified":"2023-11-03T11:06:29","modified_gmt":"2023-11-03T10:06:29","slug":"datomic-the-ten-rules-of-schema-growth","status":"publish","type":"post","link":"https:\/\/monodes.com\/predaelli\/2023\/11\/03\/datomic-the-ten-rules-of-schema-growth\/","title":{"rendered":"Datomic &#8211; The Ten Rules of Schema Growth"},"content":{"rendered":"<p><em><a href=\"https:\/\/blog.datomic.com\/2017\/01\/the-ten-rules-of-schema-growth.html\">Datomic &#8211; The Ten Rules of Schema Growth<\/a><\/em><\/p>\n<blockquote><p>Data outlives code, and a valuable database supports many applications over time. These ten rules will help grow your database schema without breaking your applications.<\/p><\/blockquote>\n<p><!--more--><!--nextpage--><\/p>\n<blockquote><p>&nbsp;<\/p>\n<p>Data outlives code, and a valuable database supports many applications over time. These ten rules will help grow your database schema without breaking your applications.<\/p>\n<div class=\"separator\"><a href=\"https:\/\/i0.wp.com\/blog.datomic.com\/images\/blog\/-53l7suMbyss\/WDw4_QYOk5I\/AAAAAAAAANQ\/MmACZ8DAbKsE8tE4BzI4I6Y81tt18aJtwCPcB\/s1600\/DatomicLogo_Square.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2023\/11\/DatomicLogo_Square.png?resize=200%2C199&#038;ssl=1\" width=\"200\" height=\"199\" border=\"0\" \/><\/a><\/div>\n<h3>1. \u00a0Prod is not like dev.<\/h3>\n<p>Production is not development. In production, one or more codebases depend on your data, and these ten rules below should be followed exactingly.<\/p>\n<p>A dev environment can be much more relaxed. \u00a0Alone on your development machine experimenting with a new feature, you have no users to break. \u00a0You can soften the rules, so long as you harden them when transitioning to production.<\/p>\n<h3>2. \u00a0Grow your schema, and never break it.<\/h3>\n<p>The lack of common vocabulary makes it all too easy to automate the wrong practices. I will use the terms <i>growth<\/i> and <i>breakage<\/i> as defined in Rich Hickey&#8217;s\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=oyLBGkS5ICk\" target=\"_blank\" rel=\"noopener\">Spec-ulation<\/a>\u00a0talk. \u00a0In schema terms:<\/p>\n<ul>\n<li><i>growth<\/i> is providing more schema<\/li>\n<li><i>breakage<\/i> is removing schema, or changing the meaning of existing schema.<\/li>\n<\/ul>\n<p>In contrast to these terms, many people use &#8220;<a href=\"http:\/\/edgeguides.rubyonrails.org\/active_record_migrations.html\" target=\"_blank\" rel=\"noopener\">migrations<\/a>&#8220;, &#8220;<a href=\"http:\/\/databaserefactoring.com\/\" target=\"_blank\" rel=\"noopener\">refactoring<\/a>&#8220;, or &#8220;<a href=\"https:\/\/martinfowler.com\/articles\/evodb.html\" target=\"_blank\" rel=\"noopener\">evolution<\/a>&#8220;. These usages tend to focus on repeatability, convenience, and the needs of new programs, ignoring the distinction between growth and breakage. The problem here is obvious: Breakage is bad, so we don&#8217;t want it to be more convenient!<\/p>\n<p>Using precise language underscores the costs of of breakage. Most migrations are easily categorized as growth or breakage by considering the rules below. \u00a0<i>Growth migrations<\/i> are suitable for production, and <i>breakage migrations<\/i> are, at best, a dev-only convenience. Keep them widely separate.<\/p>\n<h3>3. The database is the source of truth.<\/h3>\n<p>Schema growth needs to be reproducible from one environment to another. \u00a0Reproducibility supports the development and testing of new schema before putting it into production and also the reuse of schema in different databases. Schema growth also needs to be evident in the database itself, so that you can determine what the database has, what it needs, and when growth occurred.<\/p>\n<p>For both of these reasons, the database is the proper source of truth for schema growth. When the database is the source of truth, reproducability and auditability happen <i>for free<\/i>\u00a0via the ordinary<br \/>\nquery and transaction capabilities of the database. \u00a0(If your database is not up to the tasks of queries and transactions you have bigger problems beyond the scope of this article).<\/p>\n<p>Storing schema in a database is strictly more powerful than storing schema as text files in source control. The database is the actual home for schema, plus it provides validation, structure, query, transactions, and history. A source control system provides only history and is separate from the data itself.<\/p>\n<p>Note that this does <i>not<\/i> mean &#8220;never put schema information in source control&#8221;. Source control may be convenient for other reasons, e.g. it may be more readily accessible. You may redundantly store schema in source control, but remember that the database is definitive.<\/p>\n<h3>4. \u00a0Growing is adding.<\/h3>\n<p>As you acquire more information about your domain, grow your schema to match. You can grow a schema by adding new things, and <i>only<\/i> by adding new things, for example:<\/p>\n<ul>\n<li>adding new attributes to an existing &#8216;type&#8217;<\/li>\n<li>adding new types<\/li>\n<li>adding relationships between types<\/li>\n<\/ul>\n<div><\/div>\n<h3>5. \u00a0Never remove a name.<\/h3>\n<p>Removing a named schema component at any level is a breaking change for programs that depend on that name. Never remove a name.<\/p>\n<h3>6. \u00a0Never reuse a name.<\/h3>\n<p>The meaning of a name is established when the name is first introduced. Reusing that name to mean something substantially different breaks programs that depend on that meaning. This can be even<br \/>\nworse than removing the name, as the breakage may not be as immediately obvious.<\/p>\n<h3>7. \u00a0Use aliases.<\/h3>\n<p>If you are familiar with database refactoring patterns, the advice in Rules Five and Six may seem stark. After all, one purpose of refactoring is to adopt better names as we discover them. How can we<br \/>\ndo that if names can never be removed or changed in meaning?<\/p>\n<p>The simple solution is to use <a href=\"http:\/\/docs.datomic.com\/schema.html#renaming-an-identity\" target=\"_blank\" rel=\"noopener\">more than one alias<\/a> to refer to the same schema entity. Consider the following example:<\/p>\n<ul>\n<li>In iteration 1, users of your system are identified by their email with an attribute named <b>:user\/id<\/b>.<\/li>\n<li>In iteration 2, you discover that users sometimes have non-email identifiers for users and that you want to store a user&#8217;s email even when not using the email as an identifier. In short, you wish that <b>:user\/id<\/b>\u00a0was named <b>:user\/primary-email<\/b>.<\/li>\n<\/ul>\n<p>No problem! Just create <b>:user\/primary-email<\/b> as an alias for <b>:user\/id<\/b>. Older programs can continue to use <b>:user\/id<\/b>, and newer programs can use the now-preferred <b>:user\/primary-email<\/b>.<\/p>\n<h3>8. \u00a0Namespace all names.<\/h3>\n<p>Namespaces greatly reduce the cost of getting a name wrong, as the same local name can safely have different meanings in different namespaces. \u00a0Continuing the previous example, imagine that the local<br \/>\nname <b>id<\/b> is used to refer to a UUID in several namespaces, e.g. <b>:inventory\/id<\/b>, <b>:order\/id<\/b>, and so on. The fact that <b>:user\/id\u00a0<\/b>is <i>not<\/i> a UUID is inconsistent, and newer programs should not have to put up with this.<\/p>\n<p>Namespaces let you improve the situation without breaking existing programs. You can introduce <b>:user-v2\/id<\/b>, and new programs can ignore names in the\u00a0<b>user<\/b>\u00a0namespace. If you don&#8217;t like <b>v2<\/b>, you can also pick a more semantic name for the new namespace.<\/p>\n<h3>9. \u00a0Annotate your schema.<\/h3>\n<p>Databases are good at storing data about your schema. Adding annotations to your schema can help both human readers and make sense of how the schema grew over time. For example:<\/p>\n<ul>\n<li>you could annotate names that are not recommended for new programs with a <b>:schema\/deprecated<\/b> flag, or you could get fancier still with <b>:schema\/deprecated-at<\/b> or <b>:schema\/deprecated-because<\/b>. Note that such deprecated names are still <i>never<\/i> removed (Rule Five).<\/li>\n<li>you could provide <b>:schema\/see-also<\/b> or <b>:schema\/see-instead<\/b> pointers to more current conventions.<\/li>\n<\/ul>\n<p>In fact, all the database refactoring patterns that are typically implemented as breaking changes could be implemented non-destructively, with the refactoring details recorded as an annotation. For example, the breaking <a href=\"http:\/\/databaserefactoring.com\/SplitColumns.html\" target=\"_blank\" rel=\"noopener\">&#8220;split column&#8221; refactoring<\/a>\u00a0might instead be implemented as schema growth:<\/p>\n<ul>\n<li>add N new columns<\/li>\n<li>(optional) add a <b>:schema\/split-into<\/b> attribute on the original column whose value is the new columns, and possibly even the recipe for the split<\/li>\n<\/ul>\n<div><\/div>\n<h3>10. Plan for accretion.<\/h3>\n<p>If a system is going to grow at all, then programs must not bake in limiting presumptions. \u00a0For example: If a schema states that <b>:user\/id<\/b> is a string, then programs can rely on <b>:user\/id<\/b> being a string and not occasionally an integer or a boolean. \u00a0But a program cannot assume that a user entity will be limited to a the set of attributes previously seen, or that it understands the semantics of attributes that it has not seen before.<\/p>\n<h3>Are these rules specific to a particular database?<\/h3>\n<p>No. These rules apply to almost any SQL or NoSQL database. \u00a0The rules even apply to the so-called &#8220;schemaless&#8221; databases. \u00a0A better word for schemaless is &#8220;schema-implicit&#8221;, i.e. the schema is implicit in your data and the database has no reified awareness of it. \u00a0With an implicit schema, all the rules still apply, except that the database is impotent to help you (no Rule 3).<\/p>\n<h3>In Context<\/h3>\n<p>Many of the resources on migrations, refactoring, and database evolution emphasize repeatability and the needs of new programs, without making the top-level distinctions of growth vs. breakage and prod vs. dev. As a result, these resources encourage breaking the rules in this article.<\/p>\n<p>Happily, these resources can easily be recast in growth-only terms. \u00a0You <i>can<\/i> grow your schema without breaking your app. You <i>can<\/i> continuously deploy without continuously propagating breakage. \u00a0Here&#8217;s what it looks like <a href=\"http:\/\/docs.datomic.com\/best-practices.html#production-schema\" target=\"_blank\" rel=\"noopener\">in Datomic<\/a>.<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p class=\"excerpt\">Datomic &#8211; The Ten Rules of Schema Growth Data outlives code, and a valuable database supports many applications over time. These ten rules will help grow your database schema without breaking your applications.<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"https:\/\/monodes.com\/predaelli\/2023\/11\/03\/datomic-the-ten-rules-of-schema-growth\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":4,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[72],"tags":[],"class_list":["post-10934","post","type-post","status-publish","format-standard","hentry","category-documentations"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p6daft-2Qm","jetpack-related-posts":[{"id":2209,"url":"https:\/\/monodes.com\/predaelli\/2017\/02\/28\/how-to-backup-sql-server-to-sql-file-stack-overflow\/","url_meta":{"origin":10934,"position":0},"title":"How to backup Sql Server to sql file? &#8211; Stack Overflow","author":"Paolo Redaelli","date":"2017-02-28","format":"link","excerpt":"In \"Back UP\" I only get a bak file, but I would like to create .sql file Sorgente: How to backup Sql Server to sql file? - Stack Overflow I may need it pretty soon up vote 25 down vote Use SQL Server's Generate Scripts commend right click on the\u2026","rel":"","context":"In &quot;Senza categoria&quot;","block_context":{"text":"Senza categoria","link":"https:\/\/monodes.com\/predaelli\/category\/senza-categoria\/"},"img":{"alt_text":"apple-touch-icon@2","src":"https:\/\/i0.wp.com\/monodes.com\/predaelli\/wp-content\/uploads\/sites\/4\/2017\/02\/apple-touch-icon%402-150x150-1.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":6316,"url":"https:\/\/monodes.com\/predaelli\/2019\/12\/15\/the-7-habits-of-highly-ineffective-programmers-young-coder-medium\/","url_meta":{"origin":10934,"position":1},"title":"The 7 Habits of Highly Ineffective Programmers &#8211; Young Coder &#8211; Medium","author":"Paolo Redaelli","date":"2019-12-15","format":"link","excerpt":"Are you committing crimes against code? From: The 7 Habits of Highly Ineffective Programmers - Young Coder - Medium The 7 Habits of Highly Ineffective Programmers Are you committing crimes against code? Matthew MacDonald Dec 5 \u00b7 4 min read Adapted from Pixabay It\u2019s always good to refactor old code,\u2026","rel":"","context":"In &quot;Senza categoria&quot;","block_context":{"text":"Senza categoria","link":"https:\/\/monodes.com\/predaelli\/category\/senza-categoria\/"},"img":{"alt_text":"Matthew MacDonald","src":"https:\/\/i0.wp.com\/miro.medium.com\/fit\/c\/58\/58\/1%2AsSR_cfRyUkBoM7L1pUYoYA.jpeg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":4908,"url":"https:\/\/monodes.com\/predaelli\/2018\/11\/13\/open-source-web-cloud-microsoft-office-access-alternatives-alternativeto-net\/","url_meta":{"origin":10934,"position":2},"title":"Open Source Web \/ Cloud Microsoft Office Access Alternatives &#8211; AlternativeTo.net","author":"Paolo Redaelli","date":"2018-11-13","format":false,"excerpt":"Open Source Web \/ Cloud Microsoft Office Access Alternatives - AlternativeTo.net VFront VFront is a dynamic front-end for MySQL or PostgreSQL databases written in PHP and Javascript. With VFront you can create forms for manage your data in AJAX style... nuBuilder Forte Low-code, open-source RAD development tool for creating web-based\u2026","rel":"","context":"In &quot;Software Libero&quot;","block_context":{"text":"Software Libero","link":"https:\/\/monodes.com\/predaelli\/category\/software\/software-libero\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":10940,"url":"https:\/\/monodes.com\/predaelli\/2023\/11\/03\/versioning-data-in-postgres-testing-a-git-like-approach-specfy\/","url_meta":{"origin":10934,"position":3},"title":"Versioning data in Postgres? Testing a git like approach &#8211; Specfy","author":"Paolo Redaelli","date":"2023-11-03","format":false,"excerpt":"Versioning data in Postgres? Testing a git like approach - Specfy is fashinating but I think that most of the time these two proposed alternatives fit most of the needs: In-Table versioning, the Wordpress way of doing thing. Add a a column version (or modify date) and SELECT the maximum\u2026","rel":"","context":"In &quot;Tricks&quot;","block_context":{"text":"Tricks","link":"https:\/\/monodes.com\/predaelli\/category\/documentations\/tricks\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":3522,"url":"https:\/\/monodes.com\/predaelli\/2017\/11\/22\/sql-server-why-should-an-application-not-use-the-sa-account-database-administrators-stack-exchange\/","url_meta":{"origin":10934,"position":4},"title":"sql server &#8211; Why should an application not use the sa account &#8211; Database Administrators Stack Exchange","author":"Paolo Redaelli","date":"2017-11-22","format":"status","excerpt":"sql server - Why should an application not use the sa account - Database Administrators Stack Exchange I understand that the sa account enables complete control over a SQL Server and all the databases, users, permissions etc. I have an absolute belief that applications should not use the sa password\u2026","rel":"","context":"In &quot;Proprietary software&quot;","block_context":{"text":"Proprietary software","link":"https:\/\/monodes.com\/predaelli\/category\/software\/proprietary-software\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":9759,"url":"https:\/\/monodes.com\/predaelli\/2022\/10\/24\/6-popular-postgresql-cli-libraries\/","url_meta":{"origin":10934,"position":5},"title":"6 Popular PostgreSQL CLI Libraries","author":"Paolo Redaelli","date":"2022-10-24","format":false,"excerpt":"6 Popular PostgreSQL CLI Libraries pgcli - Postgres CLI with autocompletion and syntax highlighting pgsh - Branch your PostgreSQL Database like Git psql - The built-in PostgreSQL CLI client psql2csv - Run a query in psql and output the result as CSV nancy - The Nancy CLI is a unified\u2026","rel":"","context":"In &quot;Documentations&quot;","block_context":{"text":"Documentations","link":"https:\/\/monodes.com\/predaelli\/category\/documentations\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/posts\/10934","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/comments?post=10934"}],"version-history":[{"count":0,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/posts\/10934\/revisions"}],"wp:attachment":[{"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/media?parent=10934"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/categories?post=10934"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/monodes.com\/predaelli\/wp-json\/wp\/v2\/tags?post=10934"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}