Просмотр исходного кода

Proposal for an explicit GitHub workflow. (#29)

* Proposal for an explicit GitHub workflow.

This suggests a GitHub workflow that uses pull requests, produces linear
history, and both incentivizes and encourages small, incremental changes
(both at the pull request and commit granularity). It tries to follow
general best practices around software engineering at scale and GitHub
workflows. It also tries to ensure the workflow is very well supported
by tooling and automation built into GitHub.

Of note, this proposal should match precisely the current enforced flow
on our GitHub repositories. But we need to actually decide we like this,
write up the rationale behind it, and document what we're doing.

I've added an abbreviated version of the proposal as a documentation
update to the contributing file. Happy to restructure or find a better
home for this. I've tried to focus on the parts that contributors
actually would need to care about as opposed to the things that are
simply and fully enforced mechanically.

My hope after this is to suggest more detailed code review guidelines.

* Addressing review comments.

Notably, I really was giving too much weight to multi-commit PRs which
shouldn't be the common or default. I've tried to restructure everything
to make it much more clear what is going on here.

* Minor tweaks

* Fix typo in CONTRIBUTING.md

Co-authored-by: josh11b <josh11b@users.noreply.github.com>

* Extract workflow description and remove redundancies.

* remove tracking issue template field

* Update proposals/p0029.md with reviewer suggsetion.

Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com>

* Continue to address review feedback.

* Apply suggestions from code review

Co-authored-by: Jon Meow <46229924+jonmeow@users.noreply.github.com>
Co-authored-by: austern <austern@google.com>

* Incorporate code review feedback and begin moving toward "trunk" based terminology

* Tweak the wording and make it a bit more consistent.

* Update docs/project/pull_request_workflow.md

Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com>

* Address more review comments.

* Correct the rationale.

* Tweak wording based on discussion in review.

* Improve the rationale around the default branch to avoid overstating or
misstatig things.

* Apply suggestions from code review

Co-authored-by: Jon Meow <46229924+jonmeow@users.noreply.github.com>

* Clean up formatting and address a couple of comments on grammar from review.

* Update docs/project/pull_request_workflow.md

Co-authored-by: austern <austern@google.com>

* Add to proposal list.

Co-authored-by: josh11b <josh11b@users.noreply.github.com>
Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com>
Co-authored-by: Jon Meow <46229924+jonmeow@users.noreply.github.com>
Co-authored-by: austern <austern@google.com>
Chandler Carruth 5 лет назад
Родитель
Сommit
c5ddb57dd7
4 измененных файлов с 281 добавлено и 0 удалено
  1. 7 0
      CONTRIBUTING.md
  2. 116 0
      docs/project/pull_request_workflow.md
  3. 1 0
      proposals/README.md
  4. 157 0
      proposals/p0029.md

+ 7 - 0
CONTRIBUTING.md

@@ -245,6 +245,13 @@ Every file type uses a variation on the same license text ("Apache-2.0 WITH
 LLVM-exception") with similar formatting. If you're not sure what text to use,
 please ask on Discourse Forums.
 
+## Workflow
+
+Carbon repositories all follow a common
+[pull-request workflow](docs/project/pull_request_workflow.md) for landing
+changes. It is a trunk-based development model that emphasizes small,
+incremental changes and preserves a simple linear history.
+
 ## Acknowledgements
 
 Carbon's Contributing guidelines are based on

+ 116 - 0
docs/project/pull_request_workflow.md

@@ -0,0 +1,116 @@
+# Trunk-based pull-request GitHub workflow
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+Carbon repositories follow a few basic principles:
+
+- Development directly on the `trunk` branch and
+  [revert to green](#green-tests).
+- Always use pull requests, rather than pushing directly.
+- Changes should be small, incremental, and review-optimized.
+- Preserve linear history by
+  [rebasing](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-merges#rebase-and-merge-your-pull-request-commits)
+  or
+  [squashing](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-merges#squash-and-merge-your-pull-request-commits)
+  pull requests rather than using unsquashed merge commits.
+
+These principles try to optimize for several different uses or activities with
+version control:
+
+- Continuous integration and bisection to identify failures and revert to green.
+- Code review both at the time of commit and follow-up review after commit.
+- Understanding how things evolve over time, which can manifest in different
+  ways:
+  - When were things introduced?
+  - How does the main branch and project evolve over time?
+  - How was a bug or surprising thing introduced?
+
+Note that this isn't a complete guide to doing code reviews, and just focuses on
+the mechanical workflow and branch management. TODO: Add an explicit link to
+more detailed guidance on managing pull request based code reviews when it is
+developed.
+
+## Trunk based development
+
+We work in a simple
+[trunk-based development](https://trunkbaseddevelopment.com/) model. This means
+all development activity takes place on a single common `trunk` branch in the
+repository (our default branch). We focus on
+[small, incremental changes](#small_incremental_changes) rather than feature
+branches or the "scaled" variations of this workflow.
+
+### Green tests
+
+The `trunk` branch should always stay "green". That means that if tests fail or
+if we discover bugs or errors, we revert to a "green" state by default, where
+the failure or bug is no longer present. Fixing forward is fine if that will be
+comparably fast and efficient. The goal isn't to dogmatically avoid fixing
+forward, but to prioritize getting back to green quickly. We hope to eventually
+tool this through automatic continuous-integration powered submit queues, but
+even those can fail and the principle remains.
+
+## Always use pull requests (with review) rather than pushing directly
+
+We want to ensure that changes to Carbon are always reviewed, and the simplest
+way to do this is to consistently follow a pull request workflow. Even if the
+change seems trivial, still go through a pull request -- it'll likely be trivial
+to review. Always wait for someone else to review your pull request rather than
+just merging it, even if you have permission to do so.
+
+Our GitHub repos are configured to require pull requests and review before they
+are merged, so this rule is enforced automatically.
+
+## Small, incremental changes
+
+Developing in small, incremental changes improves code review time, continuous
+integration, and bisection. This means we typically squash pull requests into a
+single commit when landing. We use two fundamental guides for deciding how to
+split up pull requests:
+
+1. Ensure that each pull request builds and passes any tests cleanly when you
+   request review and when it lands. This will ensure bisection and continuous
+   integration can effectively process them.
+
+2. Without violating the first point, try to get each pull request to be "just
+   right": not too big, not too small. You don't want to separate a pattern of
+   tightly related changes into separate requests when they're easier to review
+   as a set or batch, and you don't want to bundle unrelated changes together.
+   Typically you should try to keep the pull request as small as you can without
+   breaking apart tightly coupled changes. However, listen to your code reviewer
+   if they ask to split things up or combine them.
+
+While the default is to squash pull requests into a single commit, _during_ the
+review you typically want to leave the development history undisturbed until the
+end so that comments on any particular increment aren't lost. We typically use
+the GitHub squash-and-merge functionality to land things.
+
+### Managing pull requests with multiple commits
+
+Sometimes, it will make sense to _land_ a series of separate commits for a
+single pull request through rebasing. This can happen when there is important
+overarching context that should feed into the review, but the changes can be
+usefully decomposed when landing them. When following this model, each commit
+you intend to end up on the `trunk` branch needs to follow the same fundamental
+rules as the pull request above: they should each build and pass tests when
+landed in order, and they should have well written, cohesive commit messages.
+
+Prior to landing the pull request, you are expected to rebase it to produce this
+final commit sequence, either interactively or not. This kind of rebase rewrites
+the history in Git, which can make it hard to track the resolution of code
+review comments. Typically, only do this as a cleanup step when the review has
+finished, or when it won't otherwise disrupt code review. It is healthy and
+expected to add "addressing review comments" commits during the review and then
+squashing them away before the pull request is merged.
+
+## Linear history
+
+We want the history of the `trunk` branch of each repository to be as simple and
+easy to understand as possible. While Git has strong support for managing
+complex history and merge patterns, we find understanding and reasoning about
+the history -- especially for humans -- to be at least somewhat simplified by
+sticking to a linear progression. As a consequence, we either squash pull
+requests or rebase them when merging them.

+ 1 - 0
proposals/README.md

@@ -23,6 +23,7 @@ request:
 <!-- proposals -->
 <!-- This list is updated by src/scripts/pre-commit-proposal-list.py. -->
 
+- [0029 - Linear, rebase, and pull-request GitHub workflow](p0029.md)
 - [0044 - Proposal tracking](p0044.md)
   - [Decision](p0044-decision.md)
 - [0074 - Change comment/decision timelines in proposal process](p0074.md)

+ 157 - 0
proposals/p0029.md

@@ -0,0 +1,157 @@
+# Linear, rebase, and pull-request GitHub workflow
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+[Pull request](https://github.com/carbon-language/carbon-lang/pull/29)
+
+## Problem
+
+There are a variety of workflows that can be effectively used with both Git
+version control and GitHub. We should have a common and consistent workflow
+across as much of Carbon as possible. While some specific areas may end up
+needing specialized flows to suit their needs, these should be the exceptions
+and can be handled on a case-by-case basis when they arise.
+
+## Background
+
+- Chapter 16 "Version Control and Branch Management" in the SWE book
+  (_[Software Engineering at Google](https://www.amazon.com/Software-Engineering-Google-Lessons-Programming/dp/1492082791)_)
+- [Trunk Based Development](https://trunkbaseddevelopment.com/)
+- GitHub documentation on
+  "[pull requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)"
+  - [Configuration of their merges](https://help.github.com/en/github/administering-a-repository/configuring-pull-request-merges)
+  - [Protected branches](https://help.github.com/en/github/administering-a-repository/about-protected-branches)
+
+## Proposal
+
+Carbon repositories should all follow a common
+[workflow](/docs/project/pull_request_workflow.md) for managing and landing
+changes. The document outlines the specifics of the proposed workflow as well as
+the core motivation.
+
+### Rename the default git branch to `trunk`
+
+This achieves two goals:
+
+1. Replaces the term `master`. This term, while only used in isolation in Git,
+   [was used](https://mail.gnome.org/archives/desktop-devel-list/2019-May/msg00066.html)
+   in immediately preceding and related systems as part of extremely problematic
+   "master/slave" terminology. That background associates the term with
+   unacceptable historical and cultural meanings. The intent of those using or
+   adopting the term isn't relevant to this association. The less overtly
+   problematic term being isolated from the rest doesn't erase its history, and
+   doesn't completely avoid painful associations.
+
+2. It directly anchors and reinforces contributors on the trunk-based workflow.
+
+### Longer discussion of linear history
+
+While Git can effectively bisect through complex history graphs, this is
+significantly harder than bisecting across linear history. Especially for any
+part of Carbon involving code, we expect bisection to be a pervasive tool and we
+can make it simpler and more effective by forcing a linear history.
+
+A linear history also makes it much easier to ask questions about whether a
+particular change has landed yet, or when a bug was introduced. For some people,
+releases are more simply understood as branching from a specific snapshot of the
+linear history. While tools like `git log` can provide similar functionality, it
+is less trivially understood.
+
+Continuous integration is simplified for many of the same reasons as bisection:
+the set of potential deltas is reduced to a linear sequence. Reverting to green
+becomes easier to understand, and testing each incremental commit has a single
+obvious interpretation.
+
+Requiring linear history also incentivises incremental development that is
+committed early to the main branch. This, in essence, ensures a single source of
+truth model even with the distributed version control system. Because
+works-in-progress are required to be rebased, they tend to merge early and often
+rather than forming long-lived development branches. This helps reduce the total
+merge resolution and testing costs as a project scales. For more details about
+the advantages of using a single source of truth, see the full text of the
+"Version Control and Branch Management" chapter in the SWE book.
+
+One concern with linear history when rebasing a sequence of changes and merging
+them is that the pull request associated with that sequence might not be obvious
+from the main branch commit log. However, there is enough information in the
+repository to establish the relationship, and GitHub's UI surfaces the pull
+request on each commit in the series.
+
+## Alternatives considered
+
+### LLVM model
+
+LLVM allows directly pushing/submitting to their "trunk" with post-commit
+review. LLVM enforces linear history for day-to-day development. Merge commits
+are allowed for rare events like contributing a new subproject.
+
+Pros:
+
+- Still has linear history.
+- Incentivizes squashing for continuous integration and bisection.
+- Very low overhead for fixing trivial mistakes.
+
+Cons:
+
+- Creates extremely bad incentives around code review.
+  - Lots of patches don't get pre-commit review, even if they would benefit from
+    it.
+  - Very experienced contributors are much better at avoiding pre-commit review,
+    so are rarely blocked waiting on review.
+  - Leads to the most experienced members of the community not doing enough code
+    reviews, or being timely enough in code reviews.
+  - Lots of patches submitted with post-commit review are never reviewed in
+    practice unless they break something.
+- UI and basic support for code reviews entirely focused on pull requests.
+
+### Fork and merge model
+
+Classically, Git and GitHub support merging entire complex graphs of
+development.
+
+Pros:
+
+- Mostly supported by pull requests, so still able to use much of that
+  functionality.
+- Supports a model in which contributors do not communicate and can each develop
+  a local, decentralized fork while still achieving overall reconciliation.
+- Can model much more complex history of code evolution faithfully in the
+  tooling.
+  - Most of the time these aren't so complex that they create problems for
+    humans.
+
+Cons:
+
+- History is harder for humans to understand and reason about in complex cases.
+- Bisection and continuous integration are more complex.
+  - May create difficulty for continuous integration against mainline, because
+    unclear what "order" they should be applied / explored. While there are
+    technical approaches to address this, the don't seem to eliminate the
+    complexity, merely provide a clear set of mechanics for handling it.
+- Reduces incentives to land small, incremental changes by allowing
+  non-linearity to reduce the effort required for large and/or complex merges.
+- Makes review of the main branch's history harder due to non-linearity.
+
+### Fork and merge, but branches can only have linear history
+
+Imagine a fork and merge model, but PRs can only have linear history. That is,
+branching and merging within a PR, or merging `trunk` into PR is not allowed. In
+this model, the only merge commits are merges of PRs into `trunk`. PRs
+themselves don’t contain merge commits.
+
+Pros:
+
+- Mostly supported by GitHub pull requests, so still able to use much of that
+  functionality.
+- Restricts non-linearity of history. The only non-linearity that is left is
+  merge commits on the trunk branch. PRs themselves can’t contain merge commits.
+
+Cons:
+
+- Requires a custom presubmit on GitHub that checks linearity of a PR.
+- The cons of the fork and merge strategy regarding the complexity of history
+  remain, but to a smaller degree, since non-linearity is restricted.