Bläddra i källkod

And, or, not (#680)

Proposal: use `and`, `or`, and `not` keywords in place of `&&`, `||`, `!`.

Co-authored-by: Chandler Carruth <chandlerc@gmail.com>
Richard Smith 4 år sedan
förälder
incheckning
86ae4c6053
2 ändrade filer med 359 tillägg och 0 borttagningar
  1. 1 0
      proposals/README.md
  2. 358 0
      proposals/p0680.md

+ 1 - 0
proposals/README.md

@@ -63,5 +63,6 @@ request:
 -   [0618 - var ordering](p0618.md)
 -   [0623 - Require braces](p0623.md)
 -   [0646 - Low context-sensitivity principle](p0646.md)
+-   [0680 - And, or, not](p0680.md)
 
 <!-- endproposals -->

+ 358 - 0
proposals/p0680.md

@@ -0,0 +1,358 @@
+# And, or, not
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+[Pull request](https://github.com/carbon-language/carbon-lang/pull/680)
+
+<!-- toc -->
+
+## Table of contents
+
+-   [Problem](#problem)
+-   [Background](#background)
+    -   [Usage in existing languages](#usage-in-existing-languages)
+    -   [Known issues with punctuation operators](#known-issues-with-punctuation-operators)
+-   [Proposal](#proposal)
+-   [Details](#details)
+    -   [Precedence](#precedence)
+    -   [Associativity](#associativity)
+    -   [Conversions](#conversions)
+    -   [Overloading](#overloading)
+-   [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
+-   [Alternatives considered](#alternatives-considered)
+    -   [Use punctuation spelling for all three operators](#use-punctuation-spelling-for-all-three-operators)
+    -   [Precedence of AND versus OR](#precedence-of-and-versus-or)
+    -   [Precedence of NOT](#precedence-of-not)
+    -   [Punctuation form of NOT](#punctuation-form-of-not)
+    -   [Two forms of NOT](#two-forms-of-not)
+    -   [Repeated NOT](#repeated-not)
+    -   [AND and OR produce the decisive value](#and-and-or-produce-the-decisive-value)
+
+<!-- tocstop -->
+
+## Problem
+
+Logical AND, OR, and NOT are important building blocks for working with Boolean
+values. Carbon should support them.
+
+## Background
+
+### Usage in existing languages
+
+Most mainstream programming languages use one (or both) of two syntaxes for
+these operators:
+
+-   `&&`, `||`, and `!`. The `&&` and `||` operators were
+    [first used by C](https://www.bell-labs.com/usr/dmr/www/chist.html) to
+    distinguish short-circuiting behavior from that of regular operators. They
+    are now seen in C, C++, C#, Swift, Rust, Haskell, and many more languages.
+    `&&` and `||` invariably short-circuit.
+-   `and`, `or`, and `not`. These are often used by languages with more emphasis
+    on readability and scripting, such as Python, Pascal, Nim, SQL, and various
+    variants of BASIC. In Python, the `and` and `or` operators short-circuit; in
+    Pascal and Visual Basic, they do not, but variants of them do:
+
+    -   `and then` and `or else` in Pascal short-circuit.
+    -   `AndAlso` and `OrElse` in Visual Basic short-circuit.
+
+    C++ recognizes `and`, `or`, and `not` as keywords and treats them as lexical
+    synonyms for `&&`, `||`, and `!`. C provides an `<iso646.h>` standard header
+    that exposes `and`, `or`, and `not` as macros expanding to `&&`, `||`, and
+    `!`.
+
+Perl, Ruby, and Raku permit both syntaxes, giving the punctuation forms higher
+precedence and the keyword forms lower precedence. Both forms short-circuit.
+Raku also provides `andthen` and `orelse`, but these differ from `and` and `or`
+in what value is produced on short-ciruiting, not in whether they short-ciruit.
+
+Zig provides `and`, `or`, and `!`.
+
+### Known issues with punctuation operators
+
+The use of `&&` and `||` for short-circuiting logical operators is a common
+source of error due to the potential for confusion with the bitwise `&` and `|`
+operators. See:
+
+-   [CERT rule EXP46-C: Do not use a bitwise operator with a Boolean-like operand](https://wiki.sei.cmu.edu/confluence/display/c/EXP46-C.+Do+not+use+a+bitwise+operator+with+a+Boolean-like+operand)
+-   ChromiumOS bug prevents login due to `&&` / `&` typo
+    ([news coverage](https://www.theregister.com/2021/07/23/chromebork_bug_google/)).
+
+We have anecdotal evidence that the `!` operator is hard to see for some
+audiences in some contexts, particularly when adjacent to similar characters:
+parentheses, `I`, `l`, `1`.
+
+## Proposal
+
+Carbon will provide three operators to support logical operations on Boolean
+values.
+
+-   `and` provides a short-circuiting logical AND operation.
+-   `or` provides a short-circuiting logical OR operation.
+-   `not` provides a logical NOT operation.
+
+Note that these operators are valid alternative spellings in C++; PR
+[#682](https://github.com/carbon-language/carbon-lang/pull/682) provides a
+demonstration of what this proposal would look like if applied to the C++ code
+in the Carbon project.
+
+## Details
+
+`and` and `or` are infix binary operators. `not` is a prefix unary operator.
+
+### Precedence
+
+`and`, `or`, and `not` have very low precedence. When an expression appearing as
+the condition of an `if` uses these operators unparenthesized, they are always
+the lowest precedence operators in that expression.
+
+These operators permit any reasonable operator that might be used to form a
+boolean value as a subexpression. In particular, comparison operators such as
+`<` and `==` have higher precedence than all logical operators.
+
+A `not` operator can be used within `and` and `or`, but `and` cannot be used
+directly within `or` without parentheses, nor the other way around.
+
+For example:
+
+```
+if (n + m == 3 and not n < m) {
+```
+
+can be fully parenthesized as
+
+```
+if (((n + m) == 3) and (not (n < m))) {
+```
+
+and
+
+```
+if (cond1 == not cond2) {
+// ...
+if (cond1 and cond2 or cond3) {
+```
+
+are both errors, requiring parentheses.
+
+### Associativity
+
+`and` and `or` are left-associative. A `not` expression cannot be the operand of
+another `not` expression -- `not not b` is an error without parentheses.
+
+```
+// OK
+if (not a and not b and not c) { ... }
+if (not (not a)) { ... }
+
+// Error
+if (not a or not b and not c) { ... }
+if (not not a) { ... }
+```
+
+### Conversions
+
+The operand of `and`, `or`, or `not` is converted to a Boolean value in the same
+way as the condition of an `if` expression. In particular:
+
+-   If we decide that certain values, such as pointers or integers, should not
+    be usable as the condition of an `if` without an explicit comparison against
+    null or zero, then those values will also not be usable as the operand of
+    `and`, `or`, or `not` without an explicit comparison.
+-   If an extension point is provided to determine how to branch on the truth of
+    a value in an `if` (such as by supplying a conversion to a Boolean type),
+    that extension point will also apply to `and`, `or`, and `not`.
+
+### Overloading
+
+The logical operators `and`, `or`, and `not` are not overloadable. As noted
+above, any mechanism that allows types to customize how `if` treats them will
+also customize how `and`, `or`, and `not` treats them.
+
+## Rationale based on Carbon's goals
+
+-   _Code that is easy to read, understand, and write:_
+
+    -   The choice of `and` and `or` improves readability by avoiding the
+        potential for visual confusion between `&&` and `&`.
+    -   The choice of `not` improves readability by avoiding the use of a small
+        and easy-to-miss punctuation character for logical negation.
+    -   Having no precedence rule between `and` and `or` avoids readability
+        problems with expressions involving both operators, by requiring the use
+        of parentheses.
+    -   The use of a keyword rather than punctuation for these operators helps
+        emphasize that they are not regular operators but instead have
+        control-flow semantics.
+    -   Using the same precedence for `not` as for `and` and `or` allows
+        conditions to quickly be visually scanned and for the overall structure
+        and its nested conditions to be identified.
+
+-   _Interoperability with and migration from existing C++ code:_
+    -   The use of the operators `and`, `or`, and `not`, which are keywords with
+        the same meaning in C++, avoids problems with a Carbon keyword colliding
+        with a valid C++ identifier and allows early adoption of the Carbon
+        syntax in C++ codebases.
+    -   While these operators are overloadable in C++, `&&` and `||` are nearly
+        the least-overloaded operators, and `!` is generally only overloaded as
+        a roundabout way of converting to `bool`. There is no known need for
+        Carbon code to call overloaded C++ `operator&&`, `operator||`, or
+        `operator!`, nor for Carbon code to provide such operators to C++. We
+        will need a mechanism for `if`, `and`, `or`, and `not` to invoke
+        (possibly-`explicit`) `operator bool` defined in C++, but that's outside
+        the scope of this proposal.
+
+## Alternatives considered
+
+### Use punctuation spelling for all three operators
+
+We could follow the convention established by C and spell these operators as
+`&&`, `||`, and `!`.
+
+Advantage:
+
+-   This would improve familiarity for people comfortable with this operator
+    set.
+
+Disadvantages:
+
+-   The use of keywords rather than punctuation gives a hint that `and` and `or`
+    are not merely performing a computation -- they also affect control flow.
+-   The `!` operator is known to be hard to see for some readers.
+-   If we also support `&` and `|` operators, there is a known risk of confusion
+    between `&&` and `&` and between `||` and `|`.
+-   If we want to change the precedence rules, using a different syntax can
+    serve as a reminder that these operators do not behave like the `&&`, `||`,
+    and `!` operators.
+-   `&&`, `||`, and `!` are likely to be harder to type than `and`, `or`, and
+    `not` on at least most English keyboard layouts, due to requiring the use of
+    the Shift key and reaching outside the letter keys.
+
+### Precedence of AND versus OR
+
+Most languages give their AND operator higher precedence than their OR operator:
+
+```c++
+if (a && b || c && d) { ... }
+// ... means ...
+if ((a && b) || (c && d)) { ... }
+```
+
+This makes sense when viewed the right way: `&&` is the multiplication of
+Boolean algebra and `||` is the addition, and multiplication usually binds
+tighter than addition.
+
+We could do the same. However, this precedence rule is not reliably known by a
+significant fraction of developers despite having been present across many
+languages for decades, leading to common recommendations to enable compiler
+warnings for the first form in the above example, suggesting to rewrite it as
+the second form. This therefore fails the test from proposal
+[#555: when to add precedence edges](/proposals/p0555.md#when-to-add-precedence-edges).
+
+### Precedence of NOT
+
+We could give the `not` operator high precedence, mirroring the behavior in C++.
+In this proposal, the `not` operator has the same precedence rules as `and` and
+`or`, which may be inconvenient for some uses. For example:
+
+```
+var x: Bool = cond1 == not cond2;
+```
+
+is invalid in this proposal and requires parentheses, and
+
+```
+var y: Bool = not cond1 == cond2;
+```
+
+is equivalent to
+
+```
+var y: Bool = cond1 != cond2;
+```
+
+which may not be what the developer intended.
+
+However, giving `not` higher precedence would result in a more complex rule and
+would break the symmetry between `and`, `or`, and `not`.
+
+### Punctuation form of NOT
+
+We could use the spelling `!` for the NOT operator instead of `not`.
+
+The syntax of the `not` operator may be harder to read than a `!` operator for
+some uses. For example, when its operand is parenthesized, because it contains a
+nested `and` or `or`:
+
+```
+if (not (thing1 and thing2) or
+    not (thing3 and thing4)) {
+```
+
+Also, the spelling of the `!=` operator is harder to justify if there is no `!`
+operator.
+
+However:
+
+-   Changing the syntax would break the symmetry between `and`, `or`, and `not`.
+-   Python uses this set of keywords and has a `!=` operator, and that doesn't
+    appear to cause confusion in practice.
+-   `not` may be easier to read than `!` in some uses, and easier to pick out of
+    surrounding context.
+-   `!` is being used to indicate early substitution in generics, so avoiding
+    its use here may avoid overloading the same syntax for multiple purposes.
+-   A function-like `not (...)` may better conform to reader expectations than a
+    `!(...)` expression.
+
+### Two forms of NOT
+
+We could combine the two previous alternatives, and introduce both a
+low-precedence `not` operator and a high-precedence `!` operator.
+
+This would allow us to provide both a convenient `!` operator for use in
+subexpressions and a consistent `not` keyword that looks and behaves like `and`
+and `or`.
+
+However, including both operators would either lead to one of the forms being
+unused or to the extra developer burden of style rules suggesting when to use
+each. Also, it may be surprising to provide a `!` operator but not `&&` and `||`
+operators.
+
+### Repeated NOT
+
+We could permit the syntax `not not x` as an idiom for converting `x` to a
+Boolean type. However, we hope that a clearer syntax for that will be available,
+such as perhaps `x as Bool`. In the presence of such syntax, `not not x` would
+be an anti-pattern, and may indicate a bug due to an unintentionally repeated
+`not` operator. Per
+[#555: when to add precedence edges](/proposals/p0555.md#when-to-add-precedence-edges),
+when in doubt, we omit the precedence rule and wait for real-world experience.
+
+### AND and OR produce the decisive value
+
+We could make the AND and OR operator produce the (unconverted) value that
+determined the result, rather than producing a Boolean value. For example, given
+nullable pointers `p` and `q`, and assuming that we can test a pointer value for
+nonnullness with `if (p)`, we could treat `p or q` as producing `p` if `p` is
+nonnull and `q` otherwise.
+
+This is the approach taken by several scripting languages, notably Python, Perl,
+and Raku. It is also the approach taken by `std::conjunction` and
+`std::disjunction` in C++.
+
+Advantages:
+
+-   Provides a syntactic method to compute the first "truthy" or last "falsy"
+    value in a list.
+
+Disadvantages:
+
+-   The resulting code may be harder to read and understand, especially when the
+    value is passed onto a function call.
+-   Requires all conditions in a chained conditional to have a common type, or
+    the use of a fallback rule in the case where there is no common type.
+-   When combined with type deduction, an unexpected type will often be deduced;
+    for example, `var x: auto = p and q;` may deduce `x` as having pointer type
+    instead of Boolean type.