numeric_literals.md 6.4 KB

Numeric literals

Table of contents

Overview

The following syntaxes are supported:

  • Integer literals
    • 12345 (decimal)
    • 0x1FE (hexadecimal)
    • 0b1010 (binary)
  • Real number literals
    • 123.456 (digits on both sides of the .)
    • 123.456e789 (optional + or - after the e)
    • 0x1.2p123 (optional + or - after the p)
  • Digit separators (_) may be used, but only in conventional locations

Note that real number literals always contain a . with digits on both sides, and integer literals never contain a ..

Literals are case-sensitive. Unlike in C++, literals do not have a suffix to indicate their type.

Details

Integer literals

Decimal integers are written as a non-zero decimal digit followed by zero or more additional decimal digits, or as a single 0.

Integers in other bases are written as a 0 followed by a base specifier character, followed by a sequence of digits in the corresponding base. The available base specifiers and corresponding bases are:

Base specifier Base Digits
b 2 0 and 1
x 16 0 ... 9, A ... F

The above table is case-sensitive. For example, 0b1 and 0x1A are valid, and 0B1, 0X1A, and 0x1a are invalid.

A zero at the start of a literal can never be followed by another digit: either the literal is 0, the 0 begins a base specifier, or the next character is a decimal point (see below). No support is provided for octal literals, and any C or C++ octal literal (other than 0) is invalid in Carbon.

Real number literals

Real numbers are written as a decimal or hexadecimal integer followed by a period (.) followed by a sequence of one or more decimal or hexadecimal digits, respectively. A digit is required on each side of the period. 0. and .3 are both invalid.

A real number can be followed by an exponent character, an optional + or - (defaulting to + if absent), and a character sequence matching the grammar of a decimal integer with some value N. For a decimal real number, the exponent character is e, and the effect is to multiply the given value by 10±N. For a hexadecimal real number, the exponent character is p, and the effect is to multiply the given value by 2±N. The exponent suffix is optional for both decimal and hexadecimal real numbers.

Note that a decimal integer followed by e is not a real number literal. For example, 3e10 is not a valid literal.

When a real number literal is interpreted as a value of a real number type, its value is the representable real number closest to the value of the literal. In the case of a tie, the nearest value whose mantissa is even is selected.

The decimal real number syntax allows for any decimal fraction to be expressed -- that is, any number of the form a x 10-b, where a is an integer and b is a non-negative integer. Because the decimal fractions are dense in the reals and the set of values of the real number type is assumed to be discrete, every value of the real number type can be expressed as a real number literal. However, for certain applications, directly expressing the intended real number representation may be more convenient than producing a decimal equivalent that is known to convert to the intended value. Hexadecimal real number literals are provided in order to permit values of binary floating or fixed point real number types to be expressed directly.

Digit separators

If digit separators (_) are included in literals, they must meet the respective condition:

  • For decimal integers, the digit separators shall occur every three digits starting from the right. For example, 2_147_483_648.
  • For hexadecimal integers, the digit separators shall occur every four digits starting from the right. For example, 0x7FFF_FFFF.
  • For real number literals, digit separators can appear in the decimal and hexadecimal integer portions (prior to the period and after the optional e or mandatory p) as described in the previous bullets. For example, 2_147.483648e12_345 or 0x1_00CA.FEF00Dp+24
  • For binary literals, digit separators can appear between any two digits. For example, 0b1_000_101_11.

Divergence from other languages

The design provides a syntax that is deliberately close to that used both by C++ and many other languages, so it should feel familiar to developers. However, it selects a reasonably minimal subset of the syntaxes. This minimal approach provides benefits directly in line with the goal that Carbon code should be easy to read, understand, and write:

  • Reduces unnecessary choices for programmers.
  • Simplifies the syntax rules of the language.
  • Improves consistency of written Carbon code.

That said, it still provides sufficient variations to address important use cases for the goal of not leaving room for a lower level language:

  • Hexadecimal and binary integer literals.
  • Scientific notation floating point literals.
  • Hexadecimal (scientific) floating point literals.

Alternatives considered

References