When a numeric literal appears in a program, we need to understand its semantics:
In C++, numeric literals have either an integral type or a floating-point type.
C++ provides permission for implementations to add extended integral types, but
in practice (for bad reasons relating to intmax_t) implementations do not do
so, so there are a small finite set of types that any given numeric literal
might have:
int, long, long long, or unsigned versions of thesefloat, double, or long doubleThe choice of type is determined solely by the literal.
The C++ approach is error-prone and problematic:
1 << 60 has value 0 because 1 is a 32-bit type.int x = -2147483648; typically results in undefined behavior even
when -2147483648 is a valid int value.long int literal, which may or may not 64 bits wide.Numeric literals have a type derived from their value, and can be converted to any type that can represent that value.
Simple operations such as arithmetic that involve only literals also produce values of literal types.
Numeric literals have a type derived from their value. Two integer literals have the same type if and only if they represent the same integer. Two real number literals have the same type if and only if they represent the same real number.
That is:
var x: i32 = 1.0; is invalid.Primitive operators are available between numeric literals, and produce values
with numeric literal types. For example, the type of 1 + 2 is the same as the
type of 3.
Numeric types can provide conversions to support initialization from numeric literals. Because the value of the literal is carried in the type, a type-level decision can be made as to whether the conversion is valid.
The integer types defined in the standard library permit conversion from integer literal types whose values are representable in the integer type. The floating-point types defined in the Carbon library permit conversion from integer and rational literal types whose values are between the minimum and maximum finite value representable in the floating-point type.
The following types are defined in the Carbon prelude:
An arbitrary-precision integer type.
class BigInt;
A rational type, parameterized by a type used for its numerator and denominator.
class Rational(T:! Type);
The exact constraints on T are not yet decided.
A type representing integer literals.
class IntLiteral(N:! BigInt);
A type representing floating-point literals.
class FloatLiteral(X:! Rational(BigInt));
All of these types are usable during compilation. BigInt supports the same
operations as Int(n). Rational(T) supports the same operations as
Float(n).
The types IntLiteral(n) and FloatLiteral(x) also support primitive integer
and floating-point operations such as arithmetic and comparison, but these
operations are typically heterogeneous: for example, an addition between
IntLiteral(n) and IntLiteral(m) produces a value of type
IntLiteral(n + m).
IntLiteral(n) converts to any sufficiently large integer type, as if by:
impl [template N:! BigInt, template M:! BigInt]
IntLiteral(N) as ImplicitAs(Int(M))
if N >= Int(M).MinValue as BigInt and N <= Int(M).MaxValue as BigInt {
...
}
impl [template N:! BigInt, template M:! BigInt]
IntLiteral(N) as ImplicitAs(Unsigned(M))
if N >= Int(M).MinValue as BigInt and N <= Int(M).MaxValue as BigInt {
...
}
The above is for exposition purposes only; various parts of this syntax are not yet decided.
Similarly, IntLiteral(x) and FloatLiteral(x) convert to any sufficiently
large floating-point type, and produce the nearest representable floating-point
value. Conversions in which x lies exactly half-way between two values are
rejected, as
previously decided.
Conversions in which x is outside the range of finite values of the
floating-point type are also rejected, rather than saturating to the finite
range or producing an infinity.
// This is OK: the initializer is of the integer literal type with value
// -2147483648 despite being written as a unary `-` applied to a literal.
var x: i32 = -2147483648;
// This initializes y to 2^60.
var y: i64 = 1 << 60;
// This forms a rational literal whose value is one third, and converts it to
// the nearest representable value of type `f64`.
var z: f64 = 1.0 / 3.0;
// This is an error: 300 cannot be represented in type `i8`.
var c: i8 = 300;
fn f[template T:! Type](v: T) {
var x: i32 = v * 2;
}
// OK: x = 2_000_000_000.
f(1_000_000_000);
// Error: 4_000_000_000 can't be represented in type `i32`.
f(2_000_000_000);
// No storage required for the bound when it's of integer literal type.
struct Span(template T:! Type, template BoundT:! Type) {
var begin: T*;
var bound: BoundT;
}
// Returns 1, because 1.3 can implicitly convert to f32, even though conversion
// to f64 might be a more exact match.
fn G() -> i32 {
match (1.3) {
case _: f32 => { return 1; }
case _: f64 => { return 2; }
}
}
// Can only be called with a literal 0.
fn PassMeZero(_: IntLiteral(0));
// Can only be called with integer literals in the given range.
fn ConvertToByte[template N:! BigInt](_: IntLiteral(N)) -> i8
if N >= -128 and N <= 127 {
return N as i8;
}
// Given any int literal, produces a literal whose value is one higher.
fn OneHigher(L: IntLiteral(template _:! BigInt)) -> auto {
return L + 1;
}
// Error: 256 can't be represented in type `i8`.
var v: i8 = OneHigher(255);
We could decide on a fixed-width type based on the form of the literal, for example using a type suffix with some rules to determine what type to pick for unsuffixed literals.
Advantages:
Disadvantages:
-2147483648 is
invalid because it overflows is surprising.We could give literals a single, arbitrary-precision type (say, Integer for
integer literals and Rational for real literals).
Advantages:
Writing a function that takes any integer literal can be done with more obvious syntax and less syntactic overhead. Instead of:
fn OneHigher(L: IntLiteral(template _:! BigInt));
we could write
fn OneHigher(template L:! Integer);
However, with this proposal, a function taking any integer expression that can be evaluated to a constant can be written as
fn F(template N:! BigInt);
and such a function would accept all integer literals, as well as non-literal constants.
Disadvantages:
impl selection based on values would introduce substantial
complexity.var x: auto = 123; would result in x
having an infinite-precision type, possibly involving invisible dynamic
allocation.
x is a type that can only represent
the value 123; as such, x is effectively immutable. The
arbitrary-precision integer type introduced in this proposal can only be
used explicitly by programs naming it.- in literal tokensWe could treat a leading - character as part of a numeric literal token, so
that -- for example -- -123 would be a single -123 token rather than a unary
negation applied to a literal 123.
Advantages:
INT_MIN cannot be written
directly, without any of the other implications of this proposal.Disadvantages:
- less uniform.-, such as an infix exponentiation operator: -2**2
may be expected to evaluate to -4, not to +4.