Why¶

In [1]:

0.1 > 1//10

Out[1]:

true

Simon Byrne - Julia Computing¶

Each finite floating point number corresponds to a single real number

Just not maybe the one you think it is

In [2]:

@printf "%.60f" 0.1

0.100000000000000005551115123125782702118158340454101562500000

Why not just use `promote`?¶

Most languages do!

In [3]:

≃(x::Real,y::Real) = ≃(promote(x,y)...)
≃(x::T, y::T) where {T<:Real} = x == y

Out[3]:

≃ (generic function with 2 methods)

In [4]:

0.1 ≃ 1//10 ≃ 0.1f0

Out[4]:

true

In [5]:

1//10 ≃ 0.1 ≃ 0.1f0

Out[5]:

false

≃ is not a transitive relation:

$$ a \simeq b \quad \text{and} \quad b \simeq c \quad \nRightarrow \quad a \simeq c $$

In [6]:

0.1 ≃ 1//10

Out[6]:

true

In [7]:

1//10 ≃ 0.1f0

Out[7]:

true

In [8]:

0.1 ≃ 0.1f0

Out[8]:

false

We can do better.

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

In [28]:

sum(1:10)

Out[28]:

Intrinsics¶

These are defined by LLVM, and typically reduce to native CPU instruction.

<(x::Float64, y::Float64) = lt_float(x, y)

In [29]:

@code_llvm 1.0 < 1.0

define i8 @"jlsys_<_60112"(double, double) #0 !dbg !5 {
top:
  %2 = fcmp olt double %0, %1
  %3 = zext i1 %2 to i8
  ret i8 %3
}

In [30]:

@code_native 1.0 < 1.0

	.section	__TEXT,__text,regular,pure_instructions
Filename: float.jl
	pushl	%ebp
	decl	%eax
	movl	%esp, %ebp
Source line: 432
	ucomisd	%xmm0, %xmm1
	seta	%al
	popl	%ebp
	retl
Source line: 432
	nop
	nop
	nop

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

Conversions¶

Certain conversions can be done exactly (without changing the underlying numeric value):

UInt32 $\rightarrow$ UInt64
Int32 $\rightarrow$ Int64
UInt32 $\rightarrow$ Float64
Int32 $\rightarrow$ Float64
Float32 $\rightarrow$ Float64

We can use this to fill in some gaps.

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

Library functions¶

BigInt and BigFloat are wrappers around external libraries (GMP and MPFR), which provide various comparison operators.

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

Unsigned vs Signed¶

<( x::BitSigned,   y::BitUnsigned) = (x <  0) | (unsigned(x) <  y)
<( x::BitUnsigned, y::BitSigned  ) = (y >= 0) & (x <  unsigned(y))

check if the signed value is less than zero
unsigned simply reinterprets a signed as an unsigned integer

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

`Int64`/`UInt64` vs `Float64`¶

function <(x::Float64, y::Union{UInt64, Int64})
    fy = Float64(y)
    (x < fy) | ((x == fy) & (fy < Float64(typemax(y))) & (unsafe_trunc(typeof(y),fy) < y))
end

Convert the integer to a Float64
Compare that to the float
- If !=, then we're done!
- If ==, there may have been rounding in step 1.
Check if the float is 2.0^64 / 2.0^63
- these are the only potential values which can't be safely converted back to an integer
Convert back to an integer, and do an integer comparison.

Though complicated, actually not that bad in terms of performance
- Can be fully pipelined
- Compiler is pretty goood at eliding unnecessary computations when one value is a constant.
Original to Julia as far as I'm aware (61fe3b9c3 was the initial commit).

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

`Rational` vs `Integer`/`Rational`¶

Cross-multiply the denominator:

<(x::Rational, y::Integer ) = x.num < widemul(x.den,y)
<(x::Rational, y::Rational) = widemul(x.num,y.den) < widemul(x.den,y.num)

denominator is constrained by the constructor to be positive
- don't need to check signs
widemul returns an integer of a sufficiently wide type so that there is no overflow

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

`Rational` vs floating point¶

function <(x::Rational, y::AbstractFloat)
    (isnan(x) || isnan(y)) && return false

    xn, xp, xd = decompose(x)
    yn, yp, yd = decompose(y)

    if xd < 0
        xn = -xn
        xd = -xd
    end
    if yd < 0
        yn = -yn
        yd = -yd
    end

    xc, yc = widemul(xn,yd), widemul(yn,xd)
    xs, ys = sign(xc), sign(yc)

    if xs != ys
        return xs < ys
    elseif xs == 0
        # both are zero or ±Inf
        return xn < yn
    end

    xb, yb = ndigits0z(xc,2) + xp, ndigits0z(yc,2) + yp

    if xb == yb
        xc, yc = promote(xc,yc)
        if xp > yp
            xc = (xc<<(xp-yp))
        else
            yc = (yc<<(yp-xp))
        end
        return xc < yc
    else
        return xc > 0 ? xb < yb : yb < xb
    end
end

Check for NaNs
Cross multiply rational with significand of float
Compare based on signs
- if different, we're done
Compare based on magnitude of leading digit
- if different, we're done
Compare based on values of cross multiplication

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

`Irrational`s¶

Julia provides various constants as a parametric type

In [12]:

pi

Out[12]:

π = 3.1415926535897...

In [13]:

Float64(pi) < pi

Out[13]:

true

In [14]:

typeof(pi)

Out[14]:

Irrational{:π}

`Irrational` vs `Float64`¶

<(x::Irrational, y::Float64) = Float64(x,RoundUp) <= y
<(x::Float64, y::Irrational) = x <= Float64(y,RoundDown)

Exploits the fact that Irrational must occur between two Float64s (since a Float64 is always rational).
Float64(x, RoundUp) gives the smallest Float64 which is greater than or equal to x.

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

`Irrational` vs `BigFloat`¶

<(x::Irrational, y::BigFloat) = setprecision(precision(y)+32) do
    big(x) < y
end

Assumes that an extra 32 bits is sufficient to distinguish an irrational.
We really should use a loop and keep adding bits until not equal.

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

`Irrational` vs `Rational`¶

function <(x::AbstractIrrational, y::Rational{T}) where T
    T <: Unsigned && x < 0.0 && return true
    rx = rationalize(T, x)
    if lessrational(rx, x)
        return rx < y
    else
        return rx <= y
    end
end

rational(T, x::Irrational) gives the closest rational number to x that fits whose coefficients fit within type T
- Computed using the convergents and semi-convergents (calculated from the truncated continued fraction of x).
lessrational(rx,x) is an @pure version of rx < big(x)
- Will be evaluated to a constant at compile-time
Can't be used for Rational{BigInt}

`Irrational` vs `Rational{BigInt}`¶

<(x::AbstractIrrational, y::Rational{BigInt}) = big(x) < y

Not strictly correct:

In [15]:

r = 45471447111470790535029367847216232831674172166049053744846518889742361808273//
    14474011154664524427946373126085988481658748083205070504932198000989141204992

Out[15]:

45471447111470790535029367847216232831674172166049053744846518889742361808273//14474011154664524427946373126085988481658748083205070504932198000989141204992

In [16]:

r < pi

Out[16]:

false

In [17]:

setprecision(BigFloat, 1024) do
    r < pi
end

Out[17]:

true

Possible solution¶

Use the fact that the convergents alternate between being smaller and larger: work your way down the continued fraction until you can say for sure.

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

`Irrational` vs `Irrational`¶

In [18]:

# julia 0.6
pi < pi

< not defined for Irrational{:π}

Stacktrace:
 [1] <(::Irrational{:π}, ::Irrational{:π}) at ./promotion.jl:350

Per Rutquist fixed this a month ago for 0.7/1.0 (#27797):

<(::Irrational{s}, ::Irrational{s}) where {s} = false
function <(x::AbstractIrrational, y::AbstractIrrational)
    Float64(x) != Float64(y) || throw(MethodError(<, (x, y)))
    return Float64(x) < Float64(y)
end

	UInt32	UInt64	Int32	Int64	BigInt	Float32	Float64	BigFloat	Rational	Irrational
UInt32
UInt64
Int32
Int64
BigInt
Float32
Float64
BigFloat
Rational
Irrational

Extending¶

Multiple dispatch allows easily extending to user-defined types.

Still work to do:

In [27]:

using DecFP
d64"1e100" == 1e100

Out[27]:

true

In [24]:

BigInt(1e100)

Out[24]:

10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104

Epilogue¶

While preparing this talk on Tuesday, I found an issue with the categorisation of the unicode operator $\eqsim$.
The fix is the only breaking change between 0.7 and 1.0 (#28511).

Why¶

Simon Byrne - Julia Computing¶

Why not just use promote?¶

Intrinsics¶

Conversions¶

Library functions¶

Unsigned vs Signed¶

Int64/UInt64 vs Float64¶

Rational vs Integer/Rational¶

Rational vs floating point¶

Irrationals¶

Irrational vs Float64¶

Irrational vs BigFloat¶

Irrational vs Rational¶

Irrational vs Rational{BigInt}¶