AutoSpecialization in Dotty
Dmitry Petrashko

About me:
- https://github.com/darkdimius/
- doing PhD at EPFL
- previously worked on ScalaBlitz (up to 24x faster collections)
- since March 2014 building on Dotty under supervision of Martin.
- since March 2015 working on Dotty Linker and supervising students working on Dotty.
Dotty contributor stats:
alias blame-lines="git ls-tree -r -z --name-only HEAD -- | egrep -z -Z -E '\.(scala)$'\
| xargs -0 -n1 git blame --line-porcelain | grep "^author "| sort | uniq -c | sort -nr"
| author | commits | additions | deletions | blame lines |
| Martin Odersky | 2,280 | 146,133 | 85,391 | 66,698 |
| Dmitry Petrashko | 628 | 158,249 | 50,201 | 83,713 |
| Samuel Gruetter | 19 | 62,324 | 12,510 | 36,615 |
| Others | 144 | 5,315 | 2,108 | 4,161 |
| Java | Scala |
public double average(int[] data) {
int sum = 0;
for(int i = 0; i < data.length; i++) {
sum += data[i];
}
return sum * 1.0d / data.length
}
|
def average(x: Array[Int]) = {
x.reduce(_ + _) * 1.0 / x.size
}
|
| 20 msec | 650 msec |
Why?
public double average(int[] data) {
int sum = 0;
for(int i = 0; i < data.length; i++) {
sum += data[i];
}
return sum * 1.0d / data.length
}
| Java cycle body: - Range check;
- Addition;
- Index Increment.
|
Why?
def reduce(op: Function2[Obj, Obj, Obj]): Obj = {
var first = true
var acc: Obj = null
this.foreach{ e =>
if (first) {
acc = e
first = false
} else acc = op.apply(acc, e)
}
acc
}
def foreach(f: Funtion1[Obj, Obj]) {
var i = 0
val len = length
while (i < len) {
f.apply(this(i));
i += 1
}
} |
Scala cycle body:
- Range check;
- boxing of element
- dynamic dispatch(foreach arg)
- predicate check(first?)
- dynamic dispatch(reduce arg)
- Addition;
- boxing inside addition.
- Index Increment.
|
Scala is an awesome language:
Convenience for the programmer:
- higher-order types
- generic methods
- generic classes
- multiple inheritance
- pattern matching
- lazy evaluation
- garbage collection
Scala runs (almost) everywhere!
Because it runs on Java(Script) VMs
Does JVM support those features:
- higher-order types
- generic methods
- generic classes
- multiple inheritance
- pattern matching
- lazy evaluation
- garbage collection
Does JVM support those features:
Garbage collection - yes.
Others - not really.
This talk will concentrate on how we compile:
- higher-order types
- generic methods
- generic classes
- multiple inheritance
Compiler needs to lower generics
def plus[T](a: T, b: T)(implicit num: Numeric[T]) =
num.plus(a, b)
def plus[T](a: T, b: T)(implicit num: Numeric[T]): T =
num.plus(a, b)
is compiled into
def plus(a: Object, b: Object, num: Numeric): Object =
num.plus(a, b)
How does plus(1, 1) look like?
unbox(plus(box(1), box(1), intNum))
Have ~same cost:
- single boxing(allocation)
- 5 dynamic dispatches
- 15 static dispatches
- 20 additions
What if the compiler cloned generics
def plus(a: Int, b: Int, num: Numeric[Int]): Int =
num.plus(a, b)
def plus(a: Long, b: Long, num: Numeric[Long]): Long =
num.plus(a, b)
def plus(a: Double, b: Double, num: Numeric[Double]): Double =
num.plus(a, b)
...
Scala has 9 primitive types:
- int, long, short, byte, char
- float, double
- boolean, void
+ we need a reference type.
= 10 types total.
class T[A1, A2, ..., An]
How many combinations are there?
10 ^ n
class T[A1, A2, ..., An] {
def foo[B1, B2, ..., Bn] = ...
}
How many combinations are there?
10 ^ (n + m)
Ok, you don't need all specializations of all methods.
Dragos(2010): Ask your user!
@specialized (2010)
- Mark type arguments to be specialized
- can specify what types should be used
- limitation - no inheritance.
@miboxed (2012)
- Made it simpler by having 3^n instead of 10^n
- some performance degradation
- handles inheritance
- has problems with arrays
- complicated implementation
Let's say you develop a library:
For demonstation, take scala.Function3[A1, A2, A3, R]
- Should you specialize?
- what types should you specialize for?
- One size fits them all?
@specialization\miniboxing break modularity
What if the JVM had reified generics?
backwards compatibility
how would pre-reified generics code call into reified generics code?
code explosion
More code generated in runtime:
- less time spend optimizing every version.
- can not perform costly optimizations.
- branch prediction & cpu code cache.
Multi-language VM
Different languages have different typesystems:
- static vs dynamic
- strong vs weak
- variance
- higher-kindness
Multi-language VM
If they run on same VM, they need to "agree" on runtime typesystem.
Multi-language VM
If runtime typesystem is complicated - you may "disagree" with it.
Where "disagree" is some of:
- no interop
- boilerplate written by user
- a lot of "magic" code generated by compiler
- slow performance
I'm not sure if I want reified generics in the VM
trait Function3[-T1, -T2, -T3, +R]
How many specializations are needed?
- Specialization: 10000 classes
- Miniboxing: 81 classes
Wrong question!
How many specializations are needed in your code?
Imprecise question!
How many specializations are needed in your app including libraries?
Auto specialization in the Dotty linker
- takes your program & libraries
- analyzes how you use them
- sees what instantiations are needed
- specializes them
- no need to annotate types.
Implementation artifact:
Type-arguments and term-args are treated in the same way.
def foo[T](x: Seq[T], id: Int) = x(id)
- Type specialization remove boxing
- Term specialization removes virtual dispatch
Auto specialization: Limitations
- slows down compilation
- requires dependencies to have TASTY
- does not help if library does tricks:
- uses null as a special value
- has typed API but untyped internals
Current status:
- prototype, unstable
- Lines of code(cf @specialized 2000+, @miniboxing 5000+)
- Analysis: 1300
- Specialization: 900
- can compile std collections
- some collections do not benefit(vector)
- List\Array\Range\ListBuffer\etc work perfectly
- will be opensourced soon
Demonstration
(if time permits)
Standing on the shoulders of giants:
- Dragos: @specialized
- Ureche: @miniboxing
- Burmako: TASTY
- Odersky: Dotty & supervision
AutoSpecialization in Dotty
Dmitry Petrashko