Tailoring the Scala Library

by Stephane Micheloud, October 2011

[Home]
[Back]

In our previous article "XML-free Scala" we presented the removal of the native XML support from Scala with the two-fold objective to reduce the code size of the Scala standard library and to fasten the development of Scala software. We want now to further tailor the functionalities of the Scala standard library.

In this article we focus on the parallal collection framework which is the main addition to the Scala 2.9 software distribution (see the article "Revisited Scala Change History" for more details). More generally we want the claim "Scala is a scalable language" to also mean "Scala comes with a scalable library".

The Scala 2.9.1 API consists of the following Java archives and library packages:

Java archives Library packages
scala-library.jar 8635 KB scala.actors Actor-based concurrency [1]
scala.collection Collection framework with mutable, immutable and parallel classes (eg. Seq, Map, Set, etc.) [2]
scala._ Combinator parsing, continuations, reflection, regular expressions, XML, etc.
scala-dbc.jar 298 KB scala.dbc Database connection facilities
scala-swing.jar 838 KB scala.swing Improved API to the Java Swing framework

Since version 2.9 of the Scala distribution we have the unfortunate situation (see the article "Shrinking the Scala library code") where the size of the Scala standard library exceeds 8 MB !

Concretely we have the following two choices for further tailoring the Scala standard library:

No Parallel Collection Framework

This software distribution was generated from the SVN project scala-noxml (more details at the end of this article) and differs from the original Scala software distribution as follows:

Version Archives
2.9.1.final scala-nopar-2.9.1.final.tgz 20 MB .md5
scala-nopar-2.9.1.final.zip 20 MB .md5
scala-nopar-2.9.1.final-devel-docs.tgz 15 MB .md5
SVN trunk scala-nopar-2.10.0.rdev-4107-2012-01-02-g008a781.tgz 21 MB .md5
scala-nopar-2.10.0.rdev-4107-2012-01-02-g008a781.zip 21 MB .md5
scala-nopar-2.10.0.rdev-4107-2012-01-02-g008a781-devel-docs.tgz 15 MB .md5
SVN trunk scala-noxml-nopar-2.10.0.rdev-4107-2012-01-02-g008a781.tgz 19 MB .md5
scala-noxml-nopar-2.10.0.rdev-4107-2012-01-02-g008a781.zip 19 MB .md5
scala-noxml-nopar-2.10.0.rdev-4107-2012-01-02-g008a781-devel-docs.tgz 14 MB .md5

Standalone Parallel Collection Framework

This second software distribution differs from the original Scala software distribution as follows (more details at the end of this article):

Last updated on May 24, 2012
Version Archives
Git repo scala-parlib-2.10.0.rdev-4209-2012-01-18-g6099141.tgz 20 MB .md5
scala-parlib-2.10.0.rdev-4209-2012-01-18-g6099141.zip 20 MB .md5
scala-parlib-2.10.0.rdev-4209-2012-01-18-g6099141-devel-docs.tgz 17 MB .md5

The generated Java archives are now:

Java archives Library packages
scala-library.jar 5324 KB scala._
scala-actors.jar 404 KB scala.actors
scala-collection-parallel.jar 1295 KB scala.collection.parallel
scala-dbc.jar 298 KB scala.dbc
scala-swing.jar 837 KB scala.swing

And the client code simply needs to specify the appropriate import clause to access the functionality of the library package (as for the Actor library), that is:

import scala.collection.parallel._

to use the parallel collection framework.

Please send your feedback or report issues directly to the author and not to the Scala project team.

References

  1. Actors in Scala
    by Philipp Haller, Frank Sommers, 2010 (1st Ed.)
  2. by Martin Odersky, September 2009
  3. The Scala Language Specification (SLS), Version 2.8
    Martin Odersky, November 2010

About the Author

Stephane's Picture
Stéphane Micheloud is a senior software engineer. He holds a PhD in computer science from EPFL and a MSc in computer science from ETHZ. At EPFL he worked on distributed programming and advanced compiler techniques and participated for over six years to the Scala project. Previously he was professor in computer science at HES-SO // Valais in Sierre, Switzerland.
[Top]

Other Articles

[Top]

Technical Notes

  1. The project modifications/additions are similar to those described in our previous article "XML-free Scala"; we added a few more project files and the two Ant targets enable.par and disable.par.

    The above project configuration makes it possible to build either the official Scala distribution or a Scala distribution with no parallel collection framework using the same code base.

    • Full Scala distribution

      [2.9.x]$ ant dist-opt distpack-opt
    • Scala distribution with no parallel collection framework

      [2.9.x]$ ant -f build-noxml.xml disable.par
      [2.9.x]$ ant -f build-noxml.xml dist-opt distpack-opt
      [2.9.x]$ ant -f build-noxml.xml enable.par
    • Scala distribution with no parallel collection framework and no native XML support

      [2.9.x]$ ant -f build-noxml.xml disable.xml disable.par
      [2.9.x]$ ant -f build-noxml.xml dist-opt distpack-opt
      [2.9.x]$ ant -f build-noxml.xml enable.xml enable.par
  2. The framework redesign consists of the following changes:

    • The following source files are moved to package scala.collection.parallel (where they actually belong):

      parallel/CustomParallelizable.scala
      parallel/Parallelizable.scala
      parallel/Parallel.scala
      parallel/generic/CanCombineFrom.scala
      parallel/generic/GenericParCompanion.scala
      parallel/generic/GenericParTemplate.scala
      parallel/generic/HasNewCombiner.scala
      parallel/generic/ParFactory.scala
      parallel/generic/ParMapFactory.scala
      parallel/generic/ParSetFactory.scala
    • Some implicit conversions are added to file parallel/package.scala and give access to the two methods par and seq:

      package scala.collection
      //...
      package object parallel {
        //...
        implicit def iterable2ParIterable[A](col: Iterable[A]) =
          new Parallelizable[A, ParIterable[A]] {
            def seq = col
            protected[this] def parCombiner = ParIterable.newCombiner[A]
          }
        //.. (more implicits)
        implicit def cmHashset2ParHashSet[A](col: cm.HashSet[A]) =
          new Parallelizable[A, pm.ParHashSet[A]] {
            def seq = col
            override def par = new pm.ParHashSet(col.hashTableContents)
            protected[this] def parCombiner = pm.ParHashSet.newCombiner[A]
          }
        //...
      }

    The chosen project configuration is similar to the one of project scala-noxml and makes it possible to build either the official Scala distribution or a Scala distribution with the standalone parallel collection framework using the same code base.

    • Full Scala distribution

      [scala-parlib]$ ant dist-opt distpack-opt
    • Scala distribution with standalone parallel collection framework

      [scala-parlib]$ ant -f build-parlib.xml disable.par
      [scala-parlib]$ ant -f build-parlib.xml dist-opt distpack-opt
      [scala-parlib]$ ant -f build-parlib.xml enable.par