compiler_gym.datasets

An instance of a CompilerGym environment uses a Benchmark as the program being optimized. A Dataset is collection of benchmarks that can be installed and made available for use.

Benchmark

class compiler_gym.datasets.Benchmark(proto: compiler_gym.service.proto.compiler_gym_service_pb2.Benchmark, validation_callbacks: Optional[List[Callable[[CompilerEnv], Iterable[compiler_gym.validation_error.ValidationError]]]] = None, sources: Optional[List[compiler_gym.datasets.benchmark.BenchmarkSource]] = None)[source]

A benchmark represents a particular program that is being compiled.

A benchmark is a program that can be used by a CompilerEnv as a program to optimize. A benchmark comprises the data that is fed into the compiler, identified by a URI.

Benchmarks are not normally instantiated directly. Instead, benchmarks are instantiated using env.datasets.benchmark(uri):

>>> env.datasets.benchmark("benchmark://npb-v0/20")
benchmark://npb-v0/20

The available benchmark URIs can be queried using env.datasets.benchmark_uris().

>>> next(env.datasets.benchmark_uris())
'benchmark://cbench-v1/adpcm'

Compiler environments may provide additional helper functions for generating benchmarks, such as env.make_benchmark() for LLVM.

A Benchmark instance wraps an instance of the Benchmark protocol buffer from the RPC interface with additional functionality. The data underlying benchmarks should be considered immutable. New attributes cannot be assigned to Benchmark instances.

The benchmark for an environment can be set during env.reset(). The currently active benchmark can be queried using env.benchmark:

>>> env = gym.make("llvm-v0")
>>> env.reset(benchmark="benchmark://cbench-v1/crc32")
>>> env.benchmark
benchmark://cbench-v1/crc32
add_source(source: compiler_gym.datasets.benchmark.BenchmarkSource)None[source]

Register a new source file for this benchmark.

Parameters

source – The BenchmarkSource to register.

add_validation_callback(validation_callback: Callable[[CompilerEnv], Iterable[compiler_gym.validation_error.ValidationError]])None[source]

Register a new validation callback that will be executed on validate().

Parameters

validation_callback – A callback that accepts a single CompilerEnv argument and returns an iterable sequence of zero or more ValidationError tuples. Validation callbacks must be thread safe and must not modify the environment.

classmethod from_file(uri: str, path: pathlib.Path)[source]

Construct a benchmark from a file.

Parameters
  • uri – The URI of the benchmark.

  • path – A filesystem path.

Raises

FileNotFoundError – If the path does not exist.

Returns

A Benchmark instance.

classmethod from_file_contents(uri: str, data: bytes)[source]

Construct a benchmark from raw data.

Parameters
  • uri – The URI of the benchmark.

  • data – An array of bytes that will be passed to the compiler service.

is_validatable()bool[source]

Whether the benchmark has any validation callbacks registered.

Returns

True if the benchmark has at least one validation callback.

ivalidate(env: CompilerEnv)Iterable[compiler_gym.validation_error.ValidationError][source]

Run the validation callbacks and return a generator of errors.

This is an asynchronous version of validate() that returns immediately.

Parameters

env – A CompilerEnv instance to validate.

Returns

A generator of ValidationError tuples that occur during validation.

property proto: compiler_gym.service.proto.compiler_gym_service_pb2.Benchmark

The protocol buffer representing the benchmark.

Returns

A Benchmark message.

Type

Benchmark

property sources: Iterable[compiler_gym.datasets.benchmark.BenchmarkSource]

The original source code used to produce this benchmark, as a list of BenchmarkSource instances.

Returns

A sequence of source files.

Type

Iterable[BenchmarkSource]

Warning

The Benchmark.sources property is new and is likely to change in the future.

property uri: str

The URI of the benchmark.

Benchmark URIs should be unique, that is, that two URIs with the same value should resolve to the same benchmark. However, URIs do not have uniquely describe a benchmark. That is, multiple identical benchmarks could have different URIs.

Returns

A URI string. :type: string

validate(env: CompilerEnv)List[compiler_gym.validation_error.ValidationError][source]

Run the validation callbacks and return any errors.

If no errors are returned, validation has succeeded:

>>> benchmark.validate(env)
[]

If an error occurs, a ValidationError tuple will describe the type of the error, and optionally contain other data:

>>> benchmark.validate(env)
[ValidationError(type="RuntimeError")]

Multiple ValidationError errors may be returned to indicate multiple errors.

This is a synchronous version of ivalidate() that blocks until all results are ready:

>>> benchmark.validate(env) == list(benchmark.ivalidate(env))
True
Parameters

env – The CompilerEnv instance that is being validated.

Returns

A list of zero or more ValidationError tuples that occurred during validation.

validation_callbacks()List[Callable[[CompilerEnv], Iterable[compiler_gym.validation_error.ValidationError]]][source]

Return the list of registered validation callbacks.

Returns

A list of callables. See add_validation_callback().

write_sources_to_directory(directory: pathlib.Path)int[source]

Write the source files for this benchmark to the given directory.

This writes each of the benchmark.sources files to disk.

If the benchmark has no sources, no files are written.

Parameters

directory – The directory to write results to. If it does not exist, it is created.

Returns

The number of files written.

class compiler_gym.datasets.BenchmarkSource(filename: str, contents: bytes)[source]

A source file that is used to generate a benchmark. A benchmark may comprise many source files.

Warning

The BenchmarkSource class is new and is likely to change in the future.

contents: bytes

The contents of the file as a byte array.

filename: str

The name of the file.

class compiler_gym.datasets.BenchmarkInitError[source]

Base class for errors raised if a benchmark fails to initialize.

Dataset

class compiler_gym.datasets.Dataset(name: str, description: str, license: str, site_data_base: pathlib.Path, benchmark_class=<class 'compiler_gym.datasets.benchmark.Benchmark'>, references: Optional[Dict[str, str]] = None, deprecated: Optional[str] = None, sort_order: int = 0, logger: Optional[logging.Logger] = None, validatable: str = 'No')[source]

A dataset is a collection of benchmarks.

The Dataset class has methods for installing and managing groups of benchmarks, for listing the available benchmark URIs, and for instantiating Benchmark objects.

The Dataset class is an abstract base for implementing datasets. At a minimum, subclasses must implement the benchmark() and benchmark_uris() methods, and size. Other methods such as install() may be used where helpful.

__init__(name: str, description: str, license: str, site_data_base: pathlib.Path, benchmark_class=<class 'compiler_gym.datasets.benchmark.Benchmark'>, references: Optional[Dict[str, str]] = None, deprecated: Optional[str] = None, sort_order: int = 0, logger: Optional[logging.Logger] = None, validatable: str = 'No')[source]

Constructor.

Parameters
  • name – The name of the dataset. Must conform to the pattern {{protocol}}://{{name}}-v{{version}}.

  • description – A short human-readable description of the dataset.

  • license – The name of the dataset’s license.

  • site_data_base – The base path of a directory that will be used to store installed files.

  • benchmark_class – The class to use when instantiating benchmarks. It must have the same constructor signature as Benchmark.

  • references – A dictionary of useful named URLs for this dataset containing extra information, download links, papers, etc.

  • deprecated – Mark the dataset as deprecated and issue a warning when install(), including the given method. Deprecated datasets are excluded from the datasets() iterator by default.

  • sort_order – An optional numeric value that should be used to order this dataset relative to others. Lowest value sorts first.

  • validatable – Whether the dataset is validatable. A validatable dataset is one where the behavior of the benchmarks can be checked by compiling the programs to binaries and executing them. If the benchmarks crash, or are found to have different behavior, then validation fails. This type of validation is used to check that the compiler has not broken the semantics of the program. This value takes a string and is used for documentation purposes only. Suggested values are “Yes”, “No”, or “Partial”.

Raises

ValueError – If name does not match the expected type.

__len__()int[source]

The number of benchmarks in the dataset.

This is the same as Dataset.size:

>>> len(dataset) == dataset.size
True

If the number of benchmarks is unknown or unbounded, for example because the dataset represents a program generator that can produce an infinite number of programs, the value is 0.

Returns

An integer.

__getitem__(uri: str)compiler_gym.datasets.benchmark.Benchmark[source]

Select a benchmark by URI.

This is the same as Dataset.benchmark(uri):

>>> dataset["benchmark://cbench-v1/crc32"] == dataset.benchmark("benchmark://cbench-v1/crc32")
True
Returns

A Benchmark instance.

Raises

LookupError – If uri does not exist.

__iter__()Iterable[compiler_gym.datasets.benchmark.Benchmark][source]

Enumerate the (possibly infinite) benchmarks lazily.

This is the same as Dataset.benchmarks():

>>> from itertools import islice
>>> list(islice(dataset, 100)) == list(islice(datset.benchmarks(), 100))
True
Returns

An iterable sequence of Benchmark instances.

benchmark(uri: str)compiler_gym.datasets.benchmark.Benchmark[source]

Select a benchmark.

Parameters

uri – The URI of the benchmark to return.

Returns

A Benchmark instance.

Raises

LookupError – If uri is not found.

benchmark_uris()Iterable[str][source]

Enumerate the (possibly infinite) benchmark URIs.

Iteration order is consistent across runs. The order of benchmarks() and benchmark_uris() is the same.

If the number of benchmarks in the dataset is infinite (len(dataset) == math.inf), the iterable returned by this method will continue indefinitely.

Returns

An iterable sequence of benchmark URI strings.

benchmarks()Iterable[compiler_gym.datasets.benchmark.Benchmark][source]

Enumerate the (possibly infinite) benchmarks lazily.

Iteration order is consistent across runs. The order of benchmarks() and benchmark_uris() is the same.

If the number of benchmarks in the dataset is infinite (len(dataset) == math.inf), the iterable returned by this method will continue indefinitely.

Returns

An iterable sequence of Benchmark instances.

property deprecated: bool

Whether the dataset is included in the iterable sequence of datasets of a containing Datasets collection.

Type

bool

property description: str

A short human-readable description of the dataset.

Type

str

install()None[source]

Install this dataset locally.

Implementing this method is optional. If implementing this method, you must call super().install() first.

This method should not perform redundant work. This method should first detect whether any work needs to be done so that repeated calls to install() will complete quickly.

property installed: bool

Whether the dataset is installed locally. Installation occurs automatically on first use, or by calling install().

Type

bool

property license: str

The name of the license of the dataset.

Type

str

property logger: logging.Logger

The logger for this dataset.

Type

logging.Logger

property name: str

The name of the dataset.

Type

str

property protocol: str

The URI protocol that is used to identify benchmarks in this dataset.

Type

str

random_benchmark(random_state: Optional[numpy.random._generator.Generator] = None)compiler_gym.datasets.benchmark.Benchmark[source]

Select a benchmark randomly.

Parameters

random_state – A random number generator. If not provided, a default np.random.default_rng() is used.

Returns

A Benchmark instance.

property references: Dict[str, str]

A dictionary of useful named URLs for this dataset containing extra information, download links, papers, etc.

For example:

>>> dataset.references
{'Paper': 'https://arxiv.org/pdf/1407.3487.pdf',
'Homepage': 'https://ctuning.org/wiki/index.php/CTools:CBench'}
Type

Dict[str, str]

property site_data_path: pathlib.Path

The filesystem path used to store persistent dataset files.

This directory may not exist.

Type

Path

property site_data_size_in_bytes: int

The total size of the on-disk data used by this dataset.

Type

int

property size: int

The number of benchmarks in the dataset.

If the number of benchmarks is unknown or unbounded, for example because the dataset represents a program generator that can produce an infinite number of programs, the value is 0.

Type

int

uninstall()None[source]

Remove any local data for this benchmark.

This method undoes the work of install(). The dataset can still be used after calling this method.

property validatable: str

Whether the dataset is validatable. A validatable dataset is one where the behavior of the benchmarks can be checked by compiling the programs to binaries and executing them. If the benchmarks crash, or are found to have different behavior, then validation fails. This type of validation is used to check that the compiler has not broken the semantics of the program.

This property takes a string and is used for documentation purposes only. Suggested values are “Yes”, “No”, or “Partial”.

Type

str

property version: int

The version tag for this dataset.

Type

int

class compiler_gym.datasets.DatasetInitError[source]

Base class for errors raised if a dataset fails to initialize.

FilesDataset

class compiler_gym.datasets.FilesDataset(dataset_root: pathlib.Path, benchmark_file_suffix: str = '', memoize_uris: bool = True, **dataset_args)[source]

A dataset comprising a directory tree of files.

A FilesDataset is a root directory that contains (a possibly nested tree of) files, where each file represents a benchmark. The directory contents can be filtered by specifying a filename suffix that files must match.

The URI of benchmarks is the relative path of each file, stripped of a required filename suffix, if specified. For example, given the following file tree:

/tmp/dataset/a.txt
/tmp/dataset/LICENSE
/tmp/dataset/subdir/subdir/b.txt
/tmp/dataset/subdir/subdir/c.txt

a FilesDataset benchmark://ds-v0 rooted at /tmp/dataset with filename suffix .txt will contain the following URIs:

>>> list(dataset.benchmark_uris())
[
    "benchmark://ds-v0/a",
    "benchmark://ds-v0/subdir/subdir/b",
    "benchmark://ds-v0/subdir/subdir/c",
]
__init__(dataset_root: pathlib.Path, benchmark_file_suffix: str = '', memoize_uris: bool = True, **dataset_args)[source]

Constructor.

Parameters
  • dataset_root – The root directory to look for benchmark files.

  • benchmark_file_suffix – A file extension that must be matched for a file to be used as a benchmark.

  • memoize_uris – Whether to memoize the list of URIs contained in the dataset. Memoizing the URIs enables faster repeated iteration over dataset.benchmark_uris() at the expense of increased memory overhead as the file list must be kept in memory.

  • dataset_args – See Dataset.__init__().

benchmark(uri: str)compiler_gym.datasets.benchmark.Benchmark[source]

Select a benchmark.

Parameters

uri – The URI of the benchmark to return.

Returns

A Benchmark instance.

Raises

LookupError – If uri is not found.

benchmark_uris()Iterable[str][source]

Enumerate the (possibly infinite) benchmark URIs.

Iteration order is consistent across runs. The order of benchmarks() and benchmark_uris() is the same.

If the number of benchmarks in the dataset is infinite (len(dataset) == math.inf), the iterable returned by this method will continue indefinitely.

Returns

An iterable sequence of benchmark URI strings.

property size: int

The number of benchmarks in the dataset.

If the number of benchmarks is unknown or unbounded, for example because the dataset represents a program generator that can produce an infinite number of programs, the value is 0.

Type

int

TarDataset

class compiler_gym.datasets.TarDataset(tar_urls: List[str], tar_sha256: Optional[str] = None, tar_compression: str = 'bz2', strip_prefix: str = '', **dataset_args)[source]

A dataset comprising a files tree stored in a tar archive.

This extends the FilesDataset class by adding support for compressed archives of files. The archive is downloaded and unpacked on-demand.

__init__(tar_urls: List[str], tar_sha256: Optional[str] = None, tar_compression: str = 'bz2', strip_prefix: str = '', **dataset_args)[source]

Constructor.

Parameters
  • tar_urls – A list of redundant URLS to download the tar archive from.

  • tar_sha256 – The SHA256 checksum of the downloaded tar archive.

  • tar_compression – The tar archive compression type. One of {“bz2”, “gz”}.

  • strip_prefix – An optional path prefix to strip. Only files that match this path prefix will be used as benchmarks.

  • dataset_args – See FilesDataset.__init__().

install()None[source]

Install this dataset locally.

Implementing this method is optional. If implementing this method, you must call super().install() first.

This method should not perform redundant work. This method should first detect whether any work needs to be done so that repeated calls to install() will complete quickly.

property installed: bool

Whether the dataset is installed locally. Installation occurs automatically on first use, or by calling install().

Type

bool

TarDatasetWithManifest

class compiler_gym.datasets.TarDatasetWithManifest(manifest_urls: List[str], manifest_sha256: str, manifest_compression: str = 'bz2', **dataset_args)[source]

A tarball-based dataset that reads the benchmark URIs from a separate manifest file.

A manifest file is a plain text file containing a list of benchmark names, one per line, and is shipped separately from the tar file. The idea is to allow the list of benchmark URIs to be enumerated in a more lightweight manner than downloading and unpacking the entire dataset. It does this by downloading and unpacking only the manifest to iterate over the URIs.

The manifest file is assumed to be correct and is not validated.

__init__(manifest_urls: List[str], manifest_sha256: str, manifest_compression: str = 'bz2', **dataset_args)[source]

Constructor.

Parameters
  • manifest_urls – A list of redundant URLS to download the compressed text file containing a list of benchmark URI suffixes, one per line.

  • manifest_sha256 – The sha256 checksum of the compressed manifest file.

  • manifest_compression – The manifest compression type. One of {“bz2”, “gz”}.

  • dataset_args – See TarDataset.__init__().

benchmark_uris()Iterable[str][source]

Enumerate the (possibly infinite) benchmark URIs.

Iteration order is consistent across runs. The order of benchmarks() and benchmark_uris() is the same.

If the number of benchmarks in the dataset is infinite (len(dataset) == math.inf), the iterable returned by this method will continue indefinitely.

Returns

An iterable sequence of benchmark URI strings.

property size: int

The number of benchmarks in the dataset.

If the number of benchmarks is unknown or unbounded, for example because the dataset represents a program generator that can produce an infinite number of programs, the value is 0.

Type

int

Datasets

class compiler_gym.datasets.Datasets(datasets: Iterable[compiler_gym.datasets.dataset.Dataset])[source]

A collection of datasets.

This class provides a dictionary-like interface for indexing and iterating over multiple Dataset objects. Select a dataset by URI using:

>>> env.datasets["benchmark://cbench-v1"]

Check whether a dataset exists using:

>>> "benchmark://cbench-v1" in env.datasets
True

Or iterate over the datasets using:

>>> for dataset in env.datasets:
...     print(dataset.name)
benchmark://cbench-v1
benchmark://github-v0
benchmark://npb-v0

To select a benchmark from the datasets, use benchmark():

>>> env.datasets.benchmark("benchmark://a-v0/a")

Use the benchmarks() method to iterate over every benchmark in the datasets in a stable round robin order:

>>> for benchmark in env.datasets.benchmarks():
...     print(benchmark)
benchmark://cbench-v1/1
benchmark://github-v0/1
benchmark://npb-v0/1
benchmark://cbench-v1/2
...

If you want to exclude a dataset, delete it:

>>> del env.datasets["benchmark://b-v0"]
__len__()int[source]

The number of datasets in the collection.

__getitem__(dataset: str)compiler_gym.datasets.dataset.Dataset[source]

Lookup a dataset.

Parameters

dataset – A dataset name.

Returns

A Dataset instance.

Raises

LookupError – If dataset is not found.

__setitem__(key: str, dataset: compiler_gym.datasets.dataset.Dataset)[source]

Add a dataset to the collection.

Parameters
  • key – The name of the dataset.

  • dataset – The dataset to add.

__delitem__(dataset: str)[source]

Remove a dataset from the collection.

This does not affect any underlying storage used by dataset. See uninstall() to clean up.

Parameters

dataset – The name of a dataset.

Returns

True if the dataset was removed, False if it was already removed.

__contains__(dataset: str)bool[source]

Returns whether the dataset is contained.

__iter__()Iterable[compiler_gym.datasets.dataset.Dataset][source]

Iterate over the datasets.

Dataset order is consistent across runs.

Equivalent to datasets.datasets(), but without the ability to iterate over the deprecated datasets.

If the number of benchmarks in any of the datasets is infinite (len(dataset) == math.inf), the iterable returned by this method will continue indefinitely.

Returns

An iterable sequence of Dataset instances.

benchmark(uri: str)compiler_gym.datasets.benchmark.Benchmark[source]

Select a benchmark.

Returns the corresponding Benchmark, regardless of whether the containing dataset is installed or deprecated.

Parameters

uri – The URI of the benchmark to return.

Returns

A Benchmark instance.

benchmark_uris(with_deprecated: bool = False)Iterable[str][source]

Enumerate the (possibly infinite) benchmark URIs.

Benchmark URI order is consistent across runs. URIs from datasets are returned in round robin order. The order of benchmarks() and benchmark_uris() is the same.

If the number of benchmarks in any of the datasets is infinite (len(dataset) == math.inf), the iterable returned by this method will continue indefinitely.

Parameters

with_deprecated – If True, include benchmarks from datasets that have been marked deprecated.

Returns

An iterable sequence of benchmark URI strings.

benchmarks(with_deprecated: bool = False)Iterable[compiler_gym.datasets.benchmark.Benchmark][source]

Enumerate the (possibly infinite) benchmarks lazily.

Benchmarks order is consistent across runs. One benchmark from each dataset is returned in round robin order until all datasets have been fully enumerated. The order of benchmarks() and benchmark_uris() is the same.

If the number of benchmarks in any of the datasets is infinite (len(dataset) == math.inf), the iterable returned by this method will continue indefinitely.

Parameters

with_deprecated – If True, include benchmarks from datasets that have been marked deprecated.

Returns

An iterable sequence of Benchmark instances.

dataset(dataset: str)compiler_gym.datasets.dataset.Dataset[source]

Get a dataset.

Return the corresponding Dataset. Name lookup will succeed whether or not the dataset is deprecated.

Parameters

dataset – A dataset name.

Returns

A Dataset instance.

Raises

LookupError – If dataset is not found.

datasets(with_deprecated: bool = False)Iterable[compiler_gym.datasets.dataset.Dataset][source]

Enumerate the datasets.

Dataset order is consistent across runs.

Parameters

with_deprecated – If True, include datasets that have been marked as deprecated.

Returns

An iterable sequence of Dataset instances.

random_benchmark(random_state: Optional[numpy.random._generator.Generator] = None)compiler_gym.datasets.benchmark.Benchmark[source]

Select a benchmark randomly.

First, a dataset is selected uniformly randomly using random_state.choice(list(datasets)). The random_benchmark() method of that dataset is then called to select a benchmark.

Note that the distribution of benchmarks selected by this method is not biased by the size of each dataset, since datasets are selected uniformly. This means that datasets with a small number of benchmarks will be overrepresented compared to datasets with many benchmarks. To correct for this bias, use the number of benchmarks in each dataset as a weight for the random selection:

>>> rng = np.random.default_rng()
>>> finite_datasets = [d for d in env.datasets if len(d) != math.inf]
>>> dataset = rng.choice(
    finite_datasets,
    p=[len(d) for d in finite_datasets]
)
>>> dataset.random_benchmark(random_state=rng)
Parameters

random_state – A random number generator. If not provided, a default np.random.default_rng() is used.

Returns

A Benchmark instance.