Skip to content

Runtime Overview¤

Overview¤

A typical runtime consists of the following parts:

Compiled¤

The Compiled class is responsible for initializing and managing a device.

Compiled ¤

Compiled(
    device: str,
    allocator: Allocator,
    renderer: Optional[Renderer],
    compiler: Optional[Compiler],
    runtime,
    graph=None,
)

Methods:

  • synchronize

    Synchronize all pending operations on the device.

synchronize ¤

synchronize()

Synchronize all pending operations on the device.

This method ensures that all previously queued operations on the device have been completed before proceeding.

Allocator¤

The Allocator class is responsible for managing memory on the device. There is also a version called the LRUAllocator, which caches allocated buffers to optimize performance.

Allocator ¤

Methods:

_alloc ¤

_alloc(size: int, options: BufferOptions)

_free ¤

_free(opaque, options: BufferOptions)

alloc ¤

alloc(size: int, options: Optional[BufferOptions] = None)

copyin ¤

copyin(dest, src: memoryview)

copyout ¤

copyout(dest: memoryview, src)

free ¤

free(
    opaque,
    size: int,
    options: Optional[BufferOptions] = None,
)

LRUAllocator ¤

LRUAllocator()

Bases: Allocator

The LRU Allocator is responsible for caching buffers. It ensures that buffers are not freed until it is absolutely necessary, optimizing performance.

Methods:

Attributes:

cache instance-attribute ¤

cache: Dict[Tuple[int, Optional[BufferOptions]], Any] = (
    defaultdict(list)
)

alloc ¤

alloc(size: int, options: Optional[BufferOptions] = None)

free ¤

free(
    opaque: Any,
    size: int,
    options: Optional[BufferOptions] = None,
)

free_cache ¤

free_cache()

Program¤

The Program class is created for each loaded program. It is responsible for compiling and executing the program on the device. As an example, here is a ClangProgram implementation which loads program and runs it.

ClangProgram ¤

ClangProgram(name: str, lib: bytes)

Methods:

Attributes:

Source code in tinygrad/runtime/ops_clang.py
20
21
22
23
24
25
26
def __init__(self, name:str, lib:bytes):
  if DEBUG >= 6: cpu_objdump(lib)
  self.name, self.lib = name, lib
  # write to disk so we can load it
  with tempfile.NamedTemporaryFile(delete=True) as cached_file_path:
    pathlib.Path(cached_file_path.name).write_bytes(lib)
    self.fxn = ctypes.CDLL(str(cached_file_path.name))[name]

fxn instance-attribute ¤

fxn = CDLL(str(name))[name]

__call__ ¤

__call__(*bufs, vals=(), wait=False)
Source code in tinygrad/runtime/ops_clang.py
28
def __call__(self, *bufs, vals=(), wait=False): return cpu_time_execution(lambda: self.fxn(*bufs, *vals), enable=wait)

Compiler¤

The Compiler class compiles the output from the Renderer and produces it in a device-specific format.

Compiler ¤

Compiler(cachekey: Optional[str] = None)

Methods:

Attributes:

Source code in tinygrad/device.py
183
def __init__(self, cachekey:Optional[str]=None): self.cachekey = None if getenv("DISABLE_COMPILER_CACHE") else cachekey

cachekey instance-attribute ¤

cachekey = (
    None if getenv("DISABLE_COMPILER_CACHE") else cachekey
)

compile ¤

compile(src: str) -> bytes
Source code in tinygrad/device.py
184
def compile(self, src:str) -> bytes: return src.encode()   # NOTE: empty compiler is the default

compile_cached ¤

compile_cached(src: str) -> bytes
Source code in tinygrad/device.py
185
186
187
188
189
190
def compile_cached(self, src:str) -> bytes:
  if self.cachekey is None or (lib := diskcache_get(self.cachekey, src)) is None:
    assert not getenv("ASSERT_COMPILE"), f"tried to compile with ASSERT_COMPILE set\n{src}"
    lib = self.compile(src)
    if self.cachekey is not None: diskcache_put(self.cachekey, src, lib)
  return lib