typed-worklist.scrbl - www - Unnamed repository; edit this file 'description' to name the repository.

typed-worklist.scrbl (6257B)
      1 #lang scribble/manual
      2 @(require scriblib/footnote
      3           (for-label typed-worklist
      4                      racket/base))
      5 
      6 @title{Typed worklist}
      7 @author[@author+email["Suzanne Soy" "racket@suzanne.soy"]]
      8 
      9 @defmodule[typed-worklist]
     10 
     11 @section{Overview}
     12 
     13 When traversing a graph, one might want to keep track of the already-processed
     14 nodes, and the ones which still have to be processed. The same goes when
     15 implementing algorithms for which work on an element may queue more elements.
     16 
     17 If there is only one type of element, then the @racketid[worklist] function's
     18 type is simple:
     19 
     20 @racketblock[
     21  (∀ (In Out) (→ (Listof In)
     22                 (→ In (Values Out (Listof In)))
     23                 (Listof Out)))]
     24 
     25 This package provides a @racketid[worklist] function (with a lightweight macro
     26 wrapper) which can handle @racketid[n] worklists, where each worklist holds
     27 tasks of different types. Furthermore, the worker for each element type may
     28 enqueue elements into arbitrary worklists.
     29 
     30 This macro should be useful when traversing heterogeneous graphs, or when
     31 implementing algorithms which use several waiting queues (worklists) holding
     32 elements of different types.
     33 
     34 @section{Intended use}
     35 
     36 This package is mainly intended to be used by the
     37 @racketmodname[phc-graph #:indirect] package@note{The development of
     38  @racketmodname[phc-graph #:indirect] is currently stalled, as I'm
     39  investigating solutions based on @racketmodname[turnstile #:indirect].}.
     40 
     41 The goal of @racketmodname[phc-graph #:indirect] is to implement generalised
     42 folds (catamorphisms) over heterogeneous graphs. Heterogeneous graphs can be
     43 conceptualised as trees where nodes can have different types (e.g. nodes of
     44 type @racketid[A] have children of type @racketid[B], which have children of
     45 type @racketid[C], which have children of type @racketid[A], and each node
     46 type can hold different metadata), generalised to allow backward arcs (so that
     47 the trees become true graphs). This sort of transformation can for example
     48 occur when transforming an abstract syntax tree within a pass of a compiler.
     49 
     50 A simple technique for traversing and processing such a data structure is to
     51 have a worklist, which remembers the already-visited nodes, as well as a queue
     52 of nodes that still need to be visited. A polymorphic
     53 @racketid[example-worklist1] function which can be used to handle two sets of
     54 processed/to-process nodes from a @emph{homogeneous} graph would have the
     55 following signature:
     56 
     57 @nested[
     58  #:style 'inset
     59  @defproc[#:link-target? #f
     60  (example-worklist1 [roots (Listof In)]
     61                     [proc (→ In (Values Out (Listof In)))])
     62  (Listof Out)]{
     63   Signature for a hypothetical function which takes an initial list of tasks of
     64   type @racketid[In], and a processing function which consumes one task,
     65   returning a result of type @racket[Out], and a list of new tasks.
     66 
     67   The hypothetical function would return the results obtained after processing
     68   each task, making sure that each task is processed only once (i.e. if a task
     69   is returned multiple times by processing functions, it is only added once to
     70   the queue).}]
     71 
     72 However, when the tasks to handle have heterogeneous types @racket[In₁ … Inₙ],
     73 the type of the @racketid[worklist] function is not simple to express. Using
     74 Typed Racket's variadic polymorphic types, also sometimes mentioned as
     75 "polydots" (of the form @racket[(∀ (A B X ...) τ)]) along with intersection
     76 types, it is possible to encode the type of @racket[worklist].
     77 
     78 Typed Racket is not (yet?) able to reliably enforce the fact that two variadic
     79 polymorphic type variables must have the same length. The trick is to wrap the
     80 @racket[Inᵢ] and @racket[Outᵢ] types with structs @racket[I] and @racket[O],
     81 which serve as type-level tags. Then, a single variadic type variable ranges
     82 over the union types @racket[(U (I Inᵢ) (O Outᵢ)) ...]. Finally, intersection
     83 types are used to project the @racket[I] or @racket[O] parts of the type
     84 variable.
     85 
     86 The implementation of the worklist function is not trivial, and calling it
     87 requires some amount of boilerplate to correctly instantiate its type, wrap
     88 the inputs with @racket[I], and extract the results out of the @racket[O]. The
     89 @racket[worklist] macro takes care of this boilerplate.
     90 
     91 @section{Theoretical interest}
     92 
     93 It is worth noting that the core of the @racket[worklist] implementation is a
     94 function. The macro does not generate the whole implementation, instead it
     95 merely generates a lightweight wrapper and takes care of the instantiation of
     96 the function's polymorphic type. When worklists with a large number of task
     97 types are used, this function-based implementation can (hopefully) reduce the
     98 typechecking time, compared to a macro-based implementation which would
     99 generate a large chunk of code needing to be typechecked. Also, the guarantees
    100 on the correctness of the code are better, since the function-based
    101 implementation is typechecked once and for all (the macro-generated wrapper
    102 could still fail to typecheck, but it is a smaller and simpler piece of code).
    103 
    104 Finally, it is a remarkable feat that Typed Racket is able to express the type
    105 of such a function (and is also able to typecheck its implementation!), as
    106 this problem would require at a first look some flavor of type-level
    107 computation, dependent types or similar advanced typing features.
    108 Comparatively, the mechanisms used here (variadic polymorphic variables and
    109 intersection types) are refreshingly simple.
    110 
    111 @section{The @racket[worklist] macro}
    112 
    113 @defform[(worklist roots [procᵢ …ₙ] [Inᵢ Outᵢ] …ₙ)
    114          #:contracts
    115          ([roots (List (Listof Inᵢ) …ₙ)]
    116           [procᵢ (→ Inᵢ (List Outᵢ (Listof Inᵢ) …ₙ))]
    117           [Inᵢ Type]
    118           [Outᵢ Type])]{
    119                         
    120  Executes the corresponding @racket[procᵢ] on each element of each worklist.
    121  The @racket[procᵢ] takes a value of type @racket[Inᵢ] and returns an output of
    122  type @racket[Outᵢ], as well as @racket[n] lists of new inputs which are added
    123  to the worklists.
    124 
    125  The worklists are initialised with the given @racket[roots].
    126 
    127  The whole expression has the following result type:
    128 
    129  @racketblock[(List (HashTable Inᵢ Outᵢ) …ₙ)]
    130 
    131  Within a worklist, duplicate elements are only processed once.}
	www Unnamed repository; edit this file 'description' to name the repository.
	Log \| Files \| Refs \| README