commit b6048fa86cc0150f2e532d422f564f06a76b7f62
parent fe081f9ddf428977d43a6ba1551cfb8271e8c7a7
Author: Georges Dupéron <georges.duperon@gmail.com>
Date: Tue, 5 Dec 2017 00:22:55 +0100
Docs explaing the use case, and the (function-based) implementation
Diffstat:
2 files changed, 78 insertions(+), 1 deletion(-)
diff --git a/main.rkt b/main.rkt
@@ -108,6 +108,9 @@
; (error "NIY"))
(define-syntax-rule (inst-worklist (In Out) ...)
+ ;; The unsafe-cast seems unnecessary on recent Typed Racket versions, and can
+ ;; be replaced with "ann". We should use a version check to use it only for
+ ;; backward-compatibility on older versions.
(unsafe-cast
(inst worklist-function
(U (I In) (O Out))
diff --git a/scribblings/typed-worklist.scrbl b/scribblings/typed-worklist.scrbl
@@ -1,5 +1,6 @@
#lang scribble/manual
-@(require (for-label typed-worklist
+@(require scriblib/footnote
+ (for-label typed-worklist
racket/base))
@title{Typed worklist}
@@ -7,6 +8,79 @@
@defmodule[typed-worklist]
+This package is mainly intended to be used by the
+@racketmodname[phc-graph #:indirect] package@note{the development of
+ @racketmodname[phc-graph #:indirect] is currently stalled, as I'm
+ investigating solutions based on @racketmodname[turnstile #:indirect].}.
+
+The goal of @racketmodname[phc-graph #:indirect] is to implement generalised
+folds (catamorphisms) over heterogeneous graphs. Heterogeneous graphs can be
+conceptualised as a trees where nodes can have different types (e.g. nodes of
+type @racketid[A] have children of type @racketid[B], which have children of
+type @racketid[C], which have children of type @racketid[A], and each node
+type can hold different metadata), generalised to allow backward arcs (so that
+the trees become true graphs).
+
+A simple technique for traversing and processing such a data structure is to
+have a worklist, which remembers the already-visited nodes, as well as a queue
+of nodes that still need to be visited. A polymorphic
+@racketid[example-worklist1] function which can be used to handle two sets of
+processed/to-process nodes from a @emph{homogeneous} graph would have the
+following signature:
+
+@nested[#:style 'inset]{
+ @defproc[(example-worklist1 [roots (Listof In)]
+ [proc (→ In (Values Out (Listof In)))])
+ (Listof Out)
+ #:link-target? #f]{
+
+ Signature for a hypothetical function which takes an initial list of tasks of
+ type @racketid[In], and a processing function which consumes one task,
+ returning a result of type @racket[Out], and a list of new tasks.
+
+ The hypothetical function would return the results obtained after processing
+ each task, making sure that each task is processed only once (i.e. if a task
+ is returned multiple times by processing functions, it is only added once to
+ the queue).}
+}
+
+However, when the tasks to handle have heterogeneous types @racket[In₁ … Inₙ],
+the type of the @racketid[worklist] function is not simple to express. Using
+Typed Racket's variadic polymorphic types, also sometimes mentioned as
+"polydots" (of the form @racket[(∀ (A B X ...) τ)]) along with intersection
+types, it is possible to encode the type of @racket[worklist].
+
+Typed Racket is not (yet?) able to reliably enfoce the fact that two variadic
+polymorphic type variables must have the same length. The trick is to wrap the
+@racket[Inᵢ] and @racket[Outᵢ] types with structs @racket[I] and @racket[O],
+which serve as type-level tags. Then, a single variadic type variable ranges
+over the union types @racket[(U (I Inᵢ) (O Outᵢ)) ...]. Finally, intersection
+types are used to project the @racket[I] or @racket[O] parts of the type
+variable.
+
+The implementation of the worklist function is not trivial, and calling it
+requires some amount of boilerplate to correctly instantiate its type, wrap
+the inputs with @racket[I], and extract the results out of the @racket[O]. The
+@racket[worklist] macro takes care of this boilerplate.
+
+It is worth noting that the core of the @racket[worklist] implementation is a
+function. The macro does not generate the whole implementation, instead it
+merely generates a lightweight wrapper and takes care of the instantiation of
+the function's polymorphic type. When worklists with a large number of task
+types are used, this function-based implementation can (hopefully) reduce the
+typechecking time, compared to a macro-based implementation which would
+generate a large chunk of code needing to be typechecked. Also, the guarantees
+on the correctness of the code are better, since the function-based
+implementation is typechecked once and for all (the macro-generated wrapper
+could still fail to typecheck, but it is a smaller and simpler piece of code).
+
+Finally, it is a remarkable feat that Typed Racket is able to express the type
+of such a function (and is also able to typecheck its implementation!), as
+this problem would require at first look some flavor of type-level
+computation, dependent types or similar advanced typing features.
+Comparatively, the mechanisms used here (variadic polymorphic variables and
+intersection types) are refreshingly simple.
+
@defform[(worklist roots [procᵢ …ₙ] [Inᵢ Outᵢ] …ₙ)
#:contracts
([roots (List (Listof Inᵢ) …ₙ)]