Parallelism (flowlog.c)

Flowlog can exploit multicore systems by running parts of Prolog search in parallel using pthreads.

Parallelism is controlled through ISO-compatible Prolog flags (set_prolog_flag/2).

By default, Flowlog runs with flowlog_parallel_profile=fast (unordered OR-par commits for throughput).

Overview

Flowlog supports two kinds of parallelism:

The parallel profile determines ordering:

Selecting a profile

Classic ordered enumeration (parallel, ordered commits):

?- set_prolog_flag(flowlog_parallel_profile, iso).

Maximum throughput (parallel, unordered commits):

?- set_prolog_flag(flowlog_parallel_profile, fast).

Disable Flowlog parallel features (sequential):

?- set_prolog_flag(flowlog_parallel_profile, off).

How it works (engine-level)

Flowlog provides three engines:

Parallelism is implemented by running independent solver work concurrently in multiple pthread workers:

For more detail on the execution model, see IMPLEMENTATION.md.

Prolog flags

Convenience flags

Fine-grained flags

Worker threads

Flowlog defaults to using the system’s online CPU count as its worker-thread count.

Override with:

Semantics and safety

OR-parallelism (choicepoints)

OR-parallelism parallelizes alternative exploration: when a goal has many alternatives (e.g. many matching clauses, or a built-in that enumerates candidates), Flowlog can split the alternatives into “chunks” and explore them concurrently.

Implementation note:

Ordering modes:

Cancellation:

AND-parallelism (safe subsets)

AND-parallelism is conservative: Flowlog only parallelizes conjunction sub-goals when it can preserve ISO-visible semantics.

In practice, this means:

You can introspect parts of Flowlog’s AND-par safety rules via the built-ins:

Interactions and caveats

Side effects and ordering

Parallel execution is most effective for pure code (no side effects and no reliance on solution order).

If your program depends on any of the following being sequenced “just so”, keep parallelism off or use ordered mode:

Tabling

Tabling changes the shape of search. While tabling is active, OR-parallelism is currently disabled to keep the table state single-threaded and predictable. Tabled predicates are also treated as AND-par unsafe.

Tuning

OR-par worker count and stack size

Internal OR-par runs on pthreads and may need tuning on systems with deep recursion or many cores:

By default, Flowlog auto-scales the OR-par pthread stack size based on detected physical memory and worker count.

Query-temporary allocation strategy

For some heavy OR-par workloads, allocating all query-temporary objects from a single global allocator can become a bottleneck.

Flowlog supports a local allocation mode:

This causes query-temporary terms/envs to prefer a process-local bump allocator, reducing allocator contention.

Debugging parallel execution

OR-par can be debugged via environment variables:

These logs can help answer questions like: