par.ijs - Parallel Execution for J

Download par.ijs

A simple library for parallel map operations in J using threadpools.

Downloads

Quick Start

load 'par.ijs'

NB. Sequential: verb"_1 array
*:"0 i.1000

NB. Parallel: verb par array  (just add 'par')
*: par i.1000

NB. Works with any verb
myfunc =: 3 : 'expensive computation on y'
myfunc"_1 data   NB. sequential
myfunc par data NB. parallel

Installation

Copy par.ijs to your J addons directory or load directly:

load '/path/to/par.ijs'

API

par - Parallel Map (rank _1)

Applies a verb to each item of an array in parallel.

verb par array

Examples:

*: par i.100        NB. square each number
+/ par matrix       NB. sum each row
f par data          NB. apply f to each item

parr - Parallel Map with Rank

Applies a verb at a specified rank in parallel.

verb parr rank array

Examples:

*: parr 0 i.100     NB. apply at rank 0 (atoms)
+/ parr 1 matrix    NB. apply at rank 1 (rows)
f parr 2 cube       NB. apply at rank 2 (matrices)

Utility Functions

Function Description
par_cores Number of CPU cores detected
par_workers'' Current number of worker threads
par_init n Initialize pool with n workers
par_status'' Pool stats: idle, pending, total

How It Works

par is defined as:

par =: 1 : '>@(u t. par_POOL "_1)'

This: 1. Uses t. (threaded) to distribute work across a threadpool 2. Operates at rank _1 (items of the array) 3. Unboxes results with >

The library uses threadpool 1 (separate from pool 0 used by J primitives) to avoid interference with J’s internal threading.

Performance Characteristics

Parallelization has overhead from thread coordination. Benefits depend on:

When to Use par

Good candidates: - CPU-intensive computations per item - Independent calculations (no shared state) - Medium to large arrays with non-trivial work

Poor candidates: - Simple arithmetic on large arrays (overhead > benefit) - Small arrays (not enough work to distribute) - I/O bound operations

Benchmarks

System: 12 cores, 11 workers (FreeBSD)

Light: Square 100,000 numbers

Sequential: 0.00029s
Parallel:   0.392s
Speedup:    0.0008x

Trivial operations are faster sequential - parallelization overhead dominates.

Medium: Prime check on 1,000 numbers

Sequential: 0.281s
Parallel:   0.068s
Speedup:    4.1x

Good speedup - enough computation per item to overcome overhead.

Heavy: Count primes for 100 values

Sequential: 14.9s
Parallel:   2.5s
Speedup:    6.0x

Excellent speedup - significant work per item distributes well.

Very Heavy: Fibonacci(20-39)

Sequential: 318.4s
Parallel:   356.4s
Speedup:    0.89x

No speedup - 20 items but highly uneven work distribution (fib(39) >> fib(20)).

Source

NB. par.ijs - Parallel execution utilities for J

NB. Configuration
par_POOL_z_ =: 1
par_cores_z_ =: {. 8 T. ''

NB. Thread pool management
par_init_z_ =: 3 : 0
  for_i. i. y do. 0 T. < par_POOL end.
  par_workers''
)
par_workers_z_ =: 3 : '{: 2 T. par_POOL'
par_status_z_ =: 3 : '2 T. par_POOL'

NB. Parallel adverbs
par_z_ =: 1 : '>@(u t. par_POOL "_1)'
parr_z_ =: 2 : '>@(u t. par_POOL " n)'

NB. Auto-initialize
par_init par_cores - 1

License

Public domain.