I posted the below set of Algol68 Linear Algebra Operators sometime back, code could do with an update to handle sparse matrices... but as it stands it is a cute example of how you can use those extra CPUs and FPUs on your modern laptop... The next step (ignoring stability) might be to augment with Strassen's O(n^log2(7)) recursive matrix multiplication algorithm, but again using Algol68's parallel processing... O(n^log2(7))/n(cpus) ... Enjoy!
int default upb := 3; mode field = long real; mode vector = [default upb]field; mode matrix = [default upb, default upb]field; ¢ crude exception handling ¢ proc void raise index error := void: goto exception index error; sema idle cpus = level ( 8 - 1 ); ¢ 8 = number of CPU cores minus parent CPU ¢ ¢ define a operators to slice array into quarters ¢ op top = (matrix m)int: ( ⌊m + ⌈m ) %2, bot = (matrix m)int: top m + 1, left = (matrix m)int: ( 2 ⌊m + 2 ⌈m ) %2, right = (matrix m)int: left m + 1, left = (vector v)int: ( ⌊v + ⌈v ) %2, right = (vector v)int: left v + 1; prio top = 8, bot = 8, left = 8, right = 8; ¢ Operator priority - same as LWB & UPB ¢ op × = (vector a, b)field: ( ¢ dot product ¢ if (⌊a, ⌈a) ≠ (⌊b, ⌈b) then raise index error fi; if ⌊a = ⌈a then a[⌈a] × b[⌈b] else field begin, end; []proc void schedule=( ¢ of threads ¢ void: begin:=a[:left a] × b[:left b], void: end :=a[right a:] × b[right b:] ); if level idle cpus = 0 then ¢ use current CPU ¢ for thread to ⌈schedule do schedule[thread] od else par ( ¢ run vector in parallel ¢ schedule[1], ¢ assume parent CPU ¢ ( ↓idle cpus; schedule[2]; ↑idle cpus) ) fi; begin+end fi ); op × = (matrix a, b)matrix: ¢ matrix multiply ¢ if (⌊a, 2 ⌊b) = (⌈a, 2 ⌈b) then a[⌊a, ] × b[, 2 ⌈b] ¢ dot product ¢ else [⌈a, 2 ⌈b] field out; if (2 ⌊a, 2 ⌈a) ≠ (⌊b, ⌈b) then raise index error fi; []struct(bool required, proc void thread) schedule = ( ¢ of threads ¢ ( true, ¢ calculate top left corner ¢ void: out[:top a, :left b] := a[:top a, ] × b[, :left b]), ( ⌊a ≠ ⌈a, ¢ calculate bottom left corner ¢ void: out[bot a:, :left b] := a[bot a:, ] × b[, :left b]), ( 2 ⌊b ≠ 2 ⌈b, ¢ calculate top right corner ¢ void: out[:top a, right b:] := a[:top a, ] × b[, right b:]), ( (⌊a, 2 ⌊b) ≠ (⌈a, 2 ⌈b) , ¢ calculate bottom right corner ¢ void: out[bot a:, right b:] := a[bot a:, ] × b[, right b:]) ); if level idle cpus = 0 then ¢ use current CPU ¢ for thread to ⌈schedule do (required →schedule[thread] | thread →schedule[thread] ) od else par ( ¢ run vector in parallel ¢ thread →schedule[1], ¢ thread is always required, and assume parent CPU ¢ ( required →schedule[4] | ↓idle cpus; thread →schedule[4]; ↑idle cpus), ¢ try to do opposite corners of matrix in parallel if CPUs are limited ¢ ( required →schedule[3] | ↓idle cpus; thread →schedule[3]; ↑idle cpus), ( required →schedule[2] | ↓idle cpus; thread →schedule[2]; ↑idle cpus) ) fi; out fi; format real fmt = $g(-6,2)$; ¢ width of 6, with no '+' sign, 2 decimals ¢ proc real matrix printf= (format real fmt, matrix m)void:( format vector fmt = $"("n(2 ⌈m-1)(f(real fmt)",")f(real fmt)")"$; format matrix fmt = $x"("n(⌈m-1)(f(vector fmt)","lxx)f(vector fmt)");"$; ¢ finally print the result ¢ printf((matrix fmt,m)) ); ¢ Some sample matrices to test ¢ matrix a=((1, 1, 1, 1), ¢ matrix A ¢ (2, 4, 8, 16), (3, 9, 27, 81), (4, 16, 64, 256)); matrix b=(( 4 , -3 , 4/3, -1/4 ), ¢ matrix B ¢ (-13/3, 19/4, -7/3, 11/24), ( 3/2, -2 , 7/6, -1/4 ), ( -1/6, 1/4, -1/6, 1/24)); matrix c = a × b; ¢ actual multiplication example of A x B ¢ print((" A x B = Product of a and b",new line)); real matrix printf(real fmt, c). exception index error: putf(stand error, $x"Exception: index error."l$)
- Output:
(( 1.00, -0.00, -0.00, -0.00),
( -0.00, 1.00, -0.00, -0.00),
( -0.00, -0.00, 1.00, -0.00),
( -0.00, -0.00, -0.00, 1.00));