php/Math   
Recreational Mathematics   
   home  |  library  |  contact
 Math Notes
 Math Programming [25]
 Regression [3]
 Data Mining [17]
 Notation [6]
 Linear Algebra [9]
 Stats & Prob [15]
 Math Cognition [5]
 Space & Physics [6]
 Formulas [5]
 Fun & Games [2]
 Haskell [1]
 Bayes Theory [1]
 Site News [0]
 Math Projects [5]
 Polynomials [1]
 Calculus [9]
 Number Theory [3]
 Optimization [2]
 Financial [1]

 Math Links
 PHP/ir
 Andrew Gelman
 Chance Wiki
 Daniel Lemire
 KD Knuggets
 Social Stats
 MySQL Performance
 Hunch.net
 Matthew Hurst
 JMLR
 JSS
 Hal Daume III
 Math Notes >> Math Programming

Flexible computation of a sum [Math Programming
Posted on February 6, 2012 @ 11:49:32 AM by Paul Meagher

I re-implemented Matlab/Octave's sum function in PHP. The Matlab/Octave sum function is more flexible than PHP's array_sum function as it can handle scalar, vector, or matrix input. In the case of matrices, the sum function can also be directed to sum by column or row. The default is to sum a matrix by column which is handy for getting a bunch of column sums from a data matrix.

The way that the sum function works is similar to the way an ave or stdev function would work in Matlab/Octave. Eventually it would be nice to have all these data summarization functions available and working in a Matlab/Octave like way.

Without futher ado, here is the code that implements the sum function. Note that it reuses the get_type() function mentioned in previous blogs:

<?php
/**
* Partial implementation of matlab sum function.  Does not 
* handle arrays with dimension greater than 2.
*
* @see http://www.mathworks.com/help/techdoc/ref/sum.html
*/

/**
* Determines whether a variable is a scalar, vector, or matrix.

* @params mixed $var Scalar, vector, or matrix data.
* @return string scalar, vector, matrix or false 
*/
function get_type($var) {  
  if (
is_numeric($var)) {
    return 
"scalar";
  } elseif (
is_array($var)) {
    if (
is_array($var[0])) 
      return 
"matrix";
    else 
      return 
"vector";
  } else 
    return 
false;
}

/**
* Computes the row or column sums.  The default is to compute
* column sums and to store each column sum into a row vector.
*
* @param array $a A one to two dimensional array of numeric values.
* @param integer $dim The dimension to sum (1=column sums, 2=row sums).
* @return array/string $sum A row ($dim=1) or column ($dim=2) vector of sums.
*/
function sum($a$dim=1) {  
  
$type get_type($a);  
  if (
$type=="scalar")
    return array(
$a);
  elseif (
$type=="vector"
    return array(
array_sum($a));
  else {
    
// zero the sums array
    
$sums = array();
    
// num rows
    
$nrows count($a);        
    
// num cols
    
$ncols count($a[0]);         
    if (
$dim == 1) {
      
// compute sum of matrix columns
      
for ($j=0$j $ncols$j++) {
        
$sums[$j] = 0.0;
        for(
$i=0$i $nrows$i++) 
          
$sums[$j] += $a[$i][$j];
      }
      return 
$sums;
    } else {
      
// compute sum of matrix rows
      
for ($i=0$i $nrows$i++) 
        
$sums[$i][0] += array_sum($a[$i]);
      return 
$sums;
    }      
  }
}


$s 5;
$v = array(1,2,3);
$m = array(array(1,2,3),array(1,2,3),array(1,2,3));

// scalar case
$ans =  sum($s);
echo 
"<pre>";
print_r($ans);
echo 
"</pre>";
echo 
"<br />";

// row vector case
$ans =  sum($v);
echo 
"<pre>";
print_r($ans);
echo 
"</pre>";
echo 
"<br />";

// matrix case - sum cols (default)
$ans =  sum($m);
echo 
"<pre>";
print_r($ans);
echo 
"</pre>";
echo 
"<br />";

// matrix case - sum rows
$ans =  sum($m2);
echo 
"<pre>";
print_r($ans);
echo 
"</pre>";
echo 
"<br />";

// compute sum of all elements
$ans =  sum(sum($m));
echo 
"<pre>";
print_r($ans);
echo 
"</pre>";
echo 
"<br />";

?>

The output of this script looks like this:

Array
(
    [0] => 5
)


Array
(
    [0] => 6
)


Array
(
    [0] => 3
    [1] => 6
    [2] => 9
)


Array
(
    [0] => Array
        (
            [0] => 6
        )

    [1] => Array
        (
            [0] => 6
        )

    [2] => Array
        (
            [0] => 6
        )

)


Array
(
    [0] => 18
)

One important point to note is that when we sum a single scalar value like sum(5) the result is an array and not a scalar value. I suspect that it is of fundamental importance to matrix-oriented languages like Matlab/Octave that results be wrapped in an array structure so that we can string together matrix operations in a conistent manner.

Permalink 

Elementwise operations [Math Programming
Posted on January 28, 2012 @ 06:33:46 AM by Paul Meagher

In my last post, I discussed how basic arithmetic operations might be vectorized using PHP. I wasn't very satisfied with the implemention. One issue was that I was duplicating alot of code and felt that the vectorization code could be reused across vectorized arithmetic operations better. Another issue was that I didn't study how matlab/octave elementwise operations worked in detail so didn't know what the allowable argument types were (scalar, vector, matrix) and what combination of argument types was allowed on both sides of the arithmetic operation. I'm happy to report that I've addressed both of these issues in the new implementation. While I could probably add more error handling, I think the code below could be used to easily vectorize many operations in PHP such as exp, sin, cos, pow, etc...

<?php
/**
* elementwise_operations.php
*
* Functions that mimic matlab/octave's elementwise operations (e.g., .+, .-, .*, ./):
*
*  .+  =  madd  =  element-by-element addition
*  .-  =  msub  =  element-by-element subtraction
*  .*  =  mmul  =  element-by-element multiplication
*  ./  =  mdiv  =  element-by-element division
*/

/**
* Determines whether a variable is a scalar, vector, or matrix
*/
function get_type($var) {  
  if (
is_numeric($var)) {
    return 
"scalar";
  } elseif (
is_array($var)) {
    if (
is_array($var[0])) 
      return 
"matrix";
    else 
      return 
"vector";
  } else 
    return 
false;
}

/**
* Function that vectorizes the elementary operations.
*/
function vectorize($a$b$op) {
  
  
$atype get_type($a);  
  
$btype get_type($b);

  if ((
$atype != false) AND ($btype != false))
    
$arguments $atype."-".$btype;
  else 
    die(
"Argument is not a scalar, vector or matrix.");
  
  switch (
$arguments) {
    
    case 
"scalar-scalar"
      return 
$op($a$b);
      break;

    case 
"vector-scalar"
      
$asize count($a);
      
$b     array_fill(0$asize$b);
      return 
array_map($op$a$b);      
      break;

    case 
"vector-vector":
      
$asize count($a);
      
$bsize count($b);      
      if (
$asize == $bsize
        return 
array_map($op$a$b);              
      else 
        die(
"Nonconformant arguments.");
      break; 
      
    case 
"matrix-scalar":        
      
$nrows count($a);
      
$ncols count($a[0]);      
      for(
$i=0$i<$nrows$i++) {
        
$bvec  array_fill(0$ncols$b);        
        
$c[$i] = array_map($op$a[$i], $bvec);           
      }  
      return 
$c;
      break;
      
    case 
"matrix-matrix":        
      
$arows count($a);
      
$acols count($a[0]); 
      
$brows count($b);
      
$bcols count($b[0]);       
      if ((
$arows == $brows) AND ($acols == $bcols)) {
        for(
$i=0$i<$arows$i++) 
          
$c[$i] = array_map($op$a[$i], $b[$i]);   
        return 
$c;                        
      } else 
        die(
"Nonconformant arguments.");      
      break;             
  }    
  
}


// Elementary arithmetic operations.

function _add($addend1$addend2) {
  return (
$addend1 $addend2);
}

function 
_sub($minuend$subtrahend) {
  return (
$minuend $subtrahend);
}

function 
_mul($factor1$factor2) {
  return (
$factor1 $factor2);
}

function 
_div($numerator$divisor) {
  return (
$numerator $divisor);
}


// Elementwise operators.

function madd($m$addend) {
  return 
vectorize($m$addend"_add");
}

function 
msub($m$subtrahend) {
  return 
vectorize($m$subtrahend"_sub");
}

function 
mmul($m$factor) {
  return 
vectorize($m$factor"_mul");
}

function 
mdiv($m$divisor) {
  return 
vectorize($m$divisor"_div");
}


// Test elementwise addition.

$s 5;
$v = array(123);
$m = array(array(123), array(456));

// scalar-scalar case
$ans madd($s$s);
print_r($ans);
echo 
"<br />";

// vector-scalar case
$ans madd($v$s);
print_r($ans);
echo 
"<br />";

// vector-vector case
$ans madd($v$v);
print_r($ans);
echo 
"<br />";

// matrix-scalar case
$ans madd($m$s);
print_r($ans);
echo 
"<br />";

// matrix-matrix case
$ans madd($m$m);
print_r($ans);

// What the output looks like:

// 10
// Array ( [0] => 6 [1] => 7 [2] => 8 )
// Array ( [0] => 2 [1] => 4 [2] => 6 )
// Array ( [0] => Array ( [0] => 6 [1] => 7 [2] => 8 ) [1] => Array ( [0] => 9 [1] => 10 [2] => 11 ) )
// Array ( [0] => Array ( [0] => 2 [1] => 4 [2] => 6 ) [1] => Array ( [0] => 8 [1] => 10 [2] => 12 ) ) 

?>
  

Permalink 

Vectorizing basic arithmetic operations [Math Programming
Posted on January 23, 2012 @ 12:45:37 AM by Paul Meagher

Continuing on with my research into adding better matrix/vector capabilities to PHP, I decided to implement the basic arithmetic functions (addition, subtraction, division, multiplication) as vectorized functions. To make these functions concise and easy to remember, I used a system of function naming that involves prefixing the function call with a "v" (for "vector") and then using the first three letters of the arithmetic operation name (add, sub, div, mul) as the suffix:

  1. vadd($v, $addend): vector addition
  2. vsub($v, $subtrahend): vector subtaction
  3. vdiv($v, $divisor): vector division
  4. vmul($v, $factor): vector multiplication

In all cases, the first argument is a vector of numbers, followed by the value to apply to each element of the vector. The code below is proof-of-concept of how to vectorize these functions and is not meant as production-level code:

<?php
/**
* Set of functions that vectorize the basic arithmetic functions:
* addition, subtraction, division, multiplication.  No error 
* checking.
*/

// function for vectorized addition

function _add($addend1$addend2) {
  return (
$addend1 $addend2);
}

function 
vadd($v$addend) {
  
$size      count($v);
  
$add_array array_fill(0$size$addend);
  return 
array_map("_add"$v$add_array);
}

// function for vectorized subtraction

function _sub($minuend$subtrahend) {
  return (
$minuend $subtrahend);
}

function 
vsub($v$subtrahend) {
  
$size      count($v);
  
$sub_array array_fill(0$size$subtrahend);
  return 
array_map("_sub"$v$sub_array);
}

// function for vectorized multiplication

function _mul($factor1$factor2) {
  return (
$factor1 $factor2);
}

function 
vmul($v$factor) {
  
$size      count($v);
  
$mul_array array_fill(0$size$factor);
  return 
array_map("_mul"$v$mul_array);
}

// function for vectorized division

function _div($numerator$divisor) {
  return (
$numerator $divisor);
}

function 
vdiv($v$divisor) {
  
$size      count($v);
  
$div_array array_fill(0$size$divisor);
  return 
array_map("_div"$v$div_array);
}

$v   = array(51015);
$arg 5;

$b vadd($v$arg);
print_r($b);
echo 
"<br />";

$b vsub($v$arg);
print_r($b);
echo 
"<br />";

$b vmul($v$arg);
print_r($b);
echo 
"<br />";

$b vdiv($v$arg);
print_r($b);
echo 
"<br />";

// Output:

// Array ( [0] => 10 [1] => 15 [2] => 20 )
// Array ( [0] => 0 [1] => 5 [2] => 10 )
// Array ( [0] => 25 [1] => 50 [2] => 75 )
// Array ( [0] => 1 [1] => 2 [2] => 3 ) 

?>

The alternative to vectorized functions is the ad-hoc creation of loops to perform similar vectorized operations in your code. Think how much programming time and effort has been wasted coding loops to do vectorized arithmetic when there could be vectorized arithmetic functions available to do it all in one simple command.

This only touches the surface of the number of mathematical functions that should be vectorized in PHP (e.g., vsin, vcos, vexp, etc...) to make PHP a much more efficient language to program numeric-based algorithms. We could easily add a "v" suffix to all these function to indicate that they are all vector-based.

Another option to consider would be to go even further and have these functions handle matrix operations, with vectors being just the special case of a one-dimensional matrix. In that case, we might want to instead prefix all the functions with an "m" instead of a "v". I had considered this, but was too lazy to implement functions to handle the matrix case. This is a good exercise for the reader.

At the end of the day we should look at languages like Matlab/Octave to give us guidance on how to proceed with respect to the range of functions to vectorize and whether we should handle vector or matrix arguments to these functions. While we may not be able to use Matlab/Octave's concise syntax to code vectorized algorithms, we can develop slightly more verbose functions like vadd, vsub, vdiv, and vmul to efficiently code numeric algorithms.

Permalink 

Matrix manipulation functions for PHP: Part 3 [Math Programming
Posted on January 14, 2012 @ 05:28:18 PM by Paul Meagher

Another useful function that matlab/octave supports is the linspace(start, end, num_points) function. The linspace function returns an array of n numbers that are equally spaced between the start and end points. A function like this is very useful when graphing an axis and when testing vectorized functions. You will see it used quite frequently in matlab/octave examples.

Below is an implementation that captures the behavior of Octave's linspace function. It also includes some code to test the output of the linspace function.

<?php

define
("PRECISION"4);

function 
linspace($start$end$num_points) {

  
$format "%01.".PRECISION."f";

  
$start      = (float)$start;
  
$end        = (float)$end;  
  
$num_points = (int)$num_points 1;
  
  
$difference abs($end-$start);
 
  if (
$num_points <= 1)
    return array(
$end);
    
  if (
$num_points == 2
    return array(
$start$end);
  
  if (
$num_points >= 3) {

    if (
$difference 0) {

      
      
$increment sprintf($format$difference/$num_points);  
      
$increment $difference/$num_points;        
      
      
$A[0] = sprintf($format$start);
     
      for (
$i=1$i $num_points$i++) {
        
$A[$i] = sprintf($format, ($A[$i-1] + $increment));
        
$next_num $A[$i];
      }
        
      
$A[] = sprintf($format$end);

      return 
$A;

    } else {
    
      return 
array_fill(0$num_points+1$start);
      
    }  

  }

}

// Testing the output of the linspace function.

echo "<pre>";
echo 
"linspace(1, 5, 4) \n";
print_r(linspace(154));
echo 
"\n";
echo 
"linspace(1, 1.8, 4) \n";
print_r(linspace(11.84));
echo 
"\n";
echo 
"linspace(1, 1, 4) \n";
print_r(linspace(114));
echo 
"\n";
echo 
"linspace(5, 7, 0)\n";
print_r(linspace(570));
echo 
"</pre>";
?>

The output of this script looks like this:

linspace(1, 5, 4) 
Array
(
    [0] => 1.0000
    [1] => 2.3333
    [2] => 3.6666
    [3] => 5.0000
)

linspace(1, 1.8, 4) 
Array
(
    [0] => 1.0000
    [1] => 1.2667
    [2] => 1.5334
    [3] => 1.8000
)

linspace(1, 1, 4) 
Array
(
    [0] => 1
    [1] => 1
    [2] => 1
    [3] => 1
)

linspace(5, 7, 0)
Array
(
    [0] => 7
)

An interesting aspect of the behavior of the linspace function is that it handles odd cases in ways that one might think should throw an error. Instead of throwing an error when you enter, for example, linspace(5, 7, 0) - because you can't return 0 points - it instead tries to do something sensible and returns the end point (i.e., 7). Whether this is sensible or not I will leave you to decide, however, the larger point is that Octave seems biased towards returning a numeric result of some sort rather than an error result. If you don't know the detailed behavior of these functions you could end up with some nasty surprises in terms of the result returned; perhaps it would be better if an error gets returned rather than a numeric result in some cases. Nevertheless, when doing mathematical programming you should consider whether it is better to throw an error result in response to odd inputs or try to generate a sensible numeric answer by, for example, anticipating the type of mistakes a user might make when passing arguments to a function and handling this mistakes appropriately (e.g., the low and hi values might be in reverse order from what is expected, so reverse them, and give the answer based upon the corrected ordering of arguments). A good way to master Octave is to port some of it's matrix manipulation functions to PHP. To do so you will need to observe how the Octave function behaves with boundary cases; a type of study you probably would not have engaged in otherwise.

On another note, it is my hope that I can begin to revive this blog again in 2012. I don't have time to blog as frequently as I did in the past, however, I would like to try for at least one blog post per week - morely like on weekends when I have more time for recreational math, than on weekdays. This is one of my resolutions in 2012 so stay tuned to see if this pans out :-)

Permalink 

Matrix manipulation functions for PHP: Part 2 [Math Programming
Posted on January 7, 2012 @ 01:41:26 PM by Paul Meagher

In the last post, I bemoaned the fact that PHP lacks some useful matrix manipulation functions that matlab/octave support. I decided to see how difficult it would be to create a script that would 1) parse a matlab/octave-like recipe for creating a matrix, and 2) generate a matrix based upon that parse. I initially thought such work would require some tricky regular expressions for parsing the recipe, but it appears that the language for specifying matrices was designed to make parsing relatively easy and that major parts of the parsing could be handled by exploding on various control characters. I'm not claiming that the code below implements all of the matlab/octave language for specifying matrices, however, it does support much of functionality you will see discussed in matlab/octave books.

<?php
/**
* @script matrix.php
*
* Script for creating matrices using matlab/octave like recipes.
*
* @author Paul Meagher
* @version 0.1
* @modified Jan 6, 2012
*/


/**
* Returns a parse of the recipe for a desired matrix.
* The parse that is returned contains a "valid" flag 
* that is set to false (i.e., 0) if the number of 
* columns in each row is unequal.
*/
function parse_matrix_recipe($recipe) {
  
  
// Initialize parse validity status to 1. Parse validity 
  // status will be set to 0 if expression is invalid.    
  
$parse['valid'] = 1;       
  
  
$num_rows 0;
  
$num_cols 0;

  
// Get rid of excess whitespace.
  
$recipe preg_replace('/\s\s+/'' '$recipe);
      
  
// Get all the rows.
  
$rows explode(';'$recipe);
  
  
$num_rows count($rows);
  
  
// Parse all the rows  
  
for($i=0$i<$num_rows$i++) {
  
    
// Check first if the row has the : range delimiter.
    
if (substr_count($rows[$i], ":")) {
  
      list(
$start$limit) = explode(":"$rows[$i]);
        
      if (
settype($start"int") AND settype($limit"int")) {
    
        
$parse[$i]['start'] = $start;
        
$parse[$i]['limit'] = $limit;        
              
        
// Counting number of elements generated by range function
        // is an inefficient way of checking that all rows have
        // same number of columns, but it work.  Optimize later :-)
        
if ($i == 0
          
$init_num_cols $row_num_cols count(range($start$limit));
        else
          
$row_num_cols count(range($start,$limit));
                        
      } else {        
        
        
// Cannot convert range values to integers.
        
$parse['valid'] = 0;
        break;
        
      }
    
    } else {          
            
      
$parse[$i]['array'] = explode(" "$rows[$i]);
            
      if (
$i == 0
        
$init_num_cols $row_num_cols count($parse[$i]['array']);
      else
        
$row_num_cols count($parse[$i]['array']);
            
    }

    
// Exit loop and set parse to false if one row has more columns 
    // than another.
    
if ($init_num_cols != $row_num_cols) {
      
$parse['valid'] = 0;       
      break;
    }    
    
  }
  
  
// If parse is valid, set matrix dimensions.  
  
if ($parse['valid'] == 1) {
    
$parse['num_rows'] = $num_rows;
    
$parse['num_cols'] = $init_num_cols;  
  }
  
  return 
$parse;

}

/**
* Uses parse elements to generate and return a matrix
*/
function generate_matrix($parse) {
 
  
$mat = array();
 
  
$num_rows $parse['num_rows'];
  
  if (
$num_rows == 1) {    

    
// Assign supplied array to vector.
    
if (isset($parse[0]['array']))
      
$vec $parse[0]['array'];
    
// Assign specified range of values to vector.  
    
else 
      
$vec range($parse[0]['start'], $parse[0]['limit']);
      
    return 
$vec;
    
  } else {    
    
    for(
$i=0$i<$num_rows$i++) {        

      
// Assign supplied array to row. 
      
if (isset($parse[$i]['array']))      
        
$mat[$i] = $parse[$i]['array'];
      
// Assign range of values to row.  
      
else
        
$mat[$i] = range($parse[$i]['start'], $parse[$i]['limit']);
    }
    
    return 
$mat;
    
  }

}

/**
* Main function for creating matrices.
*/
function m($recipe) {  
  
  
$parse parse_matrix_recipe($recipe);
  
  if (
$parse['valid'] == 1
    return 
generate_matrix($parse); 
  else
    return array();
    
}

?>

As you can see above, the main function used to specify a matrix is the m( ) function. I envision a complementary submatrix operator, sm( ), that would be used to access a submatrix of a supplied matrix (i.e., sm($recipe, $A)). Matlab/octaves language for specifying a submatrix is more complex than the language for specifying a matrix, so coding the sm( ) function will be a bit more challenging. You don't really know how challenging, however, until you try ...

Below is a test script that shows that the matrix.php script outputs the appropriate matrices given matlab/octave like recipes.

<?php

include "matrix.php";

$recipe1 '-1:1; 3:1';
$recipe2 '1;2;3;4';
$recipe3 '1 2  3;4 5   6';
$recipe4 '1';
$recipe5 '1 2 3;4:6';

$A m($recipe1);
$B m($recipe2);
$C m($recipe3);
$D m($recipe4);
$E m($recipe5);

echo 
"<pre>";
echo 
"\$A = m('".$recipe1."') \n";
echo 
"\$A = \n";
print_r($A);
echo 
"\n";

echo 
"\$B = m('".$recipe2."') \n";
echo 
"\$B = \n";
print_r($B);
echo 
"\n";

echo 
"\$C = m('".$recipe3."') \n";
echo 
"\$C = \n";
print_r($C);
echo 
"\n";

echo 
"\$D = m('".$recipe4."') \n";
echo 
"\$D = \n";
print_r($D);
echo 
"\n";

echo 
"\$E = m('".$recipe5."') \n";
echo 
"\$E = \n";
print_r($E);
echo 
"</pre>";

?>

Here is what the output of this script looks like:

$A = m('-1:1; 3:1') 
$A = 
Array
(
    [0] => Array
        (
            [0] => -1
            [1] => 0
            [2] => 1
        )
    [1] => Array
        (
            [0] => 3
            [1] => 2
            [2] => 1
        )
)

$B = m('1;2;3;4') 
$B = 
Array
(
    [0] => Array
        (
            [0] => 1
        )

    [1] => Array
        (
            [0] => 2
        )

    [2] => Array
        (
            [0] => 3
        )

    [3] => Array
        (
            [0] => 4
        )
)

$C = m('1 2  3;4 5   6') 
$C = 
Array
(
    [0] => Array
        (
            [0] => 1
            [1] => 2
            [2] => 3
        )
    [1] => Array
        (
            [0] => 4
            [1] => 5
            [2] => 6
        )
)

$D = m('1') 
$D = 
Array
(
    [0] => 1
)

$E = m('1 2 3;4:6') 
$E = 
Array
(
    [0] => Array
        (
            [0] => 1
            [1] => 2
            [2] => 3
        )
    [1] => Array
        (
            [0] => 4
            [1] => 5
            [2] => 6
        )
)

Originally I suggested that we use a mat( ) function to both create and access a matrix. I decided that if we are going to shorten "matrix" to "mat" we might as well go all the way and use an "m" function as the function name. While I think that we could still use one m( ) function for both constructing and accessing matrices, I think it would be better to develop a separate sm( ) function to make initial development of a matrix accessor easier. Finally, using m( ) as the function used to create/access a matrix reminds me of JQuery's $() operator so I don't think there is anything inherently wrong with using a one letter function name if the importance of the operator supports reserving a letter for it. This is arguably the case for an operator designed to create and access matrices.

Permalink 

Matrix manipulation functions for PHP: Part 1 [Math Programming
Posted on January 4, 2012 @ 01:12:29 PM by Paul Meagher

Currently reading some books on Matlab (so that I can program in it's opensource version, Octave, better). In Matlab/Octave you can construct a matrix like so:

mat = [1:3; 4:6]

which returns:

mat =

   1   2   3
   4   5   6

Perhaps in PHP we should be able to do something similiar like this:

$A = mat('1:3; 4:6');

Where mat(' replaces [ and '); replaces ]. Implementing a mat function like this would require implementing 1) a parser for the expression between single quotes (or double quotes), and 2) a code generator that uses the parsed expression to generate the appropriate array or nested arrays.

In the matrix constructor for the JAMA package, there are methods to "fill" the arrays with values but they are very primative compared to matlab/octaves notation for constructing filled matrices. The proposed "mat" function could work in a very complementary way with JAMA's matrix constructor.

A mat function is only one matrix manipulation function that php might implement to make matrix manipulation easier. Other functions that would be useful are linspace, reshape, and repmat. Matlab's/Octave's method for transposing a matrix involves using a single quote operator ' outside the right bracket (e.g., mat = [1:3; 4:6]') that might be functionalized as a transpose or trans operator. Finally, Matlab/Octave also supports an expressive language for accessing the contents of a matrix that would also be useful to have. One possibility is that the mat function might take a second argument, a matrix, that would be accessed via an expression between the single quotes of the first argument like so:

$b = mat('end, 1', $A);

Which equates to:

$b = 4;

I would argue that for PHP to be a usable language for matrix manipulation it should offer matrix manipulation functions like the Matlab/Octave ones proposed here. A few functions like these, inspired by the way Matlab/Octive implements these functions, would take PHP a long way towards being a useful matrix manipulation language. They would be complementary with the JAMA Linear Algebra Package and would make it easier to port Matlab/Octave-based matrix algorithms to PHP. Many vectorized algorithms require the ability to efficiently slice and dice matrices as the algorithm proceeds towards a solution which is what these methods provide.

Permalink 

A php-based number abstraction class [Math Programming
Posted on June 11, 2007 @ 11:49:57 AM by Paul Meagher

Alix's phpmath library

Alix Axel sent me a link to the source code for some work he is doing on a number abstraction (NA) library. This is not Alix's term. Alix calls it a phpmath library. This would, in my opinion, be a foundational class in such a library if you think you need to get number abstraction right before you seriously begin work developing your quantitative algorithms.

There are other goodies in Alix's code, but the part that got me interested was some interesting number abstraction coding in his class. By number abstraction code I mean code that can be dynamically configured to use a specified c-based and/or php-based numerical precision library to represent numeric quantities and basic operations on them. Alix's code has demoable abstraction support for the c-based "bcmath" and "gmp" arbitrary precision library extensions.

class Math
{
    private $bcmath_enabled = false;
    private $gmp_enabled = false; 

    // class methods here

}

Analogy to database abstraction libraries

What are the arguments for using a number abstraction library? Are they the same arguments as are used to justify using a database abstraction library in developing web applications? Mostly yes so I don't have to bother repeating them :-)

One of the main counterarguments to using a database abstraction library is that you add some bloat and indirection to your code (i.e., the abstraction code). This means there is a market for "lite" classes especially when it comes to interposing a php-based numerical abstraction library between or amidst all your math inputs and outputs.

Other work in this area

Your's truly did a bit of work in this area by porting the Weka comparison methods from Java to PHP. If you study some of the massive Weka codebase, you will notice that these comparison functions are included and used in most of the classes you will encounter.

All 6 comparison operations in this class use a single global error tolerance setting to define the level of precision to use:

/** 
* The small deviation allowed in floating point comparisons. 
*/
const SMALL = 1e-6;

One of the main advantages of this approch is that you can more easily figure out what your practical error bounds are if you know the order of precision your machine is using to represent the quantities involved.

Eliminating sources of error

Using an arbitrary precision is one way to try to address a common source of error in your numerical computing, namely, truncation-related errors. It is important to recognize, however, that you can also achieve practically useful error bounds on your outputs without using arbitrary precision if your code implements numerically stable approaches to performing the computations involved and error tolerances are specified and used in your comparisons. My definition of numerical stability is very practical: a property of numerical algorithms that allows them to be ported to a new language and/or hardware platform (without any special accomodations) and continue to perform practically useful computations on that new platform. The JAMA package, for example, appears to implement numerically stable matrix manipulation and decomposition algorithms because I and a small team of open-source programmers (see docs) were able to port them from Java to PHP without any significant modifications to the algorithms and achieved practically identical results between the PHP and Java versions.

Complex numbers

Jesus M. Castagnetto has written a nice Complex numbers package. Would we eventually need to extend a number abstraction library to support arbitrary precision operations with complex numbers as well?

Suggestions

I think Alix has made some good progress towards a number abstraction library for PHP. There are three main components missing in my opinion: 1) simple examples of class usage, 2) a unit testing class that asserts the expected outcome for each class method and checks the computed outcome against the expected outcome, 3) a non-trivial example of a math solution implemented using the library (e.g., I have a nice guassian elimination port that would be appropriate as an example). In other words, goodies in the demos, tests, and examples subfolders.

I'd like to thank Alix for sharing some of his thought-provoking work with me.

Permalink 

Variations on Simpson's rule [Math Programming
Posted on May 8, 2007 @ 10:29:09 PM by Paul Meagher

Understanding Simpson's rule can be difficult if you are laboring under the assumption that there is only one version of it. In fact, there are at least 4 basic versions of Simpson's rule: quadratic, cubic, adaptive, and composite. I implemented each of these variants as methods in a class file called Math_Integration_Simpson.php.

<?php
/**
* @package  Math_Integration
* @filepath Math/Integration/Simpson.php
*
* A numerical intergration class implementing 4 variants of 
* Simpson's rule. 
*
* Rule of thumb:

* If the function is smooth, Simpson's rule is better.
* If the function has abrupt changes, then Trapezoid is better.
*
* @author   Paul Meagher
* @version  0.1
* @license  LGPL
* @modified May 10, 2007
*
* @see http://math.fullerton.edu/mathews/n2003/NewtonCotesMod.html
* @see http://en.wikipedia.org/wiki/Adaptive_Simpson's_method
* @see http://www.cs.queensu.ca/~daver/271/Notes.pdf
*/
class Math_Integration_Simpson {
      
  
/**   
  * Approximation of f by a quadratic polynomial (i.e., uses three 
  * weighted points). x0 as a, x2 as b and x1 as 1/2 distance 
  * between a and b.  This is the most basic form of Simpson's 
  * rule and a favorite among textbook authors - also known as 
  * Simpson's one-third rule.    
  */
  
public function quadratic($f$a$b) {
    
$c = ($a+$b)/2.0;
    
$h abs($b-$a)/6.0;
    return 
$h*($f->valueAt($a) + 4.0*$f->valueAt($c) + $f->valueAt($b));
  }
  
  
/**    
  * Approximation f by a cubic polynomial (i.e., uses four 
  * weighted points). x0 as a, x3 as b and x1 and x2 as 1/3 and 
  * 2/3 distances between a and b.  Also known as Simpson's 
  * three-eights rule.
  */
  
public function cubic($f$a$b) { 
   
$h abs($b-$a)*(1.0/3.0);
   
$c $a $h;
   
$d $b $h;
   return (
3.0*$h/8.0)*($f->valueAt($a) + 3.0*$f->valueAt($c)+ 3.0*$f->valueAt($d)+ $f->valueAt($b));
  }

  
/**
  * Recursive implementation of adaptive Simpson's rule.
  */
  
private function recursive($f$a$b$eps$sum) {
    
$c     = ($a+$b)/2.0;
    
$left  $this->quadratic($f$a$c);
    
$right $this->quadratic($f$c$b);
    if (
abs($left $right $sum) <= (15 $eps))
      return 
$left $right + ($left $right $sum)/15;
    return 
$this->recursive($f$a$c$eps/2$left) + $this->recursive($f$c$b$eps/2$right);
  }

  
/**
  * Calculate integral of f from a to b using adaptive Simpson's rule 
  * with max error of eps.
  */
  
public function adaptive($f$a$b$eps=0.000001) {
    return 
$this->recursive($f$a$b$eps$this->quadratic($f$a$b));
  }

  
/**
  * Calculate integral of f from a to b using a piecewise approach. This 
  * form is commonly used to estimate the integral with $n used to control
  * the accuracy of the estimate.
  */  
  
public function composite($f$a$b$n=10) {
    
$h  = ($b-$a)/(2.0*$n);    
    
$s1 = ($f->valueAt($a)+ $f->valueAt($b));
    for(
$i=1$i<2*$n$i=$i+2
      
$s2 $s2 $f->valueAt($a $i*$h);
    for(
$i=2$i<2*$n$i=$i+2
      
$s3 $s3 $f->valueAt($a $i*$h);
    return (
$h/3.0)*($s14.0*$s2 2.0*$s3);
  }

}
?>

The following simpson_methods.php script executes each of the methods defined in this class.

<?php

include "Math/Integration/Simpson.php";

class 
{
  function 
valueAt($x) {
    return 
4.0/(1.0+$x*$x);
  }
  function 
toString() {
    return 
"4.0/(1.0+x^2)";
  }
  function 
toHTML() {
    return 
"4.0/(1.0+x<sup>2</sup>)";
  }
}

$f = new F;
$a 0;
$b 1;

$simpson = new Math_Integration_Simpson;

$quadratic $simpson->quadratic($f$a$b);
$cubic     $simpson->cubic($f$a$b);
$adaptive  $simpson->adaptive($f$a$b);
$composite $simpson->composite($f$a$b);

echo 
"quadratic method: $quadratic <br />";
echo 
"cubic method: $cubic <br />";
echo 
"adaptive method: $adaptive <br />";
echo 
"composite method: $composite <br />";

?>

Each method in the script generates a π estimate:

quadratic method: 3.13333333333 
cubic method: 3.13846153846 
adaptive method: 3.14159265371 
composite method: 3.14159265297 

The method producing the output closest to an exact π value of 3.1415926535 is the adaptive method, beating out the composite method by a tiny margin.

Other classes that might be found in a Math_Integration package include:

  • Math_Integration_Trapezoid.php
  • Math_Integration_Gaussian.php (i.e., gaussian quadrature)
  • Math_Integration_Romberg.php

All would have basic, adaptive, and composite variants.

Exercises

1. Play around with the $eps parameter in the adaptive Simpson's rule and the $n parameter in the composite Simpson's rule. Can the composite method be made to out perform the adaptive method in computing π?

2. Apply these methods to a variety of other functions and note any differences.

3. Implement a Math_Integration_Trapezoid.php class that contains a basic, adaptive, and composite implementation of the trapezoid method of integration.

4. Compare the relative performance of Simpson's rule to the Trapezoid method of integration on functions that are smooth or contain abrupt changes.

5. How do Lagrange polynomials relate to Simpson's rule?

6. How do the Newton-Cotes formulas relate to Simpson's rule and other methods of integration?

Permalink 

Simpson's Rule and π [Math Programming
Posted on May 4, 2007 @ 12:58:55 PM by Paul Meagher

There are many routes to π.

I recently discovered another one while I was doing some research on Simpson's Rule for approximating a definate integral. I ported some Simpson's Rule c code and then noticed that the output looked like the value of π. I thought I would share it with you.

<?php
/** 
* Simpson's rule 
*
* Simpson's rule is a method for approximating definite integrals. It works 
* in a similar way to the  trapezoidal rule except that the integrand is 
* approximated to be a quadratic rather than a straight line within each 
* subinterval.
*
* @see http://zen.uta.edu/mae2360/18.html
* @see http://metric.ma.ic.ac.uk/integration
*/

function f($x) {
  return 
4.0/(1.0+$x*$x);
}
 
$a  0.0;
$b  1.0;
$s1 0.0
$s2 0.0;
$s3 0.0;

$n  10;

$h  = ($b-$a)/(2.0*$n);

$s1 = (f($a)+ f($b));

for(
$i=1$i<2*$n$i=$i+2
  
$s2 $s2 f($a $i*$h);

for(
$i=2$i<2*$n$i=$i+2
  
$s3 $s3 f($a $i*$h);

printf("%20.12lf\n", ($h/3.0)*($s14.0*$s2 2.0*$s3)) ;

?>

If you save this script to a web enabled directory and point your browser at it then you should see this output: 3.141592652970.

On the Joy of Pi website they report 1 million digits of π, the first few of which are 3.1415926535.

Permalink 

Passing and evaluating functions the object-oriented way [Math Programming
Posted on February 27, 2007 @ 03:05:07 PM by Paul Meagher

Many numerical analysis problems involve evaluating a function at various points. There are many ways in which you can pass functions to your numerical analysis methods. Laurene V. Fausett's book Numerical Methods: Algorithms and Applications (2003) illustrates a nice object-oriented way to pass a function to numerical analysis methods. It involves creating a My_Function class that implements valueAt() and toString() methods. Because PHP is predominantly a web-scripting language I recommend including toHTML() method as well. Here is an example:

<?php 

class Function_Range {

  public function 
getRange(&$f$a$b) {
    
$_range[$a] = $f->valueAt($a);
    
$_range[$b] = $f->valueAt($b);
    return 
$_range;
  }

}    
    
class 
My_Function {

  public function 
valueAt($x) {
    return 
pow($x, -3);
  }
  
  public function 
toString() {
    return 
"x^-3";
  }

  public function 
toHTML() {
    return 
"x<sup>-3</sup>";
  }

}

$a 1;
$b 3;

$f = new My_Function;
$r = new Function_Range;

$o $r->getRange($f$a$b);

echo 
"Given [$a, $b] as input, the function <b>".$f->toHTML()."</b> outputs [".$o[$a].", ".$o[$b]."]";

?>

The output of this script is:

Given [1, 3] as input, the function x-3 outputs [1, 0.037037037037]

What I like about this approach is that you can just define and instantiate My_Function as an $f object and pass the $f object as an argument to your numerical analysis function. Once inside your numerical analysis function, you can evaluate your function by invoking the valueAt($x) method which makes your numerical analysis code quite readable. You often need to output the function you are evaluating so why not implement toString() and toHTML() methods at the same time you are implementing your valueAt($x) method.

Exercise

1. What happens if you simply call the class Function instead of My_Function?

2. What might the output of a toXML() method look like? See the MathML website for ideas.

Permalink 

Math_Interpolation_Lagrange [Math Programming
Posted on February 25, 2007 @ 11:10:58 PM by Paul Meagher

An interpolation function is used to estimate the values that lie between known data points. We might use an interpolation function to generate additional plotting points so that we can plot a smooth and definite curve through our data points. One popular interpolation function is called the "Lagrange Interpolating Polynomial" which uses a polynomial function to generate points on and between the supplied data points.

The Math_Interpolation_Lagrange class below has two main methods:

  1. A setCoefficients method for finding the polynomial coefficients to use for the interpolating polynomial.
  2. A setInterpolants method for evaluating the polynomial formula at intermediate points using the computed polynomial coefficients.

You can peruse the code comments for more details about how the class works.

<?php
/**
* Math_Interpolation_Lagrange.php
*
* The Lagrange interpolation object accepts three arguments: 
*
* 1. A data vector of x values
* 2. A data vector of y values
* 3. A vector of ix values, in the same range as your x values, that 
*    you want to find the cooresponding interpolated values iy for.
*
* It uses the data vector of x and y values to compute the polynomial 
* coefficients c and then uses these coefficients to find the interpolated 
* values iy for the supplied ix values.  All supplied and computed results 
* are stored as instance variables obtainable by accessor methods or 
* direct reference.
*
* @author Paul Meagher
* @license LGPL
* @created Feb 25/2007
* @version 1.0
*/
class Math_Interpolation_Lagrange {

  
/** 
  * @var array of x values
  */
  
var $x  = array();
  
  
/** 
  * @var array of y values
  */
  
var $y  = array();
  
  
/** 
  * @var array of polynomial coefficients
  */
  
var $c  = array();  
  
  
/** 
  * @var array of x values to be interpolated
  */
  
var $ix = array();  
  
  
/** 
  * @var array of interpolated y values  
  */
  
var $iy = array();    
  
  
/** 
  * @var number of elements in data arrays x and y
  */
  
var $n 0
  
  
/** 
  * @var number of elements in interpolation arrays ix and iy
  */ 
  
var $m 0;  


  
/**
  * Constructor sets up the internal storage variables, calls a 
  * method to compute the polynomial cooefficients, then finds 
  * the interpolated iy values given the supplied ix values.
  *
  * @param $x array of x values
  * @param $y array of y values  
  * @param $ix array of x values to be interpolated
  */
  
function Math_Interpolation_Lagrange($x$y$ix
  {
    
    
$this->x  $x;
    
$this->y  $y;
    
    
$this->ix $ix;    
    
    
$this->n  count($x);
    
$this->m  count($ix);    
    
    
$this->setCoefficients();
    
$this->setInterpolants();  
     
  }
      
  
/**
  * Computes the polynomial coefficients.
  */  
  
function setCoefficients()
  {
    for(
$i=0$i<$this->n$i++) {
      
$d[$i] = 1;
      for(
$j=0$j<$this->n$j++) {
        if(
$i != $j)
          
$d[$i] = $d[$i] * ($this->x[$i] - $this->x[$j]);
        
$this->c[$i] = $this->y[$i] / $d[$i];
      }
    }
  }
  
  
/**
  * Computes the interpolated iy values given the ix values
  * supplied in the constructor.
  */    
  
function setInterpolants() 
  {
    for(
$i=0$i<$this->m$i++) {
      
$this->iy[$i] = 0;
      for(
$j=0$j<$this->n$j++) {
        
$d[$j] = 1;
        for(
$k=0$k<$this->n$k++) {
          if(
$j != $k
            
$d[$j] = $d[$j] * ($this->ix[$i] - $this->x[$k]);
        }
        
$this->iy[$i] = $this->iy[$i] + $this->c[$j] * $d[$j];           
      }
    }
  }
  
  
/**
  * Recomputes the interpolated iy values given the new ix values.
  * 
  * @param $ix array of new x values to be interpolated
  */        
  
function changeInterpolants($ix
  {
    
$this->ix $ix;
    
$this->iy = array();
    
$this->m  count($ix);    
    
$this->setInterpolants();
  }

  
/**
  * @return the polynomial coefficients
  */  
  
function getCoefficients()
  {
    return 
$this->c;
  }

  
/**
  * @return array of interpolated values
  */  
  
function getInterpolants()
  {
    return 
$this->iy;
  }

  
/**
  * @return interpolated values as comma separated string
  */  
  
function toString() 
  {
    return 
implode(", "$this->iy);
  }
    
}
?>

The lagrange_test.php script demonstrates how to use the class to obtain two sets of interpolant values:

<?php

include "Math/Interpolation/Lagrange.php";

$x   = array(-20, -112);
$y   = array( 42, -118);

// create x values in data range using step size of .5
$ix1 range(-22.5); 

// create x values in data range using step size of .2
$ix2 range(-22.2); 

$MIL = new Math_Interpolation_Lagrange($x$y$ix1);
echo 
"<p>Interpolated values using step size of .5 are: ".$MIL->toString()."</p>";

$MIL->changeInterpolants($ix2);
echo 
"<p>Interpolated values using step size of .2 are: ".$MIL->toString()."</p>";

?>

Here is the output the script generates:

Interpolated values using step size of .5 are: 4, -1.1875, -1, 0.8125, 2, 1.8125, 1, 1.8125,8

Interpolated values using step size of .2 are: 4, 0.9776, -0.7264, -1.4384, -1.4464, -1, -0.3104, 0.4496, 1.1456, 1.6816, 2, 2.0816, 1.9456, 1.6496, 1.2896, 1,0.9536, 1.3616, 2.4736, 4.5776, 8

Exercise

1. Use a PHP graphing package like JpGraph or Image::Graph to plot your x and y data points and the interpolated points ix and iy. Draw a line through your original data points and your interpolated data points.

2. Use a formula to generate a few data points and then use the Math_Interpolation_Lagrange class to find interpolating values. Come up with a method to measure how accurate your Lagrange interpolating polynomial is by comparing interpolated values to formula generated values. Try using different formulas and compare the relative accuracy of the Lagrange method for those formulas.

3. Use the Runge Function to generate some data points and see how good the Lagrange method is at interpolating values. Note: The Runge function is known to give the Lagrange interpolation method some problems.

4. Create a Math_Interpolation_Chebyshev class to minimize the weaknesses of the Lagrange method for interpolating the Runge function. See this reference for further details on Chebyshev polynomials.

Permalink 

Cheatsheet for functional programming using PHP [Math Programming
Posted on January 27, 2007 @ 06:37:09 PM by Paul Meagher

Here are some PHP array functions that you would need to become familiar with if you wanted to emulate functional programming in PHP.

  • array array_map ( callback callback, array arr1 [, array ...] )
    

    array_map() returns an array containing all the elements of arr1 after applying the callback function to each one. The number of parameters that the callback function accepts should match the number of arrays passed to the array_map().

  • array array_filter ( array input [, callback callback] )
    

    array_filter() iterates over each value in the input array passing them to the callback function. If the callback function returns true, the current value from input is returned into the result array. Array keys are preserved.

  • mixed array_reduce ( array input, callback function [, int initial] )
    

    array_reduce() applies iteratively the callback function to the elements of the array input, so as to reduce the array to a single value. If the optional initial is available, it will be used at the beginning of the process, or as a final result in case the array is empty. If the array is empty and initial is not passed, array_reduce() returns NULL.

Permalink 

Euler method [Math Programming
Posted on January 9, 2007 @ 02:38:45 AM by Paul Meagher

I wanted to create a generic Euler method for solving ordinary differential equations (ODE) but the literature I have looked at so far embeds the Euler method into specific physics formulas. An example is the calculate method below used for computing the radioactive decay of uranium atoms.

<?php
/*
* Ported from:
*
* http://www.physics.purdue.edu/~hisao/book/www/examples/chap1/DECAY.TRU
*/
function calculate(&$nuclei, &$t$tc$dt$n) {
  for(
$i=0$i $n-1$i++) {
    
$nuclei[$i+1] = $nuclei[$i]-($nuclei[$i]/$tc)*$dt;
    
$t[$i+1] = $t[$i] + $dt;
  }
}

// initial number of nuclei
$nuclei[0] = 100;
// start time
$t[0] = 0;
// time step
$dt .05;  
// time constant 
$tc 1;
// number of iterations
$n 100;

calculate($nuclei$t$tc$dt$n);

?>
<table border='1'>
  <tr>
    <th>Time</th>
    <th># Nuclei</th>
  </tr>
  <?php
  
for ($i=0$i<$n$i++) {
    
?>
    <tr>
      <td><?php echo $t[$i]; ?></td>
      <td><?php echo $nuclei[$i]; ?></td>
    </tr>
    <?php
  
}
  
?>
</table>

Permalink 

The representation of fixed and floating point numbers in PHP [Math Programming
Posted on November 29, 2006 @ 11:15:55 PM by Paul Meagher

From the PHP Manual regarding integer datatypes:

The size of an integer is platform-dependent, although a maximum value of about two billion is the usual value (that’s 32 bits signed). PHP does not support unsigned integers.

Elliot Back has written a program to find the maximum representable integer which also provides some insight into how the 32 bits used to represent integers are allocated.

From the PHP Manual regarding floating point datatypes:

The size of a float is platform-dependent, although a maximum of ~1.8e308 with a precision of roughly 14 decimal digits is a common value (that's 64 bit IEEE format).

Chapell (1995) provides some insight into how the 64 bits are allocated:

A double precision variable is usually 8 bytes (16 bits) long. In a typical implementation, 53 bits of the number are devoted to the mantissa, and 11 bits are devoted to the exponent. The 53 bits devoted to the mantissa are enough to represent 15 or 16 significant decimal digits. Similarly, the 11 bits of the exponent are enough to represent numbers as large as 10308 and as small as 10-308. (p. 474)

Learn more about floating point representation from a C perspective. Consult Mike Cowlishaw's link collection for everything you ever wanted to know about decimal arithmetic.

Exercises

1. Develop a PHP program that finds the maximum floating point number.

2. Does an integer have mantissa and exponent parts?

3. Is there a way to know exactly (instead of just "roughly") how many bits are used to represent the mantissa and exponent parts of a PHP floating point number.

Permalink 

Round Off and Truncation Error [Math Programming
Posted on November 28, 2006 @ 05:19:04 PM by Paul Meagher

There are two types of error that occur in numerical analysis that we need to be concerned with: Round Off Error and Truncation Error. Ideally we would develop math applications that would give us error bound estimates for these two types of numerical uncertainty (e.g., $this->EstimatedRoundOffError, $this->EstimatedTruncationError).

Round Off Error occurs because our number crunching programs are often limited in the precision with which they represent the quantities used in numerical calculations. We might also chose to limit the precision of the storage variable because we feel it does not require or specify that much precision (i.e., currency transactions). Round off error anaysis for php-based math programs is possible because:

  1. PHP offers integer and float data types (a double precision data type) to accomodate numerical quantities.
  2. Some textbooks on numerical analysis use plain integer and double precision data types (aka float) to show-case the math used to estimate relative and absolute round off and truncation error for a group of math algorithms (i.e., different math algorithms for estimating the first and second derivative of a function when that function is linear, quadratic, cubic, nonlinear, etc...).

Deferring to Yakowitz and Szidarovsky (1989), a Truncation Error "is introduced by approximating some ideal mathematical operation by computations directed by a program. It presumes that computations are done without roundoff error." (p. 12)

Another quote from William H. Press et al. (2003) clarifies the importance of trucation error:

The discrepancy between the true answer and the answer obtained in a practical calculation is called the truncation error. Truncation error would persist even on a hypothetical , "perfect" computer that had an infinitely accurate representation and no roundouff error. As a general rule there is not much that a programmer can do about roundoff error, other than to choose algortithms that do not magnify it unnecesessarily . Truncation error, on the other hand, is entirley under the programmer's control. In fact, it is only a slight exaggeration to say that clever minimisation of truncation error is practically the entire content of the field of numerical analysis! (p. 33)

Exercises

1. Collect some other definitions of a truncation error and note any differences in the way in which numerical errors are classified or subclassified.

2. Provide an example of a trunction error.

Permalink 

Working with numbers in fractional format [Math Programming
Posted on November 22, 2006 @ 02:36:55 PM by Paul Meagher

Matlab offers the option to display decimal formatted numbers in their equivalent fractional format. PHP does not offer a builtin fraction function to convert a number from decimal format to fractional format. A google search turned up a nice compact pair of PHP-based functions by "kewbonium" that performs this conversion. I made minor modifications to these functions to conform to my own stylistic conventions. These clever functions are reproduced below:

<?php
/**
* Used to convert a number from decimal to fractional format. 
*
* Minor stylistic changes from original.
*
* @author kewbonium
* @see    http://rocketsauce.ca/projects/php_decimal_fraction_converter
*
* @param  float $dec number in decimal format
*
* @return string $f number in fractional format
*/
function fraction($dec) {
    
$s = (string)$dec;
    
$s explode('.',$s);    
    
$f = (int)$s[0];    
    if(
count($s) > 1) {    
        
$num  = (int)$s[1];
        
$len  strlen($num);        
        
$den  pow(10$len);
        
$num += $den $f;        
        
$gcd  euclid($num$den);        
        
$num /= $gcd;
        
$den /= $gcd;        
        if(
$dec 0)
            
$num *= -1;        
        return 
$num.'/'.$den;        
    } else        
        return 
$f;
}
 
/**
* Euclid algorithm for computing greatest common
* denominator.
*
* @param int $a
* @param int $b
*
* @return int greatest common demominator
*/
function euclid($a$b) {
    
$a abs($a);
    
$b abs($b);
    
$c $a $b;
    while(
$c != 0) {
        
$a $b;
        
$b $c;
        
$c $a $b;
    }
    return 
$b;
}

$dec = -0.20;
echo 
fraction($dec)

// answer: -1/5
?>

Exercises

1. Develop an algorithm that can detect a repeating decimal. Possible strategy: Duplicate strings in a sequence are often found by storing each string in the sequence as a key in an associative array and checking to see if a new string can be used to do an associative array lookup (if so then you have encountered a duplicate). Note that associative array lookups use a hashing method that potentially makes them an optimal data structure for this type of duplicate detection task.

2. Integrate the "repeating decimal detection algorithm" into the fraction function and provide an example (if possible) of how it handles a decimal-to-fraction conversion better than the function above.

3. Develop other methods that add, subtract, multiply, and divide fractions.

4. Would certain numerical algorithms be more exact (or better in some other way) if you worked with fractional rather than decimal formatted numbers? Would you avoid numerical inaccuracies that arise from truncation and/or roundoff?

Permalink 

Horner's rule for the efficient evaluation of polynomials [Math Programming
Posted on November 21, 2006 @ 03:42:28 PM by Paul Meagher

Horner's rule is used to efficiently evaluate polynomials. The basic idea is that a polynomial A(x) = a0 + a1x + a2x2 + a3x3 + ... may be computed more efficiently if it is re-written as A(x) = a0 + x(a1 + x(a2 + x(a3 + ...))). Notice that we don't have to use the pow() function to compute the result using the latter expression which is what makes Horner's rule a more efficient approach for evaluating polynomials. Horner's rule not a difficult rule to implement, however, you should understand how it works if you are doing any work with polynomials (which come up all the time in math programming).

The code below illustrates how the rule works:

<?php
/**
* Computes the value of the k-th degree polynomial p(x) 
* at a given independent value x using horner's rule.  
* The k+1 array A represents the polynomial according to:
*
* P(x)= A(k+1)*x^k + A(k)*A^k-1 + ... + A(2)*x + A(1)
*
* Adapted from:
*
* Sidney Yakowitz & Ferenc Szidarovszky (1989) "An 
* Introduction to Numerical Computations". Macmillan
* Publishing Company, NY.

* @param A a k+1 array of polynomial coefficients
* @param x the value to be evaluated
* @returns polynomial value p(x)
*/ 
function horner($A$x) {
  
  
// k is the degree of the polynomial.
  
$k count($A) - 1;  
  
  
// First term in the summation below is the last one. 
  // We don't want to start our summation below with a 
  // p=0 or the value of the summation below will be 0.
  
$p $A[$k];

  
// Evaluate the rest of the terms in the polynomial
  // in reverse order.  
  
for($i=$k-1$i>=0$i--) 
    
$p $p*$x+$A[$i];  
    
  return 
$p;

}

// Will use horner's rule to efficiently evaluate
// the polynomial p(x) = 3x^2 + x + 5.

$A = array(513);
$x 6;

echo 
horner($A$x);

// answer is 119

?>

Exercises

1. Implement your own quick hack for evaluating the polynomial above and try to evaluate which method involves fewer machine operations.

2. Is $p = $p*$x+$A[$i] equivalent to $p *= $x+$A[$i]?

Permalink 

Lagrange Interpolation using JAMA [Math Programming
Posted on November 20, 2006 @ 02:38:11 PM by Paul Meagher

I ported Jacob Dreyer's Lagrange Interpolation class from Java to PHP. Given an array of x[] and y[] data points, you can us the getPolynomialFactors method to find the factors to use in a polynomial equation that runs through the data points. Note that the method uses the Jama Matrix Package to compute the factors.

<?php

require_once "../Matrix.php";

/**
 * Given n points (x0,y0)...(xn-1,yn-1), the following methid computes
 * the polynomial factors of the n-1't degree polynomial passing through
 * the n points.
 *
 * Example: Passing in three points (2,3) (1,4) and (3,7) will produce
 * the results [2.5, -8.5, 10] which means that the points are on the
 * curve y = 2.5x² - 8.5x + 10.
 * 
 * @see http://geosoft.no/software/lagrange/LagrangeInterpolation.java.html
 * @author Jacob Dreyer
 * @author Paul Meagher (port to PHP and minor changes)
 *
 * @param x[] float
 * @param y[] float
 */   
class LagrangeInterpolation {
  
  public function 
findPolynomialFactors($x$y) {

    
$n count($x);

    
$data = array();  // double[n][n];
    
$rhs  = array();  // double[n];
    
    
for ($i 0$i $n$i++) {
      
$v 1;
      for (
$j 0$j $n$j++) {
        
$data[$i][$n-$j-1] = $v;
        
$v *= $x[$i];
      }

      
$rhs[$i] = $y[$i];
    }

    
// Solve m * s = b
    
    
$m = new Matrix($data);
    
$b = new Matrix($rhs$n);

    
$s $m->solve($b);

    return 
$s->getRowPackedCopy();
 
  }

}

$x = array(2.01.03.0);
$y = array(3.04.07.0);

$li = new LagrangeInterpolation;
$f $li->findPolynomialFactors($x$y);

for (
$i 0$i 3$i++)
  echo 
$f[$i]."<br />";

?>

The output is the same as what the inline documentation says it should be:

2.5
-8.5
10

Which are the factors in the interpolating polynomial equation: y = 2.5x2 - 8.5x + 10

Exercises

1. Add a method to this class that will find the y value given a particular x value (within the range of existing x values). In other words, it will use the computed polynomial equation to interpolate the value of y given a value of x in the domain of the function.

2. Add a check to the evaluation routine above that warns the user when they try to use to extrapolate (x value outside of the domain) rather than interpolate (x value within the domain) a y value.

Permalink 

Float casting and printf casting [Math Programming
Posted on November 8, 2006 @ 02:45:01 PM by Paul Meagher

You can convert a number in scientific notation to a floating point number by using (float) to cast the value to a floating point number. Be careful, however, as casting only works for exponents less than or equal to 12 as the following program demonstrates:

<?php
for ($i=1$i <= 12$i++) {
  
$val "1.0e$i";
  
$val = (float) $val;
  echo 
$val."<br />";
}
?>

Which outputs:

10
100
1000
10000
100000
1000000
10000000
100000000
1000000000
10000000000
100000000000
1E+012

If you want to render scientific notation number larger than 1E+012 in decimal format, then you should use the sprintf or printf functions to do so as the following program illustrates:

<?php
for ($i=1$i <= 20$i++) {
  
$val "1.0e$i";  
  
printf("%$i.0f<br />",  $val);  
}
?>

10
100
1000
10000
100000
1000000
10000000
100000000
1000000000
10000000000
100000000000
1000000000000
10000000000000
100000000000000
1000000000000000
10000000000000000
100000000000000000
1000000000000000000
10000000000000000000
100000000000000000000

Exercises

1. Develop a php script to cast numbers in scientific notation to decimal notation when the numbers are extremely small. In other words, your scientfic notation values should have increasing negative exponents. Use float and printf casting to explore the limits of these two casting methods.

2. Develop a php script to determine when the printf or sprintf functions encounter underflow and overflow conditions. In other words, what are the smallest and largest numbers that they can be used to format.

3. What, if any, is the relationship between printf or sprintf underflow and machine epsilon?

Permalink 

Computing machine epsilon [Math Programming
Posted on November 2, 2006 @ 01:00:18 PM by Paul Meagher

1 + epsilon is the smallest number greater than 1 that a computer will distinguish from 1. |x| + epsilon can be used to estimate the error bound on a computer's representation of a floating point number. It is called "machine" epsilon because the value is machine dependent. Here is some code you can use to compute machine epsilon:

<?php
/**
* Returns the machine epsilon value
*/
function getMachineEpsilon() {
  
$eps=1.0;
  while(
$eps+1.0 1.0
    
$eps=$eps/2.0;
  
$eps*=2.0;
  return 
$eps;
}
?>

On my local window XP environment using PHP5 the machine epsilon value works out to 2.22044604925E-016 which is essentially the same as the value obtained on my linux server.

The value of machine epsilon becomes relevant when testing whether two small values might be equal. For iterative algorithms that test for convergence it may not be appropropriate to do a direct test of equivalence, but rather test whether the difference is less that a specified error tolerance. One might also need to incorporate a machine epsilon test in order to make an interative algorithm portable accross platforms.

Exercise

You can use the EPS value to estimate the number of bits used to represent the mantissa. Recall that floating point number storage consists of a bit to hold the sign of the mantissa, bits to hold the digits of the rounded mantissa, a bit to hold the sign of the exponent, and bits to hold the digits of the exponent. The first two parts are called the mantissa segment and the last part is called the exponent segment. Machine epsilon can be theoretically approximated as 2-s where s is the mantissa length of s bits. So 2-s is approximately equal to 2.22044604925E-016 for some value of s. What is that value?

Permalink 

Fast array lookups [Math Programming
Posted on May 27, 2006 @ 04:50:33 AM by Paul Meagher

Daniel Lemire asked me this PHP question today:

If you have a 10000 objects in a set, and you want to check if another element is in there, what do you do? You don't actually go through the whole list, do you?

I suggested that PHP's in_array() function might do the trick however Daniel persisted in asking me with whether the lookup was O(n) or O(1) like Python (a language Daniel was more intimately familiar with).

In the PHP manual under Arrays there is a comment by S Boisvert that clarified the answer. Essentially, in_array() is (O)n but if you can turn your array lookup problem into a test of whether an array key exists (i.e., isset($array[$key])), you can get O(1) speed because it is a hashed lookup then.

To further test this assertion I developed a program that tests array lookups using the in_array() approach versus the isset() approach and was able to conclusively demonstrate that the isset() approach is the O(1) way to do array lookups.

Thanks to Daniel for helping to educate me on some of the finer points of PHP arrays and a practical use for the big O notation.

Permalink 

Simple Regression vs Régression Simple [Math Programming
Posted on April 19, 2006 @ 03:51:28 PM by Paul Meagher

English is sometimes not the best language for discerning the object-oriented structure of a domain. A structural engineer friend of mine, Malcolm Nemis, made the astute observation that the French language offers a much more logical approach to specifying objects in CAD systems because it orders expressions in a more natural >> object > attribute >> sequence rather than the >> attribute > object >> sequence that is common in English expressions.

An example of this sequential difference is how "Simple Regression" (>> attribute > object >>) is referred to in French, namely, "Régression Simple" (>> object >> attribute >>).

Blame it on the French language for influencing my decision to change the name of the SimpleRegression class to REGRESS_Simple.

Permalink 

PHP/Math download system [Math Programming
Posted on April 3, 2006 @ 03:41:24 PM by Paul Meagher

I have put in place an improved system for downloading PHP/Math Packages.

You can now download the complete PHP/Math Library by going into the build02/docs folder and clicking the download link.

You can download individual PHP/Math Packages {JAMA, PDL, REGRESS} in a similiar manner.

The download system uses the PEAR Archive_Tar package and the download system ships with the code.

To get the download system working on your local development platform you will need to 1) pear install Archive_Tar, 2) create a top level "downloads" folder, 3) create a "downloads" subfolder in each package folder (JAMA, PDL, REGRESS), 4) set the appropriate permissions on these "downloads" folders:

chmod -Rf 777 downloads 

All of this points towards the need for a PHP/Math installation system that will create the relevant "downloads" folders, change their permissions, and replace into the "Math/PDL", "Math/JAMA", and "Math/REGRESS" system folders the currently downloaded PDL, JAMA, and REGRESS packages. I will be putting together a bash shell script to implement these installation steps.

Using a standard set of subfolder names

Some JAMA scripts are currently broken because I standardized on a set of package subfolder names and changed JAMA subfolder names to use these standard names (e.g., "doc" changed to "docs", "test" changed to "tests", "util" changed to "utils").

My standard set of subfolder names currently includes:

/examples
/tests
/lib
/utils
/functions
/docs
/docs/includes
/docs/sections
/docs/images

Developers don't have to use all these subfolder names for any given project, only the subfolders which are required. Other subfolder naming options might be added in the future.

One argument for doing things in this way is that the download system can be developed with less ad-hocery (i.e., no need to individually customize which subfolders are to be included for each package; interate through a standard set of subfolder names instead).

A standardized set of subfolder names also allows developers and users to develop a consistent set of expectations about how current and future PHP/Math packages might be browsed online, code downloaded, and code installed.

Permalink 

Fluency in math [Math Programming
Posted on March 31, 2006 @ 01:31:55 PM by Paul Meagher

Introducing Bunny Regex

In Introducing BunnyRegex: easy regular expressions, and mini-languages inside of PHP, Jonnay discusses the BunnyRegex.php class which provides an excellent example of fluent interface design:
include("BunnyRegex.php");

$pattern = new BunnyRegex();

$pattern->bol()        // ^
        ->digit()      // d
        ->exactly(4)   // {4}
        ->string('/')  // /
        ->digit()      // d
        ->exactly(2)   // {2}
        ->string('/')  // /
        ->digit()      // d
        ->exactly(2)   // {2}
        ->eol();       // $

$pattern->match('2006/02/28'); // returns true
$pattern->numberOfMatches('2006/02/28'); // returns 1

Implementing fluent interfaces in PHP

In Making Fluent Interfaces Readable in PHP, Jonnay deconstructs fluent interfaces in these terms:

Fluent Interfaces are a method of programming where setters, and sometimes behavioural methods that normally return void/null actually return an object instance, usually (but not always) the object being operated on. This is so that you can get around the effect where you make a temporary variable to alter an object, but don't end up using it again.

Making JAMA port more fluent

The Java-based JAMA linear algebra library is already quite fluent as illustrated by this line of computation:

R = L.times(U).minus(M.getMatrix(p,0,n-1));

Perhaps fluency holds the secret to helping me fully port the JAMA library so that it can be similiarly fluent.

Currently, this is how I compute the value of $R:

$S = $L->times($U);
$R = $S->minus($M->getMatrix($p,0,$n-1));    

Ideally it would be:

$R = $L->times($U)->minus($M->getMatrix($p,0,$n-1));

Just like the JAVA version.

Aspect Oriented Programming AOP

Jonnay has written a followup article Aspect Oriented Programming in PHP as a contrast to other languages which embeds this line of development and thinking in the AOP tradition.

Now that you know about fluency of the AOP sort, you can use this knowledge to create some very powerful API's for your php-based mathematical classes.

Permalink 

Structure and style of a PHP/MATH codebase [Math Programming
Posted on March 24, 2006 @ 04:07:21 PM by Paul Meagher

I am presently adding a forward slash "/" to PHP/MATH in order to signify my desire to develop a code base that will be housed as a set of Sub Folders or Packages under PHP's shared include directory. These math packages would be globally available to all PHP web applications that require them.

Components of that code base would be included into a php-based web application as follows:

:
include "Math/PDL/BinomialDistribution.php";
include "Math/JAMA/QRDecomposition.php";
include "Math/CHI/CHI2D.php";
include "Math/IT/JointEntropy.php";
include "Math/BAYES/NiaveBayes.php";
:

This is similiar to the goal of PEAR/Math developers and it is my hope that the PHP-based packages developed and promoted here will complement those packages.

If anyone looks at the php-based math packages I have developed to date you will notice that I often adopt a miminimalist coding style that does not agree on important presentational aspects with PEAR coding standards.

Some of my preferences and hangups include:

  1. I prefer 2 spaces for indentation - humans have excellent vernier acuity so we arguably don't need more space to resolve differences in code-block indentation.
  2. I prefer the KR style of placing the beginning brace "{" at the end of a code line rather that on its own line.
  3. I only use braces if I have to.
  4. Sometimes it scans better when you have zero space between the variable and operator elements of an algebraic equation.
  5. I will sometimes use one letter variable names $A to denote an important class member rather than something longer and more descriptive like $Matrix.
  6. Sometimes, a math algorithm is comprehended better if it is made to physically occupy as little space as possible on a sheet of paper rather than to be too liberally loaded with line space, comments, and space-hogging variable names (i.e., flow-breaking elements).

My coding style tries to economize on the amount of physical space the code occupies on a sheet of paper if it were to be printed. I call it the "ecomony of space" coding style. The argument for using this coding style is that it is less arbitrary than other coding styles. There is a consistent economy-of-space principle behind the code aesthetics the style proposes.

Given these presentational differences in coding style, I still try to follow many of the PEAR best practices when developing PHP/MATH object-oriented packages.

I expect the site to also differ in some significant respects from the PEAR:MATH effort in that there may be an educational component to this site and other varieties of math-related code might be actively promoted and developed (i.e., php math function libraries, php math snippets, datamining tools, web interfaces, AI experiments, game programming components, financial analysis tools, 3D rendering, genomic analysis, natural language processing, survey analysis, etc...).

Permalink 

 Archive 
 


php/Math Project
© 2011. All rights reserved.