Carl Love

Fisher's exact test...

Answered: Carl Love 28065

May 21 2019

1 4

In October 2018, I Posted an article and code for Fisher's exact test. This is a good situation to use it. (The other three tests mentioned in that Post don't apply to this situation, and, unlike Fisher's, they're only approximations anyway.) The contingency table to test (i.e., the observed table) is

SurvivorTeams:= <
   5, 0, 4; 
   0, 5, 1
>:

where each column represents one of the three new teams and each row represents one of the two original teams. (It makes no difference to the final result whether the roles of the rows and columns are switched, i.e., if we were to transpose the matrix, although my code is a bit more computationally efficient if you use the more-columns-than-rows form.)

The null hypothesis is that the 15 players were assigned to the three teams independently of which of the two original teams they were from, i.e., randomly. The alternative hypothesis is that it wasn't done randomly. Like all (two-tailed^[*1]) statistical hypothesis tests, this one does not and cannot give a definitive yes-or-no answer to that. Rather, they all return a p-value: the probability that the observed data (such as our matrix above)---or any other arrangement of data at least as deviant or extreme---would be observed if the null hypothesis were true. (I cannot overemphasize the importance of the underlined phrase in the previous sentence. Make sure that you understand it. It's fundamental to all statistical hypothesis testing. In a discrete case, such as the current one, deviant or extreme could be reworded as improbable.) Thus, a small value of p provides evidence that the alternative hypothesis is true.

Running my procedure on the given matrix:

infolevel[ContingencyTableTests]:= 1:
interface(prettyprint= 1):
ContingencyTableTests(SurvivorTeams, method= Fisher): 
% = evalf(%);
25 tables evaluated, of which 6 are at least as improbable as the observed input.
                             6                   
                            ---- = 0.005994005994
                            1001

Note that this is numerically the same probability as I gave in my off-the-top-of-my-head prior Answer, although the computational process that my procedure uses is much more elaborate and automatic (it having computed the probabilities for 25 different arrangements and summing 6 of those).

Furthermore, I object to Kitonum's Reply being referred to (by mmcdara) as a "correction" of my prior Answer.

In the code below^[*2], I give 5 weblink references. Of these, I most highly recommend the Mathworld article "Fisher's Exact Test"^[*3], which contains this formula (totally reworded by me): The probabality of any particular matrix T 's occurence in S is (&*!(R)*&*!(C)/n!)/&*!(T) where &*! is the product of the factorials of the entries (of a vector or matrix), R is the row-sum vector, C is the column-sum vector, n is the total of all entries, and S is the sample space of all matrices of nonnegative integers that have the same R and C. For the table SurvivorTeams, that's

(6!*9!*5!^3/15!)/(5!*5!*4!)
1 / 1001

Or, more programmatically, that's

`&*!`:= mul@factorial~:
Pr:= proc(T::Matrix(nonnegint))
uses AT=ArrayTools;
   &*!(AT:-AddAlongDimension(T,1))*&*!(AT:-AddAlongDimension(T,2))/add(T)!/&*!(T)
end proc:

Pr(SurvivorTeams);

There are 25 possible tables with the same R and C, with 6 of those corresponding to the 6 possible permutations of the columns or 6 possible assignment of colors to the new teams, each with equal probability, and none of the other 19 have lower probability. Hence, again we arrive at the answer 6 / 1001. (The interested reader should try to construct a few of the 19 tables omitted from the sum, as Fisher did long before the computer era.)

^[*1]A two-tailed statistical hypothesis test is one where the alternative hypothesis is the direct negation of the null hypothesis. It is called a two-sided alternative. A one-sided alternative involves a directional inequality.

^[*2]Here's my code, which has been substantially updated since my October 2018 Post:

#This code uses features new to Maple 2019.

ContingencyTableTests:= proc(
	O::Matrix(nonnegint), #contingency table of observed counts 
	{method::identical(Pearson, Yates, G, Fisher, Fisher[count]):= 'Pearson'}
)
description 
	"Returns p-value for Pearson's (w/ or w/o Yates's continuity correction)" 
	" or G chi-squared or Fisher's exact test."
	" All computations are done in exact arithmetic."
;
option
	author= "Carl Love <carl.j.love@gmail.com>, 20-May-2019",
	references= (                                                           #Ref #s:
		"https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test",         #*1
		"https://en.wikipedia.org/wiki/Yates%27s_correction_for_continuity",  #*2
		"https://en.wikipedia.org/wiki/G-test",                               #*3
		"https://en.wikipedia.org/wiki/Fisher%27s_exact_test",                #*4
		"Eric W Weisstein \"Fisher's Exact Test\" _MathWorld_"
		"--A Wolfram web resource:"
		"   http://mathworld.wolfram.com/FishersExactTest.html"               #*5
	)
;
uses AT= ArrayTools, St= Statistics, It= Iterator;
local
	C:= AT:-AddAlongDimension(O,1), R:= AT:-AddAlongDimension(O,2), #column & row sums
	c:= numelems(C), r:= numelems(R), #counts of columns & rows
	n:= add(R), #number of observations

	#All remaining code in `local` applies strictly to Fisher's.
	X, `&*!`:= mul@factorial~, 
	Cutoff:= &*!(O), #cut-off value for less-likely matrices
	CTs:= 0, #count of contingency tables with same row and column sums as O.
	Devs:= 0, #count of those tables with probabilities <= that of O.
	
	#Generate recursively all contingency tables whose row and column sums match O.
	#A very similar recursion is documented on the Maple help page
	# ?Iterator,BoundedComposition as example "Contingency Table".
	CountCTs:= (C,i)-> 
		`if`(i=r, 1, add(thisproc(C-X,i+1), X= It:-BoundedComposition(C,R[i]))),
		
        #Using the same recursion as above, generate the tables and compute their  
	#probabilities under the null hypothesis. The formula for the probability is
	#from reference *5. Sum probabilities of all those at least as improbable as O. 
	#Parameters: 
	#   C = column sums remaining to be filled; 
	#   F = product of factorials of entries of contingency table being built;
	#   i = row to be chosen this iteration
	AllCTs:= proc(C, F, i)
		if i = r then #Recursion ends.
			if irem(++CTs, 10^5) = 0 then Report() fi;
			#Last row is C, the unused portion of column sums:
			if (local P:= F*&*!(C)) >= Cutoff then ++Devs; 1/P else 0 fi
		else
			add(thisproc(C-X, F*&*!(X), i+1), X= It:-BoundedComposition(C, R[i]))
		fi
	end proc,
   
	Report:= ()-> userinfo(
		1, ContingencyTableTests, NoName, CTs, "tables evaluated, of which", Devs, 
		"are at least as improbable as the observed input."
	)     
;
	if method in {'Fisher', 'Fisher'['count']} then
		if method = 'Fisher'['count'] then return CountCTs() fi; 
		X:= AllCTs(C, 1, 1)*&*!(R)*&*!(C)/n!;  #see *5
		Report();
		X
	#Readers only interested in Fisher's exact test can ignore everything below
	#this point.
	else #Use one of the chi-squared methods: Pearson, G, Yates.
		local E::rtable; #matrix of expected values under null hypothesis 		
		try
			E:= rtable(
				(1..r, 1..c), 
				(i,j)-> R[i]*C[j],
				#The chi-squared tests aren't sufficiently
				#accurate if any cell has expected value < 5:
				datatype= satisfies(x-> x >= 5*n) 
			) / n;
		catch "unable to store":
			error
				"cell expected value too small for chosen method;"
				" use 'method'= 'Fisher'"
		end try;
		userinfo(
			1, ContingencyTableTests, 
			"Table of expected values under null hypothesis:", 
			print(
				`if`(
					E::'rtable'(posint), 
					E, 
					E=subsindets(E, Not(integer), evalf[1+ilog10(n)])
				)
      		)
		);
		#Pearson's, Yates's, and G all use a chi-square statistic,
		#each computed by a slightly different formula:
		local Chi2:= add@~table([
			'Pearson'= ((O,E)-> (O-E)^~2 /~ E),                     #see *1
			'Yates'= ((O,E)-> (abs~(O-E) -~ 1/2)^~2 /~ E),          #see *2
			'G'= ((O,E)-> 2*O*~map(x-> `if`(x=0, 0, ln(x)), O/~E))  #see *3
		]);
		1 - St:-CDF(ChiSquare((r-1)*(c-1)), Chi2[method](O,E)) 
	fi   
end proc:

^[*3]Eric W. Weisstein, "Fisher's Exact Test", Mathworld--A Wolfram web resource.

Use separate matrices...

Answered: Carl Love 28065

May 21 2019

0 0

You're trying to pack several matrices side-by-side into a single matrix, but the original matrices don't have the same number of rows, so it can't be done. Why not just export them individually?

Missing a colon...

Answered: Carl Love 28065

May 20 2019

0 1

Where you have k1 = 10, it should be k1:= 10.

RandomVariable, EmpiricalDistribution...

Answered: Carl Love 28065

May 19 2019

0 1

It's done like this:

X:= Statistics:-RandomVariable(
   EmpiricalDistribution([-1, 0, 1, 2], probabilities= [0.2, 0.25, 0.35, 0.2])
):
Statistics:-Mean(X);

Non-ASCII character...

Answered: Carl Love 28065

May 18 2019

2 1

Your problem is simply a non-ASCII character in the last line of your code. Where there should be := you have :? where ? is some non-ASCII character that looks like :=

To plot it,

plots:-odeplot(dsn1, [w, x(w)])

and likewise for p(w).

Hypergeometric probability...

Answered: Carl Love 28065

May 18 2019

1 3

The count for the probability's numerator is binomial(6,5)*binomial(9,5); for the denominator, binomial(15,5)*binomial(10,5)/3!.

I am posting this from my phone, where I have neither the ability to view your worksheet nor do my computation in Maple.

implicitdiff...

Answered: Carl Love 28065

May 17 2019

2 0

Assuming that your subscript notation just means another derivative, then it is easy to get these with implicitdiff. A single command can do all three:

EQ:= V^3-R*T*V^2/P-(B^2+P*B*R*T*sqrt(T)/(P-A))*V-A*B/(P*sqrt(T)) = 0:
indeps:= %op~([[T,V], [V,T], [V,V]]):
DP:=  table(value(indeps =~ map[3](%implicitdiff, EQ, P, indeps)));

Now the three derivatives can be referenced by DP[T,V], DP[V,T], and DP[V,V]. You can very easily verify with simplify that the first two of these are equal.

Matrix initializers...

Answered: Carl Love 28065

May 17 2019

0 0

Using a basic procedure as a Matrix initializer, it can be done by

M:= Matrix((N+1)$2, (i,j)-> `if`(j = i+1, i, 0))

It can also be done with the scan option of the Matrix command:

M:= Matrix((N+1)$2, [[0$(N+1)], [$1..N]], scan= band[0,1])

Arrays can have negative indices...

Answered: Carl Love 28065

May 16 2019

1 0

You can do

U:= Array(-x..x, -x..x);

then do the initialization. After that, if you want to use it as a mathematical matrix for LinearAlgebra commands, you can do

M:= convert(U, Matrix);

Now M will have indices 1..11. Note that your convert command above doesn't change U itself to a matrix because you did not assign the output of the command.

If you intend to do this on a large scale, there's a better way to re-index using the command ArrayTools:-Alias.

"Last name evaluation"...

Answered: Carl Love 28065

May 16 2019

0 0

When certain structures (such as procedures, modules, or tables) are assigned to a name (such as your T[1]), and you access that name, the name itself rather than the assigned value is shown. This process is called last name evaluation. To see the assigned value, use eval(T[1]). An arrow expression such as x-> x^2 is a type of procedure, so last name evaluation applies to it.

fnormal...

Answered: Carl Love 28065

May 14 2019

2 7

Acer's Answer is good, and you can learn much about working around Maple's numeric integration by studying it. But I don't know why he's pandering to your fantasy that those infinitesimal Bessel terms contribute anything to the result. The integral can be easily done by

CodeTools:-Usage(
   int(fnormal(F1), [theta= Pi/4..7*Pi/4, r= 0..1])
);
memory used=58.56MiB, alloc change=0 bytes, cpu time=500.00ms, real time=500.00ms, gc time=0ns
                                          12
                    -1.1160010071917965 10

Indeed, after the integrand is fnormal'ed as above, the integral can be done by elementary symbolic methods as taught in third-semester calculus.

One should not infer from this Answer that such use of fnormal will always produce results of the same accuracy as those produced by not using it.

Need to rescale the range of t also...

Answered: Carl Love 28065

May 13 2019

0 2

In order for the second animation to be properly considered to be a rescaling of the first, you need to rescale the t range also. So, try this,

animate(plot, [fourier(f(x,2^t),x,w)/2^t, w= 0..100], t= log[2](0.1)..log[2](1), frames= 100)

To get further help, you'll need to show the formula for f(x,t).

Help file conversion at Maple 18...

Answered: Carl Love 28065

May 10 2019

1 1

With the release of Maple 18, a new format for the internal storage of help files was introduced. This is transparent to the end user. Third-party help files (such as your OrthogonalExpansions) created in the old format need to be updated. This can be done with a single command, which finds all such help files and updates them:

HelpTools:-Database:-ConvertAll():

Let me know if you have any trouble with that.

LinearAlgebra:-Generic...

Answered: Carl Love 28065

May 10 2019

1 4

LinearAlgebra:-Generic works fine with GF (in Maple 2019). (VV, can you say more about why you think that it doesn't? Perhaps it doesn't work in the OP's Maple 2015.)

Magma: The only mistake that you made was asking for the determinant. Of course, you can't ask for the determinant of a nonsquare matrix. So, just ask for the rank. Below, I changed the order of the columns from the matrix in your worksheet because your matrix A was already in reduced row-echelon form, making it not such a good example for testing the code.

restart:
G:= GF(2,4):
A:= G:-input~(<
   5, 0, 4, 1, 0, 0;
   8, 7, 2, 0, 1, 0;
   3, 0, 1, 0, 0, 1
>):
G:-output~(LinearAlgebra:-Generic:-ReducedRowEchelonForm[G](A, 'r'));
                             [1  0  0  2  0  8 ]
                             [                 ]
                             [0  1  0  4  6  15]
                             [                 ]
                             [0  0  1  6  0  10]
r;
                                      3

Too few terms...

Answered: Carl Love 28065

May 09 2019

1 2

Some points to consider:

Intregration, even numeric integration, via Maple's int, Int, or `evalf/Int` is not threadsafe---it may not consistently give correct results when used in the shared-global-memory environment of the Threads package (see ?index,threadsafe). This is likely due to those procedures using global variables which are not protected from being overwritten by the same global variables in the sibling tasks. The Grid package does parallel processing in a non-shared-memory environment. Of course, this incurs substantial overhead (to copy the memory), and, of course, more memory. So, Grid is more appropriate for most parallel processing.

Nonetheless, your code does give accurate results in every test that I've tried. So, numeric intregration via _cuhre may in reality be threadsafe; it just hasn't been documented to be so.

The number of terms that you're adding,15, is much, much too small to be a meaningful speed test of Threads. Below, I do 10000 iterations of your integral, and the real time is cut in half by using Threads (or, as Joe said, the speed up is 2x).

If you're just adding terms, use Threads:-Add rather than the more-complicated Threads:-Task.

restart:
L1:= 10:  N:= 10000:
randomize(37): #Arbitrary number
xx:= Statistics:-Sample(Uniform(0,100), N): 
                            
CodeTools:-Usage(
   Threads:-Add(
      int(
         sin(beta)/(100 + ZZ*sin(beta) - xx[i]*cos(beta))^(5/2), 
         beta= 0..1, ZZ= 0..L1, 
         'numeric', 'epsilon'= 0.01, method= _cuhre
      ),
      i= 1..N
   )
);
memory used=384.35MiB, alloc change=464.00MiB, cpu time=14.33s, real time=2.81s, gc time=828.12ms
                          107.8763089
restart:
L1:= 10:  N:= 10000:
randomize(37): #Same arbitrary number
xx:= Statistics:-Sample(Uniform(0,100), N): 
                               
CodeTools:-Usage(
   add( #without Threads
      int(
         sin(beta)/(100 + ZZ*sin(beta) - xx[i]*cos(beta))^(5/2), 
         beta= 0..1, ZZ= 0..L1, 
         'numeric', 'epsilon'= 0.01, method= _cuhre
      ),
      i= 1..N
   )
);
memory used=357.54MiB, alloc change=-4.00MiB, cpu time=6.42s, real time=5.67s, gc time=1.31s
                          107.8763089
(1 - 2.81/5.67)*100; #Percent time reduction
                          50.44091711

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

Ask a Question

Create a Post

28065 Reputation

25 Badges

MaplePrimes Activity

These are answers submitted by Carl Love

Fisher's exact test...

Use separate matrices...

Missing a colon...

RandomVariable, EmpiricalDistribution...

Non-ASCII character...

Hypergeometric probability...

implicitdiff...

Matrix initializers...

Arrays can have negative indices...

"Last name evaluation"...

fnormal...

Need to rescale the range of t also...

Help file conversion at Maple 18...

LinearAlgebra:-Generic...

Too few terms...

Save this setting as your default sorting preference?

Ask a Question

Create a Post

Generating PDF…

Save this setting as your default sorting preference?
Note: You can change your preference any time in your account settings
Don't show this again

From:
To:

Custom Message (optional):