## DESCRIPTION ## Statistics 305 ## ENDDESCRIPTION ## KEYWORDS('Statistical inference; Hypothesis testing, Poisson ## distribution, Neyman--Pearson test; finding a likelihood ratio for a Poisson ## mean given a simple alternative hypothesis, identifying the behaviour of the ## likelihood ratio as the sample total increases, finding the criterion for ## rejection in a one-sided hypothesis test for the mean of a Poisson ## distribution, finding the P-value of the test, finding the power of the test ## for a given alternative value of the mean.') ## DBsubject('Statistics') ## DBchapter('Hypothesis testing') ## DBsection('Neyman-Pearson tests') ## level(5) ## Date('2014/06/12') ## Author('Andy Leung') ## Institution('University of British Columbia') ## TitleText1('No Text') ## EditionText1('?') ## AuthorText1('?') ## Section1('?') ## Problem1('?') ######################################################################## DOCUMENT(); loadMacros( "PGstandard.pl", "PGchoicemacros.pl", "parserRadioButtons.pl", "MathObjects.pl", "parserMultiAnswer.pl", "unionTables.pl", "RserveClient.pl", "PGcourse.pl" ); # Print problem number and point value (weight) for the problem TEXT(beginproblem()); ############################################################## # Question and R Setup ############################################################## Context->("Numeric"); SRAND($psvn); $setseed = "set.seed($psvn); 1"; # Do this in R rserve_start(); # Randomly generate parameters: n, y, ysum, and mu rserve_eval($setseed); @n = rserve_eval('n <- round(runif(1, min=30, max=41),0); n'); $n = join ", ", @n; @ysum = rserve_eval('y <- rpois(n, lambda=4); ysum <- sum(y); ysum'); $ysum = join ", ", @ysum; @mu1 = rserve_eval('mu1 <- round(runif(1, min=3.9, max=4.9),1); mu1'); $mu1 = join ", ", @mu1; # Answers @ans_a = rserve_eval('ans_a <- round( log(exp(-n*(3 - 4))*(3/4)^ysum), 2); ans_a'); $ans_a = join ", ", @ans_a; @ans_c_num = rserve_eval('ans_c <- qpois(0.05, lambda=5*n)- 1; ans_c'); $ans_c_num = join ", ", @ans_c_num; @ans_d = rserve_eval('ans_d <- round( ppois(ysum, lambda = 5*n),3); ans_d'); $ans_d = join ", ", @ans_d; @ans_e = rserve_eval('ans_e <- round( ppois(ans_c, lambda = n*mu1), 2); ans_e'); $ans_e = join ", ", @ans_e; rserve_finish(); # Multiple choice question for part b $qu_b = "Consider in general the likelihood ratio here when \(\mu _{1}<\mu _{0}\). As the sum of the observations increases, the likelihood ratio"; $ans_b = "decreases."; $fake1_b = "increases."; $fake2_b = "can both increase and decrease."; $mc_b = new_multiple_choice(); $mc_b->qa( $qu_b, $ans_b ); $mc_b->extra( $fake1_b, $fake2_b ); # Multiple choice question for part c $qu_c = "It has been hypothesised by the instructor of the course that there are an average of five typing errors in each edition of $BITALIC The Guard $EITALIC The editor of the newspaper has objected, claiming that the mean is less than five per edition. To perform the appropriate hypothesis test here at a significance level no greater than 5%, you would (complete one of the following statements):"; $ans_c = "reject the null hypothesis when the total number of observed errors is no greater than \(k\),"; $fake_c = "reject the null hypothesis when the total number of observed errors is greater than \(k\),"; $mc_c = new_multiple_choice(); $mc_c->qa( $qu_c, $ans_c ); $mc_c->extra( $fake_c ); ############################################################## ############################################################## # Question in Text ############################################################## Context()->texStrings; BEGIN_TEXT The number of typing errors in an edition of a newspaper could reasonably be modelled by a Poisson distribution with some mean \(\mu\). Each of the \($n\) students on a journalism course are allocated at random a past edition of the $BITALIC The Guard $EITALIC newspaper and asked to find the number of typing errors it contains. The total number of errors found by the students is \($ysum\). $BR $BR $BBOLD Part a) $EBOLD In testing the null hypothesis \(H_{0}:\mu =\mu _{0}\) against the alternative \(H_{a}:\mu =\mu _{1}\), find the log-likelihood ratio. Evaluate the natural logarithm of this ratio for the case \(\mu _{0}=4\) and \(\mu _{1}=3\), giving your answer to two decimal places. $BR \{ ans_rule(12) \} $BR $BR $BBOLD Part b) $EBOLD \{ $mc_b->print_q() \} $BR \{ $mc_b->print_a() \} $BR $BR $BBOLD Part c) $EBOLD \{ $mc_c->print_q() \} $BR \{ $mc_c->print_a() \} $BR where \(k\) is $BR \{ ans_rule(12) \} $BR $BR $BBOLD Part d) $EBOLD For the same test, what is the P-value? Give your answer to three decimal places. $BR \{ ans_rule(12) \} $BR $BR $BBOLD Part e) $EBOLD Suppose the true mean number of errors per edition of $BITALIC The Guard $EITALIC is \($mu1\). Find the power of the test above, giving your answer to two decimal places. $BR \{ ans_rule(12) \} $BR $BR END_TEXT ############################################################## ############################################################## ## Show answer ############################################################## $showPartialCorrectAnswers = 1; ANS( num_cmp($ans_a, tol=> 0.01) ); ANS( radio_cmp($mc_b->correct_ans()) ); ANS( radio_cmp($mc_c->correct_ans()) ); ANS( num_cmp($ans_c_num, tol=> 0.01) ); ANS( num_cmp($ans_d, tol=> 0.001) ); ANS( num_cmp($ans_e, tol=> 0.01) ); ############################################################## ############################################################## # Solution ############################################################## Context()->texStrings; Context()->{format}{number} = "%.2f"; BEGIN_SOLUTION $BR $BBOLD Part a) $EBOLD Neyman-Pearson theory suggests we look for a test based on the likelihood ratio, \begin{align*} \frac{L\left( \mu _{1};y_{1},\dots ,y_{n}\right) }{L\left( \mu _{0};y_{1},\dots ,y_{n}\right) }, \end{align*} and reject the null if the above is too large, i.e., greater than some value \(c\) say. In this case the likelihood function is \begin{align*} L\left( \mu ;y_{1},\dots ,y_{n}\right) =\frac{e^{-n\mu }\mu ^{\sum_{i}y_{i}}% }{\prod_{i}y_{i}!}, \end{align*} and so \begin{align*} \frac{L\left( \mu _{1};y_{1},\dots ,y_{n}\right) }{L\left( \mu _{0};y_{1},\dots ,y_{n}\right) } &=e^{-n\left( \mu _{1}-\mu _{0}\right) }\left( \frac{\mu _{1}}{\mu _{0}}\right)^{\sum_{i}y_{i}} \\ \log\left( \frac{L\left( \mu _{1};y_{1},\dots ,y_{n}\right) }{L\left( \mu _{0};y_{1},\dots ,y_{n}\right) } \right) &= -n\left( \mu _{1}-\mu _{0}\right) + \sum_{i}y_{i} \log \left( \frac{\mu _{1}}{\mu _{0}}\right) \end{align*} In the case that \(\mu _{0}=4\), \(\mu _{1}=3\), and \(\sum_{i}y_{i}=$ysum\), the natural logarithm of this is \($ans_a\). $BR $BR $BBOLD Part b) $EBOLD In the case that \(\mu _{1}<\mu _{0}\), the likelihood ratio decreases as the sum of the observations increases. $BR $BR $BBOLD Part c) $EBOLD The likelihood ratio above is greater than a constant \(c\) when \begin{align*} -n\left( \mu _{1}-\mu _{0}\right) +\sum_{i}y_{i}\log \left( \frac{\mu _{1}}{ \mu _{0}}\right) >\log \left( c\right) , \end{align*} which on re-arranging (and note that \(log( \mu_{1}/\mu_{0}) <0\) gives the rejection region as being \begin{align*} \sum_{i}y_{i}<\frac{\log \left( c\right) +n\left( \mu _{1}-\mu _{0}\right) }{ \log \left( \mu _{1}\right) -\log \left( \mu _{0}\right) }. \end{align*} This is not hard to implement, since using the additive property of the Poisson distribution we know that \(\sum_{i}y_{i}\sim Po\left( n\mu _{0}\right)\) under \(H_{0}\) and there is no necessity to find \(c\) explicitly. We seek the value \(k\) such that if \(Y\sim Po\left( 5n\right)\), \(P\left( Y\leq k\right) \leq 0.05.\) Using R, the required constant is found via $BR $BITALIC k <- qpois(0.05, lambda=5*$n) - 1 $EITALIC $BR $BR $BBOLD Part d) $EBOLD We require the probability that the total number of errors is no greater than \($ysum\), given that the mean number of errors per issue is \(5\). Via R $BR $BITALIC ppois($ysum, lambda = 5*$n) $EITALIC $BR $BR $BBOLD Part e) $EBOLD The power of the test is the probability we reject the null hypothesis given the true value of the mean as being \($mu1\). From R, this is $BR $BITALIC ppois(k, lambda = $mu1*$n) $EITALIC END_SOLUTION ############################################################## ENDDOCUMENT();
Write, Run & Share Perl code online using OneCompiler's Perl online compiler for free. It's one of the robust, feature-rich online compilers for Perl language, running on the latest version 5.22.1. Getting started with the OneCompiler's Perl compiler is simple and pretty fast. The editor shows sample boilerplate code when you choose language as Perl
and start coding.
OneCompiler's Perl online editor supports stdin and users can give inputs to programs using the STDIN textbox under the I/O tab. Following is a sample Perl program which takes name as input and prints hello message with your name.
my $name = <STDIN>;
print "Hello $name.\n";
Perl(Practical Extraction and Report Language) is especially desined for text processing by Larry Wall.
There is no need to specify the type of the data in Perl as it is loosely typed language.
Type | Description | Usage |
---|---|---|
Scalar | Scalar is either a number or a string or an address of a variable(reference) | $var |
Arrays | Array is an ordered list of scalars, you can access arrays with indexes which starts from 0 | @arr = (1,2,3) |
Hash | Hash is an unordered set of key/value pairs | %ul = (1,'foo', 2, 'bar) |
In Perl, there is no need to explicitly declare variables to reserve memory space. When you assign a value to a variable, declaration happens automatically.
$var-name =value; #scalar-variable
@arr-name = (values); #Array-variables
%hashes = (key-value pairs); # Hash-variables
If, If-else, Nested-Ifs are used when you want to perform a certain set of operations based on conditional expressions.
if(conditional-expression){
//code
}
if(conditional-expression){
//code if condition is true
}else{
//code if condition is false
}
if(condition-expression1){
//code if above condition is true
}else if(condition-expression2){
//code if above condition is true
}
else if(condition-expression3){
//code if above condition is true
}
...
else{
//code if all the conditions are false
}
There is no case or switch in perl, instead we use given and when to check the code for multiple conditions.
given(expr){
when (value1)
{//code if above value is matched;}
when (value2)
{//code if above value is matched;}
when (value3)
{//code if above value is matched;}
default
{//code if all the above cases are not matched.}
}
For loop is used to iterate a set of statements based on a condition.
for(Initialization; Condition; Increment/decrement){
// code
}
While is also used to iterate a set of statements based on a condition. Usually while is preferred when number of iterations are not known in advance.
while(condition) {
// code
}
Do-while is also used to iterate a set of statements based on a condition. It is mostly used when you need to execute the statements atleast once.
do {
// code
} while (condition);
Sub-routines are similar to functions which contains set of statements. Usually sub-routines are written when multiple calls are required to same set of statements which increases re-usuability and modularity.
sub subroutine_name
{
# set of Statements
}
subroutine_name();
subroutine_name(arguments-list); // if arguments are present