Julia: HypothesisTests.PowerDivergenceTest() の問題点

2022年02月04日 | ブログラミング

Julia の HypothesisTests.PowerDivergenceTest() は問題がある。

対象とする分割表に，度数が 0 のセルがあると不適切な（役に立たない）結果を返す。

julia> x = [4 5 2 0; 0 7 6 1; 1 0 3 1]
3×4 Matrix{Int64}:
4 5 2 0
0 7 6 1
1 0 3 1

julia> using HypothesisTests

julia> result = PowerDivergenceTest(x, lambda=0.0);

julia> println(result.stat)
NaN

julia> println(pvalue(result))
NaN

そこで，以下のような関数を書く。

julia> function PDTsummary(result::PowerDivergenceTest; resid::Bool=false)
# 結果を簡潔に表示するための関数
function printmatrix(name, x)
println("\n$name")
Base.print_matrix(stdout, round.(x, digits=5), "", " ", "\n")
end
println("chisq. = $(round(result.stat, digits=5)), df = $(result.df), p.value = $(round(pvalue(result), digits=5))")
if resid
printmatrix("residuals", result.residuals)
printmatrix("standardized residuals", result.stdresiduals)
end
end
PDTsummary (generic function with 1 method)

julia> function g2stat(x, nrows, ncols, n, correct)
# セルに 0 があっても正しい答えを出す
# correct=true で，連続性の補正も行うことができる
ln(n) = n .== 0 ? 0 : n .* log.(n)
n = sum(x) # 全サンプルサイズ
n1 = sum(x, dims=2) # 行和
n2 = sum(x, dims=1) # 列和
G2 = 2*(sum(ln.(x)) - sum(ln.(n1)) - sum(ln.(n2)) + ln(n)) # G 統計量
correct && (G2 /= 1 + (n * sum(1 ./ n1) - 1) * (n * sum(1 ./ n2) - 1) / (6n * nrows * ncols)) # 連続性の補正
df = (nrows - 1) * (ncols - 1) # G の自由度
G2, df
end
g2stat (generic function with 1 method)

julia> function chisq_test(x::AbstractMatrix{T}; correct::Bool=false, resid::Bool=false, lambda::Float64=1.0) where T<:Integer
# 分割表を入力してχ二乗検定，G2 検定
nrows, ncols = size(x)
n = sum(x)
if lambda == 1.0 && correct && nrows == 2 && ncols == 2
a, c, b, d = x
stat = n*max(0, abs(a*d - b*c) - 0.5n)^2 / ((a+b)*(c+d)*(a+c)*(b+d))
result = PowerDivergenceTest(lambda, ones(4), stat, 1, x, n, ones(4), x, x, x)
elseif lambda == 1.0
result = PowerDivergenceTest(x; lambda)
elseif lambda == 0.0
G2, df = g2stat(x, nrows, ncols, n, correct)
vec = ones(nrows*ncols)
mat = reshape(vec, nrows, ncols)
result = PowerDivergenceTest(lambda, vec, G2, df, mat, n, vec, mat, mat, mat)
end
PDTsummary(result; resid)
end
chisq_test (generic function with 1 method)

julia> function G2_test(x::AbstractMatrix{T}; correct::Bool=false, resid::Bool=false) where T<:Integer
chisq_test(x; correct, resid, lambda=0.0)
end
G2_test (generic function with 1 method)

2024年11月
日	月	火	水	木	金	土
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

裏 RjpWiki

Julia ときどき R, Python によるコンピュータプログラム，コンピュータ・サイエンス，統計学

Julia: HypothesisTests.PowerDivergenceTest() の問題点

このブログの人気記事

コメントを投稿

「ブログラミング」カテゴリの最新記事

プロフィール

最新記事

バックナンバー

カレンダー

カテゴリー

最新コメント

雨雲の動き

ログイン

goo blog お知らせ

goo blog おすすめ