1. Introduction

101317307

33873

Stat Probab Lett

Statistics & probability letters

0167-7152

26740729

4699449

10.1016/j.spl.2012.03.033

HHSPA738189

Article

Varying kernel density estimation on ℝ₊

Mnatsakanov

Robert

*Sarkisian

Khachatur

West Virginia University, USA; National Institute for Occupational Safety and Health, USA

*Correspondence to: Department of Statistics, P.O. Box 6330, West Virginia University, Morgantown, WV 26506, USA. Fax: +1 304 293 2272. rmnatsak@stat.wvu.edu (R. Mnatsakanov)

2122015

72012

0412016

82713371345

In this article a new nonparametric density estimator based on the sequence of asymmetric kernels is proposed. This method is natural when estimating an unknown density function of a positive random variable. The rates of Mean Squared Error, Mean Integrated Squared Error, and the L₁-consistency are investigated. Simulation studies are conducted to compare a new estimator and its modified version with traditional kernel density construction.

Varying kernel density estimatorMean Squared ErrorMean Integrated Squared Errorδ-sequenceL₁-consistency

1. Introduction

Let us assume that the support of unknown cumulative distribution function (cdf) F is the positive half-line ℝ₊ = (0,∞). To avoid an edge effect when estimating the density function of F it is common to use kernels with the same support as that of the target distribution. Recently, the constructions with asymmetric kernels have been studied for estimating a probability density function (pdf) defined on ℝ₊. Namely, in Chen (2000) and Scaillet (2004) the sequences of gamma kernels, and inverse and reciprocal inverse gaussian kernels have been used, respectively. See also Mnatsakanov and Ruymgaart (2012), where another varying kernel approach is suggested. Their method is based on the sequence of gamma pdfs with varying shapes.

We propose to use a sequence of inverse gamma kernels that represent the δ-sequences in L₂- and L₁-norms, see Lemmas 4.1 and 4.2, respectively. The constructions fα∗ and f̂_α considered in (2.3) and (2.4) (called the varying kernel density estimators (vKDEs)) are different from the traditional kernel density estimator (KDE) (see, for example, Parzen (1962), Silverman (1986), and Scott (1992)). They are also different from the ones proposed in Chen (2000) and Scaillet (2004). In the kernel density estimation the convolution is considered with respect to addition as the group operation on the entire real line ℝ and with a fixed kernel. Our constructions in (2.3) and (2.4) turns out to be of kernel type provided that convolution is considered on the space of a positive half-line (ℝ₊, dH) equipped with multiplication as a group operation, and with the Haar measure dH(t) = dt/t (see, for example, (2.7) below). It is worth mentioning that the estimators proposed by Chen (2000) and by Scaillet (2004) cannot be viewed as convolutions as well as the densities on ℝ₊.

In this paper we investigated the Mean Squared Error (MSE) and Mean Integrated Squared Error (MISE) rates of convergence for proposed estimators fα∗ and f̂_α. Note that the shape of an inverse gamma density varies according to the position of a point x at which the pdf f (x) is estimated. This allows automatic changing the “smoothing” degree around the point x. Another feature of the constructions (2.3) and (2.4) are that they have no boundary effects (see Figs. 1 and 2) and they achieve the optimal rate of convergence for MSE and for MISE within the class of non-negative kernel density estimators. Similar results have been derived in papers: Chen (2000) and Scaillet (2004). There are differences regarding the constants appearing in the first order terms only. It is worth mentioning that in contrast with KDE, the asymptotic variances of fα∗(x) and f̂_α(x) have the same form n-4/5f(x)/(2xπ), as α = n^2/5 (see (3.6) in Section 3), that becomes smaller as x increases. Finally, note that in the case of asymmetric gamma kernels (see Chen (2000)), the corresponding variance has the form n-4/5f(x)/(2xπ). In Mnatsakanov and Ruymgaart (2012), the construction similar to (2.1) has been used, and, as a result, another, the so-called moment-density estimate has been proposed, and its asymptotic properties were studied as well.

The paper is organized as follows. In Section 2 the assumptions and the construction of the vKDE are introduced. In Section 3 the MSE of fα∗ and f̂_α are derived, while in Section 4 the MISE and L₁-consistency of f̂_α are investigated. In Section 5 we conducted the simulation study and compared the performances of the estimators f̂_α, fα∗ and the traditional KDE f̂_h.

2. Preliminaries and assumptions

In this section we outline the main idea that yields vKDEs fα∗ and f̂_α in (2.3) and (2.4), respectively. Assume we would like to recover (approximate) the moment-identifiable distribution F given only the sequence of its moments. About the conditions necessary and sufficient for F to be the moment-identifiable distribution, see, for example, Stoyanov (2000) and references therein. Suppose that all negative order moments of F are finite. Define the operator ℳ by

(ℳF)(j)=∫0∞t-jdF(t)=μj,j=0,1,… and introduce the sequence of operators ℳα-1: (2.1)(Mα-1μ)(x)=1-∑k=0α(αx)kk!∑j=k∞(-αx)j-k(j-k)!μj,x∈ℝ+.

Here μ = {μ_j, j = 0, 1, …} and α → ∞ at a rate to be specified later.

In analysis, the transform (ℳF)(1 − z), where z is a complex variable, is known as the Mellin transform. There is extensive literature investigating the problem of recovering a function from its Mellin transform. See, for instance, Tagliani (2001), Klauder et al. (2001) and Sneddon (1974), among others. In Gzyl and Tagliani (2010), and Mnatsakanov (2008a,b) the problem of recovering the cdf and corresponding density function given the moment sequence of positive orders of underlying distribution has been studied. The investigation of the properties of approximation Mα-1 in (2.1) is beyond the scope of this article and will be conducted in a separate investigation.

To construct the density estimate, at first, let us approximate F by means of Mα-1. A minor modification of an argument in Mnatsakanov and Ruymgaart (2003) yields

(2.2)Fα=Mα-1MF→wF,asα→∞.

Here by →_w we denote the weak convergence of corresponding cdfs.

Now, suppose we are given a sequence X₁, …, X_n of independent and identically distributed positive random variables from the absolutely continuous distribution function F (with pdf f = F′). To estimate F, let us first estimate its negative j-th order moments μ_j, j ≥ 1. Namely, based on (2.2), let us construct the estimate Fα∗ of F by replacing the moment μ_j in (2.1) by its empirical counterpart

μ^j=∫0∞t-jdF^n(t),j=0,1,…,withF^n(t)=1n∑i=1nI{Xi≤t}.

Here F̂_n is the empirical cdf of the sample X₁, …, X_n. After a simple algebra, we derive

Fα∗(x)=1-1n∑i=1n∑k=0α1k!(αXix)kexp(-αXix),x∈ℝ+.

To compare Fα∗ with the empirical cdf F̂_n, note that Fα∗(x)~F^n(x) as long as α is large. This follows from the fact that for a given X_i and large α: ∑k=0α1k!(αXix)kexp(-αXix)~I{Xi>x}.

Note also that Fα∗(x) is a continuous function of x, hence, to estimate the density f (x) one can take the derivative of Fα∗(x): (2.3)fα∗(x)=1n∑i=1n1Xi·1Γ(α)(αXix)αexp(-αXix), and choose α = α(n) → ∞ as n → ∞. The problem of optimal choice of parameter α will be specified later. Of course fα∗(x)≥0 for each x > 0, and since it is easily seen that ∫0∞fα∗(x)dx=1, the estimator is itself a probability density. The statements similar to the ones obtained in Sections 3 and 4 are valid for fα∗ as well (see for example, Theorem 3.2). To simplify the calculations below and to reduce the bias of fα∗, let us use the modified version of fα∗. Namely, let us increase the shape parameter of the inverse gamma kernel presented in the right hand side of (2.3) by one. Denote Si,x:=1XiLα(xXi), where L_α(u) = (αu)^α⁺¹/Γ (α + 1) exp(−αu), u ∈ ℝ₊, and consider

(2.4)f^α(x)=1n∑i=1n1XiLα(xXi)=1n∑i=1nSi,x.

Throughout the proposed estimator will be considered at a fixed point x > 0, where f (x) > 0. Also, we will assume that F (0) = 0, and the underlying density satisfies

(2.5)f∈C(2)(ℝ+),withsupt>0∣f″(t)∣=M<∞.

Besides, let us denote by g(·, a_k, b_k) the inverse gamma density with the shape a_k = k(α + 2) − 1 and the rate b_k = k αx parameters, respectively. Namely

(2.6)g(t;ak,bk)=1t2·bkak(1t)ak-1e-bktΓ(ak),t>0.

The mean ξ_k and variance σk2 of g(·; a_k, b_k) have the following expressions, respectively: ξk=bkak-1=kαxk(α+2)-2,σk2=bk2(ak-1)2·(ak-2)=k2α2x2{k(α+2)-2}2·{k(α+2)-3}.

Note also that the mean of f̂_α(x) can be written as the convolution operator on (ℝ₊, dH): (2.7)fα(x)=Ef^α(x)=∫0∞Lα(x/t)f(t)dH(t),x∈ℝ+, where dH(t) = dt/t. In Lemmas 4.1 and 4.2, see Section 4, it is proved that the sequence of functions {(1/t) L_α(·/t), t ∈ ℝ₊, α ∈ ℕ} with L_α(·) defined in (2.4) forms the δ-sequences in L₁- and L₂-norms, as α → ∞.

3. Bias and MSE

Without explicit reference it will be assumed that all the conditions in Section 2 are satisfied. Let us study the bias and the second moment of the estimator f̂_α. We have

(3.1)ESi,xk=∫0∞1{Γ(α+1)}k(1t)k(αxt)k(α+1)exp(-kαxt)f(t)dt=∫0∞{k(α+2)-2}!(αx)k(α+1){Γ(α+1)}k(kαx)k(α+2)-1g(t;ak,bk)f(t)dt=(1αx)k-1{k(α+2)-2}!{Γ(α+1)}k1kk(α+2)-1∫0∞g(t;ak,bk)f(t)dt.

In particular, for k = 1: (3.2)Ef^α(x)=ESi,x=∫0∞g(t;a1,b1)f(t)dt=fα(x).

This yields for the bias of f̂_α(x): (3.3)fα(x)-f(x)=Bias{f^α(x)}=∫0∞g(t;a1,b1){f(t)-f(x)}dt=∫0∞g(t;a1,b1){f(x)+(t-x)f′(x)+12(t-x)2f″(t∼)-f(x)}dt=12∫0∞(t-x)2g(t;a1,b1)f″(x)dt+12∫0∞(t-x)2g(t;a1,b1){f″(t∼)-f″(x)}dt=12·x2α-1·f″(x)+o(1α),asn→∞.

For the variance we have

(3.4)Var{f^α(x)}=1nVarSi,x=1n{ESi,x2-fα2(x)}.

Applying (3.1) for k = 2 and Bα=α-12-(2α+3)Γ(2α+3)[Γ(α+1)]-2~α1/2/(2π), as α → ∞, yields

(3.5)ESi,x2=1αx·Γ(2α+3){Γ(α+1)}2·122(α+2)-1·∫0∞g(t;a2,b2)f(t)dt=Bαx∫0∞g(t;a2,b2)f(t)dt~1αx2π·e-2(α+1){2(α+1)}2(α+1)+1/2e-2α·α2α+1·122α+3×∫0∞g(t;a2,b2)f(t)dt~1αx2πα3/22∫0∞g(t;a2,b2)f(t)dt=α2xπ{f(x)+o(1)}=α2πf(x)x+o(α).

Inserting (3.3) and (3.5) in (3.4) we obtain

(3.6)Var{f^α(x)}=1n[12παxf(x)+o(α)-{f(x)+O(1α)}2]=α2nπf(x)x+o(αn).

Finally, combining (3.3) and (3.6) leads to the MSE of f̂_α(x): (3.7)MSE{f^α(x)}=Var{f^α(x)}+Bias2{f^α(x)}=α2nπf(x)x+14x4(α-1)2{f″(x)}2+o(αn)+o(1α2).

For optimal rates we may take

(3.8)α=α(n)=n2/5.

By substitution (3.8) into (3.7) we find

(3.9)MSE{f^α(x)}=n-4/5[f(x)2xπ+x4{f″(x)}24]+o(n-4/5).

Here we have assumed that the pdf f has a continuous and bounded second derivative f″ (condition (2.5)). The following statement is valid.

<sec id="S5"><title>Theorem 3.1

Under the assumption (2.5) the bias of f̂_α(x) satisfies

Bias{f^α(x)}=x2f″(x)2·(α-1)+o(1α),asαandn→∞.

For the MSE of f̂_α(x) we have the expression in (3.9), provided that we choose α = α(n) ~ n^2/5.

One can check very easily that the variance of vKDE fα∗ defined in (2.3) has the same form we have in the right-hand side of (3.6), while the bias of fα∗ has additional term containing f′. Applying the similar argument used in derivations of (3.3), (3.5) and (3.6), we obtain the following statement.

Theorem 3.2

Under the assumption (2.5), the bias and MSE of fα∗(x) have the following expressions

Bias{fα∗(x)}=xf′(x)α-1+x2f″(x)2×α2(α-1)2(α-2)+o(1α),MSE{fα∗(x)}=α2nπf(x)x+x2{f′(x)}2(α-1)2+x4{f″(x)}2α44(α-1)4(α-2)2+o(αn)+o(1α2), as α and n → ∞. For the optimal MSE of fα∗(x) we have

MSE{fα∗(x)}=n-4/5[f(x)2xπ+x2{f′(x)}2+x4{f″(x)}24]+o(n-4/5), provided that we choose α = α(n) ~ n^2/5.

4. MISE and <italic>L</italic>1-consistency of <italic>f̂α</italic>4.1. MISE rate of convergence

Throughout this section again F concentrates mass 1 on (0, ∞) but it is also supposed to have a sufficiently smooth density. Let us consider the following conditions: (4.1)∫0∞f(x)xdx=C0<∞and, (4.2)∫0∞{x2f″(x)}2dx=C1<∞.

One can very easily obtain the optimal rate n^−4/5 for MISE{f̂_α} as α, n → ∞ by integrating the terms on the right-hand side of (3.7). Namely, the following statement is true.

Theorem 4.1

Under the assumptions (2.5), (4.1) and (4.2) we have

MISE{f^α}=∫0∞Var{f^α(x)}dx+∫0∞Bias2{f^α(x)}dx~C0α2nπ+C14α2, as α, n → ∞. While for optimal MISE we have

MISE{f^α}~n-4/5(54)·(C02π)4/5C11/5,asα,n→∞, provided that we choose α=α(n)=n2/5(2C1π/C0)2/5.

One can weaken the conditions on f and show that the corresponding rate is n^−2/3 under the requirement of integrability of {xf′(x)}². Indeed, let us denote again by B_α = α⁻¹ 2^{− (2}^α⁺³⁾ Γ (2α+3) [Γ (α+1)]⁻² and consider the following condition (instead of (4.2)): (4.3)∫0∞{xf′(x)}2dx=C2<∞.

Consider the L₁- and L₂-norms of a function ϕ : ℝ₊ → ℝ by

‖ϕ‖L1=∫0∞∣ϕ(x)∣dx,‖ϕ‖L2={∫0∞∣ϕ(x)∣2dx}1/2, respectively.

Lemma 4.1

If f′ is bounded and condition (4.3) is satisfied, then

‖fα-f‖L2≤1αC2(α+1).

Proof of the Lemma 4.1

Let us denote by η_α the r.v. with pdf L_α(t)/t, t ∈ ℝ₊. Note also that the r.v. x/η_α has pdf L_α(x/t)/t and

(4.4)∫0∞Lα(x/s)1sds=1,E[(1/ηα)]=1,Var[(1/ηα)]=1α-1,f(x/ηα)-f(x)=∫xx/ηαf′(y)dy.

Then after simple algebra combined with application of the Cauchy–Schwarz’s inequality we obtain

(4.5)‖fα-f‖L22=∫0∞Bias2{f^α(x)}dx=∫0∞[∫0∞{f(s)-f(x)}Lα(x/s)1sds]2dx=∫0∞[E(f(x/ηα)-f(x))]2dx=∫0∞[E∫xx/ηαf′(s)ds]2dx≤E∫0∞[∫xx/ηα(f′(s))2dsx(ηα-1-1)]dx=E{I[ηα≤1]∫0∞[(f′(s))2∫sηαsx(ηα-1-1)dx]ds+I[ηα≥1]∫0∞[(f′(s))2∫ssηαx(1-ηα-1)dx]ds}=E{I[ηα≤1]∫0∞(f′(s))212s2(1-ηα2)(ηα-1-1)ds+I[ηα≥1]∫0∞(f′(s))212s2(ηα2-1),(1-ηα-1)ds}=12E((ηα-1)2(ηα+1)ηα)∫0∞{sf′(s)}2ds.

But

(4.6)E((ηα-1)2(ηα+1)ηα)=∫0∞(u-1)2(u+1)uLα(u)1udu=∫0∞(u-1)2(u+1)·αα+1uα-1Γ(α+1)e-αudu=2(α+1)α2.

Combination of (4.5) and (4.6) gives

(4.7)‖fα-f‖L22=∫0∞Bias2{f^α(x)}dx≤α+1α2∫0∞{sf′(s)}2ds.

Lemma 4.1 is proved.

Theorem 4.2

If f′ is bounded and the conditions (4.1) and (4.3) are satisfied, then

MISE{f^α}≤BαC0n+C2α+C2α2,α>1,MISE{f^α}≤C0α2nπ+C2α+o(1α), as α, n → ∞. While for optimal MISE we have

MISE{f^α}≤n-2/3322/3·(C02π)2/3C21/3+o(n-2/3),asn→∞, provided that we choose α=α(n)=n2/3(4C2π/C0)2/3.

Proof

Let us study the variance term. According to the definitions of the inverse gamma g(·, a_k, b_k) in (2.6) and gamma h(·, shape, rate) densities, we have

∫0∞1xg(t;a2,b2)dx=1t∫0∞h(x,2α+3,2α/t)dx=1t.

So that integration of the both sides of the first equation in (3.5) combined with Bα~α1/2/(2π), as α → ∞, yields

(4.8)1n∫0∞ESi,x2dx=Bαn∫0∞1x[∫0∞g(t;a2,b2)f(t)dt]dx=Bαn∫0∞f(t)[∫0∞1xg(t;a2,b2)dx]dt=Bαn∫0∞f(t)tdt~α2nπ∫0∞f(t)tdt,asα→∞.

Hence, it is proved

(4.9)∫0∞Var{f^α(x)}dx≤∫0∞1n{ESi,x2}dx~α2nπ∫0∞f(t)tdt, as n, α → ∞. Finally, from (4.7)–(4.9) we obtain the statements of Theorem 4.2.

4.2. <italic>L</italic>1-consistency

In this subsection let us consider the condition

(4.10)∫0∞x2∣f″(x)∣dx=C3<∞.

Consider the L₁-distance ||f_α−f||_L₁ between f_α and f (with respect to the Lebesgue measure λ on ℝ₊). Here f_α(x) = Ef̂_α(x) = E f(x/η_α) with η_α defined in the proof of Lemma 4.1. One can show that the functions {(1/t) L_α(·/t), t > 0} form a δ-sequence in L₁-norm as well, as α → ∞. Namely, the following statement is true.

Lemma 4.2

If f″ is bounded and the condition (4.10) is satisfied, then

‖fα-f‖L1≤C3(1α+1α2).

Proof

Combination of (4.4), (4.10) and the following equations

∫0∞Lα(x/s)1sds=1,E[x(ηα-1-1)]=0,f(x/ηα)-f(x)=f′(x)(x/ηα-x)+∫xx/ηαds∫xsf″(y)dy, gives

(4.11)‖fα-f‖L1=∫0∞|∫0∞{f(s)-f(x)}Lα(x/s)1sds|dx=∫0∞∣E(f(x/ηα)-f(x))∣dx=∫0∞|E∫xx/ηαds∫xsf″(y)dy|dx≤∫0∞[EI[ηα<1](x/ηα-x)∫xx/ηα∣f″(y)∣dy+EI[ηα>1](x-x/ηα)∫x/ηαx∣f″(y)∣dy]dx.

Now in a similar way as we did in (4.5) and (4.6), changing the integrations in (4.11) yields

‖fα-f‖L1≤12∫0∞y2∣f″(y)∣dyE((ηα-1)2(ηα+1)ηα)=(1α+1α2)∫0∞y2∣f″(y)∣dy.

Lemma 4.2 is proved.

Theorem 4.3

If f″ is bounded and the conditions (4.1) and (4.10) are satisfied, then

(4.12)E‖f^α-f‖L1=E∫0∞∣f^α(x)-f(x)∣dx→0,asα/n→0,α,n→∞.

Proof

Under the assumptions (4.10) we have from Lemma 4.2 that ||f_α−f||_L₁ → 0, as α → ∞. Hence, to prove (4.12) it is sufficient to show

F{An(δ)}=F{x:∫0∞1t2Lα2(x/t)f(t)dt≥nδ}→0, for any δ > 0 and α, n → ∞(see, Theorem 1 in Mnatsakanov and Khmaladze (1981)). But F is an absolutely continuous distribution with respect to Lebesgue measure λ, so, let us establish λ{A_n(δ)} → 0, for any δ > 0 and α, n → ∞. Indeed, application of (4.8) yields

(4.13)λ{An(δ)}≤1nδ∫An(δ)dx∫0∞1t2Lα2(x/t)f(t)dt≤1nδ∫0∞ESi,x2dx=Bαnδ∫0∞f(t)tdt~α2nδπ∫0∞f(t)tdt,asα→∞.

The proof of Theorem 4.3 follows from (4.1), (4.13), and α/n→0.

Remark 4.1

Taking α = h⁻² one can see that the condition α/n→0 from Theorem 4.3 corresponds to the condition nh → ∞in traditional kernel density estimation.

5. Simulations

In this section we study the performances of fα∗ and f̂_α; defined in (2.3) and (2.4), respectively. In particular, we compare them with KDE f̂_h when the kernel function K is assumed to be a standard normal density function. Let us consider the case when the optimal choice of h, h = h_cv, is based on the least-squares cross validation (CV) algorithm that minimizes the expression M₁(h) defined by Eq. (3.39) in Silverman (1986).

In our simulation studies we plotted the curves of vKDEs f̂_α; and fα∗, when the optimal α = α_cv and, respectively, α=αcv∗, are chosen via the least-squares CV algorithm as well (cf. with Mnatsakanov and Ruymgaart (2012)), and compared them with corresponding curve of KDE f̂_h, when h = h_cv (see Figs. 1 and 2). In particular, we simulated the r.v.’s X_i, i = 1, …, n, from two different distributions: Log-normal (0, 1) and Gamma (2, 1) with different sample sizes n = 200k, 1 ≤ k ≤ 4. In addition, we repeated these simulations N = 500 times and studied the performances of f̂_α, fα∗, and f̂_h using the MISE. Namely, we used the estimated MISE: MISE^:=E^(ISE){f^}=1N∑j=1N∫0∞∣f^(j)(x)-f(x)∣2dx.

Here the expectation Ê is calculated with respect to the empirical cdf of N = 500 values of ISEs, while f̂₍_j₎ denotes the vKDEs or KDE used on the j-th replication. The optimal α = α_cv minimizes the expression M₂(α), i.e.

(5.1)αcv=argminαM2(α)=argminα[∫0∞[f^α(x)]2dx-2∫0∞f^α(x)dF^n(x)], where α ∈ {1, …, 40} for each n = 200k, 1 ≤ k ≤ 4. In the second term of the right hand side of (5.1) let us apply the leave-one-out construction instead of f̂_α. This yields the following expression of

M2(α)=Γ(2α+3)n2αΓ2(α+1)∑i=1n∑j=1n(XiXj)α+1(Xi+Xj)2α+3-2n(n-1)Γ(α+1)∑i=1n∑j≠i1Xj(αXiXj)α+1e-αXiXj.

In the case of vKDE fα∗, we choose the optimal CV parameter α=αcv∗ that minimizes the function

M3(α)=Γ(2α+1)n2αΓ2(α)∑i=1n∑j=1n(XiXj)α(Xi+Xj)2α+1-2n(n-1)Γ(α)∑i=1n∑j≠i1Xj(αXiXj)αe-αXiXj.

During the simulation study, we found out that MISE^s of vKDEs are decreasing functions of n when the parameters α = α_cv, α=αcv∗, and α = n^2/5. In Table 1, we recorded the values of α_cv, αcv∗, and h_cv and corresponding MISE^ for Log-normal (0, 1) and Gamma (2, 1) distributions for four different sample sizes. We see that the values of MISE^ for fα∗ are smaller than corresponding values of MISE^ for f̂_α and f̂_h. To illustrate the performances of vKDEs graphically, we plotted the graphs of estimators f̂_{α_cv} (the dashed curves) with α_cv = 11 and 24 and f̂_h (the dotted curve) with h = h_cv, in Fig. 1(a) and 2(a) when the sampled distributions are Log-normal (0, 1) (with n = 200) and Gamma (2, 1) (with n = 800), respectively. For the same samples, in Fig. 1(b) and 2(b) we plotted the graphs of estimators fαcv∗∗ (the dashed curve) and f̂_h when αcv∗=7 and 18 and h = h_cv, respectively. In each model the sampled pdf f (the solid curve) is plotted as well. Based on the records in Table 1, we conclude that the performances of vKDEs are better compared to the one based on KDE f̂_{h_cv}. After conducting many simulations we can say that the asymptotic behavior of fαcv∗∗ and its modified version f̂_{α_cv} are similar to each other, and their performances around the origin and on the right tail are much better than that of KDE f̂_{h_cv}. For the small sample sizes we suggest to use f̂_{α_cv} instead of f̂_{h_cv} and fαcv∗∗.

The authors are thankful to Estate Khmaladze and Cecil Burchfiel for helpful discussions, and the referee and the associate editor for their suggestions that led to a better presentation. The research was supported by NSF grant DMS-0906639. The findings and conclusions in this paper are those of the authors and do not necessarily represent the views of the National Institute for Occupational Safety and Health.

Chen

2000

Probability density function estimation using gamma kernels

Ann Inst Statist Math52471480

Gzyl

Tagliani

2010

Hausdorff moment problem and fractional moments

Appl Math Comput21633193328

Klauder

Penson

Sixdeniers

2001

Constructing coherent states through solutions of Stieltjes and Hausdorff moment problems

Phys Rev A64013817

Mnatsakanov

2008a

Hausdorff moment problem: reconstruction of distributions

J Statist Probab Lett7816121618

Mnatsakanov

2008b

Hausdorff moment problem: reconstruction of probability density functions

J Statist Probab Lett7818691877

Mnatsakanov

Khmaladze

1981

On L₁-convergence of statistical kernel estimators of distribution densities

Soviet Math Dokl23633636

Mnatsakanov

Ruymgaart

2012

Moment-density estimation for positive random variables

Statistics46215230

Mnatsakanov

Ruymgaart

2003

Some properties of moment-empirical cdf’s with application to some inverse estimation problems

Math Methods Statist12478495

Parzen

1962

On estimation of a probability density function and mode

Ann Math Statist3310651076

Scaillet

2004

Density estimation using inverse and reciprocal inverse gaussian kernels

Nonparametric Stat16217226

Scott

1992Multivariate Density EstimationJohn Wiley and Sons

New York

Silverman

1986Density Estimation for Statistics and Data AnalysisChapman and Hall

New York

Sneddon

1974The Use of Integral TransformsMcGraw-Hill

New York

Stoyanov

2000

Krein condition in probabilistic moment problems

Bernoulli6939949

Tagliani

2001

Recovering a probability density function from its Mellin transform

Appl Math Comput118151159

Fig. 1

Estimation of Log-normal(0, 1) density function f (solid curve) by f̂h_cv with h_cv = 0.14 and by (a) f̂α_cv with α_cv = 11; (b) fαcv∗∗ with αcv∗=7 . In both plots n = 200.

Fig. 2

Estimation of Gamma(2, 1) density function f (solid curve) by f̂h_cv with h_cv = 0.19 and by (a) f̂α_cv with α_cv = 24; (b) fαcv∗∗ with αcv∗=18. In both plots n = 800.

Table 1

The values of α_cv, αcv∗, and h_cv and corresponding MISE^s of vKDEs and KDE.

n	Log-normal (0, 1)						Gamma (2, 1)
n	f̂_α	[α_cv]	fα∗	[ αcv∗]	f̂_h	[h_cv]	f̂_α	[α_cv]	fα∗	[ αcv∗]	f̂_h	[h_cv]
200	0.0092	[11]	0.0066	[7]	0.0166	[0.14]	0.0060	[14]	0.0046	[10]	0.0080	[0.30]
400	0.0057	[14]	0.0043	[9]	0.0103	[0.11]	0.0033	[18]	0.0026	[14]	0.0045	[0.25]
600	0.0039	[17]	0.0030	[11]	0.0075	[0.10]	0.0020	[22]	0.0016	[16]	0.0030	[0.22]
800	0.0029	[19]	0.0022	[12]	0.0059	[0.09]	0.0018	[24]	0.0015	[18]	0.0026	[0.19]