Understanding a phenomenon from observed data requires contextual and efficient statistical models. Such models are based on probability distributions having sufficiently flexible statistical properties to adapt to a maximum of situations. Modern examples include the distributions of the truncated Fréchet generated family. In this paper, we go even further by introducing a more general family, based on a truncated version of the generalized Fréchet distribution. This generalization involves a new shape parameter modulating to the extreme some central and dispersion parameters, as well as the skewness and weight of the tails. We also investigate the main functions of the new family, stress-strength parameter, diverse functional series expansions, incomplete moments, various entropy measures, theoretical and practical parameters estimation, bivariate extensions through the use of copulas, and the estimation of the model parameters. By considering a special member of the family having the Weibull distribution as the parent, we fit two data sets of interest, one about waiting times and the other about precipitation. Solid statistical criteria attest that the proposed model is superior over other extended Weibull models, including the one derived to the former truncated Fréchet generated family.
Truncated distributiongeneral family of distributionsincomplete momentsentropycopuladata analysisIntroduction
Determining the underlying distribution of data is a crucial topic in many applied fields, such as medicine, reliability, finance, economics, engineering and environmental sciences. Among the possible approaches, one can define general families of continuous distributions from well-established parental distributions, having enough interesting properties to offer statistical models that adapt to all possible situations. The constructions of such families are based on specific mathematical techniques which may depend on one or several tunable parameters. For an overview on classic families of distributions and the associated techniques, we refer the reader to the surveys of [1–3].
In recent studies, the composition-truncation technique by [4] has been used to develop families of distributions achieving the goals of simplicity and efficiency. Among them, there are the truncated exponential-G family by [5], truncated Fréchet-G family by [6], truncated inverted Kumaraswamy-G family by [7], truncated Weibull-G family by [8], truncated Cauchy power-G family by [9], truncated Burr-G family by [10], type II truncated Fréchet-G family by [11], truncated log-logistic-G family by [12], right truncated T-X family by [13] and truncated Lomax-G family by [14]. The functions defining these families have the advantages of being simple, with a reasonable number of parameters, and having original monotonic and non-monotonic forms, which makes them attractive for statistical applications.
Especially, the truncated Fréchet-G family innovates in the following aspects: (i) Its functions are quite manageable, with a corresponding cumulative distribution function (CDF) having a simple exponential expression, (ii) It has a reasonable number of parameters: two plus those of the parental distribution, and (iii) Provides distributions with original monotonic and non-monotonic shapes, as shown in [6] with the gamma distribution as the parent. The combination of these qualities makes this family unique compared to others, and also attractive for statistical purposes. However, the price of the simplicity is that the nice flexibility of these distributions depends strongly on the choice of the parental distribution. And, to our knowledge, only the special distribution based on the gamma distribution has been explored in detail.
In this paper, we take one more step in this direction, by proposing a generalization of the truncated Fréchet-G family. It is also based on the composition-truncation technique, but uses a generalized version of the truncated Fréchet distribution called generalized Fréchet (GFr) distribution. First, the GFr distribution is defined by the following CDF:
where 0$]]>α,β,λ>0, (and FGFr(x;α,β,λ)=0 otherwise). This distribution is also known under the names of exponentiated Fréchet distribution and exponentiated Gumbel type-2 distribution pioneered by [15,16]. As an alpha property, the GFr distribution is connected with the famous exponentiated exponential (EE) distribution introduced by [17] in the following sense: if X denotes a random variable (RV) following the GFr distribution with parameters α, β and λ, then X-λ follows the EE distribution with parameters α and β. The GFr distribution contains the former Fréchet distribution, obtained by taking β=1. Also, it is proved in [15,16] that the parameter β makes the GFr model really more pliant than the former Fréchet model. This has motivated the study of some of its extensions, as the successful one proposed in [18]. Here, we exploit the features of the GFr distribution to define a new general family of distributions. Following the spirit of [4], we first derive the truncated generalized Fréchet distribution over the interval (0,1), specified by the following CDF:
FTGFr(x;α,β,λ)=FGFr(x;α,β,λ)FGFr(1;α,β,λ),x∈(0,1),
that is
We complete this definition by assuming that FTGFr(x;α,β,λ)=0 for x≤0 and FTGFr(x;α,β,λ)=1 for x≥1. As far as we know, this truncated distribution is unlisted in the literature, and can be of independent interest. Here, we use it to define the truncated generalized Fréchet generated (TGFr-G) family of (continuous) distributions by considering the CDF obtained as
FTGFr-G(x;ψ)=FTGFr(G(x;η);α,β,1),x∈ℝ,
that is
FTGFr-G(x;ψ)=11-(1-e-α)β[1-(1-e-αG(x;η)-1)β],x∈ℝ,
where G(x;η) denotes the CDF of a parent (continuous) distribution and ψ=(α,β,η). Note that we have put λ=1 in the definition of (2) to avoid the over-parameterization phenomenon; if necessary, one may re-introduce it easily by replacing G(x;η) by G(x;η,λ)=H(x;η)λ, where H(x;η) is a continuous CDF. One can observe that the TGFr-G and truncated Fréchet-G families coincide by taking β=1. The main innovation of the TGFr-G family remains in its definition involving the shape parameter β which opens new modelling perspectives, in the same spirit as the GFr distribution extends those of the classic Fréchet distribution. In this study, we formalize this claim by pointing out the desirable mathematical properties and applicability of the TGFr-G family. In particular, we investigate the precise role of β in the features of the main functions, stress-strength parameter, incomplete moments and various entropy measures. The parameters estimation and bivariate extensions are also discussed, as well as a complete estimation work on the parameters. The applicable aspect of the new family is mainly highlighted by a special three-parameter distribution, defined with the Weibull distribution as the parent. It is called the truncated generalized Fréchet Weibull (TGFrW) distribution. For the related model, the maximum likelihood estimates of the parameters are derived and a simulation study is also made to check their accuracy. Then, two data sets are considered to evaluate how good the fit of the proposed model is. Diverse criteria are used in this regard, pointing out that the fit of the TGFrW model is better to those of comparable Weibull type models, with possible more parameters. In particular, the proposed model surpasses the analogous truncated Fréchet model, attesting to the importance of the findings.
The following organization is adopted. The TGFr-G family is defined in Section 2. Diverse properties are discussed in Section 3, including the analytical study of the main functions, stress-strength parameter, series expansions, incomplete moments with derivations, various entropy measures, theoretical and practical parameters estimation and various bivariate extensions of the proposed family through the use of copulas. Section 4 is devoted to the TGFrW distribution, with an emphasis on its applicability in simulated and concrete statistical settings. Section 5 contains some concluding notes.
The TGFr-G Family
The basics of the TGFr-G family are proposed in this section, exhibiting its main functions of interest, as well as a short list of special distributions.
First Approach
First of all, we recall that the CDF given as (3) defines the TGFr-G family. Hereafter, a RV X having the CDF given as (3) is denoted by X~TGFr-G(ψ). By taking β=1, it corresponds to the special case of the truncated Fréchet-G family by [6].
Among the important functions of the TGFr-G family, there are the PDF given as
and the hazard rate function (HRF) obtained as
hTGF-G(x;ψ)=αβg(x;η)G(x;η)2e-αG(x;η)-1(1-e-αG(x;η)-1)β-1(1-e-αG(x;η)-1)β-(1-e-α)β,x∈ℝ.
The analytical properties of these functions are very informative on the data fitting possibilities of the associated models. This aspect will be the subject of further discussions. Also, the quantile function (QF), obtained by inverting the CDF in (3), is given as
where Q(u;η) denotes the QF of the parental distribution. The fact that QTGFr-G(u;ψ) has a closed-form expression is a plus for the TGFr-G family. In particular, we can simply determine the median as M=QTGFr-G(1/2;ψ), derive several functions related to this QF and generate random values through the inverse transform sampling method.
In order to illustrate the heterogeneity of the TGFr-G family, Tab. 1 lists several of its members based on standard parental distributions, with various supports and numbers of parameters.
Some special distributions belonging to the TGFr-G family
TGFr-G
Parent’s name
Support
G(x;η)
ψ
FTGFr-G(x;ψ)
TGFrU
Uniform
(0, v)
xv
(α,β,v)
11-(1-e-α)β[1-(1-e-αvx-1)β]
TGFrP
Power
(0, 1)
xλ
(α,β,λ)
11-(1-e-α)β[1-(1-e-αx-λ)β]
TGFrK
Kumaraswamy
(0, 1)
1 −(1 − xa)b
(α,β,a,b)
11-(1-e-α)β[1-(1-e-α[1-(1-xa)b]-1)β]
TGFrE
Exponential
(0,+∞)
1-e-θx
(α,β,θ)
11-(1-e-α)β[1-(1-e-α(1-e-θx)-1)β]
TGFrW
Weibull
(0,+∞)
1-e-θxλ
(α,β,λ,θ)
11-(1-e-α)β[1-(1-e-α(1-e-θxλ)-1)β]
TGFrLom
Lomax
(0,+∞)
1-(1+ρx)-θ
(α,β,ρ,θ)
11-(1-e-α)β[1-(1-e-α[1-(1+ρx)-θ]-1)β]
TGFrC
Cauchy
ℝ
1πarctan(bx)+12
(α,β,b)
11-(1-e-α)β[1-(1-e-α[1πarctan(bx)+12]-1)β]
TGFrGu
Gumbel
ℝ
exp(-e-bx)
(α,β,b)
11-(1-e-α)β[1-(1-e-αexp(e-bx))β]
TGFrLog
Logistic
ℝ
(1+e-bx)-1
(α,β,b)
11-(1-e-α)β[1-(1-e-α(1+e-bx))β]
In our applications, a focus will be put on the TGFrW distribution defined with θ=1. This choice is motivated by upstream numerical and graphical investigations.
General Properties
In this section, we develop some notable properties of the TGFr-G family, and discuss some new motivations.
Equivalences
Here, some analytical results on the functions of the TGFr-G family are studied. Firstly, we investigate the equivalences of FTGFr-G(x;ψ), fTGFr-G(x;ψ) and hTGFr-G(x;ψ). Mathematical facts force us to distinguish the cases: G(x;η)→0, G(x;η)→1, α→0, α→+∞, β→0 and β→+∞. It is assumed that G(x;η)∈(0,1) for these four last cases, but G(x;η)→0 and G(x;η)→1 are not excluded.
Let us mention that G(x;η)→0 is equivalent to say that x tends to the lower limit of the adherence of the set 0\}$]]>{x∈ℝ;G(x;η)>0}, and G(x;η)→1 is equivalent to say that x tends to the upper limit of the adherence of the set {x∈ℝ;G(x;η)<1}. The obtained equivalences for FTGFr-G(x;ψ) and fTGFr-G(x;ψ) are described in Tab. 2.
Equivalences for the CDF and PDF of the TGFr-G family
FTGFr-G(x;ψ)~
fTGFr-G(x;ψ)~
G(x;η)→0
β1-(1-e-α)βe-αG(x;η)-1
αβ1-(1-e-α)βg(x;η)G(x;η)2e-αG(x;η)-1
G(x;η)→1
1-αβ(1-e-α)β(eα-1)[1-(1-e-α)β](1-G(x;η))
αβ(1-e-α)β(eα-1)[1-(1-e-α)β]g(x;η)
α→0
(1+αβ)(1-αβG(x;η)-β)
αββg(x;η)G(x;η)-β-1
α→+∞
e-α(G(x;η)-1-1)
αg(x;η)G(x;η)2e-α(G(x;η)-1-1)
β→0
1log(1-e-α)log(1-e-αG(x;η)-1)
αlog(1-e-α)g(x;η)G(x;η)2[eαG(x;η)-1-1]
β→+∞
[1+(1-e-α)β][1-(1-e-αG(x;η)-1)β]
αβg(x;η)G(x;η)2e-αG(x;η)-1(1-e-αG(x;η)-1)β-1
From Tab. 2, the following remarks hold. When G(x;η)→0, we see that α has a significant impact on the limit of fTGFr-G(x;ψ). In particular, the term e-αG(x;η)-1 can dominate g(x;η)/G(x;η)2 and thus fTGFr-G(x;ψ)→0 with an exponential decay. When G(x;η)→1, for the limit of fTGFr-G(x;ψ), both α and β influence the proportionality constant, but the limit comportment of g(x;η) remains determinant. When α→0 or α→+∞ with G(x;η)<1 and fix g(x;η), we have fTGFr-G(x;ψ)→0. When β→0, the limiting function of FTGFr-G(x;ψ) is obtained as
F*(x;α,η)=1log(1-e-α)log(1-e-αG(x;η)-1),x∈ℝ,
and one can remark that F*(x;α,η) is a valid CDF. As far as we know, it is unlisted in the literature, offering a new and original “logarithmic-exponential-G family”. This finding also reveals the richness of the proposed TGFr-G family.
Tab. 3 completes Tab. 2 by investigating the equivalences of hTGFr-G(x;ψ).
From Tab. 3, when G(x;η)→0, we see that the limit of hTGFr-G(x;ψ) truly depends on α, which is not the case when G(x;η)→1, where the limiting function correspond to the HRF of the parental distribution. In the case where G(x;η)→1 is excluded and α→0, we have
hTGFr-G(x;ψ)~βg(x;η)G(x;η)-β-1G(x;η)-β-1,
showing the importance of the parameter β in this regard. Note that, when both G(x;η)→1 and α→0, with a fix g(x;η), we have hTGFr-G(x;ψ)~(β/αβ)g(x;η)→+∞. Also, when G(x;η)→1 is excluded, with fix g(x;η) and G(x;η), and α→+∞, we have hTGFr-G(x;ψ)→0. The obtained limit when β→0 is a complex function with respect to x, and, when G(x;η)→1 is excluded, with fix g(x;η) and G(x;η), and β→+∞, we have
hTGFr-G(x;ψ)~αβg(x;η)G(x;η)2e-αG(x;η)-1(1-e-αG(x;η)-1)-1,
implying that hTGFr-G(x;ψ)→+∞.
Mode(s) Analysis
A mode of the TGFr-G family belongs to the set argmaxx∈ℝfTGFr-G(x;ψ). Such a mode, say xm,
where g(x;η)′′ denotes the two times derivative of g(x;η) with respect to x.
The number and definition(s) of the mode(s) depend on the parental distribution, α and β. However, even though all of these quantities are known, the complexity of the above equations constitutes an obstacle to get an analytical expression of the mode(s). Thus, mathematical software seems necessary for any numerical appreciation.
Stress-Strength Parameter
The stress-strength parameter provides one of the most important measurements in reliability analysis. From two independent RVs X and Y, the stress-strength parameter is defined by R = P(Y < X). As a common application, it is a measure of performance of a system; it evaluates the probability that a random strength modeled by X exceeds an independent random stress modeled by Y. For the theory and applications on this probabilistic object, we may refer the reader to [19,20].
The following result shows that, under a certain scenario on the parameters, a stress-strength parameter associated to the TGFr-G family has a tractable analytical expression.
Proposition 3.1. Let ψ1=(α,β1,η), ψ2=(α,β2,η), X1~TGFr-G(ψ1), X2~TGFr-G(ψ2), with X1 and X2 independent, and R = P(X2 < X1). Then, we have
R=11-(1-e-α)β2[1-β1β1+β21-(1-e-α)β1+β21-(1-e-α)β1].
Proof. The independence of X1 and X2, and (3), imply that
R=P(X2<X1)=∫-∞+∞FTGFr-G(x;ψ2)fTGFr-G(x;ψ1)dx=11-(1-e-α)β2[1-∫-∞+∞(1-e-αG(x;η)-1)β2fTGFr-G(x;ψ1)dx].
Now, by virtue of (4) and some developments, we get
(1-e-αG(x;η)-1)β2fTGFr-G(x;ψ1)=αβ11-(1-e-α)β1g(x;η)G(x;η)2e-αG(x;η)-1(1-e-αG(x;η)-1)β1+β2-1=β1β1+β21-(1-e-α)β1+β21-(1-e-α)β1fTGFr-G(x;ψ*),
where ψ*=(α,β1+β2,η). By putting the above equations together and using ∫ -∞+∞fTGFr-G(x;ψ*)dx=1, we obtain
R=11-(1-e-α)β2[1-β1β1+β21-(1-e-α)β1+β21-(1-e-α)β1].
This ends the proof of Proposition 3.1.
From Proposition 3.1, we can note that R is finally independent of the chosen parental distribution. Also, when β1=β2, X1 and X2 are identically distributed and R takes the value 1/2 as expected in this simple case. The manageable expression of R is useful for estimation purposes; with the plug-in approach, α, β1 and β2 can be substituted by adequate estimates to derive an estimate for R. Further developments in this regard are however out the scope of this study.
Representation
The following proposition proves that the “possibly complex” exponentiated PDF fTGFr-G(x;ψ)τ can be simply expressed as a series depending on parental exponentiated functions. Such expansion is useful for diverse algebraic manipulations of fTGFr-G(x;ψ)τ involving differentiation or integration, as discussed in full generality in [21].
Proposition 3.2. Let 0$]]>τ>0. The two following complementary expansions hold for fTGFr-G(x;ψ)τ:
In terms of g(x;η)τ and exponentiated survival functions of the parental distribution, i.e., Ḡ(x;η)=1-G(x;η), we have
Proof. Owing to (4), we get
fTGFr-G(x;ψ)τ=ατβτ[1-(1-e-α)β]τg(x;η)τG(x;η)-2τe-ατG(x;η)-1(1-e-αG(x;η)-1)τ(β-1).
Since e-αG(x;η)-1∈(0,1), the generalized binomial theorem gives
e-ατG(x;η)-1(1-e-αG(x;η)-1)τ(β-1)=∑k=0+∞(τ(β-1)k)(-1)ke-α(k+τ)G(x;η)-1.
Now, the exponential expansion gives
G(x;η)-2τe-α(k+τ)G(x;η)-1=∑ℓ=0+∞(-1)ℓ1ℓ!αℓ(k+τ)ℓG(x;η)-ℓ-2τ.
At this stage, two complementary decompositions for G(x;η)-ℓ-2τ can be studied separately.
To obtain A1: One can express G(x;η)-ℓ-2τ in terms of exponentiated Ḡ(x;η) via the generalized binomial theorem as
G(x;η)-ℓ-2τ=∑m=0+∞(-ℓ-2τm)(-1)mḠ(x;η)m.
To obtain A2: One can express G(x;η)-ℓ-2τ in terms of exponentiated G(x;η) via the generalized and standard binomial theorems as
G(x;η)-ℓ-2τ=∑m=0+∞∑u=0m(-ℓ-2τm)(mu)(-1)m+uG(x;η)u.
The proof of Proposition 3.2 ends by putting all the above expansions together.
Several applications of Proposition 3.2 will be presented later.
Incomplete Moments with Discussion
The incomplete moments of X~TGFr-G are useful to derive crucial measures and functions of the TGFr-G family, with a high potential of applicability. Mathematically, the rth incomplete moment of X~TGFr-G at any t∈ℝ can be expressed as
μr′(t)=E(Xr1X≤t)=∫ -∞txrfTGFr-G(x;ψ)dx,
that is, thanks to (4),
For some special parental distributions, the calculus of this integral by usual integration techniques is not excluded. However, for further analytical manipulations or evaluation, a series expression is sometimes preferable. In this regard, several possibilities are presented below, depending on the level of complexity in the definition of G(x;η).
From (6), by applying the change of variable v=e-αG(x;η)-1, i.e., x=Q([-(1/α)lnv]-1;η), and the generalized binomial expansion, assuming that the integral and sum signs are interchangeable, we get
If the QF of the parental distribution is not too complex, the integral term can be made explicit.
For more universal series developments, Proposition 3.2 applied with τ=1 gives series expansions of fTGFr-G(x;ψ) that can be injected into (6). For instance, by considering the expression A1, assuming that the integral and sum signs are interchangeable, we get
μr′(t)=∑k,ℓ,m=0+∞Ξk,ℓ,m[1]∫ -∞txrg(x;η)Ḡ(x;η)mdx.
Alternatively, under the same conditions, the application of A2 gives
For a wide panel of parental distributions, the integrals ∫ -∞txrg(x;η)Ḡ(x;η)mdx and ∫ -∞txrg(x;η)G(x;η)udx are available in the literature or easily calculable. Also, for practical aims, one can truncate the infinite sums by any large integer to have suitable approximation functions for μr′(t). Further detail on the interest of such series expansions in the treatment of various probabilistic measures can be found in [21].
As example of applications, from the incomplete moments of X~TGFr-G, we can derive the rth raw moments of X defined by μr′=E(Xr)= limt→+∞μr′(t), the rth central moment of X specified by the following relation: μr=E((X-μ1′)r)=∑k=0r(rk)(-1)r-kμk′(μ1′)r-k, the variance of X given as σ2=V(X)=μ2, the general coefficient of X defined by Cr=μr/σr allowing to define the skewness coefficient corresponding to S = C3 and the kurtosis coefficient obtained as K = C4, among others.
Also, from the mean incomplete moment μ1′(t), that is μr′(t) taken with r = 1, one can express the mean deviation of X about μ1′ as δ1=E(|X-μ1′|)=2μ1′FTGFr-G(μ1′;ψ)-2μ1′(μ1′), the mean deviation about M as δ2=E(|X-M|)=μ1′-2μ1′(M), the mean residual life as t)=[1-\mu^{\prime}_{1}(t)]/[1-F_{TGFr-G}(t;\psi)]-t$]]>m(t)=E(X-t∣X>t)=[1-μ1′(t)]/[1-FTGFr-G(t;ψ)]-t, the mean waiting time as M(t)=E(t-X∣X≤t)=t-μ1′(t)/FTGFr-G(t;ψ), the Bonferroni curve as B(u)=μ1′(QTGFr-G(u;ψ))/(uμ1′), u∈(0,1), and the Lorenz curve as L(u) = uB(u), u∈(0,1).
Entropy
The entropy is a fundamental concept in information theory, with applications in statistical inference, neurobiology, linguistics, cryptography, quantum computer science and bioinformatics. In the literature, there exists several entropy measures to determine the randomness of a distribution. Most of them are discussed in the survey of [22]. By considering a generic (continuous) distribution with PDF denoted by f(x), some of them are presented in Tab. 4. In this table, it is supposed that 0$]]>θ>0 and θ≠1.
Some entropy measures of a distribution with PDF denoted by f(x)
Entropy
Definition
Reference
Rényi
Rθ=11-θln[∫ -∞+∞f(x)θdx]
[23]
Havrda and Charvat
HCθ=121-θ-1[∫ -∞+∞f(x)θdx-1]
[24]
Arimoto
Aθ=θ1-θ{[∫ -∞+∞f(x)θdx]1/θ-1}
[25]
Awad and Alawneh
AAθ=1θ-1ln{[supx∈ℝf(x)]1-θ∫ -∞+∞f(x)θdx}
[26]
Tsallis
Tθ=1θ-1[1-∫ -∞+∞f(x)θdx]
[27]
From Tab. 4, we see that the main term in the definitions of the entropy measures is the following integral term: ∫ -∞+∞f(x)θdx. We now investigate it in the context of the TGFr-G family. So, we set
Iθ(ψ)=∫ -∞+∞fTGFr-G(x;ψ)θdx,
with 0$]]>θ>0 and θ≠1. Thanks to (4), it can be expressed as
Iθ(ψ)=αθβθ[1-(1-e-α)β]θ∫ -∞+∞g(x;η)θG(x;η)2θe-αθG(x;η)-1(1-e-αG(x;η)-1)θ(β-1)dx.
For some special parental distributions, we can inspect the calculus of this integral by standard techniques. A more universal approach consists in expressing it as a tractable series expansion. Hence, once can apply Proposition 3.2 with the choice τ=θ to obtain series expansions of fTGFr-G(x;ψ)θ and use it into (8). Thus, assuming that the integral and sum signs are interchangeable, from A1, we get
Iθ(ψ)=∑k,ℓ,m=0+∞Ξk,ℓ,m[θ]∫ -∞+∞g(x;η)θḠ(x;η)mdx.
Alternatively, under the same conditions, the application of A2 gives
Iθ(ψ)=∑k,ℓ,m=0+∞∑u=0mϒk,ℓ,m,u[θ]∫ -∞+∞g(x;η)θG(x;η)udx.
For most of the standard parental distributions, the integrals ∫ -∞+∞g(x;η)θḠ(x;η)mdx and ∫ -∞+∞g(x;η)θG(x;η)udx can be determined with mathematical efforts. Thus, one can deduce expansions of all the entropy measures presented in Tab. 4. In particular, the Tsallis entropy of the TGFr-G family can be expanded as
One can deduce a precise approximation of it by truncating the infinite sum by any large integer.
Parameters Estimation: Theory and Practice
The main objective of the TGFr-G family is to provide pliant semi-parametric models for statistical applications. To reach this aim, the estimation of the model parameters is a crucial step, and several methods of estimation are possible. Here, we provide the essential theory on the maximum likelihood (ML) method of estimation in the context of the TGFr-G family. The generalities can be found in [28].
First of all, let X1,…,Xn be n independent and identically distributed RVs from X~TGFr-G(ψ) and X=(X1,…,Xn). Then, assuming that they are unique, the ML estimators of the parameters α, β and η, say α^, β^ and η^, respectively, are the RVs obtained as
ψ^=argmaxψL(ψ,X),
where ψ^=(α^,β^,η^), ψ=(α,β,η), and L(ψ,X) is the likelihood function defined from (4) as
L(ψ,X)=∏i=1nfTGFr-G(Xi;ψ)=αnβn[1-(1-e-α)β]n{∏i=1ng(Xi;η)G(Xi;η)2}e-α∑i=1nG(Xi;η)-1{∏i=1n(1-e-αG(Xi;η)-1)}β-1.
Assuming that L(ψ,X) is differentiable with respect to ψ, the ML estimators are the solutions of the following equations: ∂ℓ(ψ,X)/∂α=0, ∂ℓ(ψ,X)/∂β=0 and ∂ℓ(ψ,X)/∂η=0, where ℓ(ψ,X)= ln[L(ψ,X)]. In most of the cases, there are no analytical expressions for these estimators, but practical solutions exist and will be discussed later. Then, under some regularity conditions, the ML estimators satisfy remarkable convergence properties, including the asymptotically normal property presented below. Let m be the number of components in ψ (which can be numerous since η is itself a vector of components) and ψu be the uth component of ψ. Then, the asymptotic distribution of ψ^ is the multivariate normal distribution Nm(ψ,J(ψ)-1), where J(ψ) denotes the m×m covariance matrix defined by J(ψ)={E(-∂2ℓ(ψ,X)/(∂ψu∂ψv))}u,v.
In a concrete statistical scenario, we deal with data corresponding to observations of X1,…,Xn. Let us denoted them by x1,…,xn. Then, the ML vector of estimates of ψ, say ψ̃=(α̃,β̃,η̃), is defined by the corresponding observation of ψ^. Thanks to the argmax definition, it can be obtained numerically by optimization via the use of any Newton-Raphson type algorithm. With the R software, this numerical work can be done via the functions of the package AdequacyModel.
For the practice of the asymptotic normality, the covariance matrix J(ψ) is often difficult to determine analytically and depends on the unknown parameters. A standard approach consists in using the following approximation: J(ψ)≈{-∂2ℓ(ψ,x)/(∂ψu∂ψv)}u,v∣ψ=ψ̃, where x=(x1,…,xn). Thus, the asymptotic distribution of ψ^ can be considered as the multivariate normal distribution Nm(ψ,I-1), where I={-∂2ℓ(ψ,x)/(∂ψu∂ψv)}u,v∣ψ=ψ̃. This result is useful to construct asymptotic two-sided confidence intervals (CIs) of the parameters. More precisely, for any u=1,…,m and ν∈(0,1), the 100(1-ν)% CI of ψu is obtained as
CI=[LB,UB],
where LB and UB are the lower and upper bounds of the interval, defined by LB=LBψu(ν)=ψ̃u-z1-ν/2du and UB=UBψu(ν)=ψ̃u+z1-ν/2du, respectively, where du is the uth component in the diagonal of I−1 and z1-ν/2 is the quantile of the normal distribution N(0,1) taken at 1-ν/2. As the main interpretation, there is 100(1-ν)% of chances that ψu belongs to CI, which is of interest by taking ν small enough. The typical values for ν are 0.01, 0.05 or 0.1.
Finally, by the invariance property of the ML estimates, we can deduce ML estimates of several measures of the TGFr-G family. For instance, we can inspect the estimation of the Tsallis entropy of the TGFr-G family as defined in (9); the ML estimate of Tθ(ψ) is naturally obtained as T̃θ=Tθ(ψ̃).
The ML estimates, CIs and estimate of the Tsallis entropy will be the object of a numerical study later, by the consideration of a special distribution of the TGFr-G family.
Bivariate TGFr-G Family
Bivariate families of distributions are of interest to model distributions behind two dimensional phenomena or measures, observed via bivariate data. This remains an actual demand in regression or clustering analysis, among others. The univariate TGFr-G family can be extended to the bivariate case via several approaches. The most natural one is to use a bivariate parental distribution characterized by a bivariate CDF, say G(x,y;η), where η is the vector of parameters. Thus, based on (3), we can define the 2TGFr-G family by the following bivariate CDF:
F2TGFr-G(x,y;ψ)=11-(1-e-α)β[1-(1-e-αG(x,y;η)-1)β],(x,y)∈ℝ2,
where ψ=(α,β,η). Then, it is clear that, if (X,Y)~2TGFr-G, then X~TGFr-G and Y~TGFr-G. However, the structure of dependence between X and Y remains unmanageable. A more technical approach but with a clear dependence structure consists in employing special functions called copulas.
By using the Farlie-Gumbel-Morgenstern copula, a bivariate extension of the TGFr-G family, called FGMTGFr-G family, is defined by the bivariate CDF given as
where λ∈[-1,1], FTGFr-G(1)(x;ψ1) and FTGFr-G(2)(y;ψ2) are defined as (3) with possibly different parental CDFs, say G1(x;ψ1) and G2(y;ψ2), respectively. Note that the independence copula corresponds to the case λ=0.
By using the Clayton copula, a bivariate extension of the TGFr-G family, called CTGFr-G family, is defined by the bivariate CDF specified by
where λ≥-1 and λ≠0, by keeping the previous notations.
Other interesting bivariate extensions can be derived from other notorious copulas. A complete list of them, with more theoretical elements, can be found in [29].
The TGFrW Distribution: Theory and Applications
The TGFr-G family contains a plethora of potential interesting distributions. Here, we emphasize with the truncated generalized Fréchet Weibull (TGFrW) distribution as presented in Tab. 1, discussing its numerous qualities.
The TGFrW Distribution
Let us recall that the TGFrW distribution as described in Tab. 1 with θ=1 corresponds the following configuration: η=λ, G(x;λ)=1-e-xλ, x > 0, (G(x;λ)=0 otherwise), and g(x;λ)=λxλ-1e-xλ, x > 0. Concretely, it is defined by the following CDF:
0
\end{eqnarray*}$$]]>FTGFrW(x;α,β,λ)=11-(1-e-α)β[1-(1-e-α(1-e-xλ)-1)β],x>0
(and FTGFrW(x;α,β,λ)=0 otherwise). The corresponding PDF is given as
0.
\end{eqnarray*}$$]]>fTGFrW(x;α,β,λ)=αβλ1-(1-e-α)βxλ-1e-xλ(1-e-xλ)2e-α(1-e-xλ)-1(1-e-α(1-e-xλ)-1)β-1,x>0.
The HRF is obtained as
0.
\end{eqnarray*}$$]]>hTGFrW(x;α,β,λ)=αβλxλ-1e-xλ(1-e-xλ)2e-α(1-e-xλ)-1(1-e-α(1-e-xλ)-1)β-1(1-e-α(1-e-xλ)-1)β-(1-e-α)β,x>0.
The pliancy of the curvatures of fTGFrW(x;α,β,λ) and hTGFrW(x;α,β,λ) is illustrated in Figs. 1 and 2, respectively.
Some curves of the PDF of the TGFrW distribution
Some curves of the HRF of the TGFrW distribution
In Fig. 1, various degrees of skewness (asymmetry) and kurtosis are observed for fTGFrW(x;α,β,λ), showing decreasing and bell shapes, as well various weights on the right tail mainly. In Fig. 2, we see that hTGFrW(x;α,β,λ) possesses reversed J, bathtub decreasing and increasing shapes, with possibly several critical points.
Hence, quartiles and random generations numbers from the TGFrW distribution can be easily investigated.
Some Properties and Numerical Works
The general properties studied for the TGFr-G family in Section 2 can be applied to the TGFrW distribution. A selection of them are presented below. First of all, in order to complete the observations made on Figs. 1 and 2, let us investigate the equivalences and limits of fTGFrW(x;α,β,λ) and hTGFrW(x;α,β,λ). When x→0, we have
fTGFrW(x;α,β,λ)~hTGFrW(x;α,β,λ)~αβλ1-(1-e-α)βx-λ-1e-α(1-e-xλ)-1.
Also, when x→+∞, we have
fTGFrW(x;α,β,λ)~αβλ(1-e-α)β(eα-1)[1-(1-e-α)β]xλ-1e-xλ,hTGFrW(x;α,β,λ)~λxλ-1.
In particular, we note that λ plays the major role in these convergence, limx→0fTGFrW(x;α,β,λ)= limx→+∞fTGFrW(x;α,β,λ)=0 in all cases, and, when x→+∞, hTGFrW(x;α,β,λ) has the same comportment to the HRF of the parental distribution, i.e., hTGFrW(x;α,β,λ)→0 when λ<1, hTGFrW(x;α,β,λ)→1 when λ=1, and hTGFrW(x;α,β,λ)→+∞ when 1$]]>λ>1.
Also, by the Riemann integral criteria, the equivalence results for fTGFrW(x;α,β,λ) ensure that the raw moments of all orders of X~TGFrW exist, for all the values of the parameters. In this setting, let us now discuss the rth incomplete moment of X, rth raw moment of X with related measures, and the Tsallis entropy.
As usual, the rth incomplete moment of X can be expressed as its principal integral form. Alternatively, owing to (7) and the equality: ∫ 0txrg(x;λ)Ḡ(x;λ)mdx=(m+1)-r/λ-1γ(r/λ+1,(m+1)tλ), where γ(a,x)=∫ 0xta-1e-tdt denotes the lower incomplete gamma function, we have
μr′(t)=∑k,ℓ,m=0+∞Ξk,ℓ,m[1](m+1)-r/λ-1γ(r/λ+1,(m+1)tλ).
We can manipulate this expansion to derive approximations of the measures and functions presented in Subsection 3.5. Also, by applying t→+∞, we get the rth raw moment of X, i.e.,
μr′=∑k,ℓ,m=0+∞Ξk,ℓ,m[1](m+1)-r/λ-1Γ(r/λ+1),
where Γ(a)=∫ 0+∞ta-1e-tdt. As numerical works, Tabs. 5 and 6 collected the numerical values of some measures of the TGFrW distribution derived to the raw moments.
Values of some measures of the TGFrW distribution for several values of λ and at α=β=0.5
Measures
λ=2
λ=4
λ=6
λ=8
λ=10
λ=12
μ1′
2.051
3.037
3.704
4.206
4.608
4.944
μ2′
8.162
14.247
19.206
23.432
27.134
30.442
μ3′
49.068
91.927
130.437
165.658
198.27
228.746
μ4′
392.912
760.85
1109
1440
1757
2062
σ2
3.956
5.024
5.49
5.741
5.896
5.999
S
2.046
1.611
1.449
1.367
1.319
1.288
K
9.155
7.026
6.393
6.107
5.948
5.848
Values of some measures of the TGFrW distribution for several values of λ and at α=0.7 and β=3.0
Measures
λ=2
λ=4
λ=6
λ=8
λ=10
λ=12
μ1′
0.634
0.881
1.04
1.159
1.254
1.333
μ2′
0.639
1.062
1.393
1.669
1.909
2.122
μ3′
0.925
1.689
2.357
2.959
3.51
4.021
μ4′
1.786
3.417
4.936
6.368
7.728
9.028
σ2
0.236
0.286
0.311
0.326
0.336
0.344
S
1.916
1.64
1.518
1.446
1.4
1.366
K
8.863
7.378
6.811
6.51
6.323
6.196
Among others, Tabs. 5 and 6 show how the values of some moments measures of X~TGFrW can vary according to the values of the parameters. Here, a great variation of the values on the mean and kurtosis are mainly observed.
As described in Subsection 3.6, the Tsallis entropy of the TGFrW distribution is initially defined by an integral expression. A tractable series expansion can be deduced from (9). Indeed, since ∫ 0+∞g(x;λ)θḠ(x;λ)mdx=λθ-1(m+θ)-(θ-1)(λ-1)/λ-1Γ((θ-1)(λ-1)/λ+1) provided that \max(1-1/\theta, 0)$]]>λ> max(1-1/θ,0), we have
Tθ(ψ)=1θ-1[1-λθ-1∑k,ℓ,m=0+∞Ξk,ℓ,m[θ](m+θ)-(θ-1)(λ-1)/λ-1Γ((θ-1)(λ-1)/λ+1)].
Possible values for the Tsallis entropy are shown in Tab. 7.
Values of the Tsallis-entropy of the TGFrW distribution for several values of the parameters
α
β
λ
Tsallis entropy
θ=0.5
θ=0.8
θ=1.5
θ=2.0
0.5
0.5
0.5
4.2799
1.6245
0.3671
0.0730
0.5
0.5
1.0
1.6986
1.0025
0.4899
0.3510
0.5
0.5
1.5
0.9963
0.6504
0.3515
0.2592
0.5
0.5
2.0
0.6151
0.3974
0.1935
0.1247
0.5
0.5
3.0
0.1691
0.0403
−0.1071
−0.1761
0.5
0.5
4.0
−0.1010
−0.2096
−0.3776
−0.4904
0.7
3.0
0.5
3.1563
1.0282
0.0233
−0.2706
0.7
3.0
1.0
1.2760
0.6612
0.2274
0.1063
0.7
3.0
1.5
0.7176
0.3860
0.1026
0.0082
0.7
3.0
2.0
0.3973
0.1708
−0.0553
−0.1485
0.7
3.0
3.0
0.0077
−0.1479
−0.3668
−0.5063
0.7
3.0
4.0
−0.2346
−0.3778
−0.6525
−0.8829
Tab. 7 reveals that the amount of randomness of the TGFrW distribution measured by the Tsallis entropy is versatile. Indeed, it can take negative values, as well as small or large positive values. The rest of the study focuses on the statistical usefulness of the TGFrW model in a statistical framework.
Estimation: Numerical Study
The ML estimates of the parameters of the TGFrW model, the corresponding CIs and the estimate of the Tsallis entropy can be obtained via the approach described in Subsection 3.7. Here, we provide a numerical study on these statistical objects through the simple random sampling scheme. This scheme is based on the QF defined by (10). A performance study of the estimates is conducted relatively to the mean square errors (MSEs), (average) LBs and UBs of the corresponding 90% and 95% CIs, as well as the corresponding average lengths (ALs), i.e., AL=UB-LB. The software Mathematica 9 is used in this regard. The following steps are followed.
Step 1: A random sample of values of size n = 100, 200, 300, 1000 and 3000 is generated from the TGFrW distribution.
Step 2: We consider the following sets of parameters: set1: (α=0.5, β=2.0, λ=0.5), set2: (α=0.5, β=2.0, λ=0.3), set3: (α=0.3, β=1.6, λ=0.3) and set4: (α=0.5, β=0.8, λ=0.3).
Step 3: For each of the above sets and each sample of size n, the ML estimates are computed.
Step 4: We repeat the previous steps N times, dealing with different samples, where N = 5000. Then, the MSEs of the estimates are computed.
Step 5: Also, the LBs, UBs and ALs of the 90% and 95% CIs are calculated.
Step 6: Numerical outcomes are given in Tabs. 8–11.
Values of ML estimates and IC measures related to the TGFrW model for set1: (α=0.5, β=2.0, λ=0.5)
n
ML Est.
MSE
90%
95%
LB
UB
AL
LB
UB
AL
100
0.5704
0.3024
−14.7979
15.9387
30.7366
−17.7408
18.8815
36.6223
3.1564
2.8814
−297.3980
303.7100
601.1080
−354.9510
361.2630
716.2140
0.5225
0.1271
−1.7419
2.7869
4.5288
−2.1755
3.2205
5.3960
200
0.5045
0.1963
0.1065
0.9025
0.7960
0.0303
0.9787
0.9484
2.2409
1.6482
−1.1790
5.6608
6.8398
−1.8339
6.3157
8.1496
0.5155
0.0765
0.3840
0.6471
0.2632
0.3588
0.6723
0.3136
300
0.4444
0.1383
0.1982
0.6906
0.4924
0.1510
0.7377
0.5867
1.8926
0.9776
0.1736
3.6115
3.4380
−0.1556
3.9407
4.0963
0.5336
0.0564
0.4342
0.6330
0.1988
0.4152
0.6520
0.2368
1000
0.5173
0.1166
0.3472
0.6874
0.3402
0.3147
0.7200
0.4054
2.1735
0.8951
0.9330
3.4139
2.4809
0.6955
3.6515
2.9560
0.4976
0.0356
0.4407
0.5545
0.1138
0.4298
0.5654
0.1356
3000
0.5015
0.0446
0.4037
0.5993
0.1956
0.3850
0.6180
0.2330
2.0131
0.2899
1.3450
2.6811
1.3361
1.2171
2.8090
1.5919
0.5013
0.0160
0.4677
0.5348
0.0671
0.4613
0.5412
0.0800
Values of ML estimates and IC measures related to the TGFrW model for set2: (α=0.5, β=2.0, λ=0.3)
n
ML Est.
MSE
90%
95%
LB
UB
AL
LB
UB
AL
100
0.5499
0.2599
−16.2165
17.3163
33.5327
−19.4270
20.5268
39.9539
2.7556
2.1585
−190.1750
195.6870
385.8620
−227.1190
232.6310
459.7500
0.3052
0.0651
−1.8146
2.4249
4.2395
−2.2205
2.8308
5.0513
200
0.5928
0.2130
0.2001
0.9856
0.7855
0.1249
1.0608
0.9359
2.9351
2.0150
−0.6027
6.4728
7.0755
−1.2801
7.1503
8.4304
0.2941
0.0403
0.2219
0.3664
0.1444
0.2081
0.3802
0.1721
300
0.4031
0.1666
0.1592
0.6469
0.4877
0.1125
0.6936
0.5811
1.4359
0.9504
-0.0878
2.9597
3.0475
-0.3796
3.2515
3.6311
0.3070
0.0386
0.2654
0.3886
0.1233
0.2536
0.4004
0.1469
1000
0.5094
0.0923
0.3336
0.6852
0.3516
0.3000
0.7189
0.4189
2.1873
0.7859
0.8875
3.4870
2.5995
0.6386
3.7359
3.0973
0.2999
0.0212
0.2642
0.3356
0.0714
0.2574
0.3424
0.0851
3000
0.4844
0.0459
0.3878
0.5810
0.1932
0.3693
0.5995
0.2302
1.9188
0.2798
1.2746
2.5631
1.2885
1.1512
2.6865
1.5352
0.3037
0.0108
0.2832
0.3242
0.0411
0.2793
0.3282
0.0489
Values of ML estimates and IC measures related to the TGFrW model for set3: (α=0.3, β=1.6, λ=0.3)
n
ML Est.
MSE
90%
95%
LB
UB
AL
LB
UB
AL
100
0.4413
0.2818
−0.7060
1.5886
2.2946
−0.9257
1.8083
2.7340
3.1234
2.7498
−11.8344
18.0811
29.9155
−14.6986
20.9454
35.6440
0.2837
0.0628
0.0952
0.4722
0.3771
0.0591
0.5083
0.4493
200
0.4537
0.2608
0.0842
0.8231
0.7389
0.0135
0.8939
0.8804
3.0049
2.5383
−0.7407
6.7505
7.4911
−1.4579
7.4677
8.9256
0.2744
0.0473
0.1972
0.3515
0.1542
0.1825
0.3662
0.1838
300
0.3736
0.2031
0.0645
0.6827
0.6182
0.0053
0.7419
0.7366
2.4107
1.9493
−0.4526
5.2741
5.7267
−1.0010
5.8224
6.8233
0.2889
0.0395
0.2122
0.3655
0.1533
0.1975
0.3802
0.1827
1000
0.2874
0.0475
0.1648
0.4100
0.2452
0.1413
0.4334
0.2921
1.5329
0.2688
0.7519
2.3138
1.5619
0.6024
2.4634
1.8610
0.3059
0.0171
0.2664
0.3453
0.0789
0.2588
0.3529
0.0940
3000
0.2951
0.0345
0.2084
0.3419
0.1335
0.1956
0.3547
0.1591
1.5636
0.2004
1.0464
1.8809
0.8346
0.9665
1.9608
0.9944
0.3089
0.0125
0.2865
0.3313
0.0448
0.2822
0.3356
0.0534
Values of ML estimates and IC measures related to the TGFrW model for set4: (α=0.5, β=0.8, λ=0.3)
n
ML Est.
MSE
90%
95%
LB
UB
AL
LB
UB
AL
100
0.5612
0.2059
0.1317
0.9907
0.8591
0.0494
1.0730
1.0236
1.3312
1.1459
−1.4695
4.1319
5.6014
−2.0058
4.6682
6.6740
0.2974
0.0280
0.2223
0.3724
0.1501
0.2079
0.3868
0.1789
200
0.4886
0.1425
0.2378
0.7394
0.5016
0.1898
0.7874
0.5977
0.9538
1.0425
−0.4601
2.1077
2.5678
−0.7059
2.3536
3.0595
0.3022
0.0200
0.2539
0.3505
0.0966
0.2447
0.3598
0.1151
300
0.4860
0.1510
0.2875
0.6845
0.3971
0.2495
0.7226
0.4731
0.6807
0.7732
−0.2549
1.6162
1.8712
−0.4341
1.7954
2.2295
0.3017
0.0312
0.2641
0.3393
0.0752
0.2569
0.3465
0.0896
1000
0.4751
0.1266
0.3623
0.5880
0.2257
0.3407
0.6096
0.2689
0.7093
0.6065
0.1571
1.2615
1.1045
0.0513
1.3673
1.3159
0.3047
0.0231
0.2830
0.3264
0.0434
0.2788
0.3305
0.0517
3000
0.4615
0.1005
0.3993
0.5236
0.1243
0.3874
0.5355
0.1481
0.7687
0.5170
0.3819
0.9556
0.5737
0.3269
1.0105
0.6836
0.3066
0.0172
0.2942
0.3191
0.0249
0.2918
0.3215
0.0297
For all the considered sets of parameters, the values in Tabs. 8–11, indicate that the ML estimates stabilize to the right values as n increases. Also, the MSEs and ALs decrease and tend to 0 as n becomes large as expected.
Now, we check the numerical performance of the estimate of the Tsallis entropy of the TGFrW model as described in Subsection 3.7. In this regard, Tabs. 12–15 list the values of this estimate under the simulation scenario described above. We adopt the criteria of the relative bias (RB), defined as RB=(Estimate-Exact value)/Exact value.
Values of the Tsallis entropy estimates related to the TGFrW model for set 1: (α=0.5, β=2.0, λ=0.5)
n
Exact value
θ=0.5
Exact Value
θ=0.8
Exact value
θ=1.5
Exact value
θ=2.0
Est.
RB
Est.
RB
Est.
RB
Est.
RB
100
2.9917
2.293
0.233
0.8076
0.534
0.339
−0.3096
−0.43
0.389
−0.7971
−0.908
0.139
200
2.517
0.159
0.699
0.135
−0.355
0.145
−0.895
0.123
300
2.675
0.106
0.653
0.191
−0.398
0.285
−0.834
0.046
1000
2.956
0.012
0.783
0.031
−0.325
0.05
−0.813
0.02
3000
2.976
0.0054
0.806
0.0023
−0.305
0.014
−0.788
0.012
Values of the Tsallis entropy estimates related to the TGFrW model for set 2: (α=0.5, β=2.0, λ=0.3)
n
Exact value
θ=0.5
Exact Value
θ=0.8
Exact value
θ=1.5
Exact value
θ=2.0
Est.
RB
Est.
RB
Est.
RB
Est.
RB
100
4.4915
3.94
0.123
0.6644
0.443
0.334
−2.3148
−2.8
0.209
−7.3409
−9.798
0.335
200
4.148
0.077
0.528
0.205
−2.31478
0.044
−6.683
0.09
300
4.323
0.037
0.581
0.126
−2.256
0.026
−6.926
0.057
1000
4.455
0.0081
0.605
0.089
−2.354
0.017
−7.607
0.036
3000
4.474
0.004
0.652
0.018
−2.332
0.0074
−7.393
0.0071
Values of the Tsallis entropy estimates related to the TGFrW model for set 3: (α=0.3, β=1.6, λ=0.3)
n
Exact value
θ=0.5
Exact value
θ=0.8
Exact value
θ=1.5
Exact value
θ=2.0
Est.
RB
Est.
RB
Est.
RB
Est.
RB
100
3.2636
2.58
0.21
−0.2573
−0.541
1.102
−6.5709
−6.025
0.083
−34.0403
−27.92
0.18
200
2.799
0.142
−0.483
0.876
−6.222
0.053
−28.912
0.151
300
2.867
0.122
−0.43
0.672
−6.229
0.052
−29.127
0.144
1000
3.227
0.011
−0.218
0.153
−6.734
0.025
−33.058
0.029
3000
3.243
0.0063
−0.259
0.0071
−6.474
0.015
−33.084
0.028
Values of the Tsallis entropy estimates related to the TGFrW model for set 4: (α=0.5, β=0.8, λ=0.3)
n
Exact Value
θ=0.5
Exact Value
θ=0.8
Exact value
θ=1.5
Exact value
θ=2.0
Est.
RB
Est.
RB
Est.
RB
Est.
RB
100
6.1719
5.841
0.054
1.5863
1.422
0.104
−0.9366
−1.157
0.236
−3.2603
−3.847
0.18
200
5.864
0.05
1.43
0.099
−1.108
0.183
−3.749
0.15
300
5.94
0.038
1.472
0.072
−1.013
0.082
−3.171
0.027
1000
6.253
0.013
1.632
0.029
−0.892
0.047
−3.333
0.022
3000
6.125
0.0075
1.573
0.0085
−0.96
0.025
−3.27
0.00298
For all the considered sets of parameters, the values in Tabs. 8–11, indicate that the estimates of the Tsallis entropy stabilize to the exact values as n increases. Also, the RBs decrease and tend to 0 as n becomes large, which is a consistent observation with the expected theoretical convergence.
Data Analysis
Here, we show that the TGFrW model is ideal to fit practical data of various kinds, with better results in comparison to solid extended Weibull models. More specifically, the two following data sets are considered.
The first data set, called datasetI, contains 100 observations on minutes waiting time before a client receives the desired service in a bank. It is: datasetI = {0.8, 0.8, 1.3, 1.5, 1.8, 1.9, 1.9, 2.1, 2.6, 2.7, 2.9, 3.1, 3.2, 3.3, 3.5, 3.6, 4, 4.1, 4.2, 4.2, 4.3, 4.3, 4.4, 4.4, 4.6, 4.7, 4.7, 4.8, 4.9, 4.9, 5.0, 5.3, 5.5, 5.7, 5.7, 6.1, 6.2, 6.2, 6.2, 6.3, 6.7, 6.9, 7.1, 7.1, 7.1, 7.1, 7.4, 7.6, 7.7, 8, 8.2, 8.6, 8.6, 8.6, 8.8, 8.8, 8.9, 8.9, 9.5, 9.6, 9.7, 9.8, 10.7, 10.9, 11.0, 11.0, 11.1, 11.2, 11.2, 11.5, 11.9, 12.4, 12.5, 2.9, 13.0, 13.1, 13.3, 13.6, 13.7, 13.9, 14.1, 15.4, 15.4, 17.3, 17.3, 18.1, 18.2, 18.4, 18.9, 19.0, 19.9, 20.6, 21.3, 21.4, 21.9, 23, 27, 31.6, 33.1, 38.5}. The reference for this data is [30].
The second data set, called datasetII, represents 30 successive values of precipitation (in inches), in one month, in Minneapolis. It is: datasetII = {0.77, 1.74, 0.81, 1.20, 1.95, 1.20, 0.47, 1.43, 3.37, 2.20, 3.00, 3.09, 1.51, 2.10, 0.52, 1.62, 1.31, 0.32, 0.59, 0.81, 2.81, 1.87, 1.18, 1.35, 4.75, 2.48, 0.96, 1.89, 0.90, 2.05}. The reference for this data is [31].
The following competitors are taken into account: truncated Fréchet-Weibull (TFrW) model proposed by [6], odd log-logistic Weibull (OLLW) model introduced by [32], beta Weibull (BW) model by [33], exponentiated Weibull (EW) model introduced by [34], and gamma-exponentiated exponential (GE) model studied by [35].
For all the models, the estimation of the parameters are performed via the ML method. We refer to Subsection 3.7 concerning the ML estimates of the TGFrW model. As standard criteria of comparison, the following measures are taken into account: -ℓ^, AIC, BIC, W, A, KS and p-value (KS), corresponding to the minus estimated log-likelihood function at the data, Akaike information criterion, Bayesian information criterion, Anderson-Darling statistic, Cramer–von Mises statistic, Kolmogorov–Smirnov statistic and the p-value of the Kolmogorov–Smirnov test, respectively. The corresponding mathematical formulas are described below.
AIC=-2ℓ^+2p,BIC=-2ℓ^+ ln(n)p,W=(12n+1)[112n+∑i=1n(yi-2i-12n)2],A=-(94n2+34n+1)[n+1n∑i=1n(2i-1){ln(yi)+ ln(1-yn-i+1)}],KS= max(yi-i-1n,in-yi),p-value=P(supx∈ℝ|Fn(x)-F^(x)|≥KS),
where n is the number of observations, p is the number of parameters of the considered model, x(1),…,x(n) are the ordered observations, yi=F^(x(i)), where F^(x) denotes the estimated CDF of the model involving the ML estimates for the parameters and Fn(x) denotes the random empirical CDF. The details on these statistical measures can be found in [36,37].
It is admitted that the smaller the values of AIC, BIC, W, A and KS and the greater the values of p-value (KS), the better the model is to fit to the considered data. The software R is used for all the calculations.
For the considered models, the ML estimates with their related standard errors (SEs) are reported in Tabs. 16 and 17 for datasetI and datasetII, respectively.
Values of the ML estimates and SEs for datasetI
Model
ML Est. and SE (in parentheses)
TGFrW
9.6321
618.6199
0.4942
–
(α,β,λ)
(0.8984)
(3.0245)
(0.0219)
–
TFrW
39.9630
80.1455
0.1505
6.3061
(a,b,k,λ)
(18.96786)
(21.2212)
(0.2917)
(0.0978)
OLLW
2.2904
4.4102
1.2739
0.0125
(α,β,γ,λ)
(36.4870)
(7.4534)
(0.5479)
(0.0412)
BW
7.3516
0.1251
1.3381
0.8985
(α,β,θ,λ)
(2.1070)
(0.0137)
(0.0454)
(0.0354)
EW
2.7159
0.2897
85.3984
–
(α,β,λ)
(1.1209)
(0.2110)
(1.1282)
–
Values of the ML estimates and SEs for datasetII
Model
ML Est. and SE (in parentheses)
TGFrW
4.7180
622.2116
0.5200
–
(α,β,λ)
(1.0310)
(7.8857)
(0.1425)
–
TFrW
21.7391
3.8465
0.3587
4.3312
(a,b,k,λ)
(0.8977)
(6.0870)
(1.0098)
(0.4565)
OLLW
30.0389
39.1226
1.7002
0.0085
(α,β,γ,λ)
(16.4171)
(0.9114)
(1.9921)
(0.5161)
GE
0.4278
1.0293
1.3365
–
(α,β,λ)
(0.2033)
(0.4740)
(0.7082)
–
EW
4.3770
0.3623
91.6295
–
(α,β,λ)
(0.8867)
(0.4754)
(0.0755)
–
In particular, for datasetI, the parameters α, β and λ of the TGFrW model are estimated by α̃=9.6321, β̃=618.6199 and λ̃=0.4942, respectively, and for datasetII, they are estimated by α̃=4.7180, β̃=622.2116 and λ̃=0.5200, respectively. We remark that the novel parameter β is estimated far from 1, making a strong difference between the estimated TGFrW model and the former estimated TFrW model.
From Tabs. 18 and 19, it is clear that the TGFrW model is the best of all, with respect to the considered criteria. In particular, it has p-values (KS) closed to 1. As an important remark, the TGFrW model surpasses the former TFrW model, justifying the importance of the generalization.
Values of the considered criteria for datasetI
Distribution
-ℓ^
AIC
BIC
W
A
KS
p-value (KS)
TGFrW
320.2373
646.4747
654.2902
0.0781
0.5756
0.0644
0.8001
TFrW
327.9006
663.8012
674.2219
0.2428
1.64581
0.0929
0.3531
OLLW
389.4066
786.8133
797.2340
0.5317
3.2213
0.5161
0.0021
BW
319.7962
647.5924
658.0131
0.0644
0.4826
0.0890
0.4058
EW
322.6523
651.3046
659.1201
0.1292
0.9139
0.0726
0.6663
Values of the considered criteria for datasetII
Distribution
-ℓ^
AIC
BIC
W
A
KS
p-value (KS)
TGFrW
38.9692
83.9384
88.1420
0.0406
0.2589
0.1006
0.9217
TFrW
39.4797
86.95941
92.5642
0.0622
0.3877
0.1211
0.7708
OLLW
60.2569
128.5138
134.1186
0.1585
1.0085
0.5327
0.00084
GE
39.9177
85.83549
90.03909
0.0491
0.3552
0.1121
0.8451
EW
39.30276
84.60552
88.80911
0.0561
0.3499
0.1137
0.8323
Several kinds of fits of the TGFrW model are shown in Figs. 3 and 4 for datasetI and datasetII, respectively. Specifically, the estimated PDFs of the TGFrW distribution are plotted over the corresponding histograms and the estimated CDFs are plotted over the empirical CDFs. The empirical probabilities versus estimated probabilities (P-P) plots and the empirical quantiles versus estimated quantiles (Q-Q) plots are also shown. In all the cases, a near perfect fit is observed, validating the remarkable performance of the TGFrW model.
Various fits of the TGFrW model for datasetI: (a) Estimated PDF, (b) estimated CDF, (c) P-P plot and (d) Q-Q plot
Various fits of the TGFrW model for datasetII: (a) estimated PDF, (b) estimated CDF, (c) P-P plot and (d) Q-Q plot
Conclusion
We have motivated the use of the truncated generalized Fréchet distribution to define a new generalized family of continuous distributions, called the truncated generalized Fréchet generated (TGFr-G) family. Diverse mathematical and practical investigations show the full potential of the new family, supported by detailed graphical and numerical evidences. A focus is put on the truncated generalized Fréchet Weibull (TGFrW) distribution, with a complete statistical treatment of the related model. Comparative fitting are performed through the use of two practical data sets, with favorable results to the new model in comparison to other popular extended Weibull models. In particular, under a comparable setting, the new model surpasses the former truncated Fréchet model. As perspectives of future work, other special models of the TGFr-G family may be the subjects of further investigation, specially those with support on ℝ. Also, the bivariate extensions of the TGFr-G family can be explored more, with applications in the fields of regression and clustering, for instance. Also, applications in physics remain an interesting challenge, exploring the possible randomness of various networks [38] and various differential equations [39].
We thank the reviewers for their thorough comments and remarks which contributed to improve the quality of the paper. This work was funded by the Deanship of Scientific Research (DSR), King AbdulAziz University, Jeddah, under Grant No. (G:531-305-1441). The authors gratefully acknowledge the DSR technical and financial support. The authors, therefore, acknowledge with thanks to DSR technical and financial support.
Funding Statement: This work was funded by the Deanship of Scientific Research (DSR), King AbdulAziz University, Jeddah, under Grant No. G:531-305-1441. The authors gratefully acknowledge the DSR technical and financial support.
Conflicts of Interest: Authors must declare all conflict of interests.
ReferencesTahir, M. H., Cordeiro, G. M. (2016). Compounding of distributions: A survey and new generalized classes. ,3(1),37. DOI 10.1186/s40488-016-0052-1.Brito, C., Rêgo, L., Oliveira, W., Gomes-Silva, F. (2019). Method for generating distributions and classes of probability distributions: The univariate case. ,48(3),897–930.Ahmad, Z., Hamedani, G. G., Butt, N. S. (2019). Recent developments in distribution theory: A brief survey and some new generalized classes of distributions. ,15(1),87–110. DOI 10.18187/pjsor.v15i1.2803.Mahdavi, A., Silva, G. (2016). A method to expand family of continuous distributions based on truncated. ,13,231–247.Barreto-Souza, W., Simas, A. B. (2013). The exp-G family of probability distributions. ,27(1),84–109. DOI 10.1214/11-BJPS157.Abid, A., Abdulrazak, R. (2017). [0,1] truncated Fréchet-G generator of distributions. ,7,51–66.Bantan, R., Jamal, F., Chesneau, C., Elgarhy, M. (2019). Truncated inverted Kumaraswamy generated family of distributions with applications. ,21(1089),1–22.Najarzadegan, H., Alamatsaz, M. H., Hayati, S. (2017). Truncated Weibull-G more flexible and more reliable than beta-G distribution. ,6(5),1–17. DOI 10.5539/ijsp.v6n5p1.Aldahlan, M., Jamal, F., Chesneau, C., Elgarhy, M., Elbatal, I. (2019). The truncated Cauchy power family of distributions with inference and applications. ,22(346),1–24. DOI 10.3390/e22010001.Jamal, F., Bakouch, H., Nasir, M. (2020). A truncated general-G class of distributions with application to truncated burr-g family. (in press).Aldahlan, M. (2019). Type II truncated Fréchet generated family of distributions. ,7,221–228.Akbarinasab, M., Arabpour, A., Mahdavi, A. (2019). Truncated log-logistic family of distributions. ,5(2),137–147.Alzaatreh, A., Aljarrah, M. A., Smithson, M., Shahbaz, S. H.Shahbaz, M. Q.et al. (2020). Truncated family of distributions with applications to time and cost to start a business. ,42(6),547. DOI 10.1007/s11009-020-09801-1.Hassan, A., Sabry, M., Elsehetry, A. (2020). A new family of upper-truncated distributions: Properties and estimation. ,18(2),196–214.Nadarajah, S., Kotz, S. (2003). The exponentiated Fréchet distribution. ,1–7.Okorie, I. E., Akpanta, A. C., Ohakwe, J. (2016). The exponentiated Gumbel type-2 distribution: Properties and application. ,2016(2),1–10. DOI 10.1155/2016/5898356.Gupta, R., Kundu, D. (2001). Exponentiated exponential family: An alternative to Gamma and Weibull distributions. ,43,117–130.Mansour, M., Aryal, G., Afify, A., Ahmad, M. (2018). The Kumaraswamy exponentiated Fréchet distribution. ,34(3),177–193.Surles, J. G., Padgett, W. J. (2001). Inference for reliability and stress-strength for a scaled Burr-type X distribution. ,7(2),187–200. DOI 10.1023/A:1011352923990.Kotz, S., Lumelskii, Y., Pensky, M. (2003). . Singapore: World Scientific.Cordeiro, G., Silva, R., Nascimento, A. (2020). . Sharjah, UAE: Bentham Books.Amigo, J., Balogh, S., Hernandez, S. (2018). A brief review of generalized entropies. ,20(11),813. DOI 10.3390/e20110813.Rényi, A. (1961). On measures of entropy and information. , vol. 1, pp. 547–561, Univ. of California Press.Havrda, J., Charvát, F. (1967). Quantification method of classification processes, concept of structural α-entropy. ,3(1),30–35.Arimoto, S. (1971). Information-theoretical considerations on estimation problems. ,19(3),181–194. DOI 10.1016/S0019-9958(71)90065-9.Awad, A., Alawneh, A. (1987). Application of entropy to a life-time model. ,4(2),143–148. DOI 10.1093/imamci/4.2.143.Tsallis, C. (1988). Possible generalization of Boltzmann-Gibbs statistics. ,52(1–2),479–487. DOI 10.1007/BF01016429.Casella, G., Berger, R. (1990). . Bel Air, CA, USA: Brooks/Cole Publishing Company.Nelsen, R. (2006). . 2nd edition, New York: Springer-Verlag.Ghitany, M. E., Atieh, B., Nadarajah, S. (2008). Lindley distribution and its application. ,78(4),493–506. DOI 10.1016/j.matcom.2007.06.007.Hinkley, D. (1977). On quick choice of power transformations. ,26,67–69.Cordeiro, G. M., Alizadeh, M., Ozel, G., Hosseini, B., Ortega, E. M. M.et al. (2016). The generalized odd log-logistic family of distributions: Properties, regression models and applications. ,87(5),908–932. DOI 10.1080/00949655.2016.1238088.Lee, C., Famoye, F., Olumolade, O. (2007). Beta-weibull distribution: Some properties and applications to censored data. ,6(1),173–186. DOI 10.22237/jmasm/1177992960.Pal, M., Ali, M., Woo, J. (2006). Exponentiated Weibull distribution. ,66(2),139–147.Ristić, M. M., Balakrishnan, N. (2012). The gamma-exponentiated exponential distribution. ,82(8),1191–1206. DOI 10.1080/00949655.2011.574633.Chen, G., Balakrishnan, N. (2018). A general purpose approximate goodness-of-fit test. ,27(2),154–161. DOI 10.1080/00224065.1995.11979578.Massey, F. J. Jr. (1951). The Kolmogorov-Smirnov test for goodness of fit. ,46(253),68–78. DOI 10.1080/01621459.1951.10500769.Liu, J. B., Zhao, J., Cai, Z. Q. (2020). On the generalized adjacency, Laplacian and signless Laplacian spectra of the weighted edge corona networks. ,540,123073. DOI 10.1016/j.physa.2019.123073.Akgül, A. (2018). A novel method for a fractional derivative with non-local and non-singular kernel. ,114,478–482. DOI 10.1016/j.chaos.2018.07.032.