Statistical Hypothesis Test In Three Factor ANOVA Model Under Fuzzy Environments Using Trapezoidal Fuzzy Numbers

This paper deals with the problem of three factor ANOVA model (Latin Square DesignLSD) test using Trapezoidal Fuzzy Numbers (tfns.). The proposed test is analysed under various types of trapezoidal fuzzy models such as Alpha Cut Interval, Membership Function, Ranking Function, Total Integral Value and Graded Mean Integration Representation. Finally a comparative view of the conclusions obtained from various test is given. Moreover, two numerical examples having different conclusions have been given for a concrete comparative study.


Introduction
Fuzzy set theory [29] has been applied to many areas which need to manage uncertain and vague data. Such areas include approximate reasoning, decision making, optimization, control and so on. In traditional statistical testing [11], the observations of sample are crisp and a statistical test leads to the binary decision. However, in the real life, the data sometimes cannot be recorded or collected precisely. The statistical hypotheses testing under fuzzy environments has been studied by many authors using the fuzzy set theory concepts introduced by Zadeh [29]. Viertl [23] investigated some methods to construct confidence intervals and statistical tests for fuzzy data. Wu [27] proposed some approaches to construct fuzzy confidence intervals for the unknown fuzzy parameter. A new approach to the problem of testing statistical hypotheses is introduced by Chachi et al. [8]. Mikihiko Konishi et al. [15] proposed a method of ANOVA for the fuzzy interval data by using the concept of fuzzy sets. Hypothesis testing of one factor ANOVA model for fuzzy data was proposed by Wu [26,28] using the h-level set and the notions of pessimistic degree and optimistic degree by solving optimization problems. Gajivaradhan and Parthiban analysed one-way ANOVA test using alpha cut interval method for trapezoidal fuzzy numbers [16] and they presented a comparative study of 2-factor ANOVA test under fuzzy environments using various methods [17] Liou and Wang ranked fuzzy numbers with total integral value [14]. Wang et al. presented the method for centroid formulae for a generalized fuzzy number [25]. Iuliana Carmen BĂRBĂCIORU dealt with the statistical hypotheses testing using membership function of fuzzy numbers [12]. Salim Rezvani analysed the ranking functions with trapezoidal fuzzy numbers [20]. Wang arrived some different approach for ranking trapezoidal fuzzy numbers [25]. Thorani et al. approached the ranking function of a trapezoidal fuzzy number with some modifications [21]. Salim Rezvani and Mohammad Molani presented the shape function and Graded Mean Integration Representation for trapezoidal fuzzy numbers [19]. Liou and Wang proposed the Total Integral Value of the trapezoidal fuzzy number with the index of optimism and pessimism [14].
In this paper, we propose a new statistical fuzzy hypothesis testing of ANOVA for three factors of classifications (Latin Square Design-LSD) in which the designated samples are in terms of fuzzy (trapezoidal fuzzy numbers) data. The main idea in the proposed approach is, when we have some vague data about an experiment, what can be the result when the centroid point/ranking grades of those imprecise data are employed in hypothesis testing? For this reason, we use the centroid/ranking grades of trapezoidal fuzzy numbers (tfns.) in hypothesis testing.
Suppose the observed samples are in terms of tfns., we can evenhandedly use the centroid/ranking grades of tfns. for statistical hypothesis testing. In arriving the centroid/ranking grades of tfns., various methods are used to test which could be the best fit. Therefore, in the proposed approach, the centroid point/ranking grades of tfns. are used in LSD. Moreover we provide the decision rules which are used to accept or reject the fuzzy null and alternative hypotheses. In fact, we would like to counter an argument that the alpha cut interval method can be general enough to deal with 3-factor ANOVA method (LSD) under fuzzy environments. In the decision rules of the proposed testing technique, degrees of optimism, pessimism and h-level sets are not used but they are used in Wu [26]. For better understanding, the proposed fuzzy hypothesis testing technique of LSD using tfns., two different kinds of numerical examples are illustrated at each models. And the same concept can also be used when we have samples in terms of triangular fuzzy numbers [5,26]. where a, b, c, d are real numbers such that a < b c < d ≤ .

Definition 2.2. A fuzzy set 
A is called normal fuzzy set if there exists an element (member) 'x' such that x , x X and α 0, 1 ∈ ∈ . The set is said to be the α -cut of a fuzzy set  A .

Definition 2.3. A fuzzy subset 
A of the real line  with membership function and 'cl' is the closure operator.
It is known that for a normalized tfn.  A (a, b, c, d; 1) = , there exists four numbers a, b, c, d ∈  and two functions x and R x are non-decreasing and nonincreasing functions respectively. And its membership function is defined as follows: number  A respectively [9]. In this paper, we assume that  ( ) and it is known that the α -cut of a fuzzy number is  according to the definition of a fuzzy number, it is seen at once that every α -cut of a fuzzy number is a closed interval. Hence, for a fuzzy number  can also be termed as left and right spread of the tfn. [Dubois and Prade in 1981].
When n = 1 and b = c , we get a triangular fuzzy number. The conditions r = 1, a = b and c = d imply the closed interval and in the case r = 1, a = b = c = d = t (some constant), we can get a crisp number 't'. Since a trapezoidal fuzzy number is completely characterized by n = 1 and four real numbers a b c d ≤ ≤ ≤ , it is often denoted as  ( ) And the family of trapezoidal fuzzy numbers will be denoted by ( ) T F  . Now, for n = 1 we have a normal trapezoidal fuzzy number  ( ) A a, b, c, d = and the corresponding α -cut is defined by  And we need the following results which can be found in [11,13].

Latin Square Design (LSD)
A Latin square is an arrangement of the letters (varieties) in a square in such a way that each letter occurs once and only once in each row and each column. A Latin square of n th order is an arrangement of the symbols or letters in squares such that each symbol occurs once and only once in each row and column. There will be 'n' rows, 'n' columns and 'n' varieties, every symbol appearing 'n' times in a Latin square. In other words, we consider an agricultural experiment in which n 2 plots are taken and arranged in the form of an n n × square such that the plots in each row will be homogeneous as far as possible with respect to one factor of classification, say soil fertility and plots in each column will be homogeneous as far as possible with respect to another factor of classification, say seed quality. Then 'n' treatments are given to these plots such that each treatment occurs only once in each row and only once in each column. The various possible arrangements obtained in this manner are known as Latin squares of order 'n'. This design of experiment is called the Latin Square Design (LSD). x representing the yield of paddy, be classified according to three factors. Let the rows, columns and letters stand for the three factors, say soil fertility, seed quality and treatment respectively. We wish to test the null hypothesis that the rows, columns and letters are homogenous viz., there is no difference in the yield of paddy between the rows (due to soil fertility), between the columns (due to seed quality) and between the letters (due to treatments). Let ij x be the variate value corresponding to the i th row, j th column and k th letter. Let  Q /(n-1), Q /(n-1), Q /(n-1), Q /(n-1)(n-2) and 2 Q/(n -1) are unbiased estimates of the population variance 2 σ with degrees of freedom (n-1) , (n-1) , (n-1) , (n-1)(n-2) and 2 (n -1) respectively. If the sample population is assumed to be normal, all these estimates are independent. Therefore, each of 1 4 [Q /(n-1)] / [Q /(n-1)(n-2)], 2 4 [Q /(n-1)] / [Q /(n-1)(n-2)] and 3 4 [Q /(n-1)] / [Q /(n-1)(n-2)] follows a F-distribution with ((n-1), (n-1)(n-2)) degrees of freedom. Then the F-tests are applied and the significance of difference between rows, columns and treatments is analysed. And the descriptions of 1 2 3 4 Q, Q , Q , Q and Q are given below.

The ANOVA table for three factors of classification
Latin square is useful when one wishes to remove from an analysis of data the effect of a factor which we are not interested in, but which is known to be significant. Latin square designs are used in industrial, laboratory field, green house, educational, medical, marketing and sociological experimentation in addition to agricultural problems. Some advantages of the LSD over other designs are (i) it controls more of the variation than the completely randomized block design [16] with a two way stratification (ii) The analysis is simple (iii) Even with missing data, the analysis remains relatively simple. The assumption made in LSD model is that the interactions between treatments, row and column groupings are non-existent.

Three-factor ANOVA test with tfns. using alpha cut interval method
The fuzzy test of hypotheses of three-factor ANOVA model where the sample data are trapezoidal fuzzy numbers is given here. Using the relation, we transform the fuzzy ANOVA model to interval ANOVA model. Having the upper limit of the fuzzy interval, we construct upper level crisp ANOVA model and using the lower limit of the fuzzy interval, we construct the lower level crisp ANOVA model. Thus, in this approach, two crisp ANOVA models are designated in terms of upper and lower levels. Finally, we analyse the lower and upper level models using crisp two-factor ANOVA technique. For lower level model, from α-cut intervals of tfns. we have,  F F < at 'r' level of significance with ((n-1), (n-1)(n-2)) degrees of freedom, then the null hypothesis

Example-1
The following observed data are the yields (in kgs.) of paddy where i A , i=1, 2, 3, 4 denote the different methods of cultivation. Due to some imprecise observations, the data recorded are in terms of trapezoidal fuzzy numbers. We examine whether the different methods of cultivation have given significantly different yields.

Example-2
The following is the effectiveness of three teaching methods A 1 , A 2 , and A 3 from the achievement scores given below tabulated age and aptitude wise. The collected data are in terms of trapezoidal fuzzy numbers due to some vague observations. We perform the variance analysis taking A 1 , A 2 and A 3 into account to test whether there is a significant difference among the 3 teaching methods.
Three-way ANOVA test using alpha cut interval method Example 5.1. Let us consider example-1, the interval form of given tfns. using α-cut method is given below: The upper level model and lower level model [16,17] can be constructed using the descriptions (3.1). Here we have noted only the three-way ANOVA calculated results by omitting repeated tables and surplus explanations. For lower level model: Between rows: Here The null hypothesis L 0 H is accepted at 5% level of significance.
⇒ The difference between rows is not significant.
Between columns:  16-α] Bulletin of Mathematical Sciences and Applications Vol. 14 29 Between treatments: The null hypothesis L 0 H is rejected at 5% level of significance. ⇒ The difference between treatments is significant. ⇒ There is a significant difference among the methods of cultivation. For upper level model: Between rows: Here The null hypothesis U 0 H is accepted at 5% level of significance.
⇒ The difference between rows is not significant.
Between columns: ⇒ The difference between columns is not significant.
Between treatments: The null hypothesis U 0 H is rejected at 5% level of significance. ⇒ The difference between treatments is significant. ⇒ There is a significant difference among the methods of cultivation.
Hence, from the decisions obtained from both lower and upper level models, we conclude that there is a significance difference among the methods of cultivation. Between rows: Q (2α +6α+42) / 9 = , (n-1)(n-2)=2 .
Between rows: Here Here, the obtained decisions through lower and upper level models do not provide parallel discussion. In lower level model, between treatments, the null hypothesis is rejected and in the upper level model, between treatments, the null hypothesis is accepted. Hence the null hypothesis is rejected between treatments. Now, the CRD, RBD and LSD are independent of origin which implies that the arithmetic operations such as addition/subtraction/multiplication or division by non-zero quantity can be performed among the observed data uniformly for all entries in order to simplify the large numerical calculations while the observed data are numerically large. This indicates that ANOVA test stands on the magnitude ratio among each data of the sample observations. The another idea in this paper is, when the test is conducted using natural and vague observations such as fuzzy numbers for instance, we may use ranking grades for all observed fuzzy numbers by using unique method without damaging the magnitude ratios among the fuzzy samples. In fact, the ranking grades of all fuzzy numbers using fuzzy analytic method are crisp in nature and we perform the LSD test as usual and better decisions can be obtained.

Wang's centroid point and ranking method
Wang et al. [25] found that the centroid formulae proposed by Cheng are incorrect and have led to some misapplications such as by Chu and Tsao. They presented the correct method for centroid formulae for a generalized fuzzy number  ( ) A= a, b, c, d; w as

Bulletin of Mathematical Sciences and Applications Vol. 14
And the ranking function associated with  A is  ( ) For a normalized tfn, we put w = 1 in equations (6.1) so we have, And the ranking function associated with  A is  ( ) Example 6.1. Let we consider example 1, using the above relations (6.3) and (6.4), we obtain the ranks of tfns. which are tabulated below: The ANOVA And Treat. t(5%) F > F . ⇒ The null hypothesis  0 H is rejected at 5% level of significance. ⇒ The difference between treatments is significant. ⇒ There is a significant difference among the methods of cultivation. Example 6.2. Let we consider example 2, using the above relations (6.3) and (6.4), we obtain the ranks of tfns. which are tabulated below: The ANOVA level of significance. ⇒ The difference between treatments is not significant. ⇒ The difference among the three teaching methods is not significant.

Rezvani's ranking function of tfns.
The which is the Euclidean distance from the incenter of the centroids. For a normalized tfn, we put w = 1 in equations (1), (2) and (3)

Three-way ANOVA test using Rezvani's ranking function
We now analyse the three-way ANOVA test by assigning rank for each normalized trapezoidal fuzzy numbers and based on the ranking grades the decisions are observed. Example 7.1. Let us consider example 1, using the above relations (7.4), (7.5) and (7.6), we get the ranks of each tfns.  i A as below: The ANOVA , t (5%) F (2,2) =19.00 . And Treat. t(5%) F < F . ⇒ The null hypothesis  0 H is accepted at 5% level of significance. ⇒ The difference between treatments is not significant. ⇒ The difference among the three teaching methods is not significant.

Graded mean integration representation (GMIR)
Let  ( ) A= a, b, c, d; w be a generalized trapezoidal fuzzy number, then the GMIR [19] Proof : For a trapezoidal fuzzy number  ( ) n A= a, b, c, d; 1 , we have ( ) The null hypothesis  0 H is accepted at 5% level of significance. ⇒ The difference between rows is not significant. Col. F 5.6436 = , t (5%) F (2,2) = 19.00 . And Col. t(5%) F < F . ⇒ The null hypothesis  0 H is accepted at 5% level of significance. ⇒ The difference between columns is not significant. Treat.

F
17.0153 = , t (5%) F (2,2) =19.00 . And Treat. t(5%) F < F . ⇒ The null hypothesis  0 H is accepted at 5% level of significance. ⇒ The difference between treatments is not significant. ⇒ The difference among the three teaching methods is not significant.

 ( )
A a, b, c, d; w = is a generalized trapezoidal fuzzy number and 'k' be a scalar with k < 0, y = kA then   y = kA is a fuzzy number with ( ) kd, kc, kb, ka; w . Proof: (a) When k 0 ≥ , with the transformation y = kA we can find the membership function of fuzzy set   y = kA by α-cut method. Now, the α-cut interval of  A is   (1) and (2), we have the membership function of   y = kA as follows:  ( ) y y -ka y -kd μ y w for ka y kb; w for kb y kc; w for kc y kd; kb -ka kc -kd and 0, otherwise.
Similarly we can prove (b) if y = kA , k 0 < then  y = ( ) kd, kc, kb, ka; w is a fuzzy number with membership function,  ( ) y y -kd y -ka μ y w for kd y kc; w for kc y kb; w for kb y ka; kc -kd kb -ka and 0, otherwise.
And for a normalized trapezoidal number, we put w = 1 in equations (3) and (4).

Calculation of membership function of tfns.
The membership grades for a normalized tfn.  ( ) , we transform the tfns. in problem (2) by multiplying each members with "0.01" using proposition-9.1 and the membership grade are tabulated below: The ANOVA F (2,2) =19.00 . And Treat. t(5%) F < F . ⇒ The null hypothesis  0 H is accepted at 5% level of significance. ⇒ The difference between treatments is not significant. ⇒ The difference among the three teaching methods is not significant.

LIOU and WANG'S centroid point method
Liou and Wang [14] ranked fuzzy numbers with total integral value. For a fuzzy number defined by definition (2.3), the total integral value is defined as are the right and left integral values of  A respectively and 0 α 1 is the index of optimism which represents the degree of optimism of a decision maker.
(ii) If α 0 = , then the total value of integral represents a pessimistic decision maker's view point which is equal to left integral value. (iii) If α 1 = , then the total integral value represents an optimistic decision maker's view point and is equal to the right integral value.(iv)If α 0.5 = then the total integral value represents a moderate decision maker's view point and is equal to the mean of right and left integral values. For a decision maker, the larger the value of α is, the higher is the degree of optimism.

Young
Middle Old Low null hypothesis  0 H is accepted at 5% level of significance. ⇒ The difference between treatments is not significant. ⇒ The difference among the three teaching methods is not significant.

Thorani's ranking method
As per the description in Salim Rezvani's ranking method, Thorani et al. [21] presented a different kind of centroid point and ranking function of tfns. The incenter  For a normalized TFN, we put w = 1 in equations (1) and (2)  hypothesis  0 H is rejected at 5% level of significance. ⇒ The difference between treatments is significant. ⇒ The difference between the methods of cultivation is significant. Example 11.2. Let us consider example 2, using the above relations (11.3) and (11.4), we get the ranks of each tfns.  i A which are tabulated below: The ANOVA F (2,2) =19.00 . And Treat. t(5%) F < F . ⇒ The null hypothesis  0 H is accepted at 5% level of significance. ⇒ The difference between treatments is not significant. ⇒ The difference among the three teaching methods is not significant.

Conclusion:
The decisions obtained from various methods are tabulated below for the null hypothesis.
Observing the decisions obtained from α-cut interval method, for example-1, the difference between rows and columns is not significant and there is a significant difference between treatments. For example-2, the difference between rows and columns is not significant and there is a significant difference between treatments. Moreover, the membership function and Liou & Wang's method (L&W) do not provide reliable results as they accept the null hypotheses for all cases. Also, decisions from ranking grades of Wang, Rezvani, Thorani and GMIR provide parallel discussions.