## 25. Yıl Etkinlikleri

Yayınlanma Tarihi: 10-05-2019

Bölümümüzün eğitime başlamasının 25. yılı dolayısı ile düzenlenecek olan, yurtiçi ve yurtdışından birçok değerli bilim insanının katılacağı, etkinliklerin programına buradan ulaşabilirsiniz.

Etkinlik fotoğraflarına “25. Yıl Etkinlikleri” başlığına tıklayarak ulaşabilirsiniz.

Seminer özetleri: + Prof. Dr. Olcay ARSLAN, Robust parameter estimation and variable selection in linear regression models 28.05.2019, 09:30, Ömer KÖSE Hall ABSTRACT Two challenging problems of the regression analysis are the parameter estimation and variable selection. Concerning the parameter estimation, the least squares regression estimation method is usually used to estimate the parameters. However, it is well-known that the least squares (LS) regression estimators are highly sensitive to the outliers. In stead of classical LS estimation method robust parameter estimation methods are often used to obtain regression estimators that are resistant to the outliers in the data. Concerning the variable selection, a large number of explanatory variables are usually introduced at the initial stage of the regression model to attenuate possible modeling biases. However, including unnecessary predictors can degrade the efficiency of the resulting estimation procedure and yields less accurate predictions. On the other hand, omitting an important explanatory variable may produce biased parameter estimates and prediction results. Therefore, since selecting the significant explanatory variables is an important task, number of methods, including Information Criterions (ACI, BIC, ICOMP) have been proposed and well established. Alternative to the existing variable (model) selection methods, which are essentially carried on after parameter estimation, the recent trend is to combine variable selection with parameter estimation. One of the first combinations is the LASSO regression method, which can carry on variable selection and parameter estimation simultaneously . After its original introduction, many researchers have been proposed several different modifications of LASSO method. Since the LASSO method is based on LS estimation method it is often criticized not being robust against the outliers. Therefore, robust LASSO methods have also been proposed to conduct robust parameter estimation and variable selection simultaneously. Alternative to the robust LASSO methods, the purpose of this study is to combine a high breakdown point regression estimation method with an appropriate variable selection method to obtain regression estimators that are resistant to outliers and leverage points. To achieve this we carry on penalized regression estimation using MM robust regression objective function and bridge penalty function. The MM regression method produces estimators that are resistant to outliers in any direction and the bridge penalty will do variable selection. We study asymptotic properties of the proposed estimator, suggest a an computation algorithm and explore the finite sample behavior with a simulation study and a real data example. + Prof. Dr. Hamparsum BOZDOGAN, A new class of information-theoretic measure of complexity for model selection in high-dimensions 28.05.2019, 10:30, Ömer KÖSE Hall ABSTRACT In this presentation, we introduce a new class of entropic or information-theoretic measure of complexity called ICOMP (ICOMP for “information complexity”) of this author as a decision rule for model selection in high-dimensions. The development and construction of ICOMP is based on a generalization of the covariance complexity index introduced by van Emden (1971). ICOMP measures the fit between an entertained linear or nonlinear model and observed data in high-dimensions. In high-dimensional statistical modelling and model evaluation problems, the concept of model complexity plays an important role. At the philosophical level, complexity involves notions such as connectivity patterns and the interactions of model components. Without a measure of overall model complexity, prediction of model behaviour and assessing model quality is difficult. This requires detailed statistical analysis and computation to choose the best fitting model among a portfolio of competing models for a given finite sample. In this talk, we first start with the information complexity of the estimated inverse-Fisher information matrix (IFIM), which is also known as the celebrated Cramér-Rao lower bound (CRLB) matrix. This directly establishes a link between the Fisher information and information theoretic quantities for the first time in model selection. Such an approach provides us an achievable accuracy of the parameter estimates by considering the entire parameter space of the model. Later, we introduce ICOMP as a Bayesian criterion in maximizing a posterior expected utility (PEU). In order to protect the researcher against model misspecification, we generalize ICOMP to the case of a misspecified model and make ICOMP both robust and misspecification resistant. To make ICOMP coordinate invariant, we provide ICOMP based on the complexity of the total Kullback-Leibler (tKL) distance. In addition, we construct and introduce the consistent ICOMP as well as ICOMP as a fully Bayesian criterion. Our approach generalizes, the classic Akaike’s (1973) Information Criterion (AIC) as well as the Schwarz’s (1978) Bayesian Criterion (SBC) for model selection in high-dimensions. The practical utility and the importance of ICOMP is illustrated by providing real numerical examples in high dimensions to elucidate the current inferential problems often found in classical statistical modeling.   KEYWORDS: A New Class of ICOMP Criteria; Entropic Complexity Measures; Estimated inverse-Fisher Information Matrix (FIM); total Kullback-Leibler (tKL) Distance; Posterior Expected Utility (PEU); Consistent ICOMP; Fully Bayesian ICOMP; AIC; SBC; model selection. + Prof. Dr. Michael GREENACRE, Effect sizes and their significances in canonical correspondence analysis 28.05.2019, 14:00, Ömer KÖSE Hall ABSTRACT Canonical correspondence analysis (CCA) is a popular method in ecology for relating a multivariate set of responses, usually species abundances, to a multivariate set of explanatory variables, usually environmental variables. The results are usually reported graphically in the form of a triplot: the samples, the response variables and the explanatory variables, with an interpretation that is descriptive more than specifically quantifying the effect of the explanatory variables on the responses. Permutation tests are used to test if the explanatory variables explain a significant percentage of the variance of the responses, and also if the dimensions of the triplot explain significant variance. In this talk I show how effect sizes can be derived and tested in a CCA. giving more specific results and interpretation to the results. + Prof. Dr. Mehmet ŞAHİNOĞLU, Selecting Type-I and Type-II error Probabilities in Hypothesis Testing with Game Theory 28.05.2019, 15:00, Ömer KÖSE Hall + Prof. Dr. İsmihan BAYRAMOĞLU, On the joint distribution of a random sample and an order statistic: Applications in Reliability Analysis 29.05.2019, 09:30, Ömer KÖSE Hall ABSTRACT We consider the joint distribution of elements of a random sample and an order statistic of the same sample. The theoretical results are applied to solve an important problem in reliability analysis. In particular, we are interested in estimation of the number of inspections we need in order to detect failed components in a coherent system. + Prof. Dr. Michael GREENACRE, A practical evaluation of the isometric logratio transformation in compositional data analysis 29.05.2019, 10:30, Ömer KÖSE Hall ABSTRACT The isometric logratio (ILR) transformation, which is a logratio of geometric means, has been promoted by several authors as the correct way, from a theoretical viewpoint, to contrast groups of compositional parts and form a set of new coordinates for analysing a compositional data set.  However, the interpretation of ILRs is made complicated by the fact that each geometric mean depends on the relative values of all the parts included in it. In this talk I will show that the ILR transformation, with its elegant mathematical properties, is not a prerequisite for performing compositional data analysis in practice.  The simpler approach using pairwise logratios can explain the structure of a compositional data set, with a simpler and more parsimonious interpretation. Isometric logratios that contrast parts can also be replaced by logratios of amalgamated parts, based on substantive knowledge — these are logratios of sums of parts, which are more intuitive than geometric means. + Prof. Dr. Hamparsum BOZDOGAN, A novel computer-aided detection of breast cancer: stalking the serial killer 29.05.2019, 14:00, Ömer KÖSE Hall ABSTRACT Breast cancer is the second-leading cause of death among women worldwide, killing half a million women every year. Radiologists still miss up to 30 percent of breast lesions in mammograms. What can statistical modeling and data mining do? In this presentation, we present novel statistical modeling and data mining techniques for computer-aided detection (CAD) of breast cancer by introducing and developing several flexible supervised and unsupervised classification methods. For the supervised classification method, we develop what is called Hybrid Radial Basis Function Neural Networks (HRBF-NN), and for the unsupervised classification case we develop Mixture-Model Cluster Analysis (MMCA) under different covariance structures. In using both methods our goal is to classify the signs of disease tissues on the resulting digital radiographic images (i.e., mammograms) in order to help radiologists to reach diagnostic decisions as a second eye. Mammography screening programs have been adopted worldwide to look for signs of breast cancer on asymptomatic patients at an early stage, especially when the chance of survival is highest.  Mixture-model cluster analysis is an unsupervised classification model that learns the actual number of clusters without knowing a priori the classification of cancerous lesions or labels. Both approaches use a model selection criterion based on the information-theoretic measure of complexity (ICOMP) index introduced by this author, which allows robust statistical inference to detect cancerous lesion classification and diagnosis An experimental case study demonstration of both methods is presented by conducting a detailed analysis of a real data set on two breast cancer groups (“Benign”/”Malignant”) composed of n = 1,269 Italian patients with p = 132 continuous features (or dimensions). The efficiency and robustness of our two approaches are presented and compared with results obtained by using the Support Vector Machines (SVMs) used in Computer System Detection (CAD) of breast tumors. It is shown that our proposed methods provides a new and novel approach to be used by the radiologists as a second eye. The proposed approaches have generalization to many other applications, not only in biomedical and health informatics, but also in a variety of business applications (e.g., detecting potential fraud and bankruptcy, performing customer profiling and market segmentation, auditing of accounting practices, and detecting potential threats).  Our results elucidate the current inferential problems often found in classical statistical data mining as a first step toward the specification of a robust classification model for breast cancer detection through image modeling. KEYWORDS: Breast Cancer Detection; Hybrid Radial Basis Function Neural Networks (HRBF-NN) Classification; Unsupervised Classification; and Information Complexity. + Prof. Dr. Olcay ARSLAN, Robust empirical likelihood (EL) regression estimation 29.05.2019, 15:00, Ömer KÖSE Hall ABSTRACT Parameters of a linear regression model can be estimated using empirical likelihood (EL) estimation method when any distributional assumption of error term are not appropriate. In the EL method, unknown probabilistic weights related to the each observation are defined and the EL method performs to estimate these probabilistic weights. To carry on the estimation of the probabilistic weights an EL function of these unknown probabilistic weights is formed and this function is maximized under some constraints involving parameters of interest. In the classical EL procedure the constraints are formed using the likelihood scores of the ML estimation method under the assumption of normally distributed errors. However, the constraints created by the assumption of normality will be influenced by outlying observations in the data or nonnormality of the data. Therefore, the EL estimators obtained using these constraints will be influenced by outlying observations in the data or nonnormality of the data. In this study, our main aim is to modified the EL estimation method to obtain robust estimators for the regression parameters. This robust modification will be done using robust constraints borrowed from the robust regression estimating equations. After these arrangements, we then consider the estimation procedure to obtain the estimators for the parameters of interest. We also study asymptotic properties of these estimators, and observe that these estimators have similar asymptotic properties with the robust estimators that we use their estimating equations. To illustrate the capacity of the proposed estimators we provide simulation studies and real data examples using well known data sets that are used as test data sets to assess robustness performance of the proposed robust estimators. All of these findings show that robustification of EL estimation method is needed if we are suspecting some outliers in the data or we are not sure normality of the data. This work is a joint work with Dr. Şenay Özdemir from Afyon Kocatepe University. + Prof. Dr. İsmihan BAYRAMOĞLU, On a fibonacci sequences of random variables 30.05.2019, 10:30, Ömer KÖSE Hall ABSTRACT We consider a sequence of random variables constructed in the base of Fibonacci sequences and some initial random variables. The initial random variables are assumed to be absolutely continuous with given joint probability density function.  We study some distributional and asymptotic properties of these sequences and also deal with the prediction of future values.