У нас вы можете посмотреть бесплатно Data Transformation (Log, square root, cube root, Tukey Ladder, and Boxcox methods ) In R studio или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Data Transforming Most parametric tests require that residuals be normally distributed and that the residuals be homoscedastic. One approach when residuals fail to meet these conditions is to transform one or more variables to better follow a normal distribution. Often, just the dependent variable in a model will need to be transformed. However, in complex models and multiple regression, it is sometimes helpful to transform both dependent and independent variables that deviate greatly from a normal distribution. There is nothing illicit in transforming variables, but you must be careful about how the results from analyses with transformed variables are reported. For example, looking at the turbidity of water across three locations, you might report, “Locations showed a significant difference in log-transformed turbidity.” To present means or other summary statistics, you might present the mean of transformed values, or back transform means to their original units. Some measurements in nature are naturally normally distributed. Other measurements are naturally log-normally distributed. These include some natural pollutants in water: There may be many low values with fewer high values and even fewer very high values. For right-skewed data—tail is on the right, positive skew, common transformations include square root, cube root, and log. For left-skewed data—tail is on the left, negative skew—, common transformations include square root (constant – x), cube root (constant – x), and log (constant – x). Because log (0) is undefined—as is the log of any negative number—, when using a log transformation, a constant should be added to all values to make them all positive before the transformation. It is also sometimes helpful to add a constant when using other transformations. Another approach is to use a general power transformation, such as Tukey’s Ladder of Powers or a Box-Cox transformation. These determine a lambda value, which is used as the power coefficient to transform values. X.new = X ^ lambda for Tukey, and X.new = (X ^ lambda – 1) / lambda for Box–Cox. The function transformTukey in the rcompanion package finds the lambda which makes a single vector of values—that is, one variable—as normally distributed as possible with a simple power transformation. The Box–Cox procedure is included in the MASS package with the function boxcox. It uses a log-likelihood procedure to find the lambda to use to transform the dependent variable for a linear model (such as an ANOVA or linear regression). It can also be used on a single vector. Packages used in these tutors The packages used in this chapter include: • MASS • rcompanion • psych The following commands will install these packages if they are not already installed: if(!require(MASS)){install.packages("MASS")} if(!require(rcompanion)){install.packages("rcompanion")} if(!require(psych)){install.packages("psych")} the scrpit for this tutorials! Data transformation data=c(1,3,4,5,6,100,233,1000,1500,2000,10000,45000,9000,12000,20000) library(rcompanion) plotNormalHistogram(data) qqnorm(data) qqline(data,col="blue") #Square root transformation data_sqrt= sqrt(data) library(psych) skew(data) plotNormalHistogram(data_sqrt) #Cube root transformation data_cub=sign(data) * abs(data)^(1/3) plotNormalHistogram(data_cub) #Log transformation data_log =log(data) plotNormalHistogram(data_log) #Tukey's Ladder of Powers transformation data_tuk =transformTukey(data,plotit=TRUE) plotNormalHistogram(data_tuk) #Box-Cox transformation library(MASS) Box = boxcox(data~ 1,lambda= seq(-2,2,0.1)) Create a data frame with the results Cox = data.frame(Box$x, Box$y) Order the new data frame by decreasing y Cox2 = Cox[with(Cox, order(-Cox$Box.y)),] Display the lambda with the greatest Cox2[1,] Extract that lambda lambda = Cox2[1,"Box.x"] Transform the original data data_box=(data^lambda-1)/lambda plotNormalHistogram(data_box)