load("C:/BLOG/Workspaces/NIT Tutorial/NIT_ws3.RData")
ls()
[1] "absorp" "cor_rawspec" "cor_rawspec_fat" "cor_rawspec_moi"
[5] "cor_rawspec_prot" "endpoints" "meats" "meats_longer"
[9] "tecator"
José Ramón Cuesta
October 3, 2022
Let´s see what we have in the workspace from the previous posts:
[1] "absorp" "cor_rawspec" "cor_rawspec_fat" "cor_rawspec_moi"
[5] "cor_rawspec_prot" "endpoints" "meats" "meats_longer"
[9] "tecator"
We can remove some objects we don´t need
Now we load the libraries we will use:
The idea now is to apply some math treatments to the raw spectra and check which one improves the correlation with the parameters of interest. Normally there are some common scatter removal algorithms that I use:
Standard Normal Variate (SNV)
Detrend (linear or quadratic)
SNV + Detrend (linear or quadratic)
Multiple Scatter Correction
There are some packages in R which have these math treatment with this name or a similar one, or we can create functions to apply these algorithms to the spectra matrix.
Let´s start using SNV, where we center every spectrum (subtracting the mean) and scale it (dividing by the standard deviation):
#The algorithm is applied to the columns, so we transpose the matrix
absorp_snv <- scale(t(absorp), center = TRUE, scale = TRUE)
#Let´s convert the corrected matrix as usual
absorp_snv <- t(absorp_snv)
matplot(colnames(absorp_snv), t(absorp_snv), type = "l", xlab = "Wavelength (nm)", ylab = "Absorbance", main = "SNV Meat Spectra")
We can add the matrix treated with the SNV math treatment to the tecator dataframe
Now we can see if the correlation is improved
cor_snvspec_moi <- cor(tecator$Moisture, tecator$snv_spec)
cor_snvspec_fat <- cor(tecator$Fat, tecator$snv_spec)
cor_snvspec_prot <- cor(tecator$Protein, tecator$snv_spec)
cor_snvspec <- as.data.frame(rbind(cor_snvspec_moi, cor_snvspec_fat, cor_snvspec_prot))
cor_snvspec <- cor_snvspec %>%
mutate(Parameter = as.factor(c("Moisture", "Fat", "Protein")))
cor_snvspec %>%
pivot_longer(cols = c(1:100), names_to = "Wavelength", values_to = "Correlation") %>%
mutate(Wavelength = as.integer(Wavelength)) %>%
ggplot(aes(x = Wavelength, y = Correlation, group = Parameter, col = Parameter)) +
geom_line()
Now, apart from the better correlation we can see an improvement in the definition of the correlations (positives and negatives), and the correlation spectra confirm what we have seen in the correlation between the parameters.
As always save the workspace for future use: