FTIR fingerprint — testing a new representation of the binary fingerprint based on FTIR spectra in the prediction of physicochemical properties


  • Kacper Tomaszewski University of Applied Sciences in Tarnow, Faculty of Mathematics and Natural Sciences, Department of Chemistry, Poland
  • Rafał Kurczab Maj Institute of Pharmacology – Polish Academy of Sciences, Poland https://orcid.org/0000-0002-9555-3905




binary fingerprint, FTIR spectroscopy, Savitzky-Golay filter, FEDS, prediction models, physicochemical properties


The paper deals with the development of a new method for the generation of binary fingerprints based on the Savitzky-Golay (SG) algorithm and first-order derivatives of FTIR spectra, which are then used to create prediction models for selected the physicochemical properties of chemical compounds. Models based on the FEDS (Functionally-Enhanced Derivative Spectroscopy) transformation and raw spectra were used as a reference to determine whether the use of the SG filter and first-order derivatives was worth to further develop. The FTIR spectra of 103 compounds with theoretically determined values of logP, logD and logS were studied. The Tanimoto coefficient and correlation coefficient were used to compare the fingerprints obtained, while the root mean square error (RMSE) was used to assess the quality of the prediction models. Based on the results, it was found that the use of the SG filter and derivatives had a positive effect on the quality of the prediction models for logP and logS, and a negative effect on the quality of the models for logD, compared to the models based on original spectra and FEDS transformation.


Download data is not yet available.

Terstappen GC, Reggiani A. In silico research in drug discovery. Trends in Pharmacological Sciences. 2001;22(1):23–26. HTTPS://DOI.ORG/10.1016/S0165-6147(00)01584-4. DOI: https://doi.org/10.1016/S0165-6147(00)01584-4   Google Scholar

Willett P, Barnard JM, Downs GM. Chemical similarity searching. Journal of Chemical Information and Computer Sciences. 1998;38(6):983–996, 1998, https://doi.org/10.1021/ci9800211. DOI: https://doi.org/10.1021/ci9800211   Google Scholar

Bajorath J. Selected concepts and investigations in compound classification, molecular descriptor analysis, and virtual screening. Journal of Chemical Information and Computer Sciences. 2001;41(2):233–245. https://doi.org/10.1021/ci0001482. DOI: https://doi.org/10.1021/ci0001482   Google Scholar

Wigh DS, Goodman JM, Lapkin AA. A review of molecular representation in the age of machine learning. WIREs: Computional Molecular Science. 2022;12(5):1–19. https://doi.org/10.1002/wcms.1603. DOI: https://doi.org/10.1002/wcms.1603   Google Scholar

Zagidullin B, Wang Z, Guan Y, Pitkänen E, Tang J. Comparative analysis of molecular fingerprints in prediction of drug combination effects. Briefings in Bioinformatics. 2021;22(6):1–15. https://doi.org/10.1093/bib/bbab291. DOI: https://doi.org/10.1093/bib/bbab291   Google Scholar

Ball DW. Field Guide to Spectroscopy [Internet]. Bellingham: SPIE Press; 2006. [cited 2022 June 21]. Available form: https://spie.org/Publications/Book/682726. DOI: https://doi.org/10.1117/3.682726   Google Scholar

Luo J, Ying K, Bai J. Savitzky-Golay smoothing and differentiation filter for even number data. Signal Processing. 2005;85(7):1429–1434. https://doi.org/10.1016/j.sigpro.2005.02.002. DOI: https://doi.org/10.1016/j.sigpro.2005.02.002   Google Scholar

Savitzky A, Golay MJE. Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry. 1964;36(8):1639–1643. https://doi.org/10.1021/ac60214a048. DOI: https://doi.org/10.1021/ac60214a047   Google Scholar

Riordon ANJ, Zubritsky E. Top 10 articles. Analytical Chemistry. [Internet] 2000 [cited 2022 June 21]. Available form: http://pubs.acs.org/ac. DOI: https://doi.org/10.1021/ac002801q   Google Scholar

Schafer RW. What is a Savitzky-Golay filter? [Lecture notes]. IEEE Signal Processing Magazine. 2011;28(4):111–117. https://doi.org/10.1109/MSP.2011.941097. DOI: https://doi.org/10.1109/MSP.2011.941097   Google Scholar

Cygański A. Metody spektroskopowe w chemii analitycznej. Warszawa: Wydawnictwo WNT; 2017.   Google Scholar

de Aragão BJG, Messaddeq Y. Peak separation by derivative spectroscopy applied to FTIR analysis of hydrolized silica. Journal of the Brazilian Chemical Society. 2008;19(8):1582–1594. https://doi.org/10.1590/S0103-50532008000800019. DOI: https://doi.org/10.1590/S0103-50532008000800019   Google Scholar

Palencia M. Functional transformation of Fourier-transform mid-infrared spectrum for improving spectral specificity by simple algorithm based on wavelet-like functions. Journal of Advanced Research. 2018;14:53–62. Htttps://doi.org/10.1016/J.JARE.2018.05.009. DOI: https://doi.org/10.1016/j.jare.2018.05.009   Google Scholar

Rieppo L, Saarakkala S, Närhi T, Helminen HJ, Jurvelin JS, Rieppo J. Application of second derivative spectroscopy for increasing molecular specificity of fourier transform infrared spectroscopic imaging of articular cartilage. Osteoarthritis and Cartilage. 2012;20(5):451–459. https://doi.org/10.1016/J.JOCA.2012.01.010. DOI: https://doi.org/10.1016/j.joca.2012.01.010   Google Scholar

Yukihiro O, Slobodan Š, Jiang JH. How can we unravel complicated near infrared spectra? Recent progress in spectral analysis methods for resolution enhancement and band assignments in the near infrared region. Journal of Near Infrared Spectroscopy. 2001;9(2). https://doi.org/10.1255/jnirs.2. DOI: https://doi.org/10.1255/jnirs.295   Google Scholar

Otálora A, Palencia M. Application of functionally-enhanced derivative spectroscopy (FEDS) to the problem of the overlap of spectral signals in binary mixtures: Triethylamine-acetone. Journal of Science with Technological Applications. 2019;6:96–107. https://doi.org/10.34294/J.JSTA.19.6.44. DOI: https://doi.org/10.34294/j.jsta.19.6.44   Google Scholar

Golonka D. Development of a new chemical compound representation based on FTIR spectrum for prediction of physicochemical properties of potential therapeutic substances [master thesis]. Kraków: Jagiellonian University; 2021.   Google Scholar

Kurczab R, Golonka D. A new approach to encoding the chemical structure based on the FTIR spectra of compound. In: Xth Conversatory on Medicinal Chemistry in Lublin; 2021. https://doi.org/10.13140/RG.2.2.18264.01284.   Google Scholar

Gorzynski Smith J. Chapter 13: Mass spectrometry and infrared spectroscopy. In: Organic Chemistry. 3rd ed. New York: McGraw-Hill; 2011. p. 463–488.   Google Scholar

Kennepohl D, Farmer S, Reusch W. 11.5: Infrared Spectra of Some Common Functional Groups. In: LibreTexts: Chemistry [Internet]. [cited 2023, March 11]. Available from: https://chem.libretexts.org/Bookshelves/Organic_Chemistry/Map%3A_Organic_Chemistry_(Wade)_Complete_and_Semesters_I_and_II/Map%3A_Organic_Chemistry_(Wade)/11%3A_Infrared_Spectroscopy_and_Mass_Spectrometry/11.05%3A_Infrared_Spectra_of_Some_Common_Functional_Groups.   Google Scholar

The KNIME workflow used in the study




How to Cite

Tomaszewski, K., & Kurczab, R. (2023). FTIR fingerprint — testing a new representation of the binary fingerprint based on FTIR spectra in the prediction of physicochemical properties. Science, Technology and Innovation, 17(1-2), 9–29. https://doi.org/10.55225/sti.492



Original articles