A REVIEW ON CLASSIFICATION OF OPERATION DURATIONS OF TIME SERIES USING DIFFERENT ALGORITHMS


Elmalı C.

5TH INTERNATIONAL ISTANBUL CURRENT SCIENTIFIC RESEARCH CONGRESS, RES. ASST. GAMZE TURUN,LECTURER AGİT FERHAT ÖZEL, Editör, İksad Yayınevi, Ankara, ss.250-275, 2024

  • Yayın Türü: Kitapta Bölüm / Araştırma Kitabı
  • Basım Tarihi: 2024
  • Yayınevi: İksad Yayınevi
  • Basıldığı Şehir: Ankara
  • Sayfa Sayıları: ss.250-275
  • Editörler: RES. ASST. GAMZE TURUN,LECTURER AGİT FERHAT ÖZEL, Editör
  • Marmara Üniversitesi Adresli: Evet

Özet

Time series is a series of data in which the values collected through digitally or analogously tools within certain time intervals of a phenomenon are represented. Time series are used in all fields that contain time data, such as engineering, industry, health, finance, meteorology, and traffic. Time series are expressed as a sub-branch of the machine learning method, which has a wide range of applications and many real-life applications with data mining process. In industries where human-oriented production operates, the collection of workers' operation time series by analog means and the fact that each worker performs the operation within dynamic time intervals causes many problems. Time series collected by analogue tools are almost impossible to determine time intervals accurately due to limited data. Incorrect determination of time intervals results in problems, such as bottlenecks in the assembly line, inventory problems, and setting unrealistic production targets. Research in this field suggests classifying time series over large-sized digital data sets and performing data mining using the appropriate algorithms.

Owing to the variable and raw temporal structure of the input data, machine learning algorithms working on raw data do not produce accurate results. In recent years, many new algorithms have been developed to increase the scalability of classification, analysis, and predictive performance of raw data. It is noteworthy that in the studies conducted, hybrid models are preferred to increase the versatility of the algorithms. In this study, evaluations were made on algorithms containing different writing languages and models on subsets of 47 data sets in the time-series classification archive of the University of California database. The algorithms discussed in this study are examined under five headings: metric, intervals, shape-based, dictionary-based, and hybrid. In this study, comparisons were made on the operation time data of workers operating in any industrial field, collected through sensors from the machines in the work areas, about which algorithm would be more appropriate among the data sets examined. It is aimed to minimize the problems that may occur by assigning workers with high operational skills to the machines as a result of data mining on the collected data sets.