Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Realiu laiku generuojamų didžiųjų duomenų saugojimas ir apdorojimas organizacijoje / ; Storage and processing of real-time big data in an enterprise.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • المؤلفون: Detkova, Julija
  • نوع التسجيلة:
    bachelor thesis
  • اللغة:
    Lithuanian
    English
  • معلومة اضافية
    • بيانات النشر:
      Institutional Repository of Vilnius University
    • الموضوع:
      2022
    • Collection:
      Vilnius University Virtual Library (VU VL) / Vilniaus universitetas virtuali biblioteka
    • نبذة مختصرة :
      Today, an increased use of social networks, IoT and other different devices generates massive amounts of heterogeneous high velocity data, otherwise known as Big Data. Big Data is a term that describes the fundamental 3V paradigm: volume, velocity and variety. To meet an increasing demand for Big Data organizations having difficulties coming up with the efficient solution to store and process high volumes of low-density semi-structured and unstructured data. Therefore, the main goal of this paper is to analyze data storages that can handle the requirements of Big Data and offer real-time Big Data pipelines based on Lambda and Kappa architectures. The analysis of storages compatibility with semi-structured, unstructured repetitive and non-repetitive data, horizontal and vertical scaling and Kappa, Lambda architectures lead to the conclusions that old technologies and strategies aren’t enough to store and process semi-structured and unstructured Big Data, hence new platforms and technologies should be used instead. Also, NoSQL and Data Lakes proved to be a good solution for storing unstructured data, whereas relation data model storages have to be integrated with different technologies for storing unstructured data efficiently. Lastly, the developed Kappa and Lambda real-time Big Data processing pipelines were tested only with semi-structured data due to limited data sources of organization. However, the Kappa real-time Big Data pipeline theoretically is fully compatible with unstructured data due to the storages that have been selected.
    • File Description:
      application/pdf
    • Relation:
      https://epublications.vu.lt/object/elaba:146241189/146241189.pdf; https://repository.vu.lt/VU:ELABAETD146241189&prefLang=en_US
    • الدخول الالكتروني :
      https://repository.vu.lt/VU:ELABAETD146241189&prefLang=en_US
    • Rights:
      info:eu-repo/semantics/openAccess
    • الرقم المعرف:
      edsbas.31343DA2