Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

SpannerLib: Embedding Declarative Information Extraction in an Imperative Workflow

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • بيانات النشر:
      Preprint
    • بيانات النشر:
      Association for Computing Machinery (ACM), 2024.
    • الموضوع:
      2024
    • نبذة مختصرة :
      Document spanners have been proposed as a formal framework for declarative Information Extraction (IE) from text, following IE products from the industry and academia. Over the past decade, the framework has been studied thoroughly in terms of expressive power, complexity, and the ability to naturally combine text analysis with relational querying. This demonstration presents Spanner-Lib---a library for embedding document spanners in Python code. SpannerLib facilitates the development of IE programs by providing an implementation of Spannerlog (Datalog-based document spanners) that interacts with the Python code in two directions: rules can be embedded inside Python, and they can invoke custom Python code (e.g., calls to ML-based NLP models) via user-defined functions. The demonstration scenarios showcase IE programs, with increasing levels of complexity, within Jupyter Notebook.
    • ISSN:
      2150-8097
    • الرقم المعرف:
      10.14778/3685800.3685855
    • الرقم المعرف:
      10.48550/arxiv.2409.01736
    • Rights:
      CC BY
    • الرقم المعرف:
      edsair.doi.dedup.....8250f58f59b2c23f675402d5d21ea1c6