HANCOM

AI

Hancom makes our technologies accessible to both individuals and businesses
with a combination of new AI technologies and years of accumulated data.

Hancom Data Loader
Document data extraction SDK for AI

Hancom Data Loader is a document data extraction SDK that enables you to effectively turn various document formats into data and then utilize that data. This SDK is a core technology that builds Retrieval-augmented generation (RAG) solutions.

  • AI features for generating ideas about documents

    1. Accurate extraction and splitting of data from documents

  • A chatbot that handles your needs automatically

    2. Extract metadata to segment documents by semantic units

  • Customized workingenvironments just for you

    3. Support formats such as json and csv for a range of utilization

Hancom's AI document data extraction technology

Effectively turn text, tables, charts, and images of documents into data and provide it as metadata for AI learning and RAG

Extract data such as text, tables, charts, and images of documents

Types of metadata
  • Passage

    Passage

    • Page number, location, and paragraph information
    • Metadata of documents (e.g. last modified date)
  • Text

    Text

    • Text extraction
    • Text processing by document layout elements such as paragraphs, tables, headers, and footers
    • Categorize by multi-column type
  • Table

    Table

    • Identify and process merged cells in rows/columns
    • Identify and process nested tables within a table
    • Recognize table information beyond page boundaries
    • Process tables without borders
  • ai-Assistant

    Image

    • Extract text and table information from images
    • Metadata for image search
Hancom Data Loader is a core technology building RAG solutions that dramatically improves accuracy and process time during preprocessing (Load - Split).
Data Loader Process Data Loader Process

Core technology for building RAG solutions