Trillo Doc AI Architecture
Last updated
Last updated
Trillo Doc AI processes documents using Large Language Models (LLMs) and Document AI processors. It reads the content of documents using OCR or libraries such as PDF libraries. It classifies each document using an LLM-based classifier. Based on the classification, it uses a processor to extract structured data from the content or processes it as a text document.
Text documents are summarized, chunked, and converted to vectors (text embeddings). These vectors are stored in a vector database for semantic matching.
Trillo Doc AI processing pipeline is a specific implementation of the generic document processing pipeline using GCP services such as Google Vertex AI (which includes Gen AI, LLMs APIs), Google Document AI (NLP based parsers for forms, tables, purchase orders, invoices, etc.), AlloyDB (vector database), Cloud SQL (managed relational database), Google Cloud Storage for buckets.