NVIDIA-Ingest
N
NVIDIA Ingest
Overview :
NVIDIA-Ingest is a scalable and high-performance microservice for document content and metadata extraction. It supports parsing of PDF, Word, and PowerPoint documents, utilizing NVIDIA's NIM microservice to find, contextualize, and extract text, tables, charts, and images for downstream generative applications. Its main advantages include high performance, strong scalability, and support for various document types and extraction methods. Currently, it is in the early access phase with frequent updates to the codebase.
Target Users :
The target audience includes organizations and individuals such as business data analysts and researchers who need to process large amounts of complex, unstructured PDFs and other enterprise documents, converting them into metadata and text suitable for retrieval systems. It efficiently and accurately extracts useful information from various documents, meeting their data processing and analytical needs.
Total Visits: 474.6M
Top Region: US(19.34%)
Website Views : 49.7K
Use Cases
Companies extracting key information from large volumes of business documents to build knowledge graphs
Research institutions extracting data from academic literature to support scientific research
Data analysts using extracted text data for subsequent data analysis and mining
Features
Accept JSON job descriptions containing document payloads and ingestion tasks
Allow retrieval of job results, which are JSON dictionaries containing metadata of extracted objects and processing notes
Support multiple document types such as PDF, Docx, pptx, and images
Support various extraction methods for each document type, including pdfium, Unstructured.io, and Adobe Content Extraction Services for PDFs
Support preprocessing and postprocessing operations including text segmentation, transformation, filtering, and embedding generation
How to Use
1. Launch the NIM microservice
2. Install NVIDIA Ingest client dependencies in a Python environment
3. Submit ingestion jobs
4. Review and utilize results
5. Optional: Deploy the library directly
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase