

Wdoc
Overview :
wdoc is a RAG system developed by Olicorne (a medical student) to address document querying and summarization using retrieval-augmented generation technology. It supports multiple file types (such as PDFs, web pages, YouTube videos, etc.) and combines various language models to provide high-recall and high-precision query results. wdoc's main advantages include robust support for multiple file types, efficient retrieval capabilities, and flexible extensibility. It is suitable for researchers, students, and professionals, helping them process large amounts of information quickly. wdoc is currently under development, and the developer welcomes user feedback and feature requests to continuously improve the product.
Target Users :
wdoc is ideal for researchers, students, and professionals who need to process a large volume of diverse documents. It enables rapid information retrieval and summarization, helping users save time and increase productivity. wdoc is a powerful tool for users who work with various file types (such as PDFs, web pages, audio, videos, etc.), especially in scenarios requiring querying and summarizing across different file types.
Use Cases
Users can quickly query specific content within a PDF file and obtain detailed answers using wdoc.
Use wdoc to summarize YouTube videos, extract key information, and generate summaries in Markdown format.
Use wdoc for personal knowledge bases (such as Anki cards) to quickly retrieve and summarize card content.
Features
Supports 15+ file types (such as PDFs, web pages, YouTube videos, etc.) and can query multiple file types simultaneously.
Uses LangChain to process documents, supporting over 100 language models, including local and private LLMs.
Employs advanced RAG techniques, generating high-quality answers through embedding-based retrieval and semantic clustering.
Provides a powerful summarization feature, compressing the document's reasoning process and arguments into an easy-to-read Markdown format.
Supports local and private modes, ensuring data security and preventing information leakage.
Supports multiple tasks, such as querying, searching, summarizing, and querying after summarization.
Provides detailed documentation and command-line help, making it easy for users to get started quickly.
Highly extensible, supporting integration as a tool or library into other projects.
How to Use
1. Install wdoc: Use pip to install wdoc, for example, `pip install wdoc`.
2. Set up environment variables: Add the API key of your chosen language model as an environment variable.
3. Launch wdoc: Run `wdoc --task=query --path=document_path --filetype=file_type` to perform a query.
4. Use the summarization feature: Run `wdoc --task=summarize --path=document_path --filetype=file_type` to generate a summary.
5. Save and load indexes: Use `--save_embeds_as` to save the index and `--load_embeds_from` to load the index, speeding up query times.
6. Use advanced features: Optimize query results by combining parameters such as `--query_retrievers` and `--top_k`.
7. View the help documentation: Run `wdoc --help` to view detailed commands and parameter explanations.
Featured AI Tools

Myreader AI
MyReader is an AI-powered tool that reads books for you. You can upload any book or document (PDF, EPUB), ask questions, and get answers along with the relevant passage for your reference. You can also browse the contents of the uploaded books, view related chapters, and jump to specific pages within the book to continue reading. MyReader helps you efficiently acquire knowledge and allows you to create different contexts, such as philosophy, finance, and healthcare. You can refer to your uploaded books anytime, with a maximum upload limit of 20,000 pages. Please visit our website for pricing details.
Knowledge Management
606.4K

Google NotebookLM
NotebookLM is a personalized AI assistant designed to help users with thinking, summarizing, and brainstorming. Users can create notebooks, add Google Docs, PDFs, or copied text as information sources, and then ask NotebookLM questions to assist with explanation, summarization, and brainstorming. Users can also click on information sources to automatically generate summaries and key themes. NotebookLM's strength lies in its personalized assistance, allowing users to trust the information it provides and build upon it for their work.
Knowledge Management
348.0K