DocInsights platform
Overview
DocInsights is a platform which is designed to help you be more productive with your documents. For example, you have 100s of documents and don't know where to search for your question, or you are unsure how exactly to phrase your question to meet the keyword-based search of your computer. DocInsights "understands" the content of your documents and allows you to search in the using natural language. This is also useful if you for example want to read research papers, or quiz yourself in preparation of an exam.
You can sign up for DocInsights here. To learn more about DocInsights you can also take a look at the landing page coretext.dev.
Features
A brief overview of the app is given here. To sign in, go to the signin page. If you don't have an account, create one with your email address.
Once you are signed in, you can upload documents to your library in the library view. Here you can also remove entries if they are no longer required.
Once you have documents in your library, you can start asking questions about them. DocInsights gives you an answer and provides the sources it has used to generate the answer.
Implementation
Here are some details on the implementation of DocInsights. DocInsights is implemented in micro-services and hosted in Google Cloud. Specifically, the following components are used.
Core backend
The engine of the app, which provides the logic for handling queries, uploading documents, etc. The engine uses my open-source library RAGCore
, which you can find here. Written in modern Python.
Web frontend
A modern web frontend written in TypeScript with Next.JS.
Vector database for documents
A Pinecone vector database. Namespaces are used to isolate user data.
Database for user data
A Firestore instance (non-sql).
Authentication/authorization service
Firebase Authentication service to handle user accounts. Users are identified by their UIDs throughout the app only, so no user names or emails are visible outside of the authentication service.
Data cleanup service
A service which runs periodically to fully remove all data when a user deletes the account. Written in async Python.
0 Comments