Chat with PDF

Engage in conversations with your PDF documents using AI to extract insights and answer questions.

The Chat with PDF demo application enables intelligent interaction with document content through a conversational AI interface. Upload PDF files and instantly engage in natural dialogue about their contents, asking questions, requesting summaries, and extracting key information with remarkable accuracy.

Mobile

Web

Live preview

Features

Transform how you interact with document content through these powerful capabilities:

PDF upload

Easily upload PDF files directly into the application for analysis.

Contextual conversation

Chat with an AI that understands the content of your uploaded PDF, providing relevant answers based on the text.

Information extraction

Quickly find specific information, key points, or summaries within the document through natural language queries.

Source highlighting (coming soon)

Visualize exactly which document sections informed the AI's responses with precise source highlighting.

Multi-document intelligence (coming soon)

Conduct sophisticated conversations spanning multiple uploaded documents, enabling cross-document analysis and comparison.

Setup

To implement the "Chat with PDF" application in your project, configure these essential backend services:

Database

Set up PostgreSQL with the pgvector extension to efficiently store conversation history, document metadata, and vector embeddings for semantic search.

Storage

Configure S3-compatible cloud storage for secure management of uploaded PDF documents.

You'll also need to obtain API keys for both the conversational AI models and the embedding models used for text processing.

AI models

This application leverages two complementary AI model types working together:

Large Language Models (LLMs): Provide sophisticated natural language understanding to interpret your questions and generate contextually appropriate responses based on document content.
Embedding Models: Convert document text segments into numerical vector representations that enable efficient semantic similarity search and Retrieval-Augmented Generation (RAG).

Configure the providers for the models you wish to use:

OpenAI

Utilize GPT models for conversational AI and advanced embedding models for vector representation.

Anthropic

Implement Claude models for sophisticated reasoning and nuanced document understanding.

Google AI

Leverage Gemini models for powerful conversational capabilities with document content.

Replicate

Access diverse open-source embedding models for flexible implementation options.

For comprehensive configuration details, consult the AI SDK documentation covering provider setup and model selection.

Data persistence

The application stores data related to chats, documents, and embeddings to provide a persistent experience.

Database

Learn more about database services in TurboStarter AI.

Application data is organized within a dedicated PostgreSQL schema named pdf:

chats: captures essential metadata for each document-specific conversation session.
messages: stores all user queries and AI responses within conversation threads.
documents: maintains comprehensive tracking of uploaded PDF files, including filenames and storage locations.
embeddings: contains text segments extracted from PDFs along with their vector representations (using pgvector's vector data type). To optimize similarity searches critical for RAG processing, the system creates an index (embeddingIndex using HNSW) on the embedding column.

Storage

Learn more about cloud storage services in TurboStarter AI.

The PDF files uploaded by users are securely stored in your configured cloud storage bucket. The path field in the documents table maintains the precise reference to each file's location.

Structure

The "Chat with PDF" feature is architected across the monorepo for optimal organization and code reuse:

Core

The @turbostarter/ai package (packages/ai) contains the essential logic under modules/pdf:

Comprehensive types, validation schemas, and constants specific to PDF processing
Advanced document parsing, text segmentation, and embedding generation utilities
Core API logic for managing conversations, performing RAG-based lookups, and interacting with LLMs
Database operations for storing and retrieving conversations, documents, and embeddings
Shared utilities for managing PDF file uploads and downloads

API

The packages/api package defines the backend API endpoints using Hono:

src/modules/ai/pdf/pdf.router.ts: implements Hono RPC routes for document upload and conversation management, handles input validation, applies middleware (authentication, credit management), and invokes the core functionality from @turbostarter/ai.

Web

The Next.js application (apps/web) delivers an intuitive user interface:

src/app/[locale]/(apps)/pdf/**: contains the Next.js App Router pages and layouts for the document conversation experience
src/components/pdf/**: houses reusable React components specific to the PDF interaction UI (document upload, conversation interface, message display)

Mobile

The Expo/React Native application (apps/mobile) provides a native mobile experience:

src/app/pdf/**: defines the screens for the mobile document conversation interface
src/components/pdf/**: contains React Native components optimized for mobile document interaction
API integration: utilizes the same Hono RPC client (packages/api) as the web app for consistent backend communication

This architecture ensures that core AI processing and data handling logic is shared across platforms, while enabling optimized UI implementations tailored to each environment.

Chat with PDF

PDF upload

Contextual conversation

Information extraction

Source highlighting (coming soon)

Multi-document intelligence (coming soon)

Database

Storage

OpenAI

Anthropic

Google AI

Replicate

Database

Storage

On this page