> ## Documentation Index
> Fetch the complete documentation index at: https://anam.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Knowledge Base (RAG)

> Give your AI personas access to your documents using semantic search and Retrieval-Augmented Generation

## Overview

Anam's Knowledge Base lets your AI personas search and retrieve information from your documents using Retrieval-Augmented Generation (RAG). Instead of relying solely on the LLM's training data, your personas can access up-to-date, organization-specific information from your uploaded documents.

<Warning>
  **Beta Feature**: Knowledge Base is currently in beta. You may encounter some
  issues as we continue to improve the feature. Please report any feedback or
  issues to help us make it better.
</Warning>

With this feature, personas can:

* Answer questions from your documentation
* Provide information about your products, policies, or procedures
* Stay current with your latest content updates
* Ground responses in verified sources

## How It Works

Knowledge tools let your AI persona search your documents to answer questions.

<Steps>
  <Step title="Upload">
    You upload a file using a three-step signed upload process.

    <CodeGroup>
      ```javascript Signed URL Upload theme={"system"}
      async function uploadDocument(file, folderId, apiKey) {
        try {
          // Step 1: Get an upload URL from Anam
          const response = await fetch(
            `/v1/knowledge/groups/${folderId}/documents/presigned-upload`,
            {
              method: 'POST',
              headers: {
                'Authorization': `Bearer ${apiKey}`,
                'Content-Type': 'application/json'
              },
              body: JSON.stringify({
                filename: file.name,
                contentType: file.type,
                fileSize: file.size
              })
            }
          );

          if (!response.ok) {
            throw new Error('Failed to get upload URL');
          }

          const { uploadUrl, documentId } = await response.json();

          // Step 2: Upload your file directly to the URL
          const uploadResponse = await fetch(uploadUrl, {
            method: 'PUT',
            body: file
          });

          if (!uploadResponse.ok) {
            throw new Error('Failed to upload file to storage');
          }

          // Step 3: Confirm the upload with Anam
          const confirmResponse = await fetch(
            `/v1/knowledge/documents/${documentId}/confirm-upload`,
            {
              method: 'POST',
              headers: {
                'Authorization': `Bearer ${apiKey}`,
                'Content-Type': 'application/json'
              },
              body: JSON.stringify({ fileSize: file.size })
            }
          );

          if (!confirmResponse.ok) {
            throw new Error('Failed to confirm upload');
          }

          return await confirmResponse.json();
        } catch (error) {
          console.error('Upload failed:', error.message);
          throw error;
        }
      }
      ```
    </CodeGroup>

    This process stores the document and begins the processing pipeline.
  </Step>

  <Step title="Processing">
    A background job processes the document to make it searchable. This typically takes \~30 seconds.

    Status changes to `PROCESSING`.
  </Step>

  <Step title="Ready">
    Once processing completes (typically \~30 seconds), the document status changes to `READY` and becomes searchable.

    <Check>
      You can now attach this folder to knowledge tools and start searching.
    </Check>
  </Step>
</Steps>

<Warning>
  Documents must be in READY status to be searchable. If a document fails
  processing, its status will be set to FAILED with an error message.
</Warning>

## Document Processing

The system processes different file types appropriately:

| File Type          | How It's Processed                               |
| ------------------ | ------------------------------------------------ |
| PDF, TXT, MD, DOCX | Split into paragraphs for precise search results |
| CSV                | Each row is searchable independently             |
| JSON               | Entire file kept intact                          |
| LOG                | Each line is searchable independently            |

### Optimizing Document Structure

For best search results, structure your documents with:

**Clear headings and sections**:

```markdown theme={"system"}
# Installation Guide

## Prerequisites

Before installing, ensure you have...

## Step 1: Download the software

Navigate to our downloads page...

## Step 2: Run the installer

Double-click the downloaded file...
```

**Self-contained paragraphs**:
Each paragraph should make sense independently, as it may be retrieved without surrounding context.

**Descriptive filenames**:

* `product-installation-guide.pdf`
* `billing-faq-2024.md`

Avoid generic names like `document1.pdf` or `untitled.txt`.

## Storage Limits and Quotas

### Upload Limits

Document uploads are subject to file size and monthly storage limits based on your plan.

<Info>
  **Need higher limits?** Contact us about Enterprise plans with custom upload
  limits tailored to your needs.
</Info>

**File uploads**:

* Maximum file size per document varies by plan
* Batch uploads supported (multiple files at once)
* Storage quotas count only non-deleted documents

Deleting documents frees up quota for new uploads.

### Checking Your Usage

You can view your current usage in the Anam Lab UI at `/knowledge` or via API:

```javascript theme={"system"}
const response = await fetch("/v1/knowledge/usage", {
  headers: {
    Authorization: `Bearer ${apiKey}`,
  },
});

const usage = await response.json();
console.log(`Used: ${usage.totalBytes} / ${usage.quotaBytes}`);
```

## Search Performance

### How Search Works

When your AI searches documents, it finds the most relevant information to answer the user's question. The system ranks results by relevance and provides the best matches to the LLM.

### Improving Search Results

**Use descriptive folder names and document titles**:
Metadata helps the system understand context.

**Keep documents focused on specific topics**:
Instead of one 500-page manual, upload focused documents on individual topics.

**Update documents regularly**:
Delete outdated documents and upload current versions.

**Organize by knowledge domain**:
Create separate folders for different areas:

* Product documentation
* FAQs
* Policies
* Troubleshooting guides

## Using Knowledge Tools

Once your documents are uploaded and processed, create knowledge tools to enable search:

```typescript theme={"system"}
{
  type: 'server',
  subtype: 'knowledge',
  name: 'search_product_docs',
  description: 'Search product documentation when users ask technical questions about features, installation, or usage',
  documentFolderIds: ['550e8400-e29b-41d4-a716-446655440000', '6ba7b810-9dad-11d1-80b4-00c04fd430c8']
}
```

Attach the tool to a persona, and the LLM will invoke it when relevant:

**User**: "How do I configure SSL?"

**LLM internal process**:

1. Recognizes this is a technical question
2. Invokes `search_product_docs` with query "SSL configuration"
3. Receives relevant chunks from documentation
4. Generates response: "To configure SSL, you'll need to..."

<Note>
  Learn more about creating and using knowledge tools in the [Tools
  documentation](/personas/tools/overview).
</Note>

## Security and Privacy

### Organization Isolation

All knowledge base data is organization-scoped:

* Users can only access their organization's folders and documents
* API requests are filtered by organization ID at the database level
* Even with knowledge of folder IDs, users cannot access other organizations' data

<Warning>
  Always use HTTPS for API requests. Never commit API keys to version control or
  expose them client-side.
</Warning>

## Troubleshooting

### No Search Results

**Possible causes**:

* Documents still in PROCESSING status (wait \~30 seconds)
* Query semantically unrelated to document content
* Folder contains no READY documents
* Documents in wrong folder (check folder assignments)

**Solutions**:

1. Check document status in the UI or via API
2. Test query using debug modal (`Ctrl+Shift+K`)
3. Try rephrasing query with more specific terms
4. Verify folder contains relevant documents

### Upload Failures

**File too large** (>50MB):

* Split document into smaller files
* Remove images in PDFs
* Remove unnecessary pages

**Processing failed**:

* Check file isn't corrupted
* Verify file format is supported
* Try re-uploading the file

### Slow Processing

**Normal processing time**: 30 seconds for typical documents

**Longer processing times** may occur with:

* Very large files (40-50MB)
* Complex PDFs with many images
* High system load

<Info>
  You can upload multiple documents simultaneously. The system processes up to 4
  documents concurrently.
</Info>

## Next Steps

<CardGroup cols={2}>
  <Card title="Cookbook: RAG Knowledge Base" icon="book-open" href="https://anam.ai/cookbook/rag-knowledge-base">
    Tutorial for adding a RAG knowledge base to your avatar agent
  </Card>

  <Card title="Knowledge Base Setup" icon="upload" href="/personas/knowledge/setup">
    Step-by-step guide to creating folders and uploading documents
  </Card>

  <Card title="Uploading Documents" icon="file-arrow-up" href="/personas/knowledge/uploading-documents">
    Detailed guide on the upload process and batch uploads
  </Card>

  <Card title="Tools Overview" icon="wrench" href="/personas/tools/overview">
    Learn about all available tool types including knowledge tools
  </Card>
</CardGroup>
