List view
Understanding Nobi
Understanding Nobi
Getting Started
Getting Started
Implementing Nobi On Your Site
Implementing Nobi On Your Site
Knowledge Base Feature Overview
Knowledge Base Feature Overview
Ecommerce Merchandising
Ecommerce Merchandising
Reporting & Analytics
Reporting & Analytics
Beta Products
Beta Products
Developers Guide
Developers Guide
References
References
Adding Documents To Your Knowledge Base
Add content to your knowledge base by indexing web pages or uploading files. Both methods process documents automatically, making the content searchable and available to Nobi within 1-2 minutes.
Adding Documents from URLs
Index web pages by providing their URLs. Nobi fetches the content, extracts text, and processes it for semantic search.
How to Add a URL
- Go to Knowledge Base in your dashboard
- Click "Add Document"
- You should end up on the Index URL tab
- Enter the full URL (must start with
https://)
- Click "Add Document"
The document begins processing immediately. You'll see it appear in your document list with an "Indexing" status.
What Gets Indexed from Web Pages
Nobi extracts:
- Main content - Article text, documentation, policy information
- Headings and structure - Helps understand content organization
- Lists and tables - Preserves structured information
- Metadata - Page title, description for document preview
Nobi automatically filters out:
- Navigation menus
- Footers and boilerplate
- Sidebar content
- JavaScript and CSS
- Advertisements
This ensures only substantive content is indexed.
Best URLs to Index
Good candidates:
- Help center articles
- Policy pages (shipping, returns, privacy)
- Product documentation
- FAQ pages
- Size guides and fit information
- About us and company information
- Blog posts with evergreen content
Avoid indexing:
- Product listing pages (use product catalog instead)
- Cart and checkout pages
- Login/account pages
- Pages with primarily navigation
- Admin or internal-only pages
- Pages that change frequently
URL Requirements
URLs must:
- Start with
https://(HTTP is not supported)
- Be publicly accessible (no login required)
- Contain text content (not just images or videos)
- Be from a domain you own or have permission to index
Nobi cannot index:
- Pages behind login walls
- Password-protected content
- Private Google Docs or similar
- Pages that require authentication
Processing Time
Most web pages are fully indexed within:
- Simple pages - 30-60 seconds
- Long-form content - 1-2 minutes
- Large documents - Up to 3-4 minutes
You'll see the status update from "Indexing" to "Ready" when processing completes.
Uploading Documents
Upload PDF files directly through the dashboard. Files are securely stored in your private storage and processed the same way as web pages.
How to Upload a Document
- Go to Products → Knowledge Base in your dashboard
- Click "Add Document"
- Select "Upload File"
- Drag and drop a PDF, or click to browse files
- Select your file
- Click "Upload"
The document begins processing immediately. You'll see it appear in your document list with an "Indexing" status.
Supported File Types
Currently supported:
- PDF - All standard PDF documents
- TXT - Standard TXT documents
- Beta DOC(X) - Beta support for Word doc formats
What Gets Indexed from PDFs
Nobi extracts:
- All text content - Including body text, headings, captions
- Document structure - Preserves section organization
- Tables - Extracts tabular data where possible
- Metadata - Document title, author, creation date
Nobi automatically filters out:
- Headers and footers
- Page numbers
- Watermarks
- Form fields (unless filled in)
Best Documents to Upload
Good candidates:
- Product manuals and guides
- Company policy documents
- Return and warranty information
- Technical specifications
- Training materials
- White papers and research
- Care instructions
Avoid uploading:
- Scanned images without OCR text
- Forms meant to be filled out
- Documents with primarily images/diagrams
- Internal-only or confidential information
- Documents that change frequently
File Requirements
PDF files must:
- Contain text (OCR-scanned images are not yet supported)
- Be under 25 MB in size
- Not be password-protected
- Be readable (not corrupted)
Storage and Security
Uploaded documents are:
- Stored in private, secure cloud storage
- Organized by your merchant ID
- Not publicly accessible
- Only viewable through authenticated dashboard links
When customers view document sources, they see short-lived preview links that expire after viewing. The original files remain private.
Processing Time
PDF processing time depends on file size and complexity:
- Short PDFs (under 10 pages) - 30-60 seconds
- Medium PDFs (10-50 pages) - 1-3 minutes
- Long PDFs (50+ pages) - 3-5 minutes
Complex PDFs with tables, images, or unusual formatting may take slightly longer.
Document Limits
There are no hard limits on the number of documents you can index. However:
- Each document must be under 25 MB (for uploads)
- Processing is queued, so adding many documents at once may take time
- Very large knowledge bases (1000+ documents) may affect search performance
For best results, focus on quality over quantity. A curated set of 20-50 high-value documents often outperforms hundreds of marginally relevant ones.
Checking Document Status
After adding a document, monitor its status in the document list:
Indexing - Document is being processed. This usually takes 1-3 minutes.
Ready - Document is fully indexed and available for search. Nobi can now use this content to answer questions.
Error - Processing failed. Common causes:
- URL is not accessible or requires login
- PDF is password-protected or corrupted
- Document has no extractable text
- Server timeout or connection issue
If a document shows an error, try:
- Verifying the URL is publicly accessible
- Checking the PDF opens correctly
- Ensuring the file contains text (not just images)
- Re-uploading or re-indexing the document
Best Practices
Start with Core Content
Begin with documents customers reference most:
- Shipping and delivery information
- Return and exchange policies
- FAQ pages
- Product care guides
Add more specialized content as needed.
Use Clear Document Structure
Well-structured documents work best:
- Clear headings - Help Nobi understand content organization
- Logical sections - Group related information together
- Concise paragraphs - Make information easy to extract
- Descriptive lists - Use bullet points for steps or options
Avoid dense walls of text without structure.
Keep URLs Current
For web pages:
- Index pages that are regularly maintained
- Avoid indexing pages that might be removed or moved
- Use permalinks rather than temporary URLs
- Update indexed URLs if page locations change
Test After Adding
After a document is indexed:
- Ask Nobi a question you know is answered in the document
- Verify the response includes accurate information
- Check that citations reference the correct document
- Try variations of the question to ensure consistent retrieval
This confirms content is properly accessible.
Organize in Batches
When adding multiple documents:
- Group related content (e.g., all policy documents)
- Add high-priority content first
- Wait for processing to complete before testing
- Review results before adding more
This makes it easier to identify issues and measure impact.
Regular Maintenance
Review your indexed documents periodically:
- Remove outdated content
- Add new documentation as created
- Update existing documents when policies change
- Force refresh after making significant updates
A well-maintained knowledge base provides better results than a neglected one.
Adding Documents To Your Knowledge BaseAdding Documents from URLsHow to Add a URLWhat Gets Indexed from Web PagesBest URLs to IndexURL RequirementsProcessing TimeUploading DocumentsHow to Upload a DocumentSupported File TypesWhat Gets Indexed from PDFsBest Documents to UploadFile RequirementsStorage and SecurityProcessing TimeDocument LimitsChecking Document StatusBest PracticesStart with Core ContentUse Clear Document StructureKeep URLs CurrentTest After AddingOrganize in BatchesRegular Maintenance