Testing
Running and writing tests for SurfSense
SurfSense uses pytest with two test layers: unit tests (no database) and integration tests (require PostgreSQL + pgvector). Tests are self-bootstrapping — they configure the test database, register a user, and clean up automatically.
Prerequisites
- PostgreSQL + pgvector running locally (database
surfsense_testwill be used) REGISTRATION_ENABLED=TRUEin your.env(this is the default)- A working LLM model with a valid API key in
global_llm_config.yaml(for integration tests)
No Redis or Celery is required — integration tests use an inline task dispatcher.
Running Tests
Run all tests:
uv run pytestRun by marker:
uv run pytest -m unit # fast, no DB needed
uv run pytest -m integration # requires PostgreSQL + pgvectorAvailable markers:
| Marker | Description |
|---|---|
unit | Pure logic tests, no DB or external services |
integration | Tests that require a real PostgreSQL database |
Useful flags:
| Flag | Description |
|---|---|
-s | Show live output (useful for debugging polling loops) |
--tb=long | Full tracebacks instead of short summaries |
-k "test_name" | Run a single test by name |
-o addopts="" | Override default flags from pyproject.toml |
Configuration
Default pytest options are in surfsense_backend/pyproject.toml:
[tool.pytest.ini_options]
addopts = "-v --tb=short -x --strict-markers -ra --durations=5"-v— verbose test names--tb=short— concise tracebacks on failure-x— stop on first failure--strict-markers— reject unregistered markers-ra— show summary of all non-passing tests--durations=5— show the 5 slowest tests
Environment Variables
| Variable | Default | Description |
|---|---|---|
TEST_DATABASE_URL | postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense_test | Database URL for tests |
The test suite forces DATABASE_URL to point at the test database, so your production database is never touched.
Unit Tests
Pure logic tests that run without a database. Cover model validation, chunking, hashing, and summarization.
Integration Tests
Require PostgreSQL + pgvector. Split into two suites:
document_upload/— Tests the HTTP API through public endpoints: upload, multi-file, duplicate detection, auth, error handling, page limits, and file size limits. Uses an in-process FastAPI client withASGITransport.indexing_pipeline/— Tests pipeline internals directly:prepare_for_indexing,index(), andindex_uploaded_file()covering chunking, embedding, summarization, fallbacks, and error handling.
External boundaries (LLM, embedding, chunking, Redis) are mocked in both suites.
How It Works
- Database setup —
TEST_DATABASE_URLdefaults tosurfsense_test. Tables and extensions (vector,pg_trgm) are created once per session and dropped after. - Transaction isolation — Each test runs inside a savepoint that rolls back, so tests don't affect each other.
- User creation — Integration tests register a test user via
POST /auth/registeron first run, then log in for subsequent requests. - Search space discovery — Tests call
GET /api/v1/searchspacesand use the first available space. - Cleanup — A session fixture purges stale documents before tests run. Per-test cleanup deletes documents via API, falling back to direct DB access for stuck records.
Writing New Tests
- Create a test file in the appropriate directory (
unit/orintegration/). - Add the marker at the top of the file:
import pytest
pytestmark = pytest.mark.integration # or pytest.mark.unit- Use fixtures from
conftest.py—client,headers,search_space_id, andcleanup_doc_idsare available to integration tests. Unit tests getmake_connector_documentand sample ID fixtures. - Register any new markers in
pyproject.tomlundermarkers.