# Environment Variables ## Database Configuration - `DATABASE_URL`: PostgreSQL connection string - Example: `postgresql://username:password@localhost:5432/claim_guard_db` ## OpenAI Configuration - `OPENAI_API_KEY`: Your OpenAI API key for embeddings - `OPENAI_API_MODEL`: OpenAI model for embeddings (default: `text-embedding-ada-002`) ## Vector Search Configuration - `VECTOR_SIMILARITY_THRESHOLD`: Minimum similarity threshold for vector search (default: `0.85`) - Range: 0.0 to 1.0 - Higher values = more strict matching - Recommended: 0.85 for production, 0.7 for development ## Application Configuration - `PORT`: Application port (default: 3000) - `NODE_ENV`: Environment mode (development/production) ## Example .env file ```bash # Database DATABASE_URL="postgresql://username:password@localhost:5432/claim_guard_db" # OpenAI OPENAI_API_KEY="your-openai-api-key-here" OPENAI_API_MODEL="text-embedding-ada-002" # Vector Search VECTOR_SIMILARITY_THRESHOLD=0.85 # App PORT=3000 NODE_ENV=development ``` ## Similarity Threshold Guidelines ### Production Environment - **High Precision**: 0.90 - 0.95 (very strict matching) - **Standard**: 0.85 - 0.90 (recommended for most use cases) - **Balanced**: 0.80 - 0.85 (good balance between precision and recall) ### Development Environment - **Testing**: 0.70 - 0.80 (more lenient for testing) - **Debugging**: 0.60 - 0.70 (very lenient for development) ### How to Set Threshold #### Via Environment Variable ```bash export VECTOR_SIMILARITY_THRESHOLD=0.90 ``` #### Via .env file ```bash VECTOR_SIMILARITY_THRESHOLD=0.90 ``` #### Via API (Runtime) ```bash POST /api/pgvector/threshold { "threshold": 0.90 } ``` ## Impact of Threshold Changes - **Higher Threshold (0.90+)**: Fewer results, higher precision, more relevant matches - **Lower Threshold (0.70-)**: More results, lower precision, may include less relevant matches - **Optimal Range (0.80-0.90)**: Good balance between precision and recall for most medical coding use cases