# Environment Variables

## Database Configuration
- `DATABASE_URL`: PostgreSQL connection string
  - Example: `postgresql://username:password@localhost:5432/claim_guard_db`

## OpenAI Configuration
- `OPENAI_API_KEY`: Your OpenAI API key for embeddings
- `OPENAI_API_MODEL`: OpenAI model for embeddings (default: `text-embedding-ada-002`)

## Vector Search Configuration
- `VECTOR_SIMILARITY_THRESHOLD`: Minimum similarity threshold for vector search (default: `0.85`)
  - Range: 0.0 to 1.0
  - Higher values = more strict matching
  - Recommended: 0.85 for production, 0.7 for development

## Application Configuration
- `PORT`: Application port (default: 3000)
- `NODE_ENV`: Environment mode (development/production)

## Example .env file
```bash
# Database
DATABASE_URL="postgresql://username:password@localhost:5432/claim_guard_db"

# OpenAI
OPENAI_API_KEY="your-openai-api-key-here"
OPENAI_API_MODEL="text-embedding-ada-002"

# Vector Search
VECTOR_SIMILARITY_THRESHOLD=0.85

# App
PORT=3000
NODE_ENV=development
```

## Similarity Threshold Guidelines

### Production Environment
- **High Precision**: 0.90 - 0.95 (very strict matching)
- **Standard**: 0.85 - 0.90 (recommended for most use cases)
- **Balanced**: 0.80 - 0.85 (good balance between precision and recall)

### Development Environment
- **Testing**: 0.70 - 0.80 (more lenient for testing)
- **Debugging**: 0.60 - 0.70 (very lenient for development)

### How to Set Threshold

#### Via Environment Variable
```bash
export VECTOR_SIMILARITY_THRESHOLD=0.90
```

#### Via .env file
```bash
VECTOR_SIMILARITY_THRESHOLD=0.90
```

#### Via API (Runtime)
```bash
POST /api/pgvector/threshold
{
  "threshold": 0.90
}
```

## Impact of Threshold Changes

- **Higher Threshold (0.90+)**: Fewer results, higher precision, more relevant matches
- **Lower Threshold (0.70-)**: More results, lower precision, may include less relevant matches
- **Optimal Range (0.80-0.90)**: Good balance between precision and recall for most medical coding use cases