mirror of
https://github.com/ferdzo/iotDashboard.git
synced 2026-04-05 17:16:26 +00:00
407 lines
9.6 KiB
Markdown
407 lines
9.6 KiB
Markdown
# Database Writer Service
|
|
|
|
A robust, production-ready service that reads sensor data from Redis streams and writes it to PostgreSQL/TimescaleDB. Part of the IoT Dashboard project.
|
|
|
|
## Features
|
|
|
|
- ✅ **Reliable consumption** from Redis streams using consumer groups
|
|
- ✅ **Batch processing** for high throughput
|
|
- ✅ **At-least-once delivery** with message acknowledgments
|
|
- ✅ **Dead letter queue** for failed messages
|
|
- ✅ **Connection pooling** for database efficiency
|
|
- ✅ **Graceful shutdown** handling
|
|
- ✅ **Flexible schema** that adapts to changes
|
|
- ✅ **Structured logging** with JSON output
|
|
- ✅ **Health checks** for monitoring
|
|
- ✅ **TimescaleDB support** for time-series optimization
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Redis Streams → Consumer Group → Transform → Database → Acknowledge
|
|
↓
|
|
Failed messages
|
|
↓
|
|
Dead Letter Queue
|
|
```
|
|
|
|
### Components
|
|
|
|
- **`main.py`**: Service orchestration and processing loop
|
|
- **`redis_reader.py`**: Redis stream consumer with fault tolerance
|
|
- **`db_writer.py`**: Database operations with connection pooling
|
|
- **`schema.py`**: Data transformation and validation
|
|
- **`config.py`**: Configuration management
|
|
|
|
## Quick Start
|
|
|
|
### Prerequisites
|
|
|
|
- Python 3.13+
|
|
- [uv](https://github.com/astral-sh/uv) package manager
|
|
- Redis server with streams
|
|
- PostgreSQL or TimescaleDB
|
|
|
|
### Installation
|
|
|
|
1. **Navigate to the service directory**:
|
|
```bash
|
|
cd services/db_write
|
|
```
|
|
|
|
2. **Copy and configure environment variables**:
|
|
```bash
|
|
cp .env.example .env
|
|
# Edit .env with your DATABASE_URL and other settings
|
|
```
|
|
|
|
3. **Install dependencies**:
|
|
```bash
|
|
uv sync
|
|
```
|
|
|
|
4. **Setup database schema** (IMPORTANT - do this before running):
|
|
```bash
|
|
# Review the schema in models.py first
|
|
cat models.py
|
|
|
|
# Create initial migration
|
|
chmod +x migrate.sh
|
|
./migrate.sh create "initial schema"
|
|
|
|
# Review the generated migration
|
|
ls -lt alembic/versions/
|
|
|
|
# Apply migrations
|
|
./migrate.sh upgrade
|
|
```
|
|
|
|
5. **Run the service**:
|
|
```bash
|
|
uv run main.py
|
|
```
|
|
|
|
Or use the standalone script:
|
|
```bash
|
|
chmod +x run-standalone.sh
|
|
./run-standalone.sh
|
|
```
|
|
|
|
### ⚠️ Important: Schema Management
|
|
|
|
This service uses **Alembic** for database migrations. The service will NOT create tables automatically.
|
|
|
|
- Schema is defined in `models.py`
|
|
- Migrations are managed with `./migrate.sh` or `alembic` commands
|
|
- See `SCHEMA_MANAGEMENT.md` for detailed guide
|
|
|
|
## Schema Management
|
|
|
|
This service uses **SQLAlchemy** for models and **Alembic** for migrations.
|
|
|
|
### Key Files
|
|
|
|
- **`models.py`**: Define your database schema here (SQLAlchemy models)
|
|
- **`alembic/`**: Migration scripts directory
|
|
- **`migrate.sh`**: Helper script for common migration tasks
|
|
- **`SCHEMA_MANAGEMENT.md`**: Comprehensive migration guide
|
|
|
|
### Quick Migration Commands
|
|
|
|
```bash
|
|
# Create a new migration after editing models.py
|
|
./migrate.sh create "add new column"
|
|
|
|
# Apply pending migrations
|
|
./migrate.sh upgrade
|
|
|
|
# Check migration status
|
|
./migrate.sh check
|
|
|
|
# View migration history
|
|
./migrate.sh history
|
|
|
|
# Rollback last migration
|
|
./migrate.sh downgrade 1
|
|
```
|
|
|
|
**See `SCHEMA_MANAGEMENT.md` for detailed documentation.**
|
|
|
|
## Configuration
|
|
|
|
All configuration is done via environment variables. See `.env.example` for all available options.
|
|
|
|
### Required Settings
|
|
|
|
```bash
|
|
# Redis connection
|
|
REDIS_HOST=localhost
|
|
REDIS_PORT=6379
|
|
|
|
# Database connection
|
|
DATABASE_URL=postgresql://user:password@localhost:5432/iot_dashboard
|
|
```
|
|
|
|
### Optional Settings
|
|
|
|
```bash
|
|
# Consumer configuration
|
|
CONSUMER_GROUP_NAME=db_writer # Consumer group name
|
|
CONSUMER_NAME=worker-01 # Unique consumer name
|
|
BATCH_SIZE=100 # Messages per batch
|
|
BATCH_TIMEOUT_SEC=5 # Read timeout
|
|
PROCESSING_INTERVAL_SEC=1 # Delay between batches
|
|
|
|
# Stream configuration
|
|
STREAM_PATTERN=mqtt_stream:* # Stream name pattern
|
|
DEAD_LETTER_STREAM=mqtt_stream:failed
|
|
|
|
# Database
|
|
TABLE_NAME=sensor_readings # Target table name
|
|
ENABLE_TIMESCALE=false # Use TimescaleDB features
|
|
|
|
# Logging
|
|
LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR
|
|
LOG_FORMAT=json # json or console
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
### Input (Redis Streams)
|
|
|
|
The service reads from Redis streams with the format:
|
|
```
|
|
mqtt_stream:{device_id}:{sensor_type}
|
|
```
|
|
|
|
Each message contains:
|
|
```
|
|
{
|
|
"value": "23.5",
|
|
"timestamp": "2023-10-18T14:30:00Z",
|
|
"metadata": "{...}" (optional)
|
|
}
|
|
```
|
|
|
|
### Output (Database)
|
|
|
|
Data is written to the `sensor_readings` table:
|
|
|
|
```sql
|
|
CREATE TABLE sensor_readings (
|
|
id BIGSERIAL PRIMARY KEY,
|
|
timestamp TIMESTAMPTZ NOT NULL,
|
|
device_id VARCHAR(100) NOT NULL,
|
|
sensor_type VARCHAR(100) NOT NULL,
|
|
value DOUBLE PRECISION NOT NULL,
|
|
metadata JSONB,
|
|
created_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
```
|
|
|
|
**Note**: The table is automatically created if it doesn't exist.
|
|
|
|
## Running with Docker
|
|
|
|
### Build the image
|
|
|
|
```bash
|
|
docker build -t db-writer:latest .
|
|
```
|
|
|
|
### Run the container
|
|
|
|
```bash
|
|
docker run -d \
|
|
--name db-writer \
|
|
-e REDIS_HOST=redis \
|
|
-e DATABASE_URL=postgresql://user:pass@postgres:5432/iot \
|
|
db-writer:latest
|
|
```
|
|
|
|
## Consumer Groups
|
|
|
|
The service uses Redis consumer groups for reliable, distributed processing:
|
|
|
|
- **Multiple instances**: Run multiple workers for load balancing
|
|
- **Fault tolerance**: Messages are not lost if a consumer crashes
|
|
- **Acknowledgments**: Messages are only removed after successful processing
|
|
- **Pending messages**: Unacknowledged messages can be reclaimed
|
|
|
|
### Running Multiple Workers
|
|
|
|
```bash
|
|
# Terminal 1
|
|
CONSUMER_NAME=worker-01 uv run main.py
|
|
|
|
# Terminal 2
|
|
CONSUMER_NAME=worker-02 uv run main.py
|
|
```
|
|
|
|
All workers in the same consumer group will share the load.
|
|
|
|
## Error Handling
|
|
|
|
### Dead Letter Queue
|
|
|
|
Failed messages are sent to the dead letter stream (`mqtt_stream:failed`) with error information:
|
|
|
|
```
|
|
{
|
|
"original_stream": "mqtt_stream:esp32:temperature",
|
|
"original_id": "1634567890123-0",
|
|
"device_id": "esp32",
|
|
"sensor_type": "temperature",
|
|
"value": "23.5",
|
|
"error": "Database connection failed",
|
|
"failed_at": "1634567890.123"
|
|
}
|
|
```
|
|
|
|
### Retry Strategy
|
|
|
|
- **Transient errors**: Automatic retry with backoff
|
|
- **Data errors**: Immediate send to DLQ
|
|
- **Connection errors**: Reconnection attempts
|
|
|
|
## Monitoring
|
|
|
|
### Health Checks
|
|
|
|
Check service health programmatically:
|
|
|
|
```python
|
|
from main import DatabaseWriterService
|
|
|
|
service = DatabaseWriterService()
|
|
health = service.health_check()
|
|
print(health)
|
|
# {
|
|
# 'running': True,
|
|
# 'redis': True,
|
|
# 'database': True,
|
|
# 'stats': {...}
|
|
# }
|
|
```
|
|
|
|
### Logs
|
|
|
|
The service outputs structured logs:
|
|
|
|
```json
|
|
{
|
|
"event": "Processed batch",
|
|
"rows_written": 100,
|
|
"messages_acknowledged": 100,
|
|
"timestamp": "2023-10-18T14:30:00Z",
|
|
"level": "info"
|
|
}
|
|
```
|
|
|
|
### Statistics
|
|
|
|
Runtime statistics are tracked:
|
|
- `messages_read`: Total messages consumed
|
|
- `messages_written`: Total rows inserted
|
|
- `messages_failed`: Failed messages sent to DLQ
|
|
- `batches_processed`: Number of successful batches
|
|
- `errors`: Total errors encountered
|
|
|
|
## Development
|
|
|
|
### Project Structure
|
|
|
|
```
|
|
db_write/
|
|
├── config.py # Configuration management
|
|
├── db_writer.py # Database operations
|
|
├── redis_reader.py # Redis stream consumer
|
|
├── schema.py # Data models and transformation
|
|
├── main.py # Service entry point
|
|
├── pyproject.toml # Dependencies
|
|
├── .env.example # Configuration template
|
|
└── README.md # This file
|
|
```
|
|
|
|
### Adding Dependencies
|
|
|
|
```bash
|
|
uv add package-name
|
|
```
|
|
|
|
### Running Tests
|
|
|
|
```bash
|
|
uv run pytest
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Service won't start
|
|
|
|
1. **Check configuration**: Verify all required environment variables are set
|
|
2. **Test connections**: Ensure Redis and PostgreSQL are accessible
|
|
3. **Check logs**: Look for specific error messages
|
|
|
|
### No messages being processed
|
|
|
|
1. **Check streams exist**: `redis-cli KEYS "mqtt_stream:*"`
|
|
2. **Verify consumer group**: The service creates it automatically, but check Redis logs
|
|
3. **Check stream pattern**: Ensure `STREAM_PATTERN` matches your stream names
|
|
|
|
### Messages going to dead letter queue
|
|
|
|
1. **Check DLQ**: `redis-cli XRANGE mqtt_stream:failed - + COUNT 10`
|
|
2. **Review error messages**: Each DLQ entry contains the error reason
|
|
3. **Validate data format**: Ensure messages match expected schema
|
|
|
|
### High memory usage
|
|
|
|
1. **Reduce batch size**: Lower `BATCH_SIZE` in configuration
|
|
2. **Check connection pool**: May need to adjust pool size
|
|
3. **Monitor pending messages**: Use `XPENDING` to check backlog
|
|
|
|
## Performance Tuning
|
|
|
|
### Throughput Optimization
|
|
|
|
- **Increase batch size**: Process more messages per batch
|
|
- **Multiple workers**: Run multiple consumer instances
|
|
- **Connection pooling**: Adjust pool size based on load
|
|
- **Processing interval**: Reduce delay between batches
|
|
|
|
### Latency Optimization
|
|
|
|
- **Decrease batch size**: Process smaller batches more frequently
|
|
- **Reduce timeout**: Lower `BATCH_TIMEOUT_SEC`
|
|
- **Single worker**: Avoid consumer group coordination overhead
|
|
|
|
## Production Deployment
|
|
|
|
### Recommended Settings
|
|
|
|
```bash
|
|
BATCH_SIZE=500
|
|
PROCESSING_INTERVAL_SEC=0.1
|
|
LOG_LEVEL=INFO
|
|
LOG_FORMAT=json
|
|
ENABLE_TIMESCALE=true
|
|
```
|
|
|
|
### Monitoring
|
|
|
|
- Monitor consumer lag using Redis `XPENDING`
|
|
- Track database insert latency
|
|
- Alert on error rate > 5%
|
|
- Monitor DLQ depth
|
|
|
|
### Scaling
|
|
|
|
1. **Horizontal**: Add more consumer instances with unique `CONSUMER_NAME`
|
|
2. **Vertical**: Increase resources for database writes
|
|
3. **Database**: Use TimescaleDB for better time-series performance
|
|
|
|
## License
|
|
|
|
Part of the IoT Dashboard project.
|