Database Monitoring
Database Monitoring
Learn how to monitor your Solo Kit database for optimal performance, health, and reliability. This guide covers health checks, performance metrics, alerting strategies, and comprehensive observability practices.
๐ Monitoring Overview
Solo Kit Monitoring Architecture
Solo Kit implements comprehensive database observability with:
- Real-time health checks: Continuous database connectivity monitoring
- Performance metrics: Query response times and resource utilization
- Connection monitoring: Connection pool health and usage patterns
- Error tracking: Automated error detection and alerting
- Operational dashboards: Visual monitoring and trending
Monitoring Components
๐ Health Check APIs (/api/health/database, /api/healthz)
โ
๐ Performance Metrics (latency, connection count, query stats)
โ
๐จ Alerting & Notifications (Sentry, custom alerts)
โ
๐ Dashboards & Visualization (cloud provider dashboards)
Three-tier monitoring:
- Application-level: Health checks and performance metrics
- Database-level: PostgreSQL statistics and logs
- Infrastructure-level: Cloud provider monitoring (Neon, Supabase, etc.)
๐ Health Check Implementation
Built-in Health Check System
Solo Kit provides comprehensive health monitoring endpoints:
// packages/database/lib/connection.ts
export async function healthCheck(): Promise<{
status: 'healthy' | 'unhealthy' | 'not_configured';
latency?: number;
error?: string;
timestamp: Date;
}> {
const timestamp = new Date();
if (!isDatabaseConfigured()) {
return {
status: 'not_configured',
timestamp,
};
}
const startTime = Date.now();
try {
await withConnectionRetry(async () => {
const db = await getDb();
if (!db) {
throw new Error('Database instance is null');
}
// Test connection (different approaches for different drivers)
if (shouldUseNeonServerlessDriver()) {
// Connection test was done during initialization for Neon serverless
return true;
} else {
// Test with a simple query for postgres.js
await db.execute('SELECT 1');
return true;
}
}, 2); // Fewer retries for health checks
const latency = Date.now() - startTime;
return {
status: 'healthy',
latency,
timestamp,
};
} catch (error) {
return {
status: 'unhealthy',
error: error instanceof Error ? error.message : 'Unknown error',
timestamp,
};
}
}
Health Check API Endpoints
Database-specific health check:
// apps/web/app/api/(dev)/health/database/route.ts
export async function GET() {
try {
const health = await healthCheck();
// Return appropriate HTTP status based on database health
const status =
health.status === 'healthy'
? 200
: health.status === 'not_configured'
? 503
: 500;
return NextResponse.json(
{
database: {
status: health.status,
latency: health.latency,
error: health.error,
timestamp: health.timestamp,
message: getStatusMessage(health.status),
},
environment: process.env.NODE_ENV,
configured: health.status !== 'not_configured',
},
{
status,
headers: {
'Cache-Control': 'no-cache, no-store, must-revalidate',
Pragma: 'no-cache',
Expires: '0',
},
}
);
} catch (error) {
console.error('Health check endpoint error:', error);
return NextResponse.json(
{
database: {
status: 'unhealthy',
error: 'Health check failed',
timestamp: new Date(),
message: 'Database health check encountered an unexpected error',
},
environment: process.env.NODE_ENV,
configured: false,
},
{ status: 500 }
);
}
}
function getStatusMessage(status: string): string {
switch (status) {
case 'healthy':
return 'Database is connected and responding normally';
case 'unhealthy':
return 'Database is configured but not responding';
case 'not_configured':
return 'Database is not configured - setup required';
default:
return 'Unknown database status';
}
}
Comprehensive application health check:
// apps/web/app/api/(dev)/healthz/route.ts
export async function GET(request: NextRequest) {
const { searchParams } = new URL(request.url);
const checkType = searchParams.get('type') || 'readiness';
try {
// Basic liveness check - just verify process is running
if (checkType === 'liveness') {
return Response.json({
status: 'healthy',
timestamp: new Date().toISOString(),
version: process.env.npm_package_version || '0.1.0',
environment: process.env.NODE_ENV || 'development',
uptime: getUptime(),
type: 'liveness',
});
}
// Full readiness check - verify all dependencies
const [databaseHealth, memoryHealth] = await Promise.all([
checkDatabaseHealth(),
Promise.resolve(getMemoryUsage()),
]);
const overallStatus = calculateOverallStatus(
databaseHealth.status,
memoryHealth.status
);
const healthResponse = {
status: overallStatus,
timestamp: new Date().toISOString(),
version: process.env.npm_package_version || '0.1.0',
environment: process.env.NODE_ENV || 'development',
uptime: getUptime(),
checks: {
database: databaseHealth,
memory: memoryHealth,
},
};
// Return appropriate HTTP status code based on health
const httpStatus = overallStatus === 'unhealthy' ? 503 : 200;
return Response.json(healthResponse, { status: httpStatus });
} catch (error) {
return errorHandler(error);
}
}
async function checkDatabaseHealth() {
try {
const { healthCheck, isDatabaseConfigured } = await import('@packages/database');
if (!isDatabaseConfigured()) {
return {
status: 'unhealthy' as const,
enabled: false,
provider: 'none',
responseTime: 0,
error: 'DATABASE_URL not configured',
};
}
const startTime = Date.now();
const health = await healthCheck();
const responseTime = Date.now() - startTime;
// Alert for slow database responses
if (responseTime > 2000) {
console.warn(`๐ Slow database health check: ${responseTime}ms`);
// Optionally send to monitoring service
}
return {
status: health.status === 'healthy' ? ('healthy' as const) : ('unhealthy' as const),
enabled: true,
provider: 'postgresql',
responseTime,
error: health.error,
};
} catch (error) {
return {
status: 'unhealthy' as const,
enabled: true,
provider: 'postgresql',
responseTime: 0,
error: error instanceof Error ? error.message : 'Health check failed',
};
}
}
๐ Performance Monitoring
Response Time Tracking
Monitor database query performance:
// Query performance monitoring wrapper
export class QueryPerformanceMonitor {
private static instance: QueryPerformanceMonitor;
private queryStats: Map<string, QueryStats> = new Map();
static getInstance(): QueryPerformanceMonitor {
if (!QueryPerformanceMonitor.instance) {
QueryPerformanceMonitor.instance = new QueryPerformanceMonitor();
}
return QueryPerformanceMonitor.instance;
}
async monitorQuery<T>(
queryName: string,
queryFn: () => Promise<T>
): Promise<T> {
const startTime = Date.now();
try {
const result = await queryFn();
const duration = Date.now() - startTime;
this.recordQuery(queryName, duration, true);
// Alert on slow queries
if (duration > 1000) {
console.warn(`๐ Slow query detected: ${queryName} took ${duration}ms`);
await this.alertSlowQuery(queryName, duration);
}
return result;
} catch (error) {
const duration = Date.now() - startTime;
this.recordQuery(queryName, duration, false);
throw error;
}
}
private recordQuery(queryName: string, duration: number, success: boolean) {
const stats = this.queryStats.get(queryName) || {
totalQueries: 0,
totalDuration: 0,
successCount: 0,
errorCount: 0,
averageDuration: 0,
maxDuration: 0,
minDuration: Infinity,
};
stats.totalQueries++;
stats.totalDuration += duration;
if (success) {
stats.successCount++;
} else {
stats.errorCount++;
}
stats.averageDuration = stats.totalDuration / stats.totalQueries;
stats.maxDuration = Math.max(stats.maxDuration, duration);
stats.minDuration = Math.min(stats.minDuration, duration);
this.queryStats.set(queryName, stats);
}
getQueryStats(): Record<string, QueryStats> {
const result: Record<string, QueryStats> = {};
this.queryStats.forEach((stats, queryName) => {
result[queryName] = { ...stats };
});
return result;
}
private async alertSlowQuery(queryName: string, duration: number) {
// Send to monitoring service
try {
const Sentry = await import('@sentry/nextjs');
Sentry.captureMessage('Slow database query detected', {
tags: {
query_name: queryName,
performance_issue: 'true'
},
extra: {
duration_ms: duration,
threshold_ms: 1000
},
});
} catch {
// Sentry not available
}
}
}
interface QueryStats {
totalQueries: number;
totalDuration: number;
successCount: number;
errorCount: number;
averageDuration: number;
maxDuration: number;
minDuration: number;
}
// Usage in repository methods
export class UserRepository {
private monitor = QueryPerformanceMonitor.getInstance();
async findById(id: string): Promise<User | null> {
return this.monitor.monitorQuery('user_find_by_id', async () => {
const [user] = await this.db
.select()
.from(users)
.where(eq(users.id, id))
.limit(1);
return user || null;
});
}
}
Connection Pool Monitoring
Monitor connection pool health:
// Connection pool monitoring
export class ConnectionPoolMonitor {
private poolStats = {
totalConnections: 0,
activeConnections: 0,
idleConnections: 0,
queuedRequests: 0,
errors: 0,
};
async getPoolStats() {
try {
const database = await getDb();
if (!database) {
return { status: 'not_configured' };
}
// For PostgreSQL, query connection statistics
const connectionStats = await database.execute(sql`
SELECT
count(*) as total_connections,
count(*) FILTER (WHERE state = 'active') as active_connections,
count(*) FILTER (WHERE state = 'idle') as idle_connections
FROM pg_stat_activity
WHERE datname = current_database()
`);
const stats = connectionStats[0];
return {
status: 'healthy',
connections: {
total: parseInt(stats.total_connections),
active: parseInt(stats.active_connections),
idle: parseInt(stats.idle_connections),
},
timestamp: new Date().toISOString(),
};
} catch (error) {
console.error('Failed to get connection pool stats:', error);
return {
status: 'error',
error: error.message,
timestamp: new Date().toISOString(),
};
}
}
async checkPoolHealth(): Promise<{
healthy: boolean;
warnings: string[];
}> {
const stats = await this.getPoolStats();
const warnings: string[] = [];
if (stats.status === 'not_configured') {
return { healthy: false, warnings: ['Database not configured'] };
}
if (stats.status === 'error') {
return { healthy: false, warnings: [stats.error] };
}
// Check for connection pool issues
if (stats.connections) {
const { total, active, idle } = stats.connections;
// Too many active connections
if (active > total * 0.8) {
warnings.push(`High active connection usage: ${active}/${total} (${Math.round(active/total * 100)}%)`);
}
// No idle connections available
if (idle === 0 && active > 0) {
warnings.push('No idle connections available - potential bottleneck');
}
// Unusual connection patterns
if (total === 0) {
warnings.push('No database connections found');
}
}
return {
healthy: warnings.length === 0,
warnings,
};
}
}
๐ Database Metrics Collection
PostgreSQL Statistics Monitoring
Collect comprehensive database metrics:
// Database metrics collector
export class DatabaseMetricsCollector {
async collectMetrics(): Promise<DatabaseMetrics> {
try {
const database = await getDb();
if (!database) {
throw new Error('Database not available');
}
// Collect various PostgreSQL statistics
const [
tableStats,
indexStats,
connectionStats,
queryStats,
] = await Promise.all([
this.getTableStatistics(database),
this.getIndexStatistics(database),
this.getConnectionStatistics(database),
this.getQueryStatistics(database),
]);
return {
timestamp: new Date().toISOString(),
tables: tableStats,
indexes: indexStats,
connections: connectionStats,
queries: queryStats,
};
} catch (error) {
console.error('Failed to collect database metrics:', error);
throw error;
}
}
private async getTableStatistics(database: any) {
const result = await database.execute(sql`
SELECT
schemaname,
tablename,
n_tup_ins as inserts,
n_tup_upd as updates,
n_tup_del as deletes,
n_live_tup as live_tuples,
n_dead_tup as dead_tuples,
seq_scan as sequential_scans,
seq_tup_read as sequential_reads,
idx_scan as index_scans,
idx_tup_fetch as index_reads
FROM pg_stat_user_tables
ORDER BY seq_scan + idx_scan DESC
LIMIT 10
`);
return result.map((row: any) => ({
schema: row.schemaname,
table: row.tablename,
operations: {
inserts: parseInt(row.inserts),
updates: parseInt(row.updates),
deletes: parseInt(row.deletes),
},
tuples: {
live: parseInt(row.live_tuples),
dead: parseInt(row.dead_tuples),
},
scans: {
sequential: parseInt(row.sequential_scans),
index: parseInt(row.index_scans),
},
reads: {
sequential: parseInt(row.sequential_reads),
index: parseInt(row.index_reads),
},
}));
}
private async getIndexStatistics(database: any) {
const result = await database.execute(sql`
SELECT
schemaname,
tablename,
indexname,
idx_scan as scans,
idx_tup_read as reads,
idx_tup_fetch as fetches
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC
LIMIT 10
`);
return result.map((row: any) => ({
schema: row.schemaname,
table: row.tablename,
index: row.indexname,
scans: parseInt(row.scans),
reads: parseInt(row.reads),
fetches: parseInt(row.fetches),
efficiency: row.reads > 0 ? (row.fetches / row.reads * 100).toFixed(2) : 0,
}));
}
private async getConnectionStatistics(database: any) {
const result = await database.execute(sql`
SELECT
state,
count(*) as connection_count,
max(now() - backend_start) as max_connection_age,
avg(now() - backend_start) as avg_connection_age
FROM pg_stat_activity
WHERE datname = current_database()
GROUP BY state
`);
return result.map((row: any) => ({
state: row.state,
count: parseInt(row.connection_count),
maxAge: row.max_connection_age,
avgAge: row.avg_connection_age,
}));
}
private async getQueryStatistics(database: any) {
// Requires pg_stat_statements extension
try {
const result = await database.execute(sql`
SELECT
query,
calls,
total_time,
mean_time,
rows
FROM pg_stat_statements
WHERE query NOT LIKE '%pg_stat_statements%'
ORDER BY mean_time DESC
LIMIT 5
`);
return result.map((row: any) => ({
query: row.query.substring(0, 100) + '...', // Truncate for display
calls: parseInt(row.calls),
totalTime: parseFloat(row.total_time),
meanTime: parseFloat(row.mean_time),
rows: parseInt(row.rows),
}));
} catch (error) {
// pg_stat_statements not available
return [];
}
}
}
interface DatabaseMetrics {
timestamp: string;
tables: any[];
indexes: any[];
connections: any[];
queries: any[];
}
๐จ Alerting & Notifications
Performance-Based Alerting
Automated alerting for performance issues:
// Alert system for database monitoring
export class DatabaseAlertSystem {
private alertRules = {
slowQuery: { threshold: 1000, severity: 'warning' },
verySlowQuery: { threshold: 5000, severity: 'critical' },
highConnectionUsage: { threshold: 80, severity: 'warning' },
criticalConnectionUsage: { threshold: 95, severity: 'critical' },
lowCacheHitRatio: { threshold: 95, severity: 'warning' },
highErrorRate: { threshold: 5, severity: 'critical' },
};
async checkAlerts(): Promise<Alert[]> {
const alerts: Alert[] = [];
try {
// Check various metrics and generate alerts
const poolHealth = await new ConnectionPoolMonitor().checkPoolHealth();
const queryStats = QueryPerformanceMonitor.getInstance().getQueryStats();
const metrics = await new DatabaseMetricsCollector().collectMetrics();
// Connection usage alerts
if (!poolHealth.healthy) {
alerts.push({
type: 'connection_pool',
severity: 'warning',
message: `Connection pool issues: ${poolHealth.warnings.join(', ')}`,
timestamp: new Date().toISOString(),
});
}
// Query performance alerts
Object.entries(queryStats).forEach(([queryName, stats]) => {
if (stats.averageDuration > this.alertRules.verySlowQuery.threshold) {
alerts.push({
type: 'query_performance',
severity: 'critical',
message: `Very slow query detected: ${queryName} avg ${stats.averageDuration.toFixed(2)}ms`,
timestamp: new Date().toISOString(),
metadata: { queryName, averageDuration: stats.averageDuration },
});
} else if (stats.averageDuration > this.alertRules.slowQuery.threshold) {
alerts.push({
type: 'query_performance',
severity: 'warning',
message: `Slow query detected: ${queryName} avg ${stats.averageDuration.toFixed(2)}ms`,
timestamp: new Date().toISOString(),
metadata: { queryName, averageDuration: stats.averageDuration },
});
}
// Error rate alerts
const errorRate = (stats.errorCount / stats.totalQueries) * 100;
if (errorRate > this.alertRules.highErrorRate.threshold) {
alerts.push({
type: 'error_rate',
severity: 'critical',
message: `High error rate for ${queryName}: ${errorRate.toFixed(2)}%`,
timestamp: new Date().toISOString(),
metadata: { queryName, errorRate, totalQueries: stats.totalQueries },
});
}
});
// Send alerts if any found
if (alerts.length > 0) {
await this.sendAlerts(alerts);
}
return alerts;
} catch (error) {
console.error('Failed to check database alerts:', error);
return [];
}
}
private async sendAlerts(alerts: Alert[]) {
for (const alert of alerts) {
// Send to monitoring service
try {
const Sentry = await import('@sentry/nextjs');
const level = alert.severity === 'critical' ? 'error' : 'warning';
Sentry.captureMessage(alert.message, {
level,
tags: {
alert_type: alert.type,
severity: alert.severity,
},
extra: alert.metadata || {},
});
} catch {
// Sentry not available
}
// Log locally
const emoji = alert.severity === 'critical' ? '๐จ' : 'โ ๏ธ';
console.log(`${emoji} DATABASE ALERT [${alert.severity.toUpperCase()}]: ${alert.message}`);
// Could also send to:
// - Slack webhook
// - Email notifications
// - PagerDuty
// - Discord webhook
// - Custom notification service
}
}
}
interface Alert {
type: string;
severity: 'warning' | 'critical';
message: string;
timestamp: string;
metadata?: Record<string, any>;
}
Health Check Automation
Automated monitoring with scheduled health checks:
// Scheduled monitoring service
export class MonitoringScheduler {
private intervals: NodeJS.Timeout[] = [];
start() {
console.log('๐ Starting database monitoring scheduler...');
// Health check every 30 seconds
const healthCheckInterval = setInterval(async () => {
try {
await this.runHealthCheck();
} catch (error) {
console.error('Scheduled health check failed:', error);
}
}, 30000);
// Performance metrics every 5 minutes
const metricsInterval = setInterval(async () => {
try {
await this.collectAndLogMetrics();
} catch (error) {
console.error('Scheduled metrics collection failed:', error);
}
}, 300000);
// Alert checking every minute
const alertsInterval = setInterval(async () => {
try {
await this.checkAndSendAlerts();
} catch (error) {
console.error('Scheduled alert check failed:', error);
}
}, 60000);
this.intervals.push(healthCheckInterval, metricsInterval, alertsInterval);
}
stop() {
console.log('โน๏ธ Stopping database monitoring scheduler...');
this.intervals.forEach(interval => clearInterval(interval));
this.intervals = [];
}
private async runHealthCheck() {
const health = await healthCheck();
if (health.status !== 'healthy') {
console.warn(`๐ Health check warning: ${health.status}`, health.error);
} else if (health.latency && health.latency > 1000) {
console.warn(`๐ Slow health check: ${health.latency}ms`);
}
}
private async collectAndLogMetrics() {
const collector = new DatabaseMetricsCollector();
const metrics = await collector.collectMetrics();
// Log key metrics
console.log('๐ Database metrics:', {
timestamp: metrics.timestamp,
tableCount: metrics.tables.length,
indexCount: metrics.indexes.length,
activeConnections: metrics.connections.find(c => c.state === 'active')?.count || 0,
slowestQuery: metrics.queries[0]?.meanTime || 0,
});
}
private async checkAndSendAlerts() {
const alertSystem = new DatabaseAlertSystem();
const alerts = await alertSystem.checkAlerts();
if (alerts.length > 0) {
console.log(`๐จ Generated ${alerts.length} database alerts`);
}
}
}
// Start monitoring in production
if (process.env.NODE_ENV === 'production') {
const scheduler = new MonitoringScheduler();
scheduler.start();
// Graceful shutdown
process.on('SIGTERM', () => scheduler.stop());
process.on('SIGINT', () => scheduler.stop());
}
๐ฑ Monitoring Dashboard API
API endpoint for monitoring dashboard:
// API route for monitoring dashboard
// apps/web/app/api/(dev)/monitoring/database/route.ts
export async function GET() {
try {
const [
health,
poolStats,
queryStats,
metrics,
alerts
] = await Promise.all([
healthCheck(),
new ConnectionPoolMonitor().getPoolStats(),
Promise.resolve(QueryPerformanceMonitor.getInstance().getQueryStats()),
new DatabaseMetricsCollector().collectMetrics(),
new DatabaseAlertSystem().checkAlerts(),
]);
return NextResponse.json({
status: 'success',
timestamp: new Date().toISOString(),
data: {
health,
connectionPool: poolStats,
queryPerformance: queryStats,
metrics,
alerts,
summary: {
healthy: health.status === 'healthy',
latency: health.latency,
alertCount: alerts.length,
criticalAlerts: alerts.filter(a => a.severity === 'critical').length,
},
},
});
} catch (error) {
console.error('Monitoring dashboard API error:', error);
return NextResponse.json(
{
status: 'error',
error: error instanceof Error ? error.message : 'Monitoring failed',
timestamp: new Date().toISOString(),
},
{ status: 500 }
);
}
}
๐ฏ Cloud Provider Monitoring
Neon Monitoring
Leverage Neon's built-in monitoring:
- Metrics Dashboard: Connection count, query performance, storage usage
- Query Insights: Slow query identification and optimization suggestions
- Resource Monitoring: CPU, memory, and storage utilization
- Branch Analytics: Per-branch performance metrics
- Alerting: Custom alerts via webhooks or email
Supabase Monitoring
Utilize Supabase monitoring features:
- Database Health: Connection metrics and query performance
- Resource Usage: CPU, memory, and storage monitoring
- Logs: Real-time database logs and error tracking
- Performance Insights: Query optimization recommendations
- Custom Metrics: API for custom monitoring integration
Railway Monitoring
Access Railway's monitoring tools:
- Resource Metrics: CPU, memory, and network usage
- Database Analytics: Query performance and connection metrics
- Observability: Integrated logging and metrics
- Health Checks: Automated uptime monitoring
- Alerting: Notification integration with various services
๐ฏ Best Practices
1. Monitoring Strategy
Implement the four golden signals:
- Latency: How long requests take
- Traffic: How many requests you're getting
- Errors: Rate of requests that fail
- Saturation: How full your service is
2. Alert Fatigue Prevention
Design meaningful alerts:
// Good: Actionable alerts with context
if (responseTime > 5000) {
alert(`CRITICAL: Database response time ${responseTime}ms exceeds 5s threshold`);
}
// Bad: Noisy alerts without context
if (responseTime > 100) {
alert('Slow database');
}
3. Monitoring Documentation
Document your monitoring setup:
- Runbooks: Step-by-step incident response procedures
- Alert definitions: What each alert means and how to respond
- Metric explanations: What each metric measures and why it matters
- Dashboard guides: How to interpret monitoring dashboards
๐ฏ Next Steps
Complete your database mastery with the final guide:
- Advanced - Advanced PostgreSQL features and patterns
- Security - Advanced security monitoring
- Performance - Performance monitoring integration
- Backup & Restore - Backup monitoring
Comprehensive database monitoring ensures your Solo Kit application maintains optimal performance, reliability, and user experience at scale.