博客
文章
关于软件工程、技术与行业的随笔。
Load shedding for HTTP APIs: prioritization, degradation tiers, and backpressure
Autoscaling lags behind spikes. Load shedding rejects low-priority work so critical APIs stay fast: signals, priority buckets, Node admission sketches, HTTP semantics, and pitfalls.
Software engineeringBackendAPI designSite reliability engineeringNode.jsKubernetesRequest coalescing and single-flight: stopping cache stampedes before they flatten your database
Why synchronized TTL expiry causes thundering herds, how in-process and distributed single-flight deduplicate work, and trade-offs when scaling Node.js and Redis-backed caches.
Software engineeringBackendNode.jsPerformanceRedisArchitectureOpenAPI as the source of truth: codegen, drift detection, and breaking-change gates
Using OpenAPI as an executable contract—typed clients, server validation, CI diff gates, and workflows that keep specs and implementations aligned in production API teams.
OpenAPIAPI designCI/CDTypeScriptDeveloper experienceBackendGitHub Actions OIDC with AWS: short-lived CI credentials without static access keys
Replace static AWS keys in CI with OIDC to IAM roles: GitHub issuer registration, trust policies on jwt claims, least-privilege roles, and common integration failures.
Software engineeringDevOpsSecurityGitHub ActionsAWSCI/CDDistributed scheduled jobs: leases, idempotency, and why clocks lie
Run cron-like work safely across replicas: leader leases, idempotent handlers, scheduling semantics with skew, and operational patterns for Node.js and Postgres-backed workers.
Software engineeringBackendDistributed systemsPostgreSQLReliabilityNode.jsHTTP API admission control: concurrency limits, queues, and load shedding
Cap concurrent HTTP handlers per instance, add bounded queue waits or fail fast with 503s, and align limits with database pools and Kubernetes readiness. Focused on Node.js production APIs.
Software engineeringBackendAPI designNode.jsTypeScriptReliabilityKubernetesPresigned URL uploads to object storage: security boundaries, pipelines, and async verification
Ship browser-to-bucket uploads without proxying bytes through your API. Policy fields, content-type constraints, key layout, malware scanning hooks, and the failure modes teams hit in production.
Software engineeringBackendAPI designSecurityAmazon Web ServicesNode.jsAPI contract testing with Pact: consumer-driven workflows for polyglot services
Pact contract tests for HTTP APIs: consumer expectations, provider verification with states, broker workflows, and where they complement—not replace—integration tests.
Software engineeringBackendAPI designTestingMicroservicesTypeScriptDevOpsNode.js AsyncLocalStorage: request-scoped context without globals
Carry correlation IDs, auth, and tenancy through async call chains in Node.js using AsyncLocalStorage. CLS patterns, pitfalls with worker threads, and testing strategies for production APIs.
Node.jsTypeScriptBackendObservabilitySoftware architectureAPI designRFC 9457 Problem Details for HTTP APIs: stable errors clients can rely on
Use application/problem+json to return structured API errors with types, titles, and extension fields. Mapping exceptions, validation, and proxy behavior for production HTTP services.
Software engineeringAPI designBackendHTTPRESTTypeScriptPostgreSQL LISTEN/NOTIFY: real-time invalidation, delivery semantics, and when to stop
Use LISTEN/NOTIFY for lightweight pub/sub from Postgres to app servers: payload limits, no persistence, connection pooling pitfalls, and patterns that stay correct under load.
PostgreSQLBackendSoftware engineeringNode.jsArchitectureCachingJWT vs opaque API tokens: sessions, revocation, and scalability trade-offs
Compare signed JWTs with opaque server-side tokens for APIs: verification cost, revocation semantics, storage, and patterns that hold up under mobile, B2B, and high-QPS backends.
Software engineeringBackendAPI designSecurityOAuth 2.0TypeScriptPostgreSQL advisory locks: coordination without touching rows
When and how to use session and transaction advisory locks in Postgres for jobs, migrations, and cross-request serialization, with pooling caveats and a Node.js example.
PostgreSQLBackendSoftware engineeringConcurrencyArchitectureLLM-as-judge evaluation: rubrics, calibration, and production pitfalls
Using models to score or rank other model outputs: rubric design, calibration against humans, bias risks, and how to combine automated judges with spot checks in shipping AI features.
Artificial intelligenceMachine learning operationsQuality assuranceSoftware engineeringPostgreSQL connection pooling in Kubernetes: sizing pools, PgBouncer, and Node.js pitfalls
Right-size Node.js DB pools behind replicas so you do not exhaust PostgreSQL max_connections. PgBouncer modes, prepared statements, checkout timeouts, and what to measure in production.
Software engineeringBackendPostgreSQLKubernetesNode.jsInfrastructureHybrid logical clocks: ordering events when wall clocks lie
Use HLCs to assign monotonic, causally aware timestamps across nodes without tight clock sync. How they work, comparison to Lamport clocks, and production patterns for logs and APIs.
Software engineeringDistributed systemsBackendArchitecturePostgreSQLObservabilityOAuth 2.1 for production APIs: authorization code, PKCE, refresh rotation, and M2M boundaries
Ship browser and mobile clients without long-lived secrets in the bundle. How OAuth 2.1 tightens the authorization code flow, why refresh rotation matters, and when client credentials fit.
Software engineeringBackendAPI designSecurityOAuth 2.1OpenID ConnectRequest deadlines, cancellation, and backpressure in Node.js HTTP services
Propagate timeouts with AbortSignal, stop wasted work when clients disconnect, and align server deadlines with upstream calls. Patterns for fetch, pools, and long handlers in production APIs.
Software engineeringBackendNode.jsTypeScriptAPI designReliabilityOptimistic concurrency for HTTP APIs: ETags, If-Match, and conflict design
Prevent lost updates without pessimistic locks: versioned resources, conditional requests, 412 vs 409, and how optimistic concurrency interacts with caching, BFFs, and mobile retries.
Software engineeringAPI designBackendHTTPReliabilityArchitectureHTTP streaming in production backends: SSE, chunked transfer, proxies, and backpressure
Designing long-lived HTTP responses for live dashboards and LLM-style token delivery: Server-Sent Events vs chunked bodies, intermediary buffering, timeouts, and safe client reconnection.
Software engineeringBackendHTTPNode.jsAPI designReliabilityReal-time systemsService level objectives and error budgets: turning reliability into a product decision
How SLIs, SLOs, and error budgets connect user-visible reliability to engineering trade-offs: choosing indicators, setting targets, and using burn alerts without drowning in metrics.
Software engineeringSite reliability engineeringObservabilityBackendDevOpsMulti-turn LLM backends: session state, retrieval, and tool-call loops without losing the thread
Session-backed LLM APIs: durable turns, bounded context, RAG outside the transcript, tool-call round-trips, and per-session isolation—patterns from production assistants.
Software engineeringArtificial intelligenceBackendAPI designPostgreSQLTypeScriptDistributed locks and fencing tokens: why TTL alone is not enough
Lease-based locks in Redis or etcd prevent most double execution—but a delayed process can still corrupt shared state. Fencing tokens from a linearizable store close the gap. Patterns, SQL integration, and pitfalls.
Software engineeringDistributed systemsBackendArchitectureRedisPostgreSQLServer-side feature flags in distributed backends: evaluation, consistency, and kill switches
How to use feature flags beyond the frontend: where to evaluate rules, how to keep behavior consistent across services, and operational patterns for safe rollouts and instant rollback.
Software engineeringBackendArchitectureDevOpsAPI designTypeScriptDead-letter queues for async backends: when to use them, how to design them, and how not to drown in poison messages
Turn poison messages and partial failures into safe replays: DLQ naming, redrive policies, idempotency, observability, and cases where a DLQ is the wrong abstraction.
Software engineeringBackendDistributed systemsArchitectureObservabilityMessagingRead-your-writes consistency: when CDNs and caches lie (and how to fix it)
Users refresh and still see stale data after a successful write. This article explains read-your-writes consistency, cache-control patterns, surrogate keys, and tokenized URLs for edge-cached APIs.
Software engineeringBackendFrontendAPI designArchitectureCDNPostgreSQL Row-Level Security: tenant isolation that survives application bugs
How PostgreSQL RLS enforces tenant isolation in the database: session variables, policy patterns, indexing trade-offs, and pitfalls in multi-tenant SaaS.
PostgreSQLSoftware engineeringBackendSecurityArchitectureMulti-tenancyRead-your-writes consistency: replicas, routing, and session tokens
After a write, users expect to see their change immediately. Async replication breaks that illusion—here is how sticky routing, monotonic tokens, and cache discipline restore read-your-writes without giving up scale.
Software engineeringBackendDistributed systemsDatabaseAPI designArchitectureLLM trust boundaries: prompt injection, tool abuse, and defense in depth
Treat LLM inputs as untrusted code: separate system from user content, constrain tools, validate outputs, and layer controls so assistants cannot exfiltrate secrets or hijack workflows.
Software engineeringArtificial intelligenceSecurityBackendAPI designArchitectureZero-downtime database migrations: the expand-contract pattern in practice
Ship relational schema changes without maintenance windows using expand-contract phases, backfills, and safe cutovers—patterns that keep production APIs available under load.
Software engineeringArchitectureBackendPostgreSQLDevOpsGraceful shutdown for HTTP services: signals, draining, and Kubernetes
Stop Node and containerized APIs without 502 spikes: SIGTERM semantics, draining in-flight requests, readiness vs liveness, and background job coordination.
Software engineeringBackendNode.jsKubernetesReliabilityDevOpsSemantic caching for LLM APIs: cost, latency, and correctness
Reuse model outputs when prompts are paraphrases, not byte-identical strings. Embedding-based cache keys, TTLs, invalidation, and guardrails for production LLM backends.
Software engineeringArtificial intelligenceBackendAPI designRedisEmbeddingsKeyset pagination: building stable, scalable list APIs
Why OFFSET-based pagination breaks at scale and how cursor (keyset) pagination uses indexed columns for predictable latency, with SQL patterns and API design.
Software engineeringBackendAPI designPostgreSQLPerformanceArchitectureOptimistic UI with server reconciliation: patterns that survive production
How to ship instant-feeling interfaces without lying to users: mutation lifecycles, rollback rules, idempotency, and conflict handling when the server is the source of truth.
Software engineeringFrontendBackendAPI designTypeScriptReactProduction webhook receivers: signatures, replay protection, and idempotent delivery
How to build webhook HTTP endpoints that survive retries, clock skew, and malicious traffic: HMAC verification, timestamp windows, deduplication, and clear response contracts.
Software engineeringBackendAPI designSecurityReliabilityWebhooksDistributed sagas: choreography vs orchestration in production systems
Compare event-driven choreography and orchestrator-led sagas for multi-step workflows: coordination models, failure handling, observability, and when to choose each.
Software engineeringArchitectureDistributed systemsMicroservicesBackendReliabilityProduction LLM API integration: streaming, structured outputs, and resilience
How to ship LLM features behind your API: SSE streaming, JSON schema contracts, retries, timeouts, and cost controls—without turning your backend into an unreliable demo.
Software engineeringArtificial intelligenceBackendAPI designTypeScriptRAG pipelines in production: chunking, retrieval, and evaluation
How to design retrieval-augmented generation for real systems: text splitting, embedding strategy, reranking, guardrails, and how to measure quality before users do.
Software engineeringArtificial intelligenceArchitectureBackendMachine learningReliabilityDistributed tracing with W3C Trace Context and OpenTelemetry
How W3C trace context and OpenTelemetry connect spans across services: propagation, span design, sampling strategies, and production pitfalls for backend systems.
Software engineeringArchitectureObservabilityOpenTelemetryBackendAPI rate limiting: token buckets, sliding windows, and distributed fairness
How to choose algorithms for HTTP APIs, implement limits that survive restarts and multiple nodes, and communicate limits clearly to clients—without turning throttling into guesswork.
Software engineeringArchitectureBackendAPI designReliabilityRedisThe transactional outbox: publishing events without losing data or doubling work
Why dual writes fail between databases and message brokers, how the outbox pattern fixes it with atomic commits, and how to run workers, polling, and trade-offs in production.
Software engineeringArchitectureBackendReliabilityMessagingPostgreSQLAPI evolution and backward compatibility
A structured look at versioning strategies, deprecation, and communication patterns that keep integrations stable while software continues to change.
Software engineeringArchitectureTypeScriptDeveloper experienceOperational observability for production services
How structured signals—logs, metrics, and traces—support incident response and steady improvement without overwhelming engineering teams.
Software engineeringDeveloper experienceQuality assuranceArchitectureCircuit breakers, bulkheads, and timeouts: isolating failure in distributed systems
How to combine circuit breakers, resource isolation, and explicit timeouts so one slow dependency does not take down your API—patterns, trade-offs, and Node.js-oriented examples.
Software engineeringArchitectureBackendReliabilityNode.jsIdempotency keys: designing safe retries for HTTP APIs
How idempotency keys let clients retry network failures without duplicate side effects, with storage patterns, HTTP semantics, and production pitfalls.
Software engineeringArchitectureBackendAPI designReliabilityGoverning AI-assisted software development
How large language models shift cost from authoring to verification, and which governance patterns preserve review quality, security, and clear ownership in engineering teams.
Artificial intelligenceSoftware engineeringDeveloper experienceQuality assuranceTypeScript discipline at scale
Why gradual typing pays off for teams shipping production web software, and how conventions keep complexity manageable.
TypeScriptEngineering practicesPayPal