Speaker Topics - No Fluff Just Stuff

Scaling APIs for Millions of AI-Driven Calls

AI agents are becoming a new class of API consumers. Unlike human users, agents can create bursty traffic, retry aggressively, call multiple tools in parallel, and accidentally amplify downstream failures. A single user request can become a large chain of API calls, model calls, vector searches, database lookups, and workflow events.

This talk explains how to design APIs for this new reality.

We will cover agent-aware rate limiting, budget-aware throttling, backpressure, load shedding, idempotency, deduplication, deterministic caching, async workflows, event-driven APIs, tail-latency SLOs, and cost observability.

Participants will learn how to tag and trace agent traffic, control runaway tool calls, prevent retry amplification, design graceful degradation, and build runbooks for cache storms, retry storms, dependency brownouts, and cost spikes.

The core message:

APIs exposed to AI agents must be contract-safe, retry-safe, cost-aware, observable, and degradation-ready.

Classic API scaling assumed relatively predictable traffic.

AI-driven API traffic is different because:

  • One prompt can create many downstream API calls.
  • Agents can retry, loop, and fan out.
  • Tool-calling creates bursty and non-human traffic patterns.
  • Cost grows with requests, retries, context size, model calls, and downstream work.
  • Failures can amplify quickly across gateways, SDKs, queues, databases, and model APIs.

Agenda

  1. Why AI Changes API Scaling
    Human traffic versus agent traffic, tool chains, fan-out, retries, and burst patterns.
  2. New Failure Modes
    Retry storms, cache-miss storms, malformed tool calls, version drift, DB saturation, and cost spikes.
  3. Traffic Control for AI Agents
    Agent-aware rate limits, per-tenant budgets, per-tool quotas, fair queuing, and adaptive backpressure.
  4. Resilience Patterns
    Idempotency keys, deduplication, bounded retries, circuit breakers, bulkheads, timeouts, and load shedding.
  5. Caching for AI Workloads
    Deterministic-result caching, semantic-aware caching, stale-while-revalidate, negative caching, and cache warming.
  6. Async and Event-Driven APIs
    Queue-first design, workflows, webhooks, streaming responses, outbox patterns, and dead-letter handling.
  7. Observability and Cost Governance
    Chain IDs, tool IDs, agent IDs, tail-latency SLOs, per-agent cost attribution, anomaly detection, and loop detection.
  8. Runbooks and Readiness
    Playbooks for retry storms, cache storms, provider brownouts, cost spikes, and safe degradation.

About Rohit Bhardwaj

Rohit Bhardwaj is a Director of Architecture working at Salesforce. Rohit has extensive experience architecting multi-tenant cloud-native solutions in Resilient Microservices Service-Oriented architectures using AWS Stack. In addition, Rohit has a proven ability in designing solutions and executing and delivering transformational programs that reduce costs and increase efficiencies.

As a trusted advisor, leader, and collaborator, Rohit applies problem resolution, analytical, and operational skills to all initiatives and develops strategic requirements and solution analysis through all stages of the project life cycle and product readiness to execution.
Rohit excels in designing scalable cloud microservice architectures using Spring Boot and Netflix OSS technologies using AWS and Google clouds. As a Security Ninja, Rohit looks for ways to resolve application security vulnerabilities using ethical hacking and threat modeling. Rohit is excited about architecting cloud technologies using Dockers, REDIS, NGINX, RightScale, RabbitMQ, Apigee, Azul Zing, Actuate BIRT reporting, Chef, Splunk, Rest-Assured, SoapUI, Dynatrace, and EnterpriseDB. In addition, Rohit has developed lambda architecture solutions using Apache Spark, Cassandra, and Camel for real-time analytics and integration projects.

Rohit has done MBA from Babson College in Corporate Entrepreneurship, Masters in Computer Science from Boston University and Harvard University. Rohit is a regular speaker at No Fluff Just Stuff, UberConf, RichWeb, GIDS, and other international conferences.

Rohit loves to connect on http://www.productivecloudinnovation.com.
http://linkedin.com/in/rohit-bhardwaj-cloud or using Twitter at rbhardwaj1.

More About Rohit »