Why Hotel API Integration is So Hard? (3) Rate Limiting Nightmare: Blocked 3 Times on Day 1

This is part 3 of the “Why Hotel API Integration is So Hard?” series.

Part 1: Authentication Hell | Part 2: Data Chaos | Part 4: Error Handling (Coming Soon)

Has Your Search Feature Been Blocked by Supplier on Day 1?

Real Scenario

Your hotel search feature is live!

Day 1:

9:00 AM: Everything works fine, search response 300ms
10:00 AM: User traffic increases, search QPS reaches 100
10:05 AM: Supplier A starts returning 429 Too Many Requests
10:10 AM: Supplier B returns 403 Forbidden
10:15 AM: Supplier C returns 429 Too Many Requests

User complaints:

“Why can’t I find XX hotel?”
“Why is the search always spinning?”
“Why does it show error?”

You open the supplier backend and see this:

Rate Limit Status: EXCEEDED
Next Reset Time: 2026-02-06 16:00:00 UTC (3 hours later)
Your API Key has been temporarily suspended.

The Problem

You only integrated 5 suppliers, got blocked 3 times on day 1.

Why?
What are the supplier’s rate limiting rules?
Why are some hotels never searchable?

Pain Point 1: Different Rate Limiting Rules per Supplier

5 Suppliers, 5 Different Rate Limiting Rules

Supplier	QPS Limit	Time Window	Rate Limiting Strategy	What happens when exceeded
A	10	Per second	Fixed window	429 error
B	100	Per minute	Sliding window	403 error
C	1000	Per hour	Token bucket	429 error
D	500	Per minute	Sliding window	Suspend API Key
E	No limit	N/A	N/A	No limit

Differences in Rate Limiting Strategies

Fixed Window (Supplier A)

Time window: Per second
Limit: 10 QPS

Problem:

❌ Second 1 0.99s: Send 10 requests → OK
❌ Second 2 0.01s: Send 10 requests → OK
❌ Send 20 requests within 0.02s → Sudden burst traffic

Supplier’s perspective: You followed the QPS limit, but…

User’s perspective: Why is the search sometimes fast, sometimes slow?

Sliding Window (Supplier B)

Time window: Per minute
Limit: 100 QPS

More precise, but more complex:

Record timestamp for each request
Count requests within the past 60 seconds
If exceed 100, reject

Problem:

⚠️ Need to maintain request timestamp list
⚠️ High memory usage (if QPS is high)
⚠️ Complex to implement

Token Bucket (Supplier C)

Time window: Per hour
Limit: 1000 QPS
Token refill rate: ~16.7 tokens/second
Bucket capacity: 1000 tokens

Working principle:

Add 16.7 tokens to bucket every second
Each request consumes 1 token
If bucket has no tokens, reject request

Problem:

⚠️ Burst traffic: If bucket is full, can send 1000 requests instantly
⚠️ Cold start: Bucket may be empty initially

API Key Suspension (Supplier D)

After exceeding: Suspend API Key for 1 hour
Second time exceeding: Suspend API Key for 24 hours
Third time exceeding: Permanently disable API Key

Problem:

❌ Suspend for 1 hour on first exceed
❌ Cannot recover quickly
❌ Your search feature becomes completely unavailable

Pain Point 2: Hot Hotels Have Stricter Rate Limits

Real Scenario

Your users are searching for London’s hot hotel:

London Central Hotel (hot, 100x search traffic of ordinary hotels)
London Budget Hotel (cold, very low search traffic)

Supplier’s rate limiting rules:

Global QPS limit: 100/minute
Single hotel QPS limit: 10/minute

The Problem

First user searches:

London Central Hotel → Search successful
London Budget Hotel → Search successful

100th user searches:

London Central Hotel → 429 Too Many Requests
London Budget Hotel → Search successful

User complaints:

“Why can’t I find London Central Hotel?”
“Why can I find it sometimes but not others?”

The Importance of Hot Spot Detection

Supplier’s perspective:

Hot hotels: Too much search traffic, protect supplier’s backend
Single hotel rate limiting: Prevent certain hot hotel from dragging down the entire system

Your problem:

You don’t know which are hot hotels
You don’t know each supplier’s single hotel rate limiting rules
You don’t know how to avoid hot spots

Pain Point 3: Burst Traffic Causes Exceeds

Real Scenario

Your search feature is stable under normal conditions:

Average QPS: 50
Response time: 300ms
Not rate limited

One morning at 10 AM:

An influencer posted a recommendation for your website
Suddenly 10000 users flood in
Instant QPS reaches 500

Result:

Supplier A: 429 Too Many Requests
Supplier B: 403 Forbidden
Supplier C: 429 Too Many Requests
Supplier D: API Key suspended
Supplier E: Timeout (backend overloaded)

User complaints:

“Why is the website so slow?”
“Why does it keep showing errors?”
“I can’t use it!”

The Problem

Your system design assumes average QPS is 50, but burst traffic is 500.

What should you do?
Estimate traffic in advance?
Implement caching?
Queue management?

Pain Point 4: Cache Invalidation Causes Duplicate Calls

Real Scenario

You implemented caching:

Search results cached for 5 minutes
User searches the same query, return directly from cache

Effect:

Response time: 300ms → 50ms
QPS: 500 → 50
Supplier API calls reduced by 90%

But…

Suddenly the price of a certain hot hotel changes:

User A searches: $150 (cached)
User B searches: $150 (cached)
User C searches: $150 (cached)
Supplier price changes: $180

User D searches:

Cache invalidates
Call supplier API
Price: $180

User complaints:

“Why did the price change?”
“My friend saw $150, why do I see $180?”

The Problem

If you don’t cache:

Slow response time (300ms)
High QPS (500)
Easily rate limited

If you cache:

Fast response time (50ms)
Low QPS (50)
But inaccurate prices

How to balance?

Pain Point 5: Multi-Supplier Concurrent Calls Cause Exceeds

Real Scenario

Your search feature calls 5 suppliers concurrently:

async def search_hotel(hotel_id):
    tasks = []
    for supplier in suppliers:
        task = asyncio.create_task(supplier.search(hotel_id))
        tasks.append(task)
    
    results = await asyncio.gather(*tasks)
    return results

Single user search:

Concurrently call 5 suppliers
Each supplier receives 1 request
QPS: 5

100 users search simultaneously:

Concurrently call 500 suppliers (5 suppliers × 100 users)
Each supplier receives 100 requests
QPS: 100

500 users search simultaneously:

Concurrently call 2500 suppliers (5 suppliers × 500 users)
Each supplier receives 500 requests
QPS: 500

Supplier A’s rate limiting rule:

10 QPS (per second)

Result:

100 users search simultaneously: Exceeded (100 > 10)
500 users search simultaneously: Exceeded (500 > 10)

The Problem

Your user traffic grows fast, but you’re rate limited.

Should you limit user concurrency?
Should you implement queuing?
Should you cache?

Our Solution

Core Idea: Intelligent Rate Limiting Management

Problem: 5 suppliers, 5 different rate limiting rules, easy to exceed.

Solution: Intelligent rate limiting management + hot spot detection + multi-level caching + queue management

User Request → Queue Management → Intelligent Rate Limiting → Hot Spot Detection → Multi-level Cache → Supplier API

1. Intelligent Rate Limiting Management

Automatically Identify Rate Limiting Rules

class RateLimiter:
    def __init__(self, supplier):
        self.supplier = supplier
        self.rules = self._detect_rate_limit_rules()
    
    def _detect_rate_limit_rules(self):
        # 1. Read rate limiting rules from supplier documentation
        docs_rules = self.supplier.rate_limit_docs
        
        # 2. Test rate limiting rules (progressive testing)
        test_rules = self._test_rate_limits()
        
        # 3. Choose the strictest rule
        return self._merge_rules(docs_rules, test_rules)
    
    def _test_rate_limits(self):
        # Progressive testing: start from low QPS, gradually increase
        qps = 1
        max_qps = 1000
        
        while qps <= max_qps:
            success = self._test_qps(qps)
            if not success:
                break
            qps *= 2
        
        return {
            "qps_limit": qps // 2,
            "time_window": self._detect_time_window(),
            "strategy": self._detect_strategy()
        }

Dynamically Adjust Request Rate

async def request_with_rate_limit(supplier, request):
    rate_limiter = RateLimiter(supplier)
    
    # 1. Check current request rate
    current_qps = rate_limiter.get_current_qps()
    max_qps = rate_limiter.get_max_qps()
    
    # 2. If approaching rate limit, reduce rate
    if current_qps > max_qps * 0.8:
        # Reduce to 80% rate
        await asyncio.sleep(1.0 / (max_qps * 0.2))
    
    # 3. Send request
    response = await supplier.request(request)
    
    # 4. Check if rate limited
    if response.status_code == 429:
        # Rate limited, wait and retry
        retry_after = response.headers.get('Retry-After', 60)
        await asyncio.sleep(retry_after)
        return await request_with_rate_limit(supplier, request)
    
    return response

2. Hot Spot Detection

Real-time Identify Hot Hotels

class HotSpotDetector:
    def __init__(self):
        self.request_counts = {}  # hotel_id -> (count, timestamp)
        self.hotspots = set()
    
    def record_request(self, hotel_id):
        now = time.time()
        
        # 1. Record request
        self.request_counts[hotel_id] = (
            self.request_counts.get(hotel_id, (0, 0))[0] + 1,
            now
        )
        
        # 2. Clean expired data (> 1 minute)
        self._cleanup_expired_data(now - 60)
        
        # 3. Detect hot spots
        self._detect_hotspots()
    
    def _detect_hotspots(self):
        # 1. Calculate each hotel's QPS
        hotel_qps = {}
        for hotel_id, (count, timestamp) in self.request_counts.items():
            hotel_qps[hotel_id] = count / 60.0  # per minute
        
        # 2. Identify hot spots (QPS > average * 10)
        avg_qps = sum(hotel_qps.values()) / len(hotel_qps)
        self.hotspots = {
            hotel_id for hotel_id, qps in hotel_qps.items()
            if qps > avg_qps * 10
        }
    
    def is_hotspot(self, hotel_id):
        return hotel_id in self.hotspots

Rate Limiting Strategy for Hot Hotels

async def search_hotel_with_hotspot_protection(hotel_id):
    hotspot_detector = HotSpotDetector()
    
    # 1. Record request
    hotspot_detector.record_request(hotel_id)
    
    # 2. Detect if it's a hot spot
    if hotspot_detector.is_hotspot(hotel_id):
        # Hot hotel: Reduce request frequency
        await asyncio.sleep(1.0)  # Max 1 request per second
    else:
        # Ordinary hotel: Normal request
        pass
    
    # 3. Search hotel
    return await search_hotel(hotel_id)

3. Multi-level Caching

Cache Hierarchy

L1: Local memory cache (hot data, second-level TTL)
  ↓ miss
L2: Redis cache (warm data, minute-level TTL)
  ↓ miss
L3: Supplier API (real-time data)

Adaptive TTL

def adaptive_ttl(hotel_id, check_in_date):
    # 1. Base TTL: 1 minute
    base_ttl = 60  # seconds
    
    # 2. Days until check-in
    days_until = (check_in_date - datetime.now()).days
    time_factor = min(1.0, days_until / 30.0)
    
    # 3. Hot hotel: Shorten TTL
    hotspot_detector = HotSpotDetector()
    if hotspot_detector.is_hotspot(hotel_id):
        hotspot_factor = 0.5  # Hot hotel TTL halved
    else:
        hotspot_factor = 1.0
    
    # 4. Calculate final TTL
    final_ttl = int(base_ttl * time_factor * hotspot_factor)
    
    return final_ttl

Cache Implementation

async def search_hotel_with_cache(hotel_id, check_in_date):
    cache_key = f"hotel:{hotel_id}:{check_in_date}"
    
    # 1. L1: Local cache
    if cache_key in local_cache:
        return local_cache[cache_key]
    
    # 2. L2: Redis cache
    cached = await redis.get(cache_key)
    if cached:
        data = json.loads(cached)
        local_cache[cache_key] = data
        return data
    
    # 3. L3: Supplier API
    result = await search_hotel_from_supplier(hotel_id)
    
    # 4. Write to cache
    ttl = adaptive_ttl(hotel_id, check_in_date)
    await redis.setex(cache_key, ttl, json.dumps(result))
    local_cache[cache_key] = result
    
    return result

4. Queue Management

Request Queue

class RequestQueue:
    def __init__(self, max_concurrent=10):
        self.queue = asyncio.Queue()
        self.max_concurrent = max_concurrent
        self.semaphore = asyncio.Semaphore(max_concurrent)
        self.workers = []
    
    async def enqueue(self, request):
        await self.queue.put(request)
    
    async def start_workers(self):
        for i in range(self.max_concurrent):
            worker = asyncio.create_task(self._worker(i))
            self.workers.append(worker)
    
    async def _worker(self, worker_id):
        while True:
            request = await self.queue.get()
            
            async with self.semaphore:
                try:
                    result = await process_request(request)
                    request.complete(result)
                except Exception as e:
                    request.fail(e)
                finally:
                    self.queue.task_done()
    
    async def stop(self):
        for worker in self.workers:
            worker.cancel()

Burst Traffic Protection

async def search_hotel_with_queue_protection(hotel_id):
    # 1. Check queue length
    queue_length = request_queue.qsize()
    
    if queue_length > 1000:
        # Queue too long, reject new requests
        raise TooManyRequests("Please try again later")
    
    # 2. Add to queue
    future = asyncio.Future()
    await request_queue.enqueue(SearchRequest(
        hotel_id=hotel_id,
        future=future
    ))
    
    # 3. Wait for result
    result = await future
    return result

5. Graceful Degradation

async def search_hotel_with_fallback(hotel_id):
    # 1. Try primary supplier
    try:
        result = await search_hotel_from_supplier(supplier_a, hotel_id)
        return result
    except RateLimitError:
        # Rate limited, try backup supplier
        pass
    
    # 2. Try backup supplier
    try:
        result = await search_hotel_from_supplier(supplier_b, hotel_id)
        return result
    except RateLimitError:
        # Backup supplier also rate limited, return cache
        pass
    
    # 3. Return cache (may be expired)
    cached = await redis.get(f"hotel:{hotel_id}")
    if cached:
        return json.loads(cached)
    
    # 4. Really no choice, return error
    raise ServiceUnavailable("All suppliers are rate limited")

Our Advantages

1. Intelligent Rate Limiting Management

Automatically identify each supplier’s rate limiting rules
Dynamically adjust request rate
Avoid being rate limited
Automatically retry rate limited requests

2. Hot Spot Detection

Real-time identify hot hotels
Automatically reduce hot hotel request frequency
Avoid single hotel rate limiting

3. Multi-level Caching

L1 local cache + L2 Redis cache
Adaptive TTL (hot hotels shorten TTL)
Reduce 90% of supplier API calls

4. Queue Management

Request queue avoids burst traffic
Limit concurrent number to protect suppliers
Graceful degradation (cache → backup supplier)

5. Comprehensive Monitoring

Real-time monitor each supplier’s QPS
Real-time monitor rate limit times
Real-time monitor cache hit rate
Real-time monitor queue length

Call to Action

Has Your Search Feature Been Blocked on Day 1?

Problems:

5 suppliers, 5 different rate limiting rules
Hot hotels have stricter rate limiting
Burst traffic causes exceeds
Multi-supplier concurrent calls cause exceeds
Cache invalidation causes duplicate calls

Our Solution:

Intelligent rate limiting management (automatically identify rate limiting rules, dynamically adjust rate)
Hot spot detection (real-time identify hot hotels, reduce request frequency)
Multi-level caching (L1+L2, reduce 90% API calls)
Queue management (avoid burst traffic, graceful degradation)

You Only Need:

import "github.com/hotelbyte-com/sdk-go/hotelbyte"

client := hotelbyte.NewClient("YOUR_API_KEY")

// Search hotels (automatically handle rate limiting, hot spots, caching)
result, err := client.SearchHotels(&hotelbyte.SearchRequest{
    Destination: "London",
    CheckIn:     time.Date(2026, 2, 10, 0, 0, 0, 0, time.Local),
    CheckOut:    time.Date(2026, 2, 12, 0, 0, 0, 0, time.Local),
    Guests:      2,
})

// We automatically handle:
// - Rate limiting management
// - Hot spot detection
// - Multi-level caching
// - Queue management
// - Graceful degradation

for _, hotel := range result.Hotels {
    fmt.Printf("%s: $%.2f\n", hotel.Name, hotel.TotalPrice)
}

No rate limiting nightmare. No 429 errors. Only stable search.

Next Steps

Free Trial

30 days free trial
No credit card required
Start testing immediately

Free Trial for 30 Days

View Documentation

API docs: openapi.hotelbyte.com
SDK docs: docs.hotelbyte.com
Best practices: blog.hotelbyte.com

Contact Us

Have questions? Contact our engineers directly.

Contact Us

Next article preview: Error Handling - Same error, 5 different HTTP status codes and error messages

Reading time: ~15 minutes Difficulty: Medium (requires understanding API rate limiting, caching, queue management)

Has Your Search Feature Been Blocked by Supplier on Day 1?

Real Scenario

The Problem

Pain Point 1: Different Rate Limiting Rules per Supplier

5 Suppliers, 5 Different Rate Limiting Rules

Differences in Rate Limiting Strategies

Fixed Window (Supplier A)

Sliding Window (Supplier B)

Token Bucket (Supplier C)

API Key Suspension (Supplier D)

Pain Point 2: Hot Hotels Have Stricter Rate Limits

Real Scenario

The Problem

The Importance of Hot Spot Detection

Pain Point 3: Burst Traffic Causes Exceeds

Real Scenario

The Problem

Pain Point 4: Cache Invalidation Causes Duplicate Calls

Real Scenario

The Problem

Pain Point 5: Multi-Supplier Concurrent Calls Cause Exceeds

Real Scenario

The Problem

Our Solution

Core Idea: Intelligent Rate Limiting Management

1. Intelligent Rate Limiting Management

Automatically Identify Rate Limiting Rules

Dynamically Adjust Request Rate

2. Hot Spot Detection

Real-time Identify Hot Hotels

Rate Limiting Strategy for Hot Hotels

3. Multi-level Caching

Cache Hierarchy

Adaptive TTL

Cache Implementation

4. Queue Management

Request Queue

Burst Traffic Protection

5. Graceful Degradation

Our Advantages

1. Intelligent Rate Limiting Management

2. Hot Spot Detection

3. Multi-level Caching

4. Queue Management

5. Comprehensive Monitoring

Call to Action

Has Your Search Feature Been Blocked on Day 1?

Next Steps

Free Trial

View Documentation

Contact Us

Series Navigation

Comments