2 years ago

#76420

test-img

Distdev

ECS t3.micro Bottlerocket performance degradation

we have ECS service (2 tasks) running under Bottlerocket on 2 t3.micro instances (1 task per instance). It's a PHP on apache. With low load (1 req per second) avg response time of ECS (reported as target response time by ALB) is around 150ms (app does few network calls - to ElastiCache, SNS, DynamoDB, etc.). However when we increase load for test to about 20 reqs per second performance degrades - avg response time is about 1 sec. Profiling PHP app shows that most of time is spent for network calls (e.g. call to DynamoDB takes around 170 ms now).

My first thought was that we're reaching open files limit, but increasing nofile ulimit didn't help much - it became better for 10-15%. Switching network mode from bridge to awsvpc didn't help at all.

Then comparing simple curl timing:

low load:

 time curl -w "@/curl.txt" -o /dev/null -s "https://dynamodb.eu-west-1.amazonaws.com"
     time_namelookup:  0.001612s
        time_connect:  0.002452s
     time_appconnect:  0.021202s
    time_pretransfer:  0.021243s
       time_redirect:  0.000000s
  time_starttransfer:  0.022665s
          time_total:  0.022717s
                     ----------

real    0m0.040s
user    0m0.018s
sys     0m0.007s

vs increased

# time curl -w "@/curl.txt" -o /dev/null -s "https://dynamodb.eu-west-1.amazonaws.com"
     time_namelookup:  0.007911s
        time_connect:  0.008654s
     time_appconnect:  0.106077s
    time_pretransfer:  0.106182s
       time_redirect:  0.000000s
  time_starttransfer:  0.111556s
          time_total:  0.111619s
                     ----------

real    0m0.233s
user    0m0.032s
sys     0m0.000s

so now it makes me think that it's smth related to CPU, however CPU utilization doesn't go over 40% in metrics (just in case memory doesn't go over 50%).

Any suggestions what to check further or we're hitting some "known" limit?

amazon-web-services

amazon-ec2

amazon-ecs

0 Answers

Your Answer

Accepted video resources