Pricing of Google App Engine Flexible env, a $500 lesson

node.js google-app-engine google-cloud-platform

31,760

Solution 1

After multiple back and forth with Google, and hours of reading blogs and looking at reports, I've finally found an explanation for what happened. I will post it here with my suggestions so that other people do not also fall victim to this problem.

Note, this may seem obvious to some, but as a new GAE user, all of this was brand new to me.

In short, when deploying to GAE and using the following command "$ gcloud app deploy", it creates a new version and sets it as the default, but also and more importantly, it does NOT remove the previous version that was deployed.

More info about versions and instances can be found here: https://cloud.google.com/appengine/docs/standard/python/an-overview-of-app-engine

So in my case, without knowing it, I had created multiple versions of my simple node app. These versions are still running in case one needs to switch following an error. But these versions also require instances, and the default, unless stated in the app.yaml, is 2 instances.

Google says:

App Engine by default scales the number of instances running up and down to match the load, thus providing consistent performance for your app at all times while minimizing idle instances and thus reducing cost.

However, from my experience, this was not the case. As I said earlier, I pushed my node app with nodemon which it seems was causing errors.

In the end, following the tutorial and not shutting down the project, I had 4 versions, each with 2 instances running full-time for 1.5 months serving 0 requests and generating lots of error messages and it cost me $500.

RECOMMENDATIONS IF YOU STILL WANT TO USE GAE FLEX ENV:

First and foremost, setup a billing budget & alerts so that you do not get surprised by an expensive invoice that is automatically charged to your CC: https://cloud.google.com/billing/docs/how-to/budgets
In a testing env, you most likely do not need multiple versions, so while deploying use the following command:
$ gcloud app deploy --version v1
Update your app.yaml to force only 1 instance with minimal resources:

runtime: nodejs
env: flex

# This sample incurs costs to run on the App Engine flexible environment.
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/nodejs/configuring-your-app-with-app-yaml
manual_scaling:
  instances: 1
resources:
  cpu: 1
  memory_gb: 0.5
  disk_size_gb: 10

Set daily spending limit

See this blog post for more info: https://medium.com/google-cloud/three-simple-steps-to-save-costs-when-prototyping-with-app-engine-flexible-environment-104fc6736495

I wish some of these steps had been included in the tutorial in order to protect those who are trying to learn and experiment, but it was not.

Google App Engine Flex env can be tricky if one does not know all these details. A friend pointed me to Heroku, that has both set pricing and Free/Hobby offers. I was able to quickly push a new node app there, and it worked like charm! https://www.heroku.com/pricing

It "only" cost me $500 to learn this lesson, but I do hope this helps others looking at Google App Engine Flex Env.

Solution 2

If you want to reduce your GAE costs please do not use manual_scaling as suggested in this article or the accepted answer!

The beautiful thing about Google App Engine is that it can scale up and down to hundreds of machines within milliseconds based on demand. And you only pay for instances that are running.

To be able to optimize your costs you need to understand the different scaling options and instance types:

1. App engine flex vs standard:

The details about differences can be found here, but one important difference relevant for this question is:

[Standard is] Intended to run for free or at very low cost, where you pay only for what you need and when you need it. For example, your application can scale to 0 instances when there is no traffic.

2. Scaling Options:

Automatic scaling: Google will scale your app depending on demand and configuration you provided.
Manual scaling: No scaling at all, GAE will run exact # of instances you asked for, all the time(very misleading naming)
Basic scaling: It will scale up to limit you set and will also scale down after certain time

3. Instance Types: There are 2 instance types, and they basically differ in the time it takes to spin up a new instance. F class instances(used in automatic scaling) can be created when there is need within ~0.1 seconds and B class instances(used in manual scaling/basic) within ~0.7 seconds:

Now that you understood the basics let's go back to accepted answer:

manual_scaling:
  instances: 1
resources:
  cpu: 1
  memory_gb: 0.5
  disk_size_gb: 10

What this instructs GAE is to run a custom instance class(more costly), all the time. Obviously this is not the cheapest option because B1/F1 instance type could be used instead(it has lower specs) and it is also running an instance constantly.

What would be the cheapest is to turn off the instance when there is no traffic. If you don't mind the ~0.1 second spin up time you could go with this instead:

instance_class: F1
automatic_scaling:
  max_instances: 1 (--> you can adjust this as you wish)
  min_instances: 0 (--> will scale to 0 when there is no traffic so won't incur costs)

This will fall within the free quotas google provide and it should not cost you anything if you don't have any real traffic.

PS: It's also highly recommended to set up daily spending limit in case you forgot something running or you have some costly settings somewhere(daily spending limits are deprecated but will be available until July 24, 2021, source).

Solution 3

We had code deployed to GAE FE go absolutely nuts due to a cascading, exponential failure (bounced emails generated bounced-email emails, etc.) and we could NOT turn off the GAE instances that were bugged. After 4+ hours, and 1M+ emails sent (Mailgun just would NOT let us disable the account. It said "Please wait up to 24 hours for the password change to go into effect", and revoking API keys did nothing), the redis VM was stopped, the DB down, and all the site's code reduced to a single "Down For Maintenance" static 503 page), the emails kept being sent.

I determined that GAE FE just simply does not end either docker VMs or Cloud Compute VMs (redis) that are under CPU load. Maybe never! Once we actually deleted the Compute VM (instead of "merely" stopping it), the emails instantly stopped.

But, our DB continued to get filled with "could not send email" notices for up to 2 more hours, despite the GAE app reporting 100% of the versions and instances to be "Stopped". I ended up having to change the Google Cloud SQL password.

We kept checking the bill, and the 7 rogue instances kept using up CPU and so we cancelled the card used on that account, and the site did, in fact, go down when the bill was past due, but so did the rogue instances. We never were able to resolve the situation with GAE email support.

Update (30 Sep 2020): This is still the worst moment of my 22 year career!! An entire company of 15 crack genius devs couldn't figure out how to turn off GAE. We knew customers were receiving MILLIONS of emails when one of my dev's couldn't access her GMail account. Couldn't unplug it, couldn't turn it off. It was quite a "Terminator" moment!

It wouldn't have been nearly so bad, except for expenses, if MailGun had allowed us to actually disable the API access or change the password. But it would have still been bad expense-wise on GAE.

I no longer trust servers I can't issue reboot on.

In the end, MailGun only charged us about $50. GAE, however... If I had just assumed "OK, mails stopped, we can stop", we could have ended up with a $20,000 excess bill! As it was, it "only" cost $1,500. And we never could get in contact with anyone to dispute it. So the CEO just ate it.

Solution 4

Also note that if you still want your app to have automatic scaling but you don't want the default minimum of 2 instances running at all times, you can configure your app.yaml like so:

runtime: nodejs
env: flex
automatic_scaling:
  min_num_instances: 1

Solution 5

Since no one mentioned, here are the gcloud commands related to the versions

# List all versions
$ gcloud app versions list

SERVICE  VERSION.ID       TRAFFIC_SPLIT  LAST_DEPLOYED              SERVING_STATUS
default  20200620t174631  0.00           2020-06-20T17:46:56+03:00  SERVING
default  20200620t174746  0.00           2020-06-20T17:48:12+03:00  SERVING
default  prod             1.00           2020-06-20T17:54:51+03:00  SERVING

# Delete these 2 versions (you can't delete all versions, you have to have at least one remaining)
$ gcloud app versions delete 20200620t174631 20200620t174746

# Help
$ gcloud app versions --help

View more solutions

31,760

ddallala

Updated on December 29, 2021

Comments

ddallala over 2 years
I followed the Nodejs on App Engine Flexible env tutorial: https://cloud.google.com/nodejs/getting-started/hello-world

Having successfully deployed and tested the tutorial, I changed the code to experiment a little and successfully deployed it... and then left it running since this was a testing environment (not public).

A month later, I receive a bill from Google for over $370!

In the transaction details I see the following:

Oct 1 – 31, 2017 App Engine Flex Instance RAM: 5948.774 Gibibyte-hours ([MYPROJECT]) $42.24

Oct 1 – 31, 2017 App Engine Flex Instance Core Hours: 5948.774 Hours ([MYPROJECT]) $312.91

How did this testing environment with almost 0 requests require about 6,000 hours of resources? In the worst, I would have assume 720 hrs running fulltime for a month @ $0.05 per hour would cost me ~$40. https://cloud.google.com/appengine/pricing

Can someone help shed light on this? I have not been able to find out why so many resources were needed?

Thanks for the help!

For more data, this is the traffic over the last month (basically 0):

And instance data

UPDATE: Note that I did bring one modification to the package.json: I added nodemon as a dependency and added it as part of my "nmp start" script. Though I doubt this explains the 6000 hours of resources:
```
  "scripts": {
    "deploy": "gcloud app deploy",
    "start": "nodemon app.js",
    "dev": "nodemon app js",
    "lint": "samples lint",
    "pretest": "npm run lint",
    "system-test": "samples test app",
    "test": "npm run system-test",
    "e2e-test": "samples test deploy"
  },
```
App.yaml (default-no change from tutorial)
```
runtime: nodejs
env: flex
```
- BrettJ over 6 years
  
  You should contact GCP support for help with billing: support.google.com/cloud/contact/cloud_platform_billing
- ddallala over 6 years
  
  Thanks for the response @BrettJ, I had already contacted them and this is what they told me: "As mentioned, we do not have any capability to view the detailed report of the usage that's why I provided the links so you can post as well on the community forum and again there will be experienced developers can help you with your technical questions."
- Dan Cornilescu over 6 years
  
  Your expectations appear based on standard env pricing (and only a B1 class instance). But you're using the flex env - different pricing. Check your app.yaml for CPUs and GB of memory configs - those are your per-instance hour multiplicators. Then you multiply by 2 - the number of instances you had running.
- ddallala over 6 years
  
  Hi @DanCornilescu pricing is still at ~ $0.0.5 even for flex envs ... vCPU per core hour $0.0526 (Iowa). I pasted my app.yaml ... in short, didnt modify it from the tutorial.
- Dan Cornilescu over 6 years
  
  OK, now you have beter datapoints to communicate to GCP billing support.
- ddallala over 6 years
  
  I've provided an answer below to what happened, hope this helps others
- Nano Miratus over 5 years
  
  Hi. The same happened to me. With 400$. I contacted support and the try to get me a "one time courtesy adjustment". So there's hope. Does anyone know something about that? Does it work? What exactly is it?
Drazen Bjelovuk over 6 years

Google really seems to have the market cornered on lousy documentation. It's unfortunate that you were slapped with a $500 bill, but you've taken the bullet for many others I'm sure by offering your insights, so much appreciated!
DeividasV about 6 years

another possibility "gcloud app deploy app.yaml --stop-previous-version"
Kartik over 5 years

Thanks, very helpful. Billing alerts/limits are a must. Faced a similar issue just recently
Dominic about 5 years

I think you mean max_num_instances?
Theodore R. Smith about 5 years

There is definitely no option to limit instances. Spinning up 1,000 instances during a DDoS attack and billing the customer $1000s of dollars is a business strategy of GCP.
zardilior almost 5 years

@TheodoreR.Smith actually with max you can and also setting a daily limit
jon_wu over 4 years

@Dominic min_num_instances is correct here if you want to save money while idle at the cost of redundancy. @Theodore There's also max_num_instances to limit instances, but you can't set a daily spending limit on App Engine flexible (but you can on standard). You can however set up budgets and alerts.
yorbro over 4 years

You can't set min_instances to 0. Per the documentation: The minimum number of instances given to your service. When a service is deployed, it is given this many instances and scales according to traffic. Must be 1 or greater, default is 2 to reduce latency.
Caner over 4 years

@yorbro thanks for pointing that out, min_instances is for standard environment, the document you linked refers to different parameter min_num_instances which is for flex environment. I will update my answer to clearly reflect this.
yorbro over 4 years

Ah my bad. Thanks for the quick reply!
Caner over 4 years

this is definitely not the cheapest way, because it'd constantly run a single instance. please see my answer
Pete Nice over 4 years

In the documentation for min_instances is says Warning: For this feature to function properly, you must make sure that warmup requests are enabled and that your application handles warmup requests. Does this have to be enabled? What impact will it make to latency if this isn't implemented? I'm trying to reduce my running costs for an app that has about 600 users so I'm trying to figure out what the best scaling settings are.
Caner over 4 years

that warning seems to be new, I haven't seen it before. That being said, don't know about the performance impact. details here: cloud.google.com/appengine/docs/standard/python/…
John Doe about 4 years

Can we potentially expect the same bad surprise with AppEngine standard env ? Or do the issues OP mentioned occur only in the flex env ?
Theodore R. Smith almost 4 years

Now that I've long ago left that company, I can tell you that the monthly bill was to the tune of some $5,000, normally about $300.
ingernet almost 4 years

I’ve used GCP and AWS for the last few years, and stories like this make me want to run screaming into the arms of AWS full time. The holes in GCP’s documentation and error checking are wretched - improving, but still wretched. It’s cheap for a reason. That said, I’m about to deploy an app to GAE, hold my beer
Theodore R. Smith almost 4 years

It's literally impossible to get in touch with anyone at Google if you have a SERIOUS Problem with GCP. We tried for months to contact them about gross instability issues. No go.
ingernet almost 4 years

I’ve had ok luck with their tech support, but my company also pays for a support account, soooo
SkrewEverything almost 4 years

@JohnDoe It only happens in Flex env. In standard env, if there are no requests, the instance will be shutdown. You can read it in the official docs -> scaling to zero row
Theodore R. Smith over 3 years

This is still the worst moment of my 22 year career!! An entire company of 15 crack genius devs couldn't figure out how to turn off GAE. We knew customers were receiving MILLIONS of emails when one of my dev's couldn't access her GMail account. Couldn't unplug it
J W over 3 years

Daily spending limits have been deprecated ¯_(ツ)_/¯
John Balvin Arias over 3 years

what do you mean you couldn't turn off app engine? there is literally a button for it
Ankit Bindal over 3 years

Not all heroes wear a cape
Alex about 3 years

@SkrewEverything - You should not assume you're safe in standard env. Instance shutdown behavior depends on the app.yaml config. While the default config are generally shutting down instances when there is no traffic, that does not mean you cannot end up with the exact same situation in standard env.
m4heshd about 2 years

Be careful. Heroku suspends accounts out of the blue for no reason and with no notification of suspension. People including myself learned that the hard way. They have and awful automated flagging system. They'll also never respond to your mails.
Theodore R. Smith about 2 years

1. Create a C app that counts forever, using 100% CPU. 2. Run this on GAE (at least in 2020, not sure now). 3. Delete the instance in GAE. It'll still charge you, because the instance isn't actually killed.