Recently I inherited a complex Google Cloud (GCP) App Engine configuration. App Engine is the managed web server platform, which does not require you to provision/configure/manage your own servers. Some call that serverless.
As I said, it was very complex with some components (services) probably not being used at all. The challenge was to make probably a certainty, so I could feel safe cleaning up parts that are not used (which reduces costs). In addition, there were a very large number of old versions of the services running, up to 64 in one case.Monitoring from App Engine goes through the universal Stackdriver tool. You can use the Metric Explorer to examine the metrics for a service. Response count is a pretty useful metric to determine if a service is getting any HTTP requests. It counts all responses whether HTTP 200 or any other. Metric Explorer can display graphs that look like this:
That is hard to read. Can you tell if version 20180912t175909 is getting any requests? It is pretty hard to read such sparse data. I would rather just see a list of responses and their timestamp. I just wanted the data behind this graph and indeed for the full 90 days of retention that Stackdriver has available.
So, I set out on one of my quests. There was no way built into the web interface for GCP. In addition, I could not find documentation on how to get the data using a CLI (gcloud) or a Google-supported SDK. I couldn’t find anything on the CLI. After a support request, I got a tiny foothold in building a Python script. However, it still took a lot more work on my part. Here is the script as it stands today:
This is example output:
servic1-dev, 20190617t190438, 200: 144, 304: 117, 404: 23, servic1-prod, 20190617t193131, 200: 340, 204: 10, 304: 510, 404: 115, servic2-staging, 20190617t192628, 200: 47, 304: 22, 404: 17, 204: 4, servic3api-prod, 20190618t015039, 204: 313, 200: 217, 304: 145, 401: 25, 404: 3, service4lytics, 20190923t184827, 200: 312, 304: 159, 204: 75, 302: 37, 404: 31, , 20190319t164724, 200: 5, service4lytics-staging, 20190923t184814, 304: 13, 200: 70, 204: 2, 302: 27, 404: 19, default, 20180912t191450, 200: 66, 404: 56, , 20180912t175909, 200: 7,
The first column is the service name, the second is the version, following that are the various HTTP response codes, a colon, and the count of the occurrences of that code.