Profiling Java application in Google Appengine Flex

There are a lot of ways java applications can be profiled for memory and CPU to find memory leaks or any culprit block of code where the app spends most of the time. In Google AppEngine Flex the application code (jar or war) runs inside a docker container which in turn runs inside a VM dedicated a service/module. While Google Cloud Profile can be used for analysing live applications it only supports CPU profiling when it comes to Java in AppEngine flex. Memory profiling is supported for Java only in AppEngine standard and python and go in AppEngine flex. So as of now, one has to profile the memory in JVM manually.

SSH to the AppEngine instance

If you have Editor role you can ssh to the specific appengine flex instance using gcloud utility. You can connect to the instance by clicking on the SSH button against the instance in the instances page in the appengine GCP console.
  1.  SSH to the instance

    $ gcloud --project "demoneil" app instances ssh "aef-default-20190225t133842-tlmz" --service "default" --version "20190225t133842"
  2. Find the docker processes.


    $ docker ps
    CONTAINER ID IMAGE CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
    d388874ebe58 asia.gcr.io/demoneil/appengine/default.20190225t133842@sha256:cad540ad23c872496cfc73721a096f27481b6360a7cfb7867796e627f6f2c25e "/docker-entrypoin..." 41 minutes ago Up 41 minutes 172.17.0.1:8080->8080/tcp gaeapp
    f01675bb33c8 gcr.io/google-appengine/api-proxy "/proxy" 41 minutes ago Up 41 minutes api
    8ef572057a10 gcr.io/google-appengine/nginx-proxy "/var/lib/nginx/bi..." 42 minutes ago Up 42 minutes 8080/tcp, 0.0.0.0:8443->8443/tcp, 8090/tcp, 0.0.0.0:10402->10402/tcp nginx_proxy
    69fce23569a2 gcr.io/google-appengine/iap-watcher "./start_iap_watch..." 42 minutes ago Up 42 minutes iap_watcher
    aca67e06d5ed gcr.io/google-appengine/fluentd-logger "/opt/google-fluen..." 42 minutes ago Up 42 minutes fluentd_logger

  3. Get a shell for the docker container running your code. Usually, it has an image URL from grc.io and name as "gaeapp", like the first one here.
    $ docker exec -it gaeapp /bin/bash

  4. Find the process and change the user to the user running the jetty server, which is usually "jetty"

    root@d388874ebe58:/var/lib/jetty# ps -aux

    USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
    jetty 1 0.1 11.0 2509224 113236 ? Ssl 08:13 0:05 java -showversion -Djava.io.tmpdir=/tmp/jetty -agentpath:/opt/cdbg/cdbg_java_agent.so=--log_dir=/var/log/app_engine,--alsologtostderr=tru
    root 187 0.0 0.3 18140 3268 ? Ss 09:01 0:00 /bin/bash
    root 191 0.0 0.2 36636 2816 ? R+ 09:02 0:00 ps -aux


    $ su jetty
  5. Create a directory to write the heap dump as the current directory may not have write access.

    mkdir dump_dir.
    chmod 777 dump_dir
  6. Generate the heap dump for the process id

    jmap -dump:live,file=dump_dir/ 50

  7. Exit the container and copy the dump file from within the container to the VM

    docker cp gaeapp:/var/lib/jetty/dump_dir/mem.prof .

  8. Exit the VM and copy the file from the VM to the local machine

    gcloud app instances scp --service default --version 20190225t133842 aef-default-20190225t133842-tlmz:/home/neil/mem.prof .

Analyse the dump

There are several tools like VisualVM and Yorkit to analyse the heap dump file to find out where are the memory issues. I used something heaphero.io which is publicly hosted and good for small dump files.

Here is a sample report for the getting started the app.