Sentry

Sentry is our automated error tracking tool. New exceptions in the API will cause Sentry to send us an email alert. We also audit the existing errors when fixing tech debt or silencing errors that are expected, i.e. unexceptional, such as when sub-systems return validation or not-found errors as 500s rather than 422s or 404s.

All our classes are namespaced with a 'Mobile' Ruby module. When auditing errors a custom search of 'Mobile' filters out other vets-api errors.

To further refine the search to only 'error', rather than 'warn' or 'info' level errors you can toggle open the search builder bar.

And then select 'error' from the list.

Dropdown listing three options: Warning, Info, and Error

Once you've found an error, or have been linked to one directly from an alert email, you'll be taken to the error details page.

Error result page in Sentry

The majority of our errors occur during HTTP responses. Often the errors first present themselves deep within the API framework's base classes.

Code-level error details example

Selecting the 'Full' tab reveals the full call stack and as seen below the true source of the error.

Full error stack example

Once the location of the error has been determined the next step is determining the cause. Errors from bugs we have introduced (500s in our API responses) will have clear Ruby errors such as 'NoMethodError'. For errors from sub-systems you'll need to check the 'ADDITIONAL DATA' section of the error details page.

As seen below this section includes the errors from the upstream service as well as any custom tags added for that specific service. In this case 'ICN' and 'MHV Correlation ID' can be used to cross-reference Loki based logs to get an idea of the full request flow and any user actions that lead up to the error.

Other upstream errors included in the log output