Introduction¶

Background¶

The government contracting model and the sheer size of VA mean hundreds of teams have created or are creating APIs for the organization. As methodologies, architectures, and documentation standards vary from team to team, it's unsurprising that in the past the department provided fractured and inconsistent APIs to its consumers.

Purpose¶

The focus of the standards are the 'I' in API, the interface. It is not an application design guide or a programming manual. How teams develop an API behind that interface is up to them. The guide's main goal is to outline common interface and documentation standards for APIs deployed at VA.

This guide is not intended to be read front-to-back. That said, the first section, General Guidelines, applies to all APIs at VA, and implementing teams should read it in its entirety.

The remaining sections are more of a field manual or cookbook that is skimmed initially. Later, when presented with a problem whose solution is ambiguous, teams can reference this guide for VA's recommended solution or design pattern.

Conventions¶

Some items in the API standards are hard and fast rules, other are recommendations. RFC 2119 describes the keywords which signify the requirements in this guide. Those keywords appear in bold within their respective highlighted admonitions as below:

Requirement

Things you must and must not do as an absolute requirement of publishing VA APIs.

Guidance

API best practices that are recommended and those that are not recommended.

Info

Additional information and tips.

General Guidelines

General Guidelines¶

This section outlines the general guidelines for the architecture and documentation standards that all APIs at VA should follow.

Recommendations follow established industry API standards. While developing APIs for the Department of Veterans Affairs (VA) and the government presents unique challenges, API teams can leverage existing API guidelines refined in the open-source community and the private sector.

Not all consumers of VA APIs will be familiar with the VA API landscape. Following industry standards and best practices will ensure that VA APIs share a common interface and behave similarly to best-of-breed APIs outside VA.

Using these standards when developing reduces the research that teams typically do before starting and lets developers access the more in-depth, industry-standard documentation elsewhere when necessary.

In this section:

API-first: Create an API contract before beginning development.
Architecture: REST as the architectural style is prescribed for distributed systems.
Data interchange: JSON is the recommended format for API request and response payloads.
Documentation: Document the API in a format both humans and computers understand.
Development process: Development approach for building APIs.
Development environments: Development environments for successful delivery of APIs.
Test data: Requirements and best practices for test data.
Performance & Availability: Expectations for APIs in the consumer integration and production environments.

API-first¶

Guidance

Providers should design their API before beginning implementation.

An API-first approach means that a team should design and iterate upon the API before beginning development and treating the API as a first-class product rather than just another component of the system. This approach:

Makes implementation go more smoothly
Reduces implementation time
Increases the chances implementation is correct by centering the API around the consumer
Does not lessen the importance of implementation

The resulting speed and accuracy benefit not only the providing team but also the consuming teams by allowing them to begin work earlier and in parallel when they have a design to build against. Mock response data can be a placeholder in specs and tests while the provider fleshes out the implementation details.

Finally, developers often neglect documentation in software projects. A fortunate side effect of the API-first approach is that your design is also documentation. Documentation will be complete before you start implementation and will require only minor tweaks along the way to reflect your API when development is complete.

Architecture¶

Requirement

APIs must be stateless
APIs must be cache compatible
APIs must be able to work as a component of a layered system

Guidance

APIs should use a RESTful architecture
SOAP is not recommended.

REST¶

REST is the most widely-adopted architectural style for distributed systems. Much of its popularity comes from its consistency and the fact that its underlying interface is HTTP, the foundation of data communication for the web which nearly all clients and servers understand.

REST provides a second layer of uniformity by placing 'Resources' and the manipulation of their state via identifiers at the center of client/server interactions. This reduces friction for those developing and implementing the API.

Endpoint paths for resources are one example of REST convention that delineates a solution that would be ambiguous with other architectures. So, for example:

What's the path for a list of resources?

REST prescribes using the HTTP GET method and the plural name of the resource:

request

GET ../rx/v0/prescriptions

What if we then want to update a specific resource from that list?

Use the HTTP PUT method and append its identifier:

request

PUT ../rx/v0/prescriptions/e526c85d-29fc-432c-a16d-df5cdfce2a62

Constraints of a RESTful system¶

Roy Fielding designed REST to facilitate building distributed systems that are efficient, scalable, and reliable. His PhD dissertation "Architectural Styles and the Design of Network-based Software Architectures" explains this in great detail. This guide does not reproduce that dissertation here, however an API is 'RESTful' if it meets the following constraints¹:

Client/server separation: There must be a clear separation of concerns when components like a web or mobile application and its back-end API server work together and communicate.
Statelessness: A request contains all the information needed for the server to execute it. The server does not store client context in a session between requests.
Cacheability: A response to a request must disclose if the client can store and reuse it and for how long.
Layered system: The client is only aware of the details of the server. It does not know or need to know about systems two or more layers down supporting the server.
Uniform interface: Resources and the manipulation of their state via standard HTTP methods guide the development of a consistent interface.

An optional sixth constraint, 'Code on demand', in which the server transfers executable code to the client must not be implemented in VA APIs due to security concerns. ↩

Data Interchange¶

Guidance

VA APIs should interchange data serialized as JSON.
APIs may return binary data for files (images, PDFs, etc).
XML may be used when interfacing with legacy systems.
Binary data formats (ProtoBufs, Avro, Thrift) are not recommended for public-facing APIs.

JSON¶

Requirement

JSON must conform to the JSON Data Interchange Format described in RFC 7159
Request and response bodies must be valid against JSON Schema Draft v4 or higher.

As REST is the dominant web API architecture, JSON is the dominant web data-interchange format.

REST APIs revolve around resources. They are better described using JSON's structured data than a markup language that structures information, such as XML.

JSON's advantages over XML include:

Its shared data model, obviously with JavaScript, but also with any language that implements booleans, numbers, strings, arrays, and objects (or objects like data structures, i.e. dictionaries or hashes).
It produces a smaller payload than XML.
In most languages, it compresses (gzips) faster.
is more human-readable than XML.

Information models¶

Guidance

We recommend that APIs default to using JSON:API as an information model.
Industry-specific standards, such as FHIR for health data may be used.
Information/hypermedia models such as OData, HAL, JSON-LD, and SIREN are not recommended.

An information model specifies a set of shared conventions for JSON data to ensure formatting is consistent across a set of APIs.

For example, even if APIs agree on using JSON, they could each still disagree on how they format and identify resources. Additionally, there are varied ways one could render related resources, metadata, errors, and design patterns such as pagination and filtering. With an information model in place, figuring out how to render errors or paginate a list is a matter of looking up the recommendation from the model’s specification.

Documentation¶

Requirement

API teams must add their API to the CODE VA catalog.
API teams must document their API using OpenAPI Specification version 3.0.x.
OpenAPI docs must be valid YAML or JSON so they are machine-readable.

Guidance

Use industry tools for your language and technical stack to automate the generation and upkeep of your OAS.

CODE VA¶

Your API must be added to the CODE VA catalog, VA's Catalog of Developer Essentials. This aids others within VA to be able to discover your API. This is helpful even if your API was written specifically for just one consumer. Documenting the intent of use for your API will make others within VA aware of your API’s existing capabilities.

OpenAPI specification¶

OpenAPI Specification (OAS) is the predominant API documentation industry standard for describing HTTP APIs. This standard allows people to understand how the API works, generate client code, and what to expect when using the API. All teams at VA must document their APIs using the OpenAPI Specification version 3.0.x.

Documentation for humans and computers¶

An OAS serves as documentation for human consumers to read when evaluating or using an API, and for machines to use when monitoring and verifying consistency between the API’s specification and its behavior.

OAS documents must pass validation in tools such as the Swagger Editor.

Access to your API's documentation is best when available from a URL that your API hosts. This allows machines and humans to consume this information. For example, if your service is called "rx" then, https://{hostname}/rx/{version}/openapi.json would be a good location to host your API documentation. Using openapi.json or openapi.yaml follows the Open API Specification recommendation.

It is highly recommended to use tools such as Springdoc, or Express OAS Generator to simplify the creation and maintenance of the OAS. You are free to use any tool that works best for your code, language, and situation.

See the guidance under Production Management for keeping the documentation in sync with the API releases.

Example OAS document¶

Below is an example OAS doc for a fictitious 'Rx' API. Click on the circular buttons labeled with a '+' to view code annotations.

---
openapi: 3.0.1 # (1)
info: # (15)
  title: Rx
  description: An example 'Rx' API that follows the VA API Standards.
  contact:
    name: support@va.gov #(20)
  version: 0.0.0 #(2) 
servers: # (3)
  - url: https://dev-api.va.gov/services/rx/{version}
    description: Development environment
    variables:
      version:
        default: v0
  - url: https://staging-api.va.gov/services/rx/{version}
    description: Staging environment
    variables:
      version:
        default: v0
  - url: https://sandbox-api.va.gov/services/rx/{version}
    description: Sandbox environment
    variables:
      version:
        default: v0
  - url: https://api.va.gov/services/rx/{version}
    description: Production environment
    variables:
      version:
        default: v0
paths:
  /pharmacies:
    get:
      tags:
        - pharmacy
      summary: Returns a list of VA facilities with pharmacies. # (16)
      description: Returns a paginated list of all VA facilities that provide
        pharmacological services.
      operationId: getPharmacies
      responses:
        "200":
          description: Returns a paginated list of VA facilities with pharmacies, and an empty list if none are found for the given criteria.
          content:
            application/json:
              schema: # (4)
                $ref: "#/components/schemas/PharmacyList"
        "500": # (5)
          $ref: "#/components/responses/ErrorInternalServerError"
        "502": # (6)
          $ref: "#/components/responses/ErrorInternalServerError"
        "503": # (7)
          $ref: "#/components/responses/ErrorServiceUnavailable"
      security: # (8)
        - {}
  /prescriptions:
    get:
      tags:
        - prescription
      summary: Returns a list of a Veteran's prescriptions
      description: Given a Veteran's ICN, return a list of their
        prescriptions.
      operationId: getPrescriptions
      parameters: # (17)
        - in: query
          name: icn
          description: MPI ICN
          required: true
          schema:
            maxLength: 17
            minLength: 17
            pattern: ^\d{10}V\d{6}$
            type: string
            example: [1012667145V762142]
      responses:
        "200":
          description: Return a list of the Veteran's prescriptions and an empty list if none are found.
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/PrescriptionList"
        "401": # (18)
          $ref: "#/components/responses/ErrorUnauthorized"
        "403":
          $ref: "#/components/responses/ErrorForbidden"
        "404":
          $ref: "#/components/responses/ErrorNotFound"
        "422":
          description: Unprocessable Entity
          content:
            application/json:
              schema:
                type: object
                example:
                  - errors:
                      - status: 422
                        title: Unprocessable Entity
                        detail: Entity given was unprocessable.
        "500":
          $ref: "#/components/responses/ErrorInternalServerError"
        "502":
          $ref: "#/components/responses/ErrorBadGateway"
        "503":
          $ref: "#/components/responses/ErrorServiceUnavailable"
      security: # (12)
        - production:
            - prescription.read # (19)
components: # (9)
  securitySchemes:
    production:
      type: oauth2
      description: This API uses OAuth2 with the authorization code grant flow.
      flows:
        authorizationCode:
          authorizationUrl: https://api.va.gov/oauth2/authorization
          tokenUrl: https://api.va.gov/oauth2/token
          scopes:
            prescription.read: Retrieve prescription data
    sandbox:
      type: oauth2
      description: This API uses OAuth 2 with the authorization code grant flow.
      flows:
        authorizationCode:
          authorizationUrl: https://sandbox-api.va.gov/oauth2/authorization
          tokenUrl: https://sandbox-api.va.gov/oauth2/token
          scopes:
            prescription.read: Retrieve prescription data
  responses: # (10)
    ErrorUnauthorized:
      description: Unauthorized
      content:
        application/json:
          schema:
            type: object
            example:
              - errors:
                  - status: 401
                    title: Unauthorized
                    detail: Invalid credentials. The access token has expired.
    ErrorForbidden:
      description: Forbidden
      content:
        application/json:
          schema:
            type: object
            example:
              - errors:
                  - status: 403
                    title: Forbidden
                    detail: You do not have access to the requested resource.
    ErrorNotFound:
      description: Not Found
      content:
        application/json:
          schema:
            type: object
            example:
              - errors:
                  - status: 404
                    title: Not Found
                    detail: The requested resource could not be found.
    ErrorInternalServerError:
      description: Internal Server Error
      content:
        application/json:
          schema:
            type: object
            example:
              - errors:
                  - status: 500
                    title: Internal Server Error
                    detail: An internal API error occurred.
    ErrorBadGateway:
      description: Bad Gateway
      content:
        application/json:
          schema:
            type: object
            example:
              - errors:
                  - status: 502
                    title: Bad Gateway
                    detail: An upstream service the API depends on returned an error.
    ErrorServiceUnavailable:
      description: Service Unavailable
      content:
        application/json:
          schema:
            type: object
            example:
              - errors:
                  - status: 503
                    detail: An upstream service is unavailable.
  schemas:
    PharmacyList:
      type: object
      required:
        - data
      properties:
        data:
          type: array
          items:
            $ref: "#/components/schemas/Pharmacy" # (11)
    Pharmacy:
      type: object
      required:
        - type
        - id
        - attributes
      properties:
        type:
          type: string
          example: ["Pharmacy"]
        id:
          type: string
          example: ["6e976911-2707-4018-a6db-0d1342326379"]
        attributes:
          type: object
          required:
            - id
            - name
            - city
            - state
            - cerner
            - clinics
          properties:
            id:
              type: string
              example: ["358"]
            name:
              type: string
              example: ["Cheyenne VA Medical Center Pharmacy"]
            city:
              type: string
              example: ["Cheyenne"]
            state:
              $ref: "#/components/schemas/State"
    PrescriptionList:
      type: object
      required:
        - data
      properties:
        data:
          type: array
          items:
            $ref: "#/components/schemas/Prescription"
    Prescription:
      type: object
      required:
        - type
        - id
        - attributes
      properties:
        type:
          type: string
          example: ["Prescription"]
        id:
          type: string
          example: ["db8a52f0-b3d2-4cc9-bcab-7053d88737d5"]
        attributes:
          type: object
          required:
            - productNumber
            - referenceDrug
            - brandName
            - activeIngredients
            - referenceStandard
            - dosageForm
            - route
          properties:
            productNumber:
              type: string
              example: ["001"]
            referenceDrug:
              type: boolean
              example: [false]
            brandName:
              type: string
              example: ["FAMOTIDINE PRESERVATIVE FREE"]
            activeIngredients:
              type: object
              required:
                - name
                - strength
              properties:
                name:
                  type: string
                  example: ["FAMOTIDINE"]
                strength:
                  type: string
                  example: ["10MG/ML"]
            referenceStandard:
              type: boolean
              example: ["false"]
            dosageForm:
              type: string
              enum: # (13)
                - INJECTABLE
                - TABLET
              example: ["TABLET"]
            route:
              type: string
              enum:
                - INJECTION
                - ORAL
              example: ["ORAL"]
    State: # (14)
      type: string
      enum:
        - AK
        - AL
        - AR
        - AZ
        - CA
        - CO
        - CT
        - DE
        - FL
        - GA
        - HI
        - IA
        - ID
        - IL
        - IN
        - KS
        - KY
        - LA
        - MA
        - MD
        - ME
        - MI
        - MN
        - MO
        - MS
        - MT
        - NC
        - ND
        - NE
        - NH
        - NJ
        - NM
        - NV
        - NY
        - OH
        - OK
        - OR
        - PA
        - RI
        - SC
        - SD
        - TN
        - TX
        - UT
        - VA
        - VT
        - WA
        - WI
        - WV
        - WY
      example: ["WY"]

OpenAPI Spec 3.0.x is the required OAS specification version when writing your API specification.
This version represents the version of the API and should be updated as releases are delivered.
The servers section should reflect the location of your API hosted domain and API base path. All endpoints should then be relative to this URL.
For example, for servers you might have one for each of the environments such as: dev, staging, sandbox, production. In this example Rx API for the VA, the servers property has been filled out for all 4 environments.
It is recommended to NOT provide IP addresses here.
Using $ref properties (references) to link to response body schema definitions helps keep your endpoint definitions short.
Errors happen. We require that all APIs let consumers know the error schema your API will return should your API encounter an unexpected error.
Many VA APIs rely on one or more upstream services for their data. If your API could throw a 500 if an upstream service returns an error, consider returning a 502 error instead, so you and the consumer know a dependency rather than the API is causing the issue.
If your API could throw a 500 if an upstream service is down, consider returning a 503 error instead, so both you and the consumer know a dependency rather than the API is causing the issue.
All endpoints must have a security property, if one is not set globally for all endpoints. In this contrived example, the endpoint does not require authentication. To handle this, the security tag is added with an empty object. This informs the consumer of the endpoint that security is optional.
Often, multiple API operations have some common parameters or return the same response structure. To avoid code duplication, you can place the common definitions in the global components section and reference them using $ref.
Responses, and specifically errors, should be the same layout or shape for all endpoints. Defining them within the components/responses can help reduce the size of your path definitions.
$refs can have $refs. Think of schemas like their database counterparts. Each model and its relations should be a distinct schema.
This endpoint returns a Veteran's prescription history, which qualifies as Personal Health Information (PHI). All endpoints that return PII or PHI must use OAuth.
Enums should be considered constants. As in 'C' style languages, they should be UPPER_CASE with underscores for spaces.
Components don't have to be resources. Any data that appears in multiple locations, such as a list of states, can be a component.
Info tags must include a title and a description. The contact property should not contain an individual's personal or work email address, instead use a generic contact if available.
Path methods must contain a shorter summary and a longer description that explains the purpose and function of each operation.
Requests should use URL or body parameters rather than headers to pass along requisite data unique to that endpoint.
Protected endpoints must define 401 Unauthorized and 403 Forbidden responses.
The prescription.read scope is required for this endpoint so it is listed in the security section.
Contact should be documented within the OAS, and this should be a group email list, Microsoft Teams, or Slack channel where a consumer can send their questions. This must be an actively monitored group email or channel.

Development Process¶

Guidance

API Teams should determine requirements.
API Teams should develop a design to handle those requirements.
API Teams should practice an iterative development cycle, which often means revisiting requirements and refining design as the development is completed.
API design should be a generic interface, and not specialized to a single consumer.
Development of an API should take place in the development environment.

Requirements¶

API development starts with understanding the requirements to know why this effort is worthwhile:

What functionality does the API provide?
What potential applications might be built with the API?
Who will the expected consumers be?
How will consumers interact with the interface?
Where will the API be hosted?
What is the expected load initially, and growth over time?

Next, the focus turns to the design of the API:

Structure of the endpoints and endpoint naming
Structure of the requests and response objects
Authentication and authorization process
Error handling practices and response objects
Documentation of the interface contract
Performance and scalability strategy
Test data approach

Design¶

Early in the design phase, set up a development environment to handle a simple request to validate the infrastructure functions as expected. The request should traverse all layers of your infrastructure, including:

authentication and authorization layer
cache layer (if any will exist)
endpoint:
- controller
- validation layer
- business layer(s)
- persistent data store(s)
- response layer
  - return a well-formed response
logging integration

Iterative Development¶

Establish automated workflow processes to build, validate and deploy the code base as changes are completed.
Stub out the business behavior layers and any interactions with other services during early development, building the layers iteratively as designs materialize.

Automate validation¶

Validate the expected endpoint behavior and the request and response object structures as development is iterating.
Establish automated testing to validate behavior.

Environments for developing APIs¶

VA recommends that API teams maintain a dedicated development environment along with a separate testing environment in order to support effective development and functional testing of their APIs.

Since APIs serve as interfaces intended for use by other development teams, VA strongly recommends that API teams provide a consumer integration non-production environment. This non-production environment allows API consumers to explore, evaluate, and integrate the API’s features in a controlled setting, supporting smoother onboarding and faster feedback cycles.

Details about these environments are described below.

Requirement

Create development and testing environments to support the API Team while developing, testing, and maintaining the API.
Data in non-production environments must not have personally identifiable information (PII) or protected health information (PHI).

Guidance

API consumers should use a consumer integration environment when exploring and evaluating an API.
Test data should be representative of actual data complexities found in the production data.

Development¶

The development environment is where software engineers merge their code as different pieces are built. This environment is typically introducing changes frequently and must not be a place where your API consumers access your API. This is a safe private space for software engineers to write and deploy code to test out new ideas and new logic for the API.

Testing¶

It is at the discretion of the API team on how their testing environment is utilized. It is often used for testing more complex business cases that require a more stable environment. The software update cadence to this environment is defined by the API team in order to efficiently accomplish their goals. It is typically updated once testing passes in the development environment.

Since API Teams often need this environment to solidify the quality of the code, it, therefore, should not be a place where your API consumers access your API. VA recommends a stable consumer integration environment for consumers to evaluate and explore the API.

Consumer integration¶

Guidance

API teams should provide a consumer integration environment to support the API consumers who want to explore, evaluate, and integrate the API’s features using a reliable environment.
API teams should provide consumers instructions on how to obtain credentials to access the API in the consumer integration environment.
Self-service sign up is preferred.

Requirement

Data in a consumer integration environment must not contain PII or PHI data.
API behavior in the consumer integration environment must match the API behavior in the production environment.

To enable ease of integration with VA partners, internal VA teams, and third party partners, VA recommends a consumer integration environment for consumers using VA APIs. To facilitate exploration of VA APIs, a prospective consumer should be able to sign up and obtain appropriate credentials to access the API. The API behavior in the consumer integration environment must match the production API behavior to avoid surprises when the consumer launches their application to production.

There are scenarios when the consumer integration environment may be slightly ahead of production functionality, but the time window of this should be short-lived to avoid a consumer building an application and expecting behavior in production that doesn't exist.

Production¶

The production environment is where the API interacts with real VA data that may contain PII and PHI data.

When releasing software updates to the API in the production environment, the software behavior in the consumer integration environment must be updated at the same time to maintain consistent behavior and not lag in functionality with production.

VA APIs must follow the security standards. To access any VA API the appropriate credentials must be issued to the prospective API consumer. This requires a signup process to be available.

Guidance

Sign up in the consumer integration environment should be immediate without team intervention.
Consumer credentials should be removed if consumer becomes inactive.

In the consumer integration environment, the consumer signup process should have a low barrier to entry to encourage rapid prototyping by prospective consumers. To support this, sign up should be immediate and fully automated, requiring no manual intervention from the API team or other support teams.

VA recommends an automated signup form the consumer completes with their name, organization, and contact information where an automated backend system assigns the credentials to the consumer, securely sends the credentials to the consumer, and then tracks their usage. Easily acquiring necessary credentials enables rapid prototyping and earlier time-to-production for application developers consuming the API.

Although this environment is only utilizing test data, inactive consumers should be removed in accordance with your program's policy on defining inactive consumers.

Requirement

Sign up in the production environment must be monitored.
A consumer must demonstrate to the VA Stakeholders how the API will be used in production for APIs handling PII and PHI data.
Consumer credentials must be removed if consumer becomes inactive.

Consumer signup for production access must be monitored to ensure appropriate use of VA data. Prior to issuing production credentials to the first consumer outside of VA, ensure all necessary VA privacy paperwork is current for the API. For example, Directive 6508 for Privacy Threshold Analysis (PTA) and Privacy Impact Assessment (PIA), and System of Records Notice (SORN).

All non-VA consumers must be party to an agreement such as Terms of Service agreement, Memorandum of Understanding, or similar document that the Office of General Counsel approved. Please work with the Office of General Counsel and relevant privacy offices to determine requirements for your API.

To prevent unauthorized access through stale or unused credentials, access must be removed for inactive consumers. This is in accordance with VA Cyber-Security Program guidance under Directive 6500 for Security Continuous Monitoring and NIST guidance under NIST SP 800-53 Rev5:AC-02(03)-Disable Accounts. Inactivity is typically defined by your program policy. Industry best practices commonly use 90 days of inactivity to define an inactive consumer.

Test Data

Test data¶

Requirement

Test data must not have personally identifiable information (PII) or protected health information (PHI) in any non-production environment.

Guidance

Test data should be representative of actual data complexities found in the production data sets.
Test accounts should be properly maintained to support consumer testing across various roles such as a Veteran, Veteran representative, clinician, or other authorized end-users when your API involves role-based access and authorization.

VA has strict policies prohibiting the use of personally identifiable information (PII) or protected health information (PHI) in any non-production environment. If an API handles PII or PHI, the API design process must include planning for the creation and management of test data. This test data should reflect the complexities of production data, but it must not consist of actual production data.

Visit test data best practices for VA guidance on creation and management of test data.

Test data best practices¶

Guidance

Consider the cost and maintenance overhead of developing your own tools versus leveraging industry or open-source tools for anonymizing and generating synthetic data.
Review the VA Technical Reference Model(TRM) for VA approved products that can be utilized.
Review guidance from Health and Human Services (HHS) and National Institute of Standards and Technology (NIST) for anonymizing data.

Test Data Management (TDM) has evolved into a specialized industry to address key challenges:

Generating representative test data that includes important edge-cases
Creating test data with production-like scenarios
Creating and maintaining test accounts to mimic role-based access
Resetting test data when APIs add, update, or delete data

Best practices for managing test data include:

Generating synthetic data
Mocking API responses
Anonymizing production data to eliminate sensitive information
Maintaining user-role test accounts
Creating representative data to reflect production complexities
Implementing data reset and refresh processes

Generate synthetic test data¶

Synthetic test data is for teams who manage their database and the data within it or who need variety in mocked API responses.

Synthetic test data can be generated to mimic real production data. There are open-source tools and paid-for products that will generate different types of data, such as random names, addresses, and phone numbers. Using an industry tool is better practice than developing custom tools when needing thousands of test data records. The tip here is to generate representative data.

Mock API responses¶

Mocking API responses is very similar and related to generating synthetic test data, but in this case you don't manage the database or data within it and therefore can't control the data. Instead, mocked results are returned to simulate the calling of the interface. This may still require generating synthetic test data for the mocked responses.

For example, your API must call another API within VA called Rx API to retrieve a Veteran's prescriptions. Instead of your API making that actual call to the Rx API, a mocked interface would return a mocked result from a set of mocked results you manage to simulate the actual call.

A good example of implementing this would be having the mocked response return different types of prescriptions and a variety of list sizes for each different Veteran test account. Try to be as representative to the real experience as possible.

Mocked data responses can often be simplified to be json blobs instead of maintaining a replica test database.

Caution

Mocked interfaces often respond faster than the actual interface call.
Mocked interfaces may not completely simulate the behavior of the actual interface, therefore, it may give a false confidence that production will behave the same way.

Mocking responses can have drawbacks that API teams must consider. Since the underlying code isn't executed in mocked interfaces, the API may appear to behave correctly in the non-production environments but fail in production. Considerations include:

Response times not reflecting actual interface calls.
Mocked responses not representing actual results from the called interface due to mistakes in generating test data or due to changes to production systems that drift from the mocked responses, such as a change in a datatype of a property.
Network and security factors with server certificates and credentials going untested.

To ensure API quality, perform comprehensive end-to-end testing in a fully integrated non-production environment that does not rely on simulated interfaces.

Anonymize production data¶

Anonymizing data is the process of transforming all personal and sensitive datasets irreversibly.

Requirement

Anonymized data must be irreversible.

Irreversible, means the resulting data cannot be reverse-engineered to reconstruct the original data. For example, consider the name "John Doe" as a real person. If this is anonymized by shifting each letter one character to the right, it creates "Kpio Epf" for the name. This anonymized name has 2 major drawbacks. It is easily reverse-engineered to determine the actual name and the name is unpronounceable. This method and methods like this are discouraged and represent an anti-pattern.

Anonymization is best done by a software product whose job is to do just that. However, customization to such tooling could be necessary, such as reducing the total dataset to fit the smaller test database size. The best anonymization tools would take the above example of "John Doe" and replace "John" with a random choice from a large dataset of first names, then replace "Doe" with a random choice from a large dataset of last names. If truly random choices were made, it would be impossible to reverse-engineer back to the original name.

The anonymizing strategy is best suited for teams who manage the database and the data it contains, where data is relatively simple, and when the production data can be reduced to fit the smaller database footprint of a typical development environment.

Keep in mind, anonymizing the sensitive data includes more than just names as given in the above example. It covers a wide range of data elements, including Social Security Numbers(SSNs), Internal Control Numbers (ICNs), sensitive dates (e.g., birth, death, and service dates), email addresses, physical addresses, phone numbers, and medical records. Remember to handle the often overlooked areas such as sensitive data on PDF documents. These items are meant as examples and not an exhaustive list.

For further guidance on de-identification techniques and requirements visit HHS and NIST SP 800-188.

Maintain test accounts¶

Guidance

API teams should maintain and document a set of test accounts to support scenarios where API behavior depends on the identity or role of the requester.
Test accounts should be documented with the API documentation.

When API behavior depends on the identity of the requester, API teams should maintain and document a set of test accounts for use with the API. This documentation should describe the specific characteristics that make each test account relevant for a test case. For example, include test accounts for representing various account statuses, accounts that may have related data such as a list of several claims or prescriptions, and accounts with a mixture of data domains, such as both appeals and claims.

The list of test accounts a consumer can use should be documented with the API documentation along with information about how to reset the account to the baseline dataset.

Representative test data¶

Representative test data requires having enough variety in the test data that reflects real production experiences.

Strive for test data to include:

empty and populated optional values
lists that include empty, typical size, and extreme size
numeric values that contain typical size, extremely low, and extremely high
booleans with both false and true example sets
variety of different enumerated values
string values with no value, typical size, and largest size it can hold

For example, if an API returns a list of prescriptions for a Veteran, then the data will need to include several different test accounts with various data scenarios in order to test pagination and performance. For example, test accounts with:

no prescription (extreme low)
1 prescription (typical)
7 prescriptions (typical)
40 or more prescriptions (extreme high)

To get the variety of datasets needed, synthetic data generator tools should be considered to determine if it can save time over manual test data creation.

Test data reset¶

If the API allows for adding, updating, or deleting data then automated cleanup will be helpful to reset the test data to baseline values. API teams have had success by scheduling ongoing data resets on a nightly or weekly basis. In addition, for APIs maintaining test accounts, having on-demand resets for a test account.

Since APIs are unique in what data they manage, the strategies vary for cleaning. Typical strategies include:

Ability to reset data for a single test account.
Refresh the entire test database to a "golden" set of baseline data on a periodic basis.
Run an auto-cleanup after each automated test run if the test generated or changed data.

Performance & Availability¶

Government digital services have a history of challenges with performance and reliability. This issue often stems from the complexities of integrating multiple services, including legacy systems, and letting their limitations affect the end user experience. The primary goal of your API should be to simplify this complexity, handle any errors from upstream services, and enhance performance.

With appropriate strategies, your API can achieve high performance and reliability, comparable to newly developed projects. For APIs being built from scratch, it's crucial to prioritize performance and reliability from the outset, as APIs inherently serve as dependencies for other applications.

Performance¶

Requirement

APIs must respond to requests within 10 seconds, including any upstream calls, or return a 504 error, signaling a server-side timeout.

Guidance

APIs should aim for response times under 1 second.
If an API cannot respond within 10 seconds, it should use an asynchronous pattern that provides an immediate response and processes the request in the background.

While the maximum allowable response time is ten seconds, it's recommended that APIs strive for response times of under one second, and ideally, in just a few milliseconds.

Availability¶

Guidance

APIs should aim for 99.9% availability in production environments.
APIs should aim for 99.0% availability in consumer integration environments.

APIs are expected to maintain 99.9% availability in production and 99.0% in testing environments. Assistance is available for overcoming technical or policy challenges to achieve these goals.

See service level objectives under monitoring and blue-green deployments under production management for further information on how to achieve high availability.

Managing downtime in dependencies¶

Requirement

APIs must not pass upstream errors to consumers.

VA APIs often rely on various external services with differing reliability levels. While API teams may not always influence changes in these services, it's critical that APIs do not expose consumers to unexpected errors (500) or service unavailability issues (503, 504).

APIs should handle unexpected errors, downtimes, and timeouts from upstream services with clear and consistent messaging. This approach ensures that consumers are aware of the issue's source and can appropriately inform their users.

For advice on how to map upstream errors, refer to Choosing an error code.

Maintenance¶

APIs are expected to meet these service level objectives (SLOs) even during scheduled maintenance. Ensuring uninterrupted service during updates or other maintenance activities is a hallmark of a robust architecture and deployment strategy.

Security

Securing APIs¶

Veterans share their data with the VA and trust that it will be safe. As such, the commitment to safeguarding that data must extend beyond compliance with federal regulatory obligations. Due to the open nature of APIs, selecting the correct authentication method is essential to mitigating risks, preventing unauthorized access, and ensuring the integrity and reliability of VA's data systems, all while facilitating secure and seamless data exchange for authorized users.

API Key or OAuth 2.0?¶

APIs that involve user authentication, personally identifiable information (PII), protected health information (PHI), or scoped or time-limited access will use OAuth 2.0. Otherwise, the API will use an API key.

The flowchart below assists in determining the appropriate option to use.

Flowchart for determining authentication requirements.

API key¶

APIs that don't involve user authentication, PII, or PHI can use API keys for access control. Otherwise, your API will use OAuth 2.0.

API keys are passed via a request header and validated at an API's server or gateway.

Documenting API keys¶

The example below defines an API key named apiKey that is sent as a request header. The security scheme is named apiKeyAuth and is used in the security section to apply the apiKeyAuth security scheme to the API. The security section shown below will apply the API key globally to all endpoints. Click on the circular buttons labeled with a '+' to view code annotations.

components: 
  securitySchemes:
    apiKeyAuth: # (1)
      type: apiKey
      name: apiKey # (2)
      in: header

security: # (3)
  - apiKeyAuth: []

apiKeyAuth is the name of the security scheme.
apiKeyis the name of the request header.
Security is set globally so the security scheme apiKeyAuth will apply to all endpoints.

The apiKeyAuth security scheme can also be applied to the operation level. Below, the apiKeyAuth security scheme is used in the security section of the /pharmacies endpoint. This is useful if only some endpoints need the API key.

paths: 
  /pharmacies:
    get:
      tags:
        - pharmacy
      summary: Returns a list of facilities with pharmacies.
      description: Returns a paginated list of all VA facilities that provide pharmacological services.
      operationId: getPharmacies
      security: # (1)
        - apiKeyAuth: []
      responses:
        '200':
          description: The veteran's prescriptions were successfully found and returned as an array.
...

The apiKeyAuth security scheme is applied to this endpoint.

OAuth 2.0¶

OAuth 2.0 lets an API consumer get an access token on behalf of a user or system.

OAuth 2.0 Flows¶

OAuth 2.0 offers several flows suited for different types of applications. These flows are designed to ensure the security of user data, particularly Personal Identifiable Information (PII) and Protected Health Information (PHI). Some OAuth 2.0 flows, such as the Implicit Grant, are no longer recommended to secure applications.

Below are the recommended OAuth 2.0 flows for VA applications, along with examples of how these might be implemented in the context of VA services.

Authorization Code Flow¶

This is ideal for applications where the client (the application requesting access) can interact with the user's web browser and receive incoming requests from the web browser (like a web server). It's the recommended flow for most web applications.

Example: A web-based VA patient portal where Veterans can access their personal health records, make appointments, and communicate with healthcare providers. When Veterans log in, the application redirects them to the VA's authentication service. After login, the service redirects to the application with an authorization code, which the application then exchanges for an access token.

Authorization Code Flow with PKCE¶

Authorization Code Flow with PKCE (Proof Key for Code Exchange) adds a layer of security for mobile and public clients. It's especially useful for native applications and single-page apps.

Example: A mobile app Veterans use to manage their VA benefits and health services. The app starts the authorization process by generating a code verifier and a code challenge. This code challenge is sent to the VA's authentication service when the Veteran logs in. Upon successful login, the service returns an authorization code to the app, which exchanges it for an access token using the code verifier. This ensures the token exchange is secure, even if the initial authorization request was intercepted.

Client Credentials Grant¶

Best for machine-to-machine applications, where an application needs to access its own resources on the server. It’s less common in user-facing applications since the authentication is based on the client credentials rather than individual user credentials.

Example: An internal VA backend application that interacts with other backend services, like an application that accesses a VA database to retrieve non-personal data for reporting or analysis. The application authenticates itself to the VA API using its own credentials and receives an access token to access the required resources.

In each of these flows, the key aspect is that the access granted is limited to what is necessary for the application to function and no more, adhering to the principle of least privilege. This approach and robust implementation help protect the sensitive data VA applications handle.

Documenting OAuth 2.0¶

Documenting Authorization Code Flow¶

Below is an example of Authorization Code Flow for the fictitious ‘Rx’ API. The security scheme is called oAuth2AuthCode and the flow is authorizationCode. The authorizationUrl field contains the authorization endpoint that will be used to obtain the authorization code from the authorization server. The tokenUrl field contains the endpoint that is used by the client to obtain an access token. All supported scopes should be listed under scopes.

components: 
  securitySchemes:
    oAuth2AuthCode: # (1)
      type: oauth2
      description: This API uses OAuth 2 with the authorization code grant flow.
      flows:
        authorizationCode: # (2)
          authorizationUrl: https://api.va.gov/oauth2/authorization   
          tokenUrl: https://api.va.gov/oauth2/token
          scopes:
            prescription.read: Retrieve prescription data
...

oAuth2AuthCode is the name of the security scheme.
The type of flow set for this security scheme is Authorization Code Flow.

The oAuth2AuthCode security scheme is then used in the security section of the /prescriptions endpoint. This endpoint requires the scope prescription.read so it is listed below the security scheme.

paths:
  /prescriptions:
    get:
      tags:
        - prescription
      summary: Returns a list of a Veteran's prescriptions
      security: # (1)
        - oAuth2AuthCode:
          - prescription.read # (2)
...

The oAuth2AuthCode security scheme is applied to this endpoint.
This endpoint requires the prescription.read scope.

Documenting Client Credentials Grant¶

This example shows Client Credentials Grant being used in the ‘Rx’ API. The security scheme is called oAuth2ClientCredentials and the flow is clientCredentials. The tokenUrl field contains the endpoint that is used by the client to obtain an access token. All supported scopes should be listed under scopes.

components: 
  securitySchemes:
    oAuth2ClientCredentials: # (1)
      type: oauth2
      description: This API uses OAuth 2 with the client credentials grant flow.
      flows:
        clientCredentials: # (2)
          tokenUrl: https://api.va.gov/oauth2/token
          scopes:
            prescription.read: Retrieve prescription data 
...

oAuth2ClientCredentials is the name of the security scheme.
The type of flow set for this security scheme is Client Credentials Grant.

The oAuth2ClientCredentials security scheme is used in the security section of the /prescriptions endpoint. This endpoint requires the scope prescription.read so it is listed below the security scheme.

paths:
  /prescriptions:
    get:
      tags:
        - prescription
      summary: Returns a list of a Veteran's prescriptions
      security: # (1)
        - oAuth2ClientCredentials:
          - prescription.read # (2)
...

The oAuth2ClientCredentials security scheme is applied to this endpoint.
This endpoint requires the prescription.read scope.

Monitoring

Monitoring¶

API monitoring tracks the availability and performance of an API.

Health checks indicate the API's availability and also the availability of the systems it depends on, while Service Level Objectives monitor the performance standards.

Health Checks¶

Requirement

Each version of an API must have a unique health check endpoint.
APIs must not repurpose application endpoints as health checks.

To enable the configuration of status pages and other monitoring tools at VA, we require that each version of an API has a health check endpoint. This is because some APIs deploy versions on different infrastructure, and versions that share infrastructure now may not always in the future. However, if two versions of an API do share infrastructure, you may route each version's health check endpoints to the same controller.

Health checks vary in complexity depending on the API. At a minimum, the health check should confirm that requests are successfully being routed through the gateway and VA network to your application. More holistic health checks may validate that the application’s internal dependencies, such as data stores and upstream services, are up and operating without error.

Health checks are required to be standalone endpoints rather than repurposed endpoints that return application resources. This ensures that the health check is free of caching, auth, and business logic.

Example URI and Responses¶

Requirement

Health check endpoints must not require authentication.
Health check endpoints must return a 200 response code if the API is up.
Health check endpoints must return a 5xx level response code if the API is down.

Below is an example health check URI and response for a fictional 'Rx' API.

GET https://api.va.gov/services/rx/v1/healthcheck

There is not a finalized RFC for healthcheck responses. A draft proposal outlines a potential response requiring only a status with one of the following values:

pass with a 200 response code. The API is available and functioning as expected.
fail with 5xx response code. The API is unavailable or throwing errors. If the health check response itself is down the server should return a 503 response.
warn with 2xx-3xx response code range. The API is up but having intermittent issues.

200 OK

{
  "status": "pass",
  "version": "1",
  "releaseId": "1.2.2"
}

503 Service Unavailable

{
  "status": "fail",
  "version": "1",
  "releaseId": "1.2.2"
}

Accessibility¶

APIs must provide a health check endpoint that VA monitoring tools can access, so those tools can monitor the status of the API. API consumers can then take appropriate action if an API cannot handle requests or is unhealthy. The health check endpoint or endpoints made available to consumers of an API must not provide internal or sensitive information within the response. In addition, the health check must reflect the state of the interface, not the service behind that interface.

Service Level Objectives

Service Level Objectives (SLOs)¶

When delivering an API, there's an inherent tension between quickly and consistently shipping features and ensuring that the service is stable, secure, and meets consumers' expectations for performance.

Once an API is in production, service quality becomes one of the most important API features. Often there are Service Level Agreements (SLAs) that the team must meet. Teams can strive for 100% availability and lightning-fast performance, but that would require immense resources in time, people power, and financial investment.

API teams must strike a balance, continuing to ship new features while maintaining reliability objectives. The API must stay reliable and performant enough that consumers do not notice and are not affected.

This section of the VA API Standards defines how to set desired service quality goals via Service Level Objectives (SLOs) and measures these objectives via Service Level Indicators (SLIs) to meet SLAs.

What to Measure¶

SLIs, as defined in the introduction, should be indicators that provide insight into how our customers interpret the reliability of VA APIs. The SLIs define what is being measured to meet an SLO.

What makes a good SLI?¶

While all SLIs are metrics, not all metrics qualify as good SLIs. For example, CPU utilization is a metric but not a good SLI. DevOps may have alerts for CPU utilization, memory usage, etc., as they can be early warning signs that reliability is about to suffer. However, they do not make good SLIs because the customer does not directly experience them.

Availability, latency, and error rate are customer-facing metrics the customer does directly experience. Individual APIs may also have feature-level SLIs, such as “claims processed”. However, availability, latency, and error rate are general metrics that should be tracked for all VA APIs.

Guidance

Availability, latency, and 5xx error rate are recommended metrics to be tracked for all VA APIs.

To track SLIs, turn them into a ratio of events being measured over total events. Then, measure that ratio over time.

The first metric, availability, is the percentage of time a service is usable. It is usually expressed by a ratio of hours the service is available over total hours for the window that is being measured.

The second metric, latency, measures the actual time to respond to a request. The value is the percentage of requests with response times below a certain target value within a given time interval. The measure of compliance is the percentage of requests with response times under the target value. For example, an SLO might be, “latency for 90% of all requests must be 1000ms or below”. Some API endpoints may be slower than others due to system complexity, data transformations, and the nature of the operation (e.g., writes, complex reads, form processing, and file uploads). For more information on this, see Handling Latency Outliers.

Finally, error rate measures how often an error occurs and is represented as a ratio of all requests received (Number of requests resulting in errors / Total number of requests). This is typically displayed as a percentage. VA recommends making 2 error rate calculations, one for server failed errors (5xx errors) and one for consumer-supplied data errors (4xx errors) to avoid skewed results in one total error rate calculation.

Example Calculations¶

Metric	Ratio	Ratio Example	Percentage Example
Availability (30 day window)	Uptime/Total time	715/720 hours 720 = 30*24 hours Down for 5 hours for a maintenance upgrade	99.3% Available
Latency beyond 1000ms	Requests with response times < 1000ms / Total requests	9500/10000	95% of requests are faster than 1000ms
4xx Error Rate	4xx Responses / Total Responses	40/10000	0.4% Error Rate
5xx Error Rate	5XX Responses / Total Responses	50/10000	0.5% Error Rate

Measurement Windows¶

Service Level Indicators (SLIs) are less interesting if looking at them individually. They are more interesting as trends and patterns over time. “Over time” means defining a measurement window. Using availability as the example SLI, it will be measured within a specific time window. For example, the baseline example SLO is:

Within a rolling 30-day period, APIs must maintain 99.9% availability in production and 99.0% in test environments, with no exclusions for planned downtime or holidays.

To achieve this SLO, it is recommended to set up two additional measurement windows for short and long-term tracking. Below are the three recommended windows.

7-day window: Set a shorter-length window, such as a 7-day window, for proactive monitoring with a stricter alert setting. For example, if the SLO targets 99.9% availability, set this window at 99.95% or greater.
30-day window: This window tracks compliance with the SLO and is the period used within the Service Level Objective.
90-day window: Set a longer-length window to track trends. Providing a wider view of the service's performance and reliability over time allows teams to assess the effectiveness of past improvements and lean into or ease off reliability engineering.

Guidance

Rolling time windows instead of calendar windows more closely align with the consumer's experience and allows API teams to identify and focus on problems sooner.

VA recommends the use of rolling time periods for the measurement window, such as 7-day, 30-day, and 90-day rolling windows. A rolling window is more closely aligned with the consumer’s experience and reports recent activity as it happens. Compare this to a calendar window in which the results being measured are not available until the end of the calendar period. Rolling windows allows teams to see trends early and to pivot to focus on the problems sooner.

Weekend Variability¶

Using a common 7-day, 30-day, and 90-day rolling window measuring system can simplify reporting requirements within VA. However, it introduces weekend variability which could skew results for some APIs. If your API is subject to weekend variability, change the rolling window alignment to avoid the potential of extra weekends by using 7-day (1-week), 28-day (4-week), and 84-day (12-week) rolling windows. Using 7-day increments when creating the measurement window keeps extra weekends out of the calculation, thus eliminating potential variability.

Latency Variability¶

With measuring latency, it should be acknowledged that all requests are NOT equal, and their response times can vary widely depending on several factors, including load, the complexity of the request, and the number and performance of the sub-systems a request must traverse. Thus, using an average response time metric (adding up all response times and dividing by the number of responses) can be extremely misleading due to outliers.

Therefore, VA recommends tracking latency using percentiles. Percentiles break down the percentage of responses above or below a threshold, allowing for better data interpretation by handling variability and outliers within the data.

Guidance

Track latency using percentiles.
Use capabilities from the Application Performance Monitoring (APM) tools available to measure latency.
Be aware of timeouts within the infrastructure the API runs in.
Be cautious of alert fatigue.
Monitor each endpoint within the API separately, but have a single Service Level Objective (SLO) for the API.
Note latency behavior in the API's documentation.

Tracking P50 (50^th Percentile), P90, and P99 latency metrics can provide a comprehensive view of an API’s performance from median use to edge cases:

P50 Latency: Represents the median response time, providing a benchmark for the typical user experience. 50% of requests were completed within a given time target and 50% of requests took longer than the given time target. For example, a SLO might be 50% of requests must be under 500 ms.

P90 Latency: Identifies the performance outliers that aren't as extreme as the worst-case scenarios but could still negatively impact a significant portion of users. In this case, 90% of the requests were completed within a given time target, but 10% took longer than the given time target. For example, an SLO might be, 90% of requests must be under 1000 ms.

P99 Latency: Reveals the upper bounds of latency experienced by users. In this measure, 99% of requests were completed within a given time target but 1% took longer than the given target. For example, an SLO might be, 99% of requests must be under 5000 ms. That means 1% could take longer and indicate an anomaly or unusual traffic that might be interesting to inspect.

Application Performance Monitoring (APM) tools often provide these metrics out of the box. Use these tools if possible and don’t reinvent the wheel.

When setting the SLO for your API for these metrics, be aware of timeouts set by the gateways, proxies, database connections, or even the application server the API runs within. Some have timeouts of 10 seconds, while others use a 15-second timeout. Additionally, timeouts within Cloud services may not be configurable. For example, AWS Lambda has a 10-second timeout, and AWS API Gateway has a 30-second timeout. Neither of these can be changed.

Knowledge of the timeouts is important when setting the P99 SLO. Take care not to let the SLO push past the boundaries of the API’s infrastructure and give a false positive. For example, if 99% of requests should take under 10 seconds, but a piece of infrastructure, such as the gateway, times out in 7 seconds, this metric isn’t as informative since no request could ever exceed 7 seconds.

To understand if an API is meeting an expected level of service, the P99 SLO value should be set so that the vast majority of requests are completed within that time frame. For example, if there are times that a request can randomly take over 5000 ms, but 99 percent of the time it is faster than 5000 ms, then set the P99 for 5000 ms. This allows outliers to be isolated and can help identify where improvements can be made, and timely feedback if the trend is unexpectedly worsening.

When setting SLOs and expectations for the level of service at the P50 end of the spectrum, set this number within reason so alert fatigue doesn’t set in, while at the same time, correct alerting triggers when non-normal behavior begins. For example, the API on average responds within 350 ms to its requests. The API example initial baseline SLO could be:

Within a rolling 30-day period; 50% of requests must be under 500 ms; 90% must be under 1000 ms; and 99% must be under 5000 ms in production.

7-day window: This is for active monitoring with a stricter setting than the SLO defined above and should be aspirational. e.g. P50 @ 350 ms, P90 @ 750 ms, and P99 @ 2500 ms. This window helps identify problems early.

30-day window: This window tracks compliance with the SLO defined above and is the period used within the SLO.

90-day window: Tracks trends over time in the same manner as the availability 90-day window.

Of course, there is often an outlier endpoint(s) within an API. For more information on handling slow-responding API endpoints, see Handling Latency Outliers.

VA recommends monitoring each endpoint separately when monitoring latency to troubleshoot issues with particular endpoints. However, the API should have a single SLO defined for all the endpoints within an API instead of each endpoint individually.

VA Recommended Service Level Objectives (SLO)¶

VA recommends that APIs should strive to meet these Service Level Objectives (SLOs) as a basic standard and to publish the actual SLO of your API with your API’s documentation.

Guidance

Track and publish the actual SLO of your API with your API’s documentation.

Metric	VA Recommended Service Level Objective
Availability (Production)	Within a rolling 30-day period, maintain 99.9% availability with no exclusions for planned downtime or holidays.
Availability (Test Environments)	Within a rolling 30-day period, maintain 99.0% availability with no exclusions for planned downtime or holidays.
Latency	Within a rolling 30-day period, 50% of requests respond under 500 ms, 90% under 1000 ms, and 99% under 5000 ms in production. VA API Standards recommend response times to be 1 second or less.
5xx Error Rate	Within a rolling 30-day period, have a 0.5% Error Rate or less

Monitoring SLOs¶

Most Application Performance Monitoring (APM) tools offer SLO dashboards out of the box; if not, API teams can create custom dashboards to visualize SLO compliance.

Datadog, shown below, has a built-in SLO feature that allows teams to set 7, 30, and 90-day windows and track one or more SLIs within them. It also calculates an error budget.

Image showing Datadog Error Budget Left report for 7 day, 30 day, and 90 days, based on a target objective for availability of 99.95%, 99.99%, and 99.9% respectively for those time periods.

Error Budgets¶

An error budget can help a team balance feature development and service quality. Quality should always be a fundamental requirement for new features, and feature development should not come at the expense of quality.

The Error Budget enables an understanding of how much quality has been sacrificed in a given time window, and what budget is left. This data helps the team align priorities. If the API is close to exhausting the error budget, engineering efforts should focus on improving reliability and performance while naturally decreasing the emphasis on feature development.

Conversely, having plenty of overhead in an error budget does not mean the team should de-prioritize quality and rush to implement new features. Instead, it indicates that the team has successfully managed quality alongside feature development.

Calculating Error Budgets¶

If tooling such as Datadog is unavailable, teams can manually calculate their error budgets.

If availability is 99.9% over a 30-day window, the error budget can be calculated as:

        Total time (minutes) = 30 days × 24 hours/day × 60 minutes/hour
        43200 minutes

        Error Budget (minutes) = 43200 minutes × 0.001
        43 minutes

In this case, the rounded error budget would be 43 minutes. Teams can then track the downtime that has already occurred in the measurement window to determine their burn rate.

Error Budget Left, as described in the example above, would be:

    Error_Budget_Left = Error_Budget - Spent_Budget

Handling Latency Outliers¶

Some endpoints within an API may be slower than others due to system complexity, data transformations, and the nature of the operation (e.g., writes, complex reads, form processing, and file uploads).

For example, the API might consist primarily of GET requests that are quickly retrieved resources, with one POST that does complicated and expensive processing depending on the search criteria given. This one expensive search could cause the latency thresholds to be exceeded for the entire API. This section explains how to handle those situations.

What would a consumer of this API endpoint want to know?¶

In this situation, put yourself in the end-user's standpoint. How long would you honestly wait for a response?

Studies have shown that an end-user will start to wonder what is happening around two seconds and around five seconds will begin to abandon the request unless given guidance on a wait time. Based on this, here are some options to handle this situation.

Guidance

Note latency behavior in the API's documentation.
Explore caching options.
Explore asynchronous processing options.

Provide documentation for the service level objectives (SLOs) this API attempts to achieve but highlight the one exception so the consumer can communicate appropriate end-user expectations when developing their applications. However, there is a limit here. See requirements on performance.

Determine if this endpoint is a candidate to use pre-loaded cache to avoid retrieving data more than once, or save off the expensive retrieval results once completed for the next request to utilize if the results do not stagnate quickly.

Determine if this endpoint should be handled asynchronously to prevent consumers from abandoning long-running requests, similar to a “take a ticket, and we’ll call you when we're done” approach.

API Lifecycle

API Lifecycle¶

As an API version progresses through its lifecycle, consumers test it, use it, and eventually migrate away from it.

The lifecycle stages, or states are:

State	Description
ACTIVE	This version of the API is available in production and is fully supported.
DEPRECATED	This version of the API is available for a fixed period of time. It is fully supported for existing consumers. It is not available to new consumers. Deprecation and Sunset HTTP headers are set to inform consumers that it is deprecated and when the API will be deactivated.
DEACTIVATED	This version of the API is unpublished from production and no longer available to any consumer. The footprint of all deployed applications entering this state must be completely removed from production, sandbox, and lower environments.

This section describes the recommended lifecycle and versioning strategies for VA APIs:

API Evolution: A strategy for enhancing an API without introducing breaking changes for the consumer.
Versioning: Conventions for creating a new version of an API to deliver major or breaking changes.
Deprecation: Marking a complete API, an API version, an endpoint, or an element of the API for future removal.
Deactivation: Retiring an API version and making it unavailable for use.

There are special cases that may occur during the life of an API. One of those cases is breaking up a large API into smaller groupings.

Decomposition: Breaking up a large API into a set of smaller more focused APIs.

API Evolution¶

API Evolution is an enhancement strategy for APIs where major versioning is unnecessary to fulfill a backward-compatible contract as long as you are adding rather than removing endpoints, fields, and query parameters.

Requirement

To be non-breaking, and not require versioning, additions must be optional, meaning that all the endpoints function as before and do not break if a consumer application ignores the recent changes.

Guidance

Providers should design APIs in a forward and extensible way to maintain compatibility and avoid duplication of resources, functionality, and excessive versioning.

Extensibility¶

An evolvable API is extensible, but designing for extensibility takes special care and forethought since it's possible for early design decisions to make later extensibility of an API more difficult.

The following practices are not extensible:

Ordered query or request body parameters
Returning one error rather an array of errors in responses
Flat data in request and response bodies

For example, if a response to a payment history API returned a flat list of payment identifiers, like what is below, then adding additional data--without versioning--would be impossible.

{
  "data": [
    "37a6c0b9-6033-484f-a707-84649e5c7c35",
    "2c7a317a-40d0-4ec7-92ef-df954a6f2818",
  ]
}

If we later want to return a payment date there is nowhere to attach it. A more extensible solution is to wrap data in an object with named fields. Using a resource in the response is even safer. For example:

{
  "data": [
    {
      "id": "37a6c0b9-6033-484f-a707-84649e5c7c35",
      "type": "PaymentHistory",
      "attributes": {
        "amount": "$3,444.70",
        "date": "2019-12-15T00:00:00Z",
      }
    },
    {
      "id": "2c7a317a-40d0-4ec7-92ef-df954a6f2818",
      "type": "PaymentHistory",
      "attributes": {
        "amount": "$3,444.70",
        "date": "2019-12-31T00:00:00Z",
      }
    }
  ]
}

Extensible names¶

Be specific and detailed when choosing names to avoid future limitations. While an endpoint in your API may initially only return one (mailing) address, like this:

{
  "address": {
    "street": "3700 North Capitol Street NW #558",
    "city": "Washington",
    "state": "DC",
    "postalCode": "20011"
  }
}

It may need to return both a residential and a mailing address in the future. Using extensible names at the start would allow an easier shift to a response like this:

{
  "residentialAddress": {
    "street": "3700 North Capitol Street NW #558",
    "city": "Washington",
    "state": "DC",
    "postalCode": "20011"
  },
  "mailingAddress": {
    "street": "511 10th St NW",
    "city": "Washington",
    "state": "DC",
    "postalCode": "20004"
  }
}

Adding new endpoints¶

Guidance

Adding a new endpoint or a method to an existing path is an evolutionary change. Incrementing the major version in these cases is not recommended.

Consumers or product owners may consider multiple new endpoints a major change. Still, it is up to the provider to decide if the major version needs to be bumped.

Evolving requests¶

Requirement

There must NOT be any change in the HTTP verbs (such as GET, POST, and so on) supported by an existing URI.
Query-parameters must be unordered.
New query parameters appended to URIs must be optional.

Renaming endpoint paths¶

If an endpoint's path needs to be renamed, an API team should do so in an additive manner so that it is effectively a new endpoint with the same functionality. The API team should keep and mark the old endpoint as deprecated.

Adding query parameters¶

While providers should only do so in moderation, adding optional query parameters can avoid versioning an API due to relatively minor changes in functionality.

For example, if we wanted to add pagination to a resource collection endpoint that was previously not paged, such as:

../v0/claims

then we could append optional pageNumber and pageSize query parameters to the request, like this:

../v0/claims?pageNumber=2&pageSize=10

As long as the original call signature of the endpoint functions as before, meaning it continues not to page without the new query params, then no versioning is necessary.

Evolving responses¶

Requirement

There must NOT be any change in the HTTP status codes returned by the URIs.
There must NOT be any change in the name and type of the request or response headers of an URI.
New headers must be optional.

Adding fields¶

As shown in the Extensibility section above, adding fields as long as they are within an object would not require versioning because adding fields is an additive change, which existing consumer apps could ignore. For example:

Before the addition

{
    "id": "2c7a317a-40d0-4ec7-92ef-df954a6f2818",
    "type": "PaymentHistory",
    "attributes": {
      "amount": "$3,444.70",
      "date": "2019-12-31T00:00:00Z"
    }
}

After the addition

{
    "id": "2c7a317a-40d0-4ec7-92ef-df954a6f2818",
    "type": "PaymentHistory",
    "attributes": {
      "amount": "$3,444.70",
      "date": "2019-12-31T00:00:00Z",
      "paymentMethod": "DIRECT_DEPOSIT"
    }
}

Renaming fields¶

If a field needs to be renamed, you should do so in an additive manner and not remove the old field. Instead, mark the old field as deprecated.

Here is an example of what an additive change should look like:

Adding ‘approvalDate’ and renaming ‘date’ to ‘paymentDate’

{
    "id": "2c7a317a-40d0-4ec7-92ef-df954a6f2818",
    "type": "PaymentHistory",
    "attributes": {
      "amount": "$3,444.70",
      "date": "2019-12-31T00:00:00Z", // marked as deprecated in the OAS
      "approvalDate": "2019-07-31T00:00:00Z",
      "paymentDate": "2019-12-31T00:00:00Z"
    }
}

As shown above, if an API team needs to add a new "approvalDate" field to an object that already has a "date" field, they can introduce a "paymentDate" field to replace "date" field's value for clarity. However, the original "date" field must behave as it did previously and it must be marked as deprecated in the OAS documentation.

Versioning¶

Breaking changes to published API versions can disrupt consumers, so our policy is to ensure that breaking changes are only published with a major version change. Additionally, release notes must accompany each major and minor release.

See below for what constitutes a breaking change.

Requirement

External APIs must increment the major version within their URI if they introduce a breaking change.
Internal APIs with over one consumer must increment the major version within their URI if they introduce a breaking change.
Release notes must accompany each major and minor release.

Guidance

It is optional to increment the major version within their URI when introducing a breaking change if it is an internal API with only one consumer AND if the consumer’s clients can be updated at the same time as the API. An example of this would be a web app pushed out to all clients vs. a mobile app where updates roll out more slowly.
It's optional but recommended to publish release notes for patch versions.

Breaking change definition¶

At a high level, a breaking change is any change to an API that would cause an error in a consuming application. The following are all examples of breaking changes:

Removing an endpoint
Renaming an endpoint’s path
Removing an HTTP verb (GET, POST, and so on) for an endpoint
Removing a property
Renaming a property
Changing a properties-level or tier within an object’s hierarchy
Making a previously optional request body property required
Changing a property’s type
Changing a property’s format
Adding or removing values from an enum
Adding a new required query parameter to an endpoint
Removing a query parameter from an endpoint
Adding a required scope to an existing endpoint
Removing a scope from an existing endpoint

API versioning scheme¶

Requirement

APIs must only use URI (non-header)-based major versioning.
APIs must provide a major version number in the URI path after the namespace and before any resources/operations.
The versioning scheme must start with the lowercase character v followed by an integer, the combination of which produces an ordinal number, e.g. v0, v1, v2.
APIs must NOT expose minor or patch version numbers in the URI path.
A minor API version must maintain backward compatibility with all previous minor versions within the same major version.
For non-major changes, API teams must still update the minor or patch versions in the OAS documentation.

Guidance

Versioning should start at v0.

The major version after the namespace.

https://api.va.gov/benefits/v0

Resources or operations specific to the endpoints within the API then show up after the version number.

https://api.va.gov/benefits/v0/claims

Incrementing minor and patch versions¶

Requirement

Backward-compatible changes that introduce new endpoints or new fields within existing endpoints must be released within a new minor version.
Changes to the API's underlying service, or upstream services, that do not update its interface or cause any new side effects to the underlying system's data must be released as a patch version.
Minor version releases must not change the major version within the URI.

Guidance

Creating a new OAS doc for each minor version can result in many files that are virtually the same. Use 'reference definitions', $ref, to reduce duplication and share definitions across OAS docs.

As minor and patch version information is not in the URI path, minor changes must only be documented in the OAS doc for the API.

Before

info:
  title: Benefits API
  ...
  version: 1.0.0

After

info:
  title: Benefits API
  ...
  version: 1.1.0

Number of active versions¶

Guidance

API teams should aim to have one active version of an API.
At most, API teams should not have more than two active versions of an API.

The best practice is to have one active version of an API at a time. However, there might be situations where maintaining multiple active versions is necessary. An example would be a team modernizing a large legacy API and rolling out the new version's endpoints over time. In that case, keeping the older version active may be necessary until the new version has reached parity.

Deprecation

Deprecation¶

Info

Deprecation can take 2 forms: partial deprecation and version deprecation.

Partial deprecation is where an element such as a query param, a property, a header, or a complete endpoint is marked as deprecated. This warns consumers that this element may not be supported in future major releases of the API and guides them to use the better alternative.

Version deprecation is when the API team has replaced the API with a newer version, or the VA has decided to stop supporting the API.

Partial deprecation¶

To discourage the use of an endpoint or a specific element within an endpoint that will not be supported in future API releases, the API team must mark it as deprecated. Use the deprecated property in the OpenAPI Specification (OAS) to indicate this.

With partial deprecation, the item marked as deprecated must remain supported for the life of that major API version in order to not break existing consumers who may be using it.

Requirement

API teams must document deprecated endpoints and elements in the OAS.
Deprecated API endpoints and elements must remain supported for the life of the major version.
API teams must increment the minor version and explain the deprecation made in the release note for that release, when deprecating an endpoint or element.

Deprecated elements still need to be supported and often become expensive to maintain. Therefore, API teams will eventually want to create a new major version to remove the deprecated parts, and then deprecate the previous version.

Examples of Partial Deprecation¶

A deprecated endpoint, query parameter, header, or property on a resource would be considered a breaking change for a consumer when it is removed. Therefore, announce the deprecation, guide the consumer to the new alternative, if available, continue to support it, and wait until the next major version of the API to remove it.

To highlight that it is deprecated, add the word "Deprecated" as the first word in the description field for that element and also to the summary field if the element is an endpoint. Inform the consumer to what they should now use.

Endpoint deprecated¶

paths:
  /legacy-appeals:
    get:
      summary:  Deprecated endpoint.  Use `GET /appeals` endpoint instead.
      description: Deprecated endpoint to retrieve legacy appeals.
        Use the new `GET /appeals` endpoint instead, which will return 
        this same information, plus handle various states of appeals.
      deprecated: true

Note: Both the summary and description fields mention the endpoint is deprecated. Move any necessary information that was in summary to the description. Guide the consumer to the alternative endpoint that is replacing the one being deprecated, if there is one.

Query parameter deprecated¶

paths:
  /appeals:
    get:
      description: Retrieve appeals status
      parameters:
        - name: fromDate
          in: query
          description: Deprecated `fromDate` query parameter, use the new `months` query parameter instead.
          deprecated: true

Header deprecated¶

paths:
  /appeals:
    get:
      description: Retrieve appeals status
      parameters:
        - name: ORG-Authorization-Token
          in: header
          description: Deprecated the `ORG-Authorization-Token` header. The value in this header is now ignored and the information is acquired using the value within the JSON Web Token (JWT).
          deprecated: true

Property deprecated within a resource¶

paths:
  /appeals:
    get:
      description: Retrieve appeals status
      responses:
        200:
          description: Successful appeal
          content:
            application/json:
              schema:
                $ref: “#/components/schemas/Appeal”
components:
  schemas:
    Appeal:
      description: Appeal status schema
      properties:
        appealType:
          type: string
          description: Deprecated `appealType`, which is the decision review
            option chosen by the appellant. Use the new property
            `chosenAppealOption`. The deprecated `appealType` will remain
            populated and should match the value in the new property
            `chosenAppealOption` as best it can. The new property provides
            greater granularity of the option chosen.
          deprecated: true

Version deprecation¶

APIs that teams have replaced with newer versions or APIs that are no longer meeting technical, business, or security objectives should be deprecated and eventually deactivated.

The goal of deprecation is to progress to a state where no consumers use the API version. To get to this state, the API team must provide clear, timely, detailed documentation, and communication to consumers. In addition, the API team must do consistent monitoring of the API's usage to ensure consumers are migrating away.

Once it has been determined that an API version should be deprecated:

Establish a sunset period.
Update the API to return Deprecation and Sunset response headers.
Begin deprecation communication and guidance with all consumers to highlight the deactivation date so they can migrate and know the timeframe for completing it.
Stop onboarding new consumers to the deprecated API.
Monitor API usage.

For versioning APIs, direct consumers to the newest version.

For removals of an entire API, clarify that the API is going away and suggest alternate solutions if they are available.

Requirement

Once an API is deprecated, it must no longer onboard new consumers.
API teams must have a plan for communicating the API's deprecation state and deactivation date to all API consumers.

Guidance

The previous API version should enter a deprecated state once a new version of the API is released.
Once an API is deprecated, its documentation should suggest alternate solutions if they are available.

Sunset period¶

The sunset period is the time between marking an API version deprecated and before deactivating it. When determining how long this period should be, it's important to remember that consumers often have different priorities, budgets, and release schedules and may need a long runway to migrate away from a deprecated API. In addition, some consumers may be supporting mobile applications and unable to control when users update a consuming application.

We recommend that API teams keep an API version in a deprecated state for at least 6 months, but not longer than 24 months before deactivation. This allows consumers adequate time to move away from using it and avoids the API team from "permanently" maintaining a deprecated API while also providing motivation for consumers to migrate.

Guidance

API teams must define a sunset period for the API version they are deprecating. At least 6 months is recommended.
API teams should deactivate the API version within 24 months from when marked as deprecated.
API teams may extend the sunset period but should not reduce it unless certain that no consumers depend on the API version in a production environment.
API teams should monitor the deprecated API version to ensure its consumers are migrating away during the sunset period.

HTTP headers used in deprecation¶

API teams must return Deprecation and Sunset response headers in the deprecated API. These headers inform API clients (many of which have built-in deprecation notifications) that the API is no longer supported and the date it will be deactivated.

Requirement

API teams must return Deprecation and Sunset headers in a deprecated API's responses.

Deprecation header¶

The Deprecation header marks an API version as deprecated and slated for deactivation at a future date. Use a boolean true value to set the API version as actively deprecated.

Deprecation: true

Sunset header¶

The Sunset header always takes an HTTP-date timestamp and represents the date an API team will deactivate the API version.

Deprecation: true
Sunset: Thu, 11 Nov 2049 23:59:59 UTC

Deactivation¶

Deactivation is the process of removing the API from being available, removing the source code, and removing the artifacts from being maintained and built. Once deactivation is complete, the API version will no longer be available.

The Sunset header set in the version deprecation phase is the date the API will no longer be available.

If 3 weeks before the communicated deactivation date, there is still usage of the API version, the VA strongly urges API teams to understand the activity and communicate to stakeholders the risk of deactivation.

Sometimes the deactivation date will need to be moved to a later calendar date due to the availability of resources to implement the changes. It should rarely, if ever, move to an earlier date.

Guidance

Understand usage on an API version about to be deactivated. Communicate that activity with stakeholders at least 3 weeks prior to the Sunset date that was established in the version deprecation phase.

Decomposition¶

API decomposition, or bifurcation, refers to splitting a single API into two or more standalone APIs, each of which constitutes a new API. As such, they must have unique namespaces and start their versions from zero (v0).

Requirement

Each child API extracted from a parent API must have a unique namespace.
Child APIs must start their version at v0.

Guidance

API teams should deprecate parent APIs once they've released all the child APIs.
API teams should document the split and how to migrate to the new APIs in the parent API's documentation.

The API you are decomposing will adhere to the version deprecation guidance.

Paths & Operations

Paths¶

Requirement

Full URIs must follow RFC 3986.
Base (server) URLs must not have a trailing slash.
Base (server) URLs must be relative to their public facing URLs.
Paths must use dashes rather than underscores for spaces in words.

An example URI for a fictional 'Rx' API that follows the RFC 3986 standard is:

GET https://api.va.gov/services/rx/v1/prescriptions/42ceca25-e9d0-466f-84a8-8ce554d70953

How this guide, and the OpenAPI spec, refers to parts of a URI differs slightly from that RFC. Following OpenAPI naming conventions the URI above is split into the following parts:

'https://api.va.gov/rx/v1' is the base URL found in the "servers" section of the OAS.
'https' is the scheme.
'api.va.gov' is the authority.
'rx' is the namespace.
'v1' is the version.
'prescriptions' is the operation path.
'42ceca25-e9d0-466f-84a8-8ce554d70953' is the resource identifier.
'GET' is the operation or method.

Namespaces¶

Guidance

Namespaces should match the product name of the API as closely as possible.
Namespaces should remain consistent between versions.
Namespaces should not reflect internal VA organizational and communication structures.
Namespaces should not be more than 2 levels deep.
Namespaces should not include application names ('WebSphere') or environments ('PROD').

Take care in choosing a namespace. This is the name of your API and should not change version to version. The namespace of your API is part of your product branding and should be user, rather than provider, centered. Choose a namespace that matches as closely as possible the product name of the API in the OAS documentation.

Operation paths¶

While everything after the namespace is technically still part of the path in URI terms, for APIs--and specifically those that follow the OpenAPI spec--paths are pointers to resource operations within a version.

The OpenAPI spec says:

In OpenAPI terms, paths are endpoints (resources), such as /users or /reports/summary/, that your API exposes, and operations are the HTTP methods used to manipulate these paths, such as GET, POST or DELETE.

Resources will inform how you construct your paths. Most of your paths will revolve around returning a collection of resources, and performing Create, Read, Update, and Destroy (CRUD) operations on singular resources.

Note that REST and CRUD are not synonymous. An API can be RESTful without CRUD operations. If an operation does not fall under the CRUD resource operations, we consider it ‘Non-Resourceful’ and the naming of its path requires special care.

Collection resources¶

Guidance

Providers should use plural nouns for collections.

Collections represent a list of resources. We prefer plural nouns with no identifier in the URL path. The Read section of this guide has details on crafting requests and responses for collections.

GET ../rx/v1/prescriptions

Use query parameters to customize the returned collection, such as for pagination. An example is:

GET ../rx/v1/prescriptions?page=2&pageSize=10

Singular resources¶

Guidance

Resources should NOT be identified using easily guessable sequential numbers.
APIs should use a UUID as a resource identifier, as outlined in RFC 4122.
Identifiers should be in the form 8-4-4-4-12 with 36 total characters (32 hexadecimal characters and 4 hyphens).

These represent a single instance of a resource in a collection. They should be completely and uniquely identified on the URL path.

The UUID ‘42ceca25-e9d0-466f-84a8-8ce554d70953’ uniquely identifying a prescription

GET ../rx/v1/prescriptions/42ceca25-e9d0-466f-84a8-8ce554d70953

Sub-resources¶

Guidance

Sub-resources should be appended after the parent's identifier.
When called with no sub-resource identifier, an API should return all the sub-resources.
Sub-resource in paths more than 2 levels deep are not recommended.

Below is an example of a one-to-many resource relationship, showing a prescription with many refill sub-resources. To return all the refills for the prescription with id ‘42ceca25-e9d0-466f-84a8-8ce554d70953’, use:

GET ../rx/v1/prescriptions/42ceca25-e9d0-466f-84a8-8ce554d70953/refills

Then to return a single sub-resource, in this case the refill with id ‘2b0f2b18-a476-4e13-a4e0-b2fb8f499ae4’, use:

GET ../rx/v1/prescriptions/42ceca25-e9d0-466f-84a8-8ce554d70953/refills/2b0f2b18-a476-4e13-a4e0-b2fb8f499ae4

Paths for custom operations¶

Guidance

Providers should show that a custom operation was intentional by ending the path with a verb.

There may be considerations that force us into a non-standard path.

For example, you may decide to support searching of resources using POST rather than GET. This may be because a lengthy query would exceed the server’s limit (often 1024 and 2048 characters, for query strings, and URLs respectively), or for security reasons to avoid PII or PHI showing up in logged URLs.

In cases like these, use a verb rather than a noun for the custom operation.

POST .../rx/v1/prescriptions/search

Headers¶

Requirement

Header names must be case insensitive.
Headers must not include API or domain-specific values.
APIs must not use headers in a way that changes the behaviour of HTTP methods.
APIs must not use headers to communicate business logic or service logic (such as paging response info or PII query parameters)

Guidance

HTTP headers should only be used for the purpose of handling cross-cutting concerns such as authorization.
API implementations should not introduce or depend on headers.
If available, HTTP standard headers should be used instead of creating a custom header.

HTTP headers are integral to the efficient communication between web clients and servers, acting as carriers for metadata about the HTTP transaction. While they have many roles—from security enhancements and caching directives to connection management and content negotiation—the designers of HTTP protocol did not intend headers to transport application-specific data or execute business logic. Beyond deviating from HTTP semantics, misusing headers can cause issues due to size and character limits, visibility, caching, performance, compatibility, and security.

Headers and PII/PHI¶

When implementing business logic or transmitting application data, we recommend using the body of the HTTP request for Personal Identifiable Information and Protected Health Information (PII/PHI) data and the query parameters or URI itself for non-PII/PHI data.

For several reasons, using the request body to transmit PII is preferable over headers. Headers are designed for metadata, not primary application data, and might face size limitations, potentially truncating transmitted PII. Security tools often focus on safeguarding the request body and might inadvertently log sensitive header data. Additionally, while both are encrypted under HTTPS, application-layer encryption predominantly targets the body. Standard practices favor transmitting data in the request body, offering more structured and safer data handling. Moreover, headers might be more exposed in debugging tools, risking unintended PII disclosure. Given the above, the request body is a more suitable transport mechanism for PII for security, semantics, and practicality.

Individual Header Guidance¶

Accept¶

This request header specifies the media types that the API client is capable of handling in the response. These API guidelines assume that APIs accept application/json.

Guidance

APIs handling the request should not assume Accept is available.

Accept-Charset¶

Requirement

APIs must not respond to this header. Browsers omit this header and servers should ignore it.

Content-Language¶

This request/response header is used to specify the language of the content.

Guidance

APIs should provide this header in the response.
This value of this header should correspond to the language of the data in the response.
APIs should default to using the locale en-US.

Example:

Content-Language: en-US

Content-Type¶

This request/response header indicates the media type of the request or response body.

Requirement

APIs must include it with response if there is a response body (it is not used with 204 responses).
If the content is a text-based type, such as application/json, the Content-Type must include a character-set parameter.
The character-set must be UTF-8.

In a request:

Accept: application/json; Accept-Charset: utf-8

In a response:

Content-Type: application/json; charset=utf-8

Link¶

According to Web Linking RFC 5988, a link is a typed connection between two resources that are identified by Internationalised Resource Identifiers (IRIs). The Link entity-header field provides a means for serializing one or more links in HTTP headers.

Requirement

The Link header must not be used in responses with status codes 201 or 3xx.

Guidance

APIs should prefer returning links within the response body's meta object.

Location¶

The Location response header indicates the URL to redirect a page to. It only provides a meaning when served with a 3xx (redirection) or 201 (created) status response.

Requirement

The Location header must only be used in responses with redirection status codes 3xx or 201 created.

Prefer¶

The Prefer request header field is used to indicate that a particular server behavior(s) is preferred by the client but is not required for successful completion of the request. It is an end-to-end field and must be forwarded by a proxy if the request is forwarded unless Prefer is explicitly identified as being hop-by-hop using the Connection header field.

The following token values are possible to use for APIs as long as an API's documentation explicitly indicates support for Prefer.

Prefer: respond-async: API client prefers that the API server processes its request asynchronously.

Server returns a 202 Accepted response and processes the request asynchronously. API server could use a webhook to inform the client subsequently, or the client may call GET to get the response at a later time.

Prefer: read-consistent: API client prefers that the API server returns responses from a durable store with consistent data. For APIs that are not offering any optimization preferences for their clients, this behavior would be the default and it would not require the client to set this token.

Prefer: read-eventual-consistent: API client prefers that the API server returns responses from cache or eventually consistent datastore if applicable. If there is a miss in finding the data from either of these two types of sources, the API server might return response from a consistent, durable datastore.

Prefer: read-cache: API client prefers that the API server returns responses from cache if available. If the cache hit is a miss, the server could return the response from other sources such as eventual consistent datastore or a consistent, durable datastore.

Prefer: return=representation: API client prefers that the API server includes an entity representing the current state of the resource in the response to a successful request. This preference is intended to provide a means of optimizing communication between the client and server. It eliminates the need for a subsequent GET request to retrieve the current representation of the resource after a creation (POST) modification operation (PUT or PATCH).

Prefer: return=minimal: API client indicates that the server returns only a minimal response to a successful request. The determination of what constitutes an appropriate “minimal” response is solely at the discretion of the server.

ETag¶

An entity tag (ETag) response header is a good approach to make update requests idempotent. ETags are generated by the server based on the current resource representation.

If-Match¶

Using the If-Match header with the current ETag value representing the current resource state allows the server to provide idempotent operations and avoid race conditions. The server would only execute the update if the If-Match value matches current ETag for the resource.

Resource Operations

Create (POST)¶

Requirement

Endpoints must use the POST method when creating resources without a consumer-supplied ID.
The operation should return a 201 with the created resource reflected in the response body.

Guidance

POST should not be idempotent; meaning, if the request is sent again, a second resource should be created.
The operation should return the created resource inside a data object.
The operation should return any errors inside an errors object.
The operation should return a 400 for syntax data errors (such as invalid JSON).
The operation should return a 422 for semantic data errors (such as failing application validation).

Creating a new resource¶

In most cases, the service generates an identifier for the resource. In cases where an identifier is supplied by the API consumer, follow the guidance for creating a resource with a consumer supplied identifier.

Example create request¶

POST ../rx/v0/prescriptions

{
  "data": {
    "type": "Prescription",
    "attributes": {
      "prescriptionNumber": "1239876",
      "prescriptionName": "IBUPROFEN 400MG TAB",
      "facilityName": "DAYT29",
      "stationNumber": "989",
      "orderedDate": "2049-07-21T01:39:00Z",
      "expirationDate": "2050-07-21T01:39:00Z",
      "dispensedDate": "2049-07-22T010:07:00Z",
      "quantity": 30,
      "isRefillable": true
    },
  }
}

Example create response¶

For returning a 201 Created HTTP status code, the response body would confirm the resource had been created as shown below (in JSON::API format).

201 Created

{
  "data": {
    "type": "Prescription",
    "id": "1c2dfaaa-4eb9-482e-86a9-4e7274975967",
    "attributes": {
      "prescriptionNumber": "1239876",
      "prescriptionName": "IBUPROFEN 400MG TAB",
      "facilityName": "DAYT29",
      "stationNumber": "989",
      "orderedDate": "2049-07-21T01:39:00Z",
      "expirationDate": "2050-07-21T01:39:00Z",
      "dispensedDate": "2049-07-22T010:07:00Z",
      "quantity": 30,
      "isRefillable": true
    }
  }
}

Example create error¶

422 Unprocessable Entity

{
  "errors": [
    {
      "status": "422",
      "source": { "pointer": "/data/attributes/isRefillable" },
      "title":  "Invalid Attribute",
      "detail": "isRefillable must be a boolean value."
    }
  ]
}

Create with a consumer-supplied identifier¶

Creating resources works differently when the consumer is supplying the identifier versus when the system is generating the identifier upon creation.

Guidance

A PUT method should be used, as the operation is idempotent even during creation.
On successful creation a 201 should be returned with the created resource in the body.
If the result is an update of an existing resource, a 204 should be returned with no response body.
The operation should return any errors inside an errors object.
The operation should return a 400 for syntax data errors (such as invalid JSON).
The operation should return a 422 for semantic data errors (such as failing application validation).

Read (GET)¶

Collections¶

Requirement

The operation must use an HTTP GET verb when fetching collections.
Successful responses must return a 200 status code.
An empty list is a successful response and must return a 200 status code.

Guidance

The operation should return resources inside a data object.
The operation should return any errors inside an errors object.
Supplemental information, such as pagination, should be in a meta object.

A collection resource should return a list of representation of all of the given resources (instances), including any related metadata. An array of resources should be in the items field. Consistent naming of collection resource fields allows API clients to create generic handling for using the provided data across various resource collections.

The GET verb should not affect the system, and should not change the response on subsequent requests unless the underlying data changes (as in, it should be idempotent). Exceptions to 'changing the system' are typically instrumentation/logging-related. The list of data should be filtered based on the privileges available to the API client, so that it lists only the resoures for which the client has the authorization to view and not all the resources in the domain. Providing a summarized or minimized version of the data representation can reduce the bandwidth footprint in cases where individual resources contain a large object.

Example read collection request¶

GET ../rx/v0/prescriptions

// No request body

Example read collection response¶

200 OK

{
  "data": [
    {
      "type": "Prescription",
      "id": "1c2dfaaa-4eb9-482e-86a9-4e7274975967",
      "attributes": {
        "prescriptionNumber": "1239876",
        "prescriptionName": "IBUPROFEN 400MG TAB",
        "facilityName": "DAYT29",
        "stationNumber": "989",
        "orderedDate": "2049-07-21T01:39:00Z",
        "expirationDate": "2050-07-21T01:39:00Z",
        "dispensedDate": "2049-07-22T010:07:00Z",
        "quantity": 30,
        "isRefillable": true
      }
    },
    {
      "type": "Prescription",
      "id": "ac9d4b3f-e4bd-49dd-b794-64ad05480729",
      "attributes": {
        "prescriptionNumber": "1239832",
        "prescriptionName": "ACETAMINOPHEN 200MG TAB",
        "facilityName": "DAYT29",
        "stationNumber": "989",
        "orderedDate": "2049-07-22T11:23:00Z",
        "expirationDate": "2050-07-22T11:30:00Z",
        "dispensedDate": "2049-07-23T012:35:00Z",
        "quantity": 30,
        "isRefillable": true
      }
    }
  ]
}

Single resource¶

Requirement

The operation must use a GET verb when fetching single resources.
Successful responses must return a 200 status code.
The operation must return a 404 status code if the resource can not be found.

Guidance

The operation should return the resource inside a data object.
The operation should return any errors inside an errors object.

A single resource is typically derived from the parent collection of resources and is often more detailed than an item in the representation of a collection resource. Executing GET should never affect the system and should not change the response on subsequent requests (as in, it should be idempotent).

All identifiers for sensitive data should be non-sequential and preferably non-numeric. In scenarios where this data might be used as a subordinate to other data, immutable string identifiers should be used for easier readability and debugging (such as, NAME_OF_VALUE vs 1421321).

Example single read request¶

GET ../rx/v0/prescriptions/1c2dfaaa-4eb9-482e-86a9-4e7274975967

// No request body

Example single read response¶

If the resource with that id is found, return a 200 'OK' status code. The response body should include the resource type and id as shown below in JSON::API format.

200 OK

{
  "data": {
    "type": "Prescription",
    "id": "1c2dfaaa-4eb9-482e-86a9-4e7274975967",
    "attributes": {
      "prescriptionNumber": "1239876",
      "prescriptionName": "IBUPROFEN 400MG TAB",
      "facilityName": "DAYT29",
      "stationNumber": "989",
      "orderedDate": "2049-07-21T01:39:00Z",
      "expirationDate": "2050-07-21T01:39:00Z",
      "dispensedDate": "2049-07-22T010:07:00Z",
      "quantity": 30,
      "isRefillable": true
    }
  }
}

If the resource is not found, return a 404 'Not Found' status code.

404 Not Found

{
  "errors": [
    {
      "status": "404",
      "title":  "Not Found",
      "detail": "The requested resource could not be found."
    }
  ]
}

Update (PUT, PATCH)¶

Full resource updates¶

Requirement

The operation must use PUT for a full resource update.
Success operations must return a 200 status code.
The operation must return a 404 status code if the resource can not be found.

Guidance

PUT should be idempotent, meaning that calling it several times should return the same result.
The operation should return the resource inside a data object.
The operation should return errors inside an errors object.
The operation should return a 400 for syntax data errors (such as invalid JSON).
The operation should return a 422 for semantic data errors (such as failing application validation).

Example full update request¶

PUT ../rx/v0/prescriptions/1c2dfaaa-4eb9-482e-86a9-4e7274975967

{
  "data": {
    "type": "Prescription",
    "attributes": {
      "prescriptionNumber": "1239877",
      "prescriptionName": "IBUPROFEN 200MG TAB",
      "facilityName": "DAYT30",
      "stationNumber": "990",
      "orderedDate": "2049-07-22T02:29:00Z",
      "expirationDate": "2050-07-22T02:49:00Z",
      "dispensedDate": "2049-07-23T011:17:00Z",
      "quantity": 20,
      "isRefillable": false
    },
  }
}

Example full update response¶

200 OK

{
  "data": {
    "type": "Prescription",
    "attributes": {
      "prescriptionNumber": "1239877",
      "prescriptionName": "IBUPROFEN 200MG TAB",
      "facilityName": "DAYT30",
      "stationNumber": "990",
      "orderedDate": "2049-07-22T02:29:00Z",
      "expirationDate": "2050-07-22T02:49:00Z",
      "dispensedDate": "2049-07-23T011:17:00Z",
      "quantity": 20,
      "isRefillable": false
    },
  }
}

Example full update error¶

422 Unprocessable Entity

{
  "errors": [
    {
      "status": "422",
      "source": { "pointer": "/data/attributes/isRefillable" },
      "title":  "Invalid Attribute",
      "detail": "isRefillable must be a boolean value."
    }
  ]
}

Partial resource updates¶

Web frameworks differ on which HTTP verb to use for partial resource updates, with some defaulting to PUT for partials updates rather than the REST-prescribed PATCH. Providers should prefer PATCH, but this is not a strict requirement.

Requirement

On success, the operation must return a 200 status code.
The operation must return a 404 status code if the resource can not be found.

Guidance

The operation should use PATCH for a partial resource update.
With PUT or PATCH it should be idempotent, meaning that calling it more than once will not change the result.
Only the fields being updated should be included in the request body.
The operation should return the resource inside a data object.
The operation should return errors inside an errors object.
The operation should return a 400 for syntax data errors (such as invalid JSON).
The operation should return a 422 for semantic data errors (such as failing application validation).

Example partial update request¶

PATCH ../rx/v0/prescriptions/1c2dfaaa-4eb9-482e-86a9-4e7274975967

{
  "data": {
    "type": "Prescription",
    "attributes": {
      "isRefillable": true
    },
  }
}

Example partial update response¶

200 OK

{
  "data": {
    "type": "Prescription",
    "attributes": {
      "prescriptionNumber": "1239877",
      "prescriptionName": "IBUPROFEN 200MG TAB",
      "facilityName": "DAYT30",
      "stationNumber": "990",
      "orderedDate": "2049-07-22T02:29:00Z",
      "expirationDate": "2050-07-22T02:49:00Z",
      "dispensedDate": "2049-07-23T011:17:00Z",
      "quantity": 20,
      "isRefillable": true
    },
  }
}

Example partial update error¶

422 Unprocessable Entity

{
  "errors": [
    {
      "status": "422",
      "source": { "pointer": "/data/attributes/isRefillable" },
      "title":  "Invalid Attribute",
      "detail": "isRefillable must be a boolean value."
    }
  ]
}

Destroy (DELETE)¶

Requirement

The operation must use the DELETE method when destroying resources.
The operation must not include a request body.
The operation should return any errors inside an errors object.
The operation must return a 404 status code if the resource can't be found.

Guidance

Success operations that include a status in the body should return a 200 status code.
Success operations that do not include a status in the body should return a 204 status code.

Destroying a resource¶

Example destroy request¶

DELETE ../rx/v0/prescriptions/1c2dfaaa-4eb9-482e-86a9-4e7274975967

// No request body

Example destroy response¶

204 No Content

// No response body

Custom Operations¶

Requirement

If the HTTP method chosen for the custom operation does not accept a request body (GET or DELETE), the operation must not send one.

Guidance

API teams should show that a custom operation was intentional by ending the path with a verb.
Custom operations with body parameters should default to the POST method.
Custom operations should not use the PATCH method.

API teams should choose from the 5 standard HTTP methods whenever workable, using custom operations only for custom functionality that falls outside the uses of one of the standard methods.

Example using GET¶

In this example of getting tracking information for a prescription, the request does not need to send any parameters. Therefore, a GET method is preferred as it is the standard REST method for reading a resource.

Example response no body¶

POST .../rx/v1/prescriptions/1c2dfaaa-4eb9-482e-86a9-4e7274975967/track

// No request body

Example response with body¶

200 OK

{
  "data": {
    "type": "PrescriptionTracking",
    "id": "add84f99-f9ce-48f8-a56c-625868d11efc",
    "attributes": {
      "prescriptionId": "1c2dfaaa-4eb9-482e-86a9-4e7274975967",
      "prescriptionName": "IBUPROFEN 400MG TAB",
      "trackingNumber": "add84f99-f9ce-48f8-a56c-625868d11efc",
      "shippedDate": "2049-07-22"
      "deliveryDervice": "USPS"
      "trackingURL": "https://tools.usps.com/go/TrackConfirmAction?tRef=fullpage&tLc=2&text28777=&tLabels=add84f99-f9ce-48f8-a56c-625868d11efc%2C&tABt=false"
    }
  }
}

Example using POST¶

In this example, which contains (mock) PII, we're sending a prescription search query that includes a mock SSN number in a POST body. The request side of the operation uses a custom Query resource that is not persisted in the system and can not be retrieved. However, the operation returns a response that would look almost identical to a Prescription resource collection's read operation.

Example query request¶

POST .../rx/v1/prescriptions/search

{
  "data": {
    "type": "Query",
    "attributes": {
      "ssn": "777-98-7654",
      "quantity": ">10",
      "dispensedDate": ">2049-01-01T00:00:00Z AND <2050-01-01T00:00:00Z"
    }
}

Example query response¶

200 OK

{
  "data": [
    {
      "type": "Prescription",
      "id": "1c2dfaaa-4eb9-482e-86a9-4e7274975967",
      "attributes": {
        "prescriptionNumber": "1239876",
        "prescriptionName": "IBUPROFEN 400MG TAB",
        "facilityName": "DAYT29",
        "stationNumber": "989",
        "orderedDate": "2049-07-21T01:39:00Z",
        "expirationDate": "2050-07-21T01:39:00Z",
        "dispensedDate": "2049-07-22T010:07:00Z",
        "quantity": 30,
        "isRefillable": true
      }
    },
    {
      "type": "Prescription",
      "id": "ac9d4b3f-e4bd-49dd-b794-64ad05480729",
      "attributes": {
        "prescriptionNumber": "1239832",
        "prescriptionName": "ACETAMINOPHEN 200MG TAB",
        "facilityName": "DAYT29",
        "stationNumber": "989",
        "orderedDate": "2049-07-22T11:23:00Z",
        "expirationDate": "2050-07-22T11:30:00Z",
        "dispensedDate": "2049-07-23T012:35:00Z",
        "quantity": 30,
        "isRefillable": true
      }
    }
  ]
}

Errors¶

Guidance

Error fields should follow RFC 7807
An operation should be able to return multiple errors in one response.
The status field should match the HTTP status code being returned.
The title field should be the generic class of the error and consistent across the API.
The detail field should be specific to the error at hand.
If an error is in the request, a source field should point to it.

Required Errors¶

Returning reliable, consistent, and descriptive errors from responses is crucial for maintaining an API's usability as it facilitates quicker diagnosis and mitigation of issues by both consumers and API teams. VA has two sets of error status codes for standardized error reporting. One of the sets contains standard error status codes, and API teams must include them in all endpoints. The other set contains error status codes related to errors that occur upstream. If the API depends on upstream services, teams must use both sets of error codes.

The table below contains the required error status codes that all APIs must use.

Status Code	Description
400	Bad Request
401	Unauthorized
403	Forbidden
404	Not Found
429	Too Many Requests
500	Internal Server Error

If your API has endpoints dependent on upstream services, you must also include the following error status codes.

Status Code	Description
502	Bad Gateway
503	Service Unavailable
504	Gateway Timeout

For 5xx level errors returned from upstream services, error mapping must be implemented so that your API returns the desired error responses to the consumer and does not just pass along the error responses from the upstream services. This must be done to ensure that the consumer does not receive system details or other sensitive information about the upstream services.

The section, Choosing an error code describes these error codes in more detail and can provide assistance with selecting the correct error codes.

Error Schemas¶

The error schema should match that of the information model you are using. This guide recommends using JSON::API unless an industry-specific format like FHIR is required. We recommend JSON::API because it has a well thought-out, extensible, and relatively simple error model.

An example of a JSON::API-formatted 422 'Unprocessable Entity' error is:

422 Unprocessable Entity

{
  "errors": [
    {
      "status": "422",
      "source": { "pointer": "/data/attributes/isRefillable" },
      "title": "Invalid Attribute",
      "detail": "isRefillable must be a boolean value."
    }
  ]
}

FHIR has a different error schema (OperationOutcome resource). An example is:

{
  "resourceType": "OperationOutcome",
  "id": "searchfail",
  "text": {
    "status": "generated",
    "div": "<div xmlns=\"http://www.w3.org/1999/xhtml\">\n<p>The &quot;name&quot; parameter has the modifier &quot;exact&quot; which is not supported by this server</p>\n</div>"
  },
  "issue": [
    {
      "severity": "fatal",
      "code": "code-invalid",
      "details": {
        "text": "The \"name\" parameter has the modifier \"exact\" which is not supported by this server"
      },
      "location": [
        "http.name:exact"
      ]
    }
  ]
}

Multiple errors¶

Operations should be able to return multiple errors and providers should return errors for all the issues with the request at once. For example, if 2 fields are invalid, return both so the consumer is aware of all the issues that must be corrected, instead of raising an error once the first invalid field is processed. If the error statuses are different but have the same hundredth, for example, both are 4xx, return the base generic value (400). If the errors have mixed hundredth values return a 500.

422 Unprocessable Entity

{
  "errors": [
    {
      "status": "422",
      "source": { "pointer": "/data/attributes/isRefillable" },
      "title": "Invalid Attribute",
      "detail": "isRefillable must be a boolean value."
    },
    {
      "status": "422",
      "source": { "pointer": "/data/attributes/contactEmail" },
      "title": "Invalid Attribute",
      "detail": "contactEmail must be a valid email address."
    }
  ]
}

Choosing an error code¶

Choosing the correct HTTP status code for an error can be confusing. With VA APIs, the error will originate from one of three sources: from the consumer’s request; an upstream service the API depends on; or the API itself, such as the web application and its components.

Request errors: Issues caused by requests are a flavor of a 400 error (4xx).
Upstream errors: Errors from upstream services are 502, 503, or 504.
Application errors: Always return a 500.

The flowchart below can help you decide which HTTP status code to return for an error.

Flowchart illustrating error handling for an API.

Naming & Formatting

Naming & Formatting¶

Guidance

Fields should use camelCase.
Acronyms should be camelCase rather than uppercase.
Abbreviations should be avoided.
Booleans should be prefixed with an auxiliary verb (such as is, has, or can).
Enums should be UPPER_CASE strings with underscores in place of spaces.

Case¶

The VA recommends using camelCase rather than snake_case or kebab-case for JSON field names. API teams can use any programming language for their applications, including C-style languages (Ruby, Python, etc.) whose style guides/communities have decided on snake case for variable and field names. That said, being consistent across the VA set of APIs is necessary. Camel case is chosen since JavaScript is the dominant language for web clients/API consumers (and is the JS in JSON), and it uses camel case.

Acronyms¶

Acronyms are usually written in uppercase within API documentation while most programming languages reserve uppercase for constants. Using uppercase for the acronym name within a field that follows the camel case naming convention can be confusing when read by consumers. For example, the field named BIRLSId contains an acronym but it's difficult to tell where the acronym name ends and where the next part of the variable name begins. Therefore, name your field birlsId instead of BIRLSId.

Abbreviations¶

For clarity, it's better to spell a word completely rather than use an abbreviation, especially a non-standard one. Modern editors and IDEs can autocomplete variable names, so abbreviations no longer save keystrokes. The VA has complicated terminology; future developers or your future self will appreciate the lack of ambiguity.

Booleans¶

JSON is not typed, so prefixing boolean fields with an auxiliary verb (such as is, has, or can) marks it as a boolean field. This is also closer to natural language, e.g., run this code if the user is a veteran becomes if (isVeteran) {}.

Addresses¶

APIs are free to format data, including address data, to best fit the needs of their consumers. However, if an API team is looking for a suggested schema for an address data type, it is strongly recommended to follow the address format the VA Profile team uses.

The schema has separate fields for international country subdivisions (province rather than state), and postal codes (internationalPostalCode rather than zipcode).

{
  "$schema": "http://json-schema.org/draft-04/schema",
  "type": "object",
  "required": [
    "addressLine1",
    "addressLine2",
    "addressLine3",
    "addressPou",
    "addressType",
    "city",
    "internationalPostalCode",
    "province",
    "stateCode",
    "zipCode",
    "zipCodeSuffix"
  ],
  "properties": {
    "id": {
      "type": "integer"
    },
    "addressLine1": {
      "type": "string"
    },
    "addressLine2": {
      "type": ["string", "null"]
    },
    "addressLine3": {
      "type": ["string", "null"]
    },
    "addressPou": {
      "type": "string"
    },
    "addressType": {
      "type": "string"
    },
    "city": {
      "type": "string"
    },
    "countryCodeIso3": {
      "type": "string"
    },
    "internationalPostalCode": {
      "type": ["string", "null"]
    },
    "province": {
      "type": ["string", "null"]
    },
    "stateCode": {
      "type": "string"
    },
    "zipCode": {
      "type": "string"
    },
    "zipCodeSuffix": {
      "type": ["string", "null"]
    }
  }
}

Country codes¶

Guidance

Country codes should use ISO 3166-1 alpha-3.
All other country subdivisions (Canadian provinces, Mexican states, UK counties, etc.) should also use the second part of their ISO 3166-2 code.

US state and territory codes¶

Guidance

US state codes should use the USPS postal abbreviation, which is the second part of a ISO 3166-2:US code.

Use the two letter United States Postal Service abbreviation for US states, the District of Columbia, US territories, Air/Army Post Office (APO) and Fleet Post Office (FPO).

States¶

Name	Abbreviation
Alaska	AK
Alabama	AL
Arkansas	AR
Arizona	AZ
California	CA
Colorado	CO
Connecticut	CT
Delaware	DE
Florida	FL
Georgia	GA
Hawaii	HI
Iowa	IA
Idaho	ID
Illinois	IL
Indiana	IN
Kansas	KS
Kentucky	KY
Louisiana	LA
Massachusetts	MA
Maryland	MD
Maine	ME
Michigan	MI
Minnesota	MN
Missouri	MO
Mississippi	MS
Montana	MT
North Carolina	NC
North Dakota	ND
Nebraska	NE
New Hampshire	NH
New Jersey	NJ
New Mexico	NM
Nevada	NV
New York	NY
Ohio	OH
Oklahoma	OK
Oregon	OR
Pennsylvania	PA
Rhode Island	RI
South Carolina	SC
South Dakota	SD
Tennessee	TN
Texas	TX
Utah	UT
Virginia	VA
Vermont	VT
Washington	WA
Wisconsin	WI
West Virginia	WV
Wyoming	WY

Districts¶

Name	Abbreviation
District of Columbia	DC

Territories¶

Name	Abbreviation
American Samoa	AS
Guam	GU
Northern Mariana Islands	MP
Puerto Rico	PR
U.S. Virgin Islands	VI

Military AFO/FPOs¶

Name	Abbreviation
Armed Forces Americas	AA
Armed Forces Europe, Canada, Middle East, and Africa	AE
Armed Forces Pacific	AP

Currency & Money¶

Guidance

All monetary amounts should be displayed in United States Dollars (USD).
Values should include the dollar symbol to the left of the amount, with no space.
Cents should be displayed using decimals and a fractional separator.
Amounts less than a dollar should start with a zero followed by a decimal.
Thousands should be separated by commas.

Return monetary values as strings prefixed by a US dollar symbol.

One thousand dollars and ten cents

{
  "paymentAmount": "$1,000.10"
}

Exact dollar values should still display cents.

One thousand dollars

{
  "paymentAmount": "$1,000.00"
}

Values less than one dollar are returned in relation to dollars.

Ten cents

{
  "paymentAmount": "$0.10"
}

Dates & Time¶

Requirement

Dates and timestamps must follow the ISO 8601 standard.

Guidance

Dates and timestamps should be in Coordinated Universal Time (UTC).
Timestamps should NOT use offsets.
Durations should follow the ISO 8601 standard.
Time Intervals should follow the ISO 8601 standard.

Dates¶

If your field only needs a date rather than a complete timestamp, use the ISO 8601 format YYYY-MM-DD.

This means the full year, month with leading zero, and day with leading zero, separated by hyphens.

December 24th, 2049

{
  "date": "2049-12-24"
}

January 2nd, 2049

{
  "date": "2049-01-02"
}

If only the month is required, you may omit the day. In this case, always use the full year, and do not omit the hyphens to avoid confusion with the YYMMDD format.

February, 2049

{
  "date": "2049-02"
}

Timestamps¶

Timestamps must be in ISO 8601 format in UTC, also known as Zulu time, which is denoted with a trailing 'Z'.

As with dates above, hours, minutes, and seconds must use leading zeros. All times must use a 24-hour clock (military time).

December 24th, 2049, at 3:09 PM

{
  "startTime": "2049-12-24T15:09:00Z"
}

Avoiding offset & time zone errors¶

Use UTC time.

It can be tempting to localize timestamps using offsets. For example, an API could return appointment times in the time zone of the facility's location. There are several issues with this example.

Offsets do not represent time zones. A time zone's offset can change with daylight savings time.
End users may live in another time zone from the facility they visit.
A VA system in another time zone may schedule a facility's appointments. Also, 13 states have more than one time zone.

UTC time has no offset and does not implement daylight savings time. Let your API be the source of truth for time and leave the local formatting up to the consuming application.

If you do need to capture or return the time zone for an event, use an additional field with a value of the full qualified name in the tz database. An example would be America/Los_Angeles.

Duration¶

Durations represent an amount of time, in the format P[n]Y[n]M[n]DT[n]H[n]M[n]S, where:

P is the duration designator (for period) placed at the start of the duration representation.
- Y is the year designator that follows the value for the number of calendar years.
- M is the month designator that follows the value for the number of calendar months.
- W is the week designator that follows the value for the number of weeks.
- D is the day designator that follows the value for the number of calendar days.
T is the time designator that precedes the time components of the representation.
- H is the hour designator that follows the value for the number of hours.
- M is the minute designator that follows the value for the number of minutes.
- S is the second designator that follows the value for the number of seconds.

You do not need to include all the duration and time designators.

Three years, six months, four days, twelve hours, thirty minutes, and five seconds

{
  "duration": "P3Y6M4DT12H30M5S"
}

Three years, six months

{
  "duration": "P3Y6M"
}

Three years

{
  "duration": "P3Y"
}

Intervals¶

Intervals represent a range of time with specific start and end dates.

{
  "interval": "2049-03-01T13:00:00Z/2049-05-11T15:30:00Z"
}

Standard Field Names¶

API teams are free to name fields as they see fit; however, to promote consistency across APIs, consider using names from the table below for fields with concepts similar to your own.

Name	Type	Description
icn	string	A Master Person Index (MPI) Integration Control Number
birlsId	string	Beneficiary Identification and Records Locator (Sub)System Identifier
edipi	string	A Department of Defense (DoD) Electronic Data Interchange Personal Identifier
mhvId	string	My HealtheVet Identifier
secId	string	EAuth Universally Unique Identifier
vet360Id	string	A VA Profile Vet360 Identifier
createDate	datetime	The datetime a resource was created
updateDate	datetime	The datetime a resource was updated
deleteDate	datetime	The datetime a resource was (soft) deleted
startDate	datetime	The beginning of a datetime range
endDate	datetime	The end of a datetime range
page	integer	The current paginated page
pageSize	integer	The number of items in a paginated page
totalSize	integer	The total of number of items regardless of pagination
first	string	The first page of a paginated list
last	string	The last page of a paginated list
prev	string	The previous page of a paginated list
next	string	The next page of a paginated list
filter	string	How a list is filtered (for example, filter=createDate>2012-08-17T21:51:08Z)
sort	string	How a list is sorted (for example, sort=age,name)

Production Management

Production Management¶

By definition, an API is an interface that other software depends upon. Once your API is in production servicing consumers, how do you deliver updates to your API without breaking the other dependent applications?

Effective management of building and releasing the source code from a repository into a running production system is crucial for ensuring the performance and availability of your API to minimize disruptions.

This section covers the operations for updating and releasing your API. Please also review versioning guidelines to avoid breaking consumers using your API.

APIs are unique because they can be the backbone of multiple applications like websites, mobile applications, and even other APIs. Therefore, strategies to upgrade APIs need to be carefully considered.

How often should you release?
How do you inform consumers of changes to the behavior of your API?
How do you inform consumers about a new release to update or version your API?
How do you have zero downtime when deploying a new release of your API?

Deployment Frequency¶

A basic engineering principle of software releases is to make small incremental changes often.
This is a win-win situation because it:

Encourages automation of the continuous delivery (CD) of your software, thus reducing manual mistakes in the delivery.
Encourages the automation of testing, thus reducing errors humans miss when doing routine, repetitive testing tasks.
Simplifies determining root-cause analysis when there is less code changing at a given release.
Allows VA customers, like the Veterans we serve, to see improvements sooner.

Guidance

Automate the delivery of the software to make releasing a trivial task.
Automate the testing of the software to make testing of the release trivial to humans.
Release small incremental improvements.
Release often, at least once every 2 weeks, or when a new feature or bug fix is ready.
Release critical bug fixes within 4 hours after it passes testing procedures.

How often is often enough?¶

VA encourages continuous integration and continuous delivery (CI/CD) of your API. We encourage weekly or every other week release schedules, or when the new feature or bug fix has passed testing, then release it.

Do not release code for the sake of releasing, but don't delay doing releases when a feature or fix is ready. The earlier a feature or fix is available, the earlier VA can reap the benefits.

This involves thinking about your API features differently and breaking down new features into small incremental changes versus large releases that generally require a more complex exit strategy and troubleshooting steps if things go wrong.

Bug fixes¶

Releasing bug fixes should be an immediate concern if it is a bug that is causing severe abnormal data behavior or the inability to use an application. Critical bug fixes should NOT wait for a normal release cadence and should be released within 4 hours after testing has passed for the patch.

Documentation Updates¶

Requirement

Updated documentation must be released when API changes are released.
Published API documentation that matches production behavior must be easily accessible.

Follow the guidance within the Documentation section while writing your API and its documentation. Documenting your API correctly is vital for VA to operate effectively and efficiently. Using a tool for automatically generating the documentation is highly recommended.

Providing an OpenAPI Specification (OAS) for an API is not a one and done activity. It is fundamental to the API lifecycle, and must be kept up-to-date as the API evolves. Release the documentation changes as part of the release when the API behavior changes. Follow the Release Notes guidance when the release occurs.

Release Notes¶

Requirement

Release notes must accompany each major and minor release, along with an updated OpenAPI specification (OAS).

Guidance

It's optional but recommended to publish release notes for patch versions.

By virtue of your code being an interface that other software relies upon, it is important to publish release notes when the API changes.

Release notes are required if an API changes in a way that could affect a consumer's capabilities. This is regardless of whether the change is within a major, minor, or patch release (Major.Minor.Patch). As discussed in the versioning section, examples of semantic versioning minor changes include an additional endpoint, an additional response property, or a new optional query parameter. These are examples of non-breaking changes but changes a consumer is typically interested in knowing about. Therefore, the OpenAPI specification (OAS) must be updated AND release notes must be published to highlight the new functionality.

API teams may optionally publish releases for changes internal to the API which are abstracted away from the end user. This is almost always a patch change. For example, iPhone security updates do not affect the UI and are released - with a one-sentence release note - as a patch version, e.g., iOS 14.8.1.

A second example of when release notes are required is when a release causes a change in behavior in the underlying system. i.e., an endpoint with the same contract as before, when now called, has a different effect on its system or upstream systems.

The final example of release notes being required is when a new major version of your API is released. This is typically a breaking change for consumers and they will eventually be required to migrate from the prior major version to the latest major version. Help your consumers plan ahead by telling them about this as early as possible.

In summary, API teams must publish release notes and an updated OAS when the API's interface changes or the system's behavior changes. API teams may optionally publish release notes when a change is internal and does not update the interface or cause new side effects to the underlying system's data.

What should the release note contain?¶

Release notes should contain the release date, a summary of what has changed, and optionally the release number.

It is best practice to always publish a release note with every change to your API, even if the changes should not affect the consumers of the API.

Consider the perspective of the consumers using your API, from both the developers' and product managers' point of view. The release note is to increase awareness of changes being made and, is an important first source of information when debugging and troubleshooting a problem when the consumer's application using your API starts to behave differently. It also shares behavior changes that could affect how consumers interact with your API.

This is not an all-inclusive list, but certainly mention:

a new endpoint
a new parameter or property that is allowed
a change in a value that is accepted
additional fields in a response
new headers that are returned
updates that shouldn't have any effect, but might
documentation changes, such as improving examples
deprecated fields in the response and alternatives that can be used
deprecated endpoints and alternatives that can be used

Example non-breaking endpoint change

**September 26, 2024** ( Version 1.5.0 )

Added an optional POST parameter for middle name on the /registrations endpoint.

Example bugfix change

**September 19, 2024** ( Version 1.4.0 )

Fixed the /registrations endpoint to accept the single quote character (') in the first name and last name fields so that names such as O'Malley will now be accepted.

Example patch release change not affecting consumers

**September 12, 2024** ( Version 1.3.1 )

Updated several third-party libraries to enhance performance, security, and functionality. No behavior changes should be noticed.

Example minor release change not affecting consumers

September 5, 2024 ( Version 1.3.0 )

Improved the underlying data structures, no behavior changes should be noticed.

Example deprecation of an endpoint

August 15, 2024 ( Version 1.2.0 )

Deprecated the POST /register endpoint. Please move to using POST /registrations instead. The deprecated endpoint is scheduled for complete removal in approximately one year (August, 2025).

Deployment Strategy¶

Requirement

A zero downtime deployment strategy must be used.
Functionality must be verified before production traffic is served.

Guidance

Blue/Green deployment strategy should be considered.
Automated system regression testing should occur after green assets are deployed, but before transitioning production traffic to it.
Automated final system smoke test should occur after transitioning production traffic to the newly deployed code.

Zero downtime deployments¶

A zero-downtime deployment strategy must be used to avoid service interruptions. Using a zero-downtime deployment strategy, consumers will not have to experience the burden of maintenance windows while a new version of an API is being deployed and will help avoid the following problems:

availability interruptions (service is down)
functional interruptions (service is degraded)
undesirable behavior (escaped defects)

Availability interruptions can occur when transitioning from a prior release to a new release. For example, terminating a running server, then starting a new server with the updated software.

Functionality based interruptions occur when service behavior is negatively impacted by the deployment of the new software release. For example, attempting to serve traffic during an initialization period, the introduction of a bug, or misconfiguration.

Blue/green deployment strategy¶

The blue/green deployment technique uses two identical production environments where one environment is called “blue” and the other is called “green”. With both environments up and running simultaneously, availability will not be negatively impacted. One environment has the current API running (blue) while the other has the new release running (green). After the green environment is determined to be working as expected, incoming traffic is directed to the green environment and away from the blue environment and the blue environment will no longer serve traffic. If problems are encountered with the new release, the traffic can quickly revert to the original servers.

Benefits include:

decreased risk in delivering new releases
elimination of downtime
elimination of communicating maintenance windows

In summary, the steps are:

Deploying the new code to a private server to perform verification during the pre-promotion.
Pre-promotion analysis executes a comprehensive regression test suite with the goal of verifying business logic, authentication behavior, and error responses.
Once verified, the private server is swapped to be the current production server and the old production server is no longer servicing requests.
Post-promotion analysis executes a smaller, smoke test suite to verify new production traffic is being handled correctly.
Immediate hot rollback to the prior server is done if anything is amiss.

Visit the AWS Whitepaper about blue/green deployments for an example vendor implementation.

Consumer Support

Consumer Support¶

Properly supporting your API will foster trust with your consumers, enhance their experience, and retain the value that VA has invested in the API.

Consumer support includes:

Maintaining reliable consumer-facing environments as described in Performance and Availability
Handling consumer requests
Providing outgoing consumer communications
Planning for consumer growth

Consumer Requests¶

Consumers will have questions, new requests for enhancements, or experience problems. The API team must provide a way to be contacted, support the requests, and communicate support expectations.

Requirement

API teams must have a documented point of contact for your API.
API teams must actively monitor the point of contact location.
API teams must communicate the support hours (including timezone) for your API.

Guidance

API teams should have documented expectations for consumer requests.
API teams should track consumer requests and practice effective prioritization.

Point of contact¶

Your API must have a clear point of contact as specified in the General Guidelines documentation example OAS under info.contact and it must be actively monitored.

The API team should determine how to handle different types of consumer questions and requests—either automatically or manually—through this monitored location to meet support expectations.

Support expectations¶

API teams should have support that covers the needs of your consumers. This could mean 24x7 or VA business hours support, depending on criticality and business needs.

API teams should meet the following consumer support Service Level Objectives:

Guidance

Acknowledge receipt of consumer questions or requests for enhancements within 24 hours of the inquiry.
Acknowledge the consumer's report of an outage or performance degradation within 60 minutes of receipt.
Disable API keys or client credentials within 15 minutes of the consumer's report of a security incident.
Provide an answer or update to a consumer's inquiry within 3 business days.

Issue resolution timeframe¶

Critical issues causing outages for the API consumer must be solved, tested, and released as quickly as possible. Releasing a critical fix promptly, as described in the Production Management bug fixes section, is important to build confidence with your consumers. Visit the Production Management section for more details on best practices for managing the API in production.

Team resources¶

API teams should have the necessary expertise and resources to continue developing business features, maintaining the API, and answering consumer inquiries. They should track consumer requests for changes and plan resources to ensure business continuity.

Consumer Communications¶

In addition to consumers contacting you, API teams must provide a way to communicate announcements with your consumers.

Requirement

API teams must inform consumers where and how to receive API announcements.
API teams must notify consumers of important events, such as outages, maintenance windows, enhancements, new versions, and deprecated versions.

Release Notes¶

All VA APIs must publish their release notes on CODE VA.

For public-facing APIs, release notes must be available on a public-facing location, in addition to CODE VA.

Notifications¶

Notifying consumers that a new release occurred or important events are occurring helps your consumers plan and support their applications.

API teams must provide consumers a place to sign up to receive outgoing announcements.

API teams must provide a location for their consumers to view the latest announcements. For public-facing APIs, this must be a public-facing location.

Events to communicate¶

API teams must communicate important events occurring in consumer-facing API environments, such as:

a widespread issue affecting API availability or service throughput
maintenance windows of when a service will be degraded or unavailable
a new major, minor, or patch version release along with release notes
a major version deprecation and guidance on what to use instead
deprecated functionality and guidance on what to use instead
resolution of a major outage or service degradation

Support expectations¶

Set expectations for the timeliness of consumer communications when you have important information for consumers. API teams should meet the following consumer Service Level Objectives:

Guidance

Announce a detected outage to all consumers within 15 minutes of detection.
Provide status updates on an outage or active incident every 60 minutes.
Report the API health status in a location accessible to your consumers.

Consumer Growth¶

Guidance

API teams should reach out to the consumer for their projected future usage.
API teams should proactively monitor consumer usage and detect unusual activity.
API teams should proactively do resource planning.

API teams should be aware of a consumer’s expected growth.

Conduct regular outreach to your consumers and ask what their projected growth is for the next 12 months. In addition, monitor their usage to detect growth or unusual behaviors and malfunctioning software on their side.

Resource planning¶

API teams should do resource planning for:

handling the growth of existing consumers
handling additional new consumers
necessary maintenance and upgrades to the API

Resources to include in planning are:

staffing necessary to support the projected usage
infrastructure to handle the projected future number of incoming requests to the API
data storage growth
security considerations

API Design Patterns

Pagination¶

Guidance

All endpoints that return collections should support pagination, even if they initially only return a few items.
Metadata about how the resource is paginated should be returned in a separate object (e.g. meta in JSON:API).

Pagination allows consumers to retrieve large amounts of data in smaller, more manageable chunks, reducing response times for requests and improving the overall performance of the API. Implementing pagination will help keep your API scalable, as pagination has proven to be an effective strategy to manage increasingly large data sets.

If your endpoint returns a collection, consider including pagination from the start. Data tends to grow rather than shrink, and small data sets will unlikely remain so. Adding pagination after the fact can break the API's behavior (making it no longer backward compatible) and confuse consumers who may assume they have received a complete result when they have only received the first page.

When implementing pagination in your API, separating navigation links and metadata from data is important. That way, consuming applications can, in turn, keep data parsing and rendering separate from pagination logic.

Data specifications and pagination¶

One of the main reasons we recommend using a data convention such as JSON::API and FHIR is that they include pagination as part of their specifications. Following the data specification's pagination schema will save time and reduces bike-shedding amongst teams.

JSON:API specification¶

According to the JSON:API specification, pagination can be implemented using query parameters, typically page[number] and page[size]. Below is an overview of how this can be done for a collection of Prescription resources.

The API consumer requests a specific page of Prescription resources by providing the page[number] and page[size] query parameters in the URL. For example:

GET /rx/v0/prescriptions?page[number]=1&page[size]=10

The API server processes the request, retrieves the requested data, and returns a response containing the Prescription resources for the specified page. The response includes pagination details as part of the links object to help the consumer navigate between pages. Metadata about the total number of records and pages is included in the meta object:

{
  "data": [
    {
      "type": "Prescription",
      "id": "1c2dfaaa-4eb9-482e-86a9-4e7274975967",
      "attributes": {
        "prescriptionNumber": "1239876",
        "prescriptionName": "IBUPROFEN 400MG TAB",
        "facilityName": "DAYT29",
        "stationNumber": "989",
        "orderedDate": "2049-07-21T01:39:00Z",
        "expirationDate": "2050-07-21T01:39:00Z",
        "dispensedDate": "2049-07-22T010:07:00Z",
        "quantity": 30,
        "isRefillable": true
      }
    },
    {
      "type": "Prescription",
      "id": "ac9d4b3f-e4bd-49dd-b794-64ad05480729",
      "attributes": {
        "prescriptionNumber": "1239832",
        "prescriptionName": "ACETAMINOPHEN 200MG TAB",
        "facilityName": "DAYT29",
        "stationNumber": "989",
        "orderedDate": "2049-07-22T11:23:00Z",
        "expirationDate": "2050-07-22T11:30:00Z",
        "dispensedDate": "2049-07-23T012:35:00Z",
        "quantity": 30,
        "isRefillable": true
      }
    },
    // ... more prescription resources
  ],
  "links": {
    "self": "https://api.va.gov/rx/v0/prescriptions?page[number]=1&page[size]=10",
    "first": "https://api.va.gov/rx/v0/prescriptions?page[number]=1&page[size]=10",
    "prev": null,
    "next": "https://api.va.gov/rx/v0/prescriptions?page[number]=2&page[size]=10",
    "last": "https://api.va.gov/rx/v0/prescriptions?page[number]=5&page[size]=10"
  },
  "meta": {
    "pagination": {
      "pageNumber": 1,
      "pageSize": 10,
      "pages": 5,
      "records": 50
    }
  }
}

FHIR specification¶

FHIR provides its own approach to pagination, which is different from the JSON:API specification. The example below shows how pagination can be implemented in FHIR for a collection of MedicationRequest resources.

The consumer requests a collection of MedicationRequest resources by providing the _count query parameter in the URL:

GET /MedicationRequest?_count=10

The FHIR server processes the request, retrieves the requested data, and returns a Bundle resource containing the MedicationRequest resources for the first page. The response also includes pagination links as part of the link array to provide navigation capabilities between pages:

{
  "resourceType": "Bundle",
  "type": "searchset",
  "total": 50,
  "link": [
    {
      "relation": "self",
      "url": "https://api.va.gov/fhir/v0/MedicationRequest?_count=10&page=1"
    },
    {
      "relation": "first",
      "url": "https://api.va.gov/fhir/v0/MedicationRequest?_count=10&page=1"
    },
    {
      "relation": "previous",
      "url": null
    },
    {
      "relation": "next",
      "url": "https://api.va.gov/fhir/v0/MedicationRequest?_count=10&page=2"
    },
    {
      "relation": "last",
      "url": "https://api.va.gov/fhir/v0/MedicationRequest?_count=10&page=5"
    }
  ],
  "entry": [
    {
      "fullUrl": "https://example.com/fhir/MedicationRequest/67890",
      "resource": {
        "resourceType": "MedicationRequest",
        "id": "67890"
        // ... more MedicationRequest resource properties ...
      }
    }
  ]
}

Best practices¶

Outlined below, are a few best practices that should be followed when designing pagination for your API.

Guidance

Remember to set default and maximum values for the page size.
If your API does not currently support pagination, introduce it in a way that maintains backward compatibility or you must version your API.

Changelog¶

2.7.1 - 2025-09-11¶

Updated partial deprecation guidance.

2.7.0 - 2025-08-19¶

Single page version is now available.

2.6.0 - 2025-08-13¶

Added guidance under General Guidelines for Development Process, Environments and Test Data which includes Test Data Best Practices.

2.5.0 - 2025-03-04¶

Upgraded MkDocs version to improve screen reading experience for Section 508 compliance.

2.4.4 - 2025-02-28¶

Modified image descriptions to be Section 508 compliant.

2.4.3 - 2025-01-17¶

Modified tables and headings to be Section 508 compliant.

2.4.2 - 2025-01-14¶

Fixed code block examples to be Section 508 compliant.

2.4.1 - 2025-01-09¶

Added underline on hyperlinks for Section 508 compliance.

2.4.0 - 2024-12-05¶

A new section called Consumer Support was added. This section contains guidance for supporting consumers.

2.3.1 - 2024-10-02¶

Updated formatting and improved Partial Deprecation examples.

2.3.0 - 2024-09-24¶

A new section called Production Management was added. This section contains guidance on managing and deploying releases of APIs to minimize impact to dependent systems.

2.2.0 - 2024-08-26¶

A new section called API Design Patterns was added. This section will contain API design best practices and standards for reusable solutions to common problems encountered while designing and developing APIs.
- A Pagination subsection was added to the API Design Patterns section.

2.1.2 - 2024-08-07¶

Updated documentation page
- Guidance for OAS documentation was updated to use OAS version 3.0.x
- The example OAS was updated to include more details and guidance.

2.1.1 - 2024-07-30¶

The monitoring section was expanded to include information and guidance on Service Level Objectives (SLOs). This content explains what SLOs are, how to create and monitor them, and, why they are important.

The health checks content was moved to its own page.

2.0.1 - 2024-07-29¶

Clarifying the Lifecycle content for deprecation, deactivation, and decomposition.

2.0.0 - 2024-02-09¶

Rebranded the Lighthouse API Standards as the VA API Standards.
Lighthouse specific content has been moved to a new public-facing API guide on CODE VA.

1.3.0 - 2023-09-01¶

Updated the deprecation page with expanded governance details on the process, such as when to deprecate, headers marking an API as deprecated, and suggested deprecation/deactivation timelines.
Updated the versioning page to recommend that there should be no more than one active version of API at a time.
Added a decomposition page with guidance on splitting a single API into multiple standalone APIs.

1.2.0 - 2023-07-10¶

Added a Defaults section which lists the default rate limit, timeout, and request size limit.
'Availability' is renamed 'Performance and Availability' with details about expected performance and timeouts.
Updated the security section to reference the rate limit.

1.1.0 - 2023-06-23¶

Moved the errors page to the top-level navigation and added a required errors section.
Added API expectations that were on developer.va.gov and not covered elsewhere in the standards:
- SLAs for sandbox and production are listed on a new availability page within the general guidelines.
- Updated the architecture requirements to include that APIs must be stateless, cache compatible, and able to work as part of a layered system.

1.0.0 - 2023-04-01¶

Added a monitoring section.
Added an example OAS doc to general-guidelines/documentation.

Changed style to match VA Design System via material-va-lighthouse plugin.

0.1.0 - 2022-09-28¶

Initial release.

Introduction¶

Background¶

Purpose¶

Conventions¶

General Guidelines

General Guidelines¶

API-first¶

Architecture¶

REST¶

Constraints of a RESTful system¶

Data Interchange¶

JSON¶

Information models¶

Documentation¶

CODE VA¶

OpenAPI specification¶

Documentation for humans and computers¶

Example OAS document¶

Development Process¶

Requirements¶

Design¶

Iterative Development¶

Automate validation¶

Environments for developing APIs¶

Development¶

Testing¶

Consumer integration¶

Production¶

Consumer signup¶

Consumer integration signup¶

Production signup¶

Test Data

Test data¶

Test data best practices¶

Generate synthetic test data¶

Mock API responses¶

Anonymize production data¶

Maintain test accounts¶

Representative test data¶

Test data reset¶

Performance & Availability¶

Performance¶

Availability¶

Managing downtime in dependencies¶

Maintenance¶

Security

Securing APIs¶

API Key or OAuth 2.0?¶

API key¶

Documenting API keys¶

OAuth 2.0¶

OAuth 2.0 Flows¶

Authorization Code Flow¶

Authorization Code Flow with PKCE¶

Client Credentials Grant¶

Documenting OAuth 2.0¶

Documenting Authorization Code Flow¶

Documenting Client Credentials Grant¶

Monitoring

Monitoring¶

Health Checks¶

Example URI and Responses¶

Accessibility¶

Service Level Objectives

Service Level Objectives (SLOs)¶

What to Measure¶

What makes a good SLI?¶

Example Calculations¶

Measurement Windows¶

Weekend Variability¶

Latency Variability¶

VA Recommended Service Level Objectives (SLO)¶

Monitoring SLOs¶

Error Budgets¶

Calculating Error Budgets¶

Handling Latency Outliers¶

What would a consumer of this API endpoint want to know?¶

API Lifecycle

API Lifecycle¶

API Evolution¶