Transifex API v3
Heads up! Version 3 of Transifex API is just around the corner! This is a blog post about the history of Transifex APIs, the reasoning behind the decision to proceed with a new version, and its design philosophy.
A short history
Version 1
As you noticed, we call this API version 3. This means that it succeeds versions 1 and 2. Version 1 dates back to when Transifex was an open-source project; it was used to mainly help the localization of the Fedora Linux distribution and other open-source projects. Back then, it was a tool to help synchronize files between version control systems. We did not support parsing of different file formats, we did not extract the source strings and translations in our database, and we did not offer functionality similar to a web translation editor. Our data and business model evolved drastically since then, so API v1 was abandoned.
Version 2
API v2 has been our most adopted API so far. It is what our transifex-client interacts with. It is based on a library called django-piston which was developed by Bitbucket and, back in the day, was the go-to library for building REST APIs with Django.
Despite its technical shortcomings, which we discuss later in this blog-post, it is a very powerful API. This is evident by the fact that it has been around for so long, even as our business model keeps evolving, and has managed to support countless integrations.
For reference, this is how you can upload a JSON file to Transifex using API v2, with a tool like curl:
curl -i -L –user <username>:<password> -X POST \
-H “Content-type: multipart/form-data” \
-F slug=<resource_slug> \
-F name=<resource_name> \
-F i18n_type=KEYVALUEJSON \
-F content=@/path/to/the/file.json \
https://www.transifex.com/api/2/project/<project>/resources/
Version 2.5
When we started developing this API, we thought it would be THE API. It was developed using more modern technologies, namely Python 3 and Django-Rest-Framework, it tackled some of the incompatibilities between the previous API and our business model and managed to address some pressing needs of important users. Unfortunately, after its initial release and while in use, we realized that some of the design decisions would not allow us to scale to the level we wanted.
So, we decided to discontinue its development and give it the name “version 2.5” internally. It is still available under https://api.transifex.com/ and offers some functionality not in version 2, like screenshots.
Version 3
While the lessons learned from version 2.5 were fresh, we decided to be extra thoughtful with the design of version 3: we wanted the next version to be even more stable, robust, scalable, offering a really good user experience. We formed a cross-team committee to define the problems, research, orchestrate, and write down the decisions, long before we started writing code for it. We were determined to get this right.
Shortcomings of the previous versions
The increments to our previous implementations were always in response to the drastic evolution of our business model and users’ requests. This meant that they lacked consistency. Here are a few examples:
- In order to create a new project, you needed to use the /projects endpoint and pass the organization as an argument while, in order to create a new resource, you needed to use the /project/<project>/resources endpoint (i.e., sometimes the required information was part of the URL and sometimes it was part of the parameters).
- Sometimes related objects would be nested in the response while other times only an identifier of the related object would be included and you would be prompted to make an extra call to get its information.
- The format of the responses, especially error responses, was not consistent; sometimes they would be in plaintext and sometimes represented as JSON objects.
These inconsistencies meant that it was hard for us to make decisions about how to evolve these APIs. Fixing the issues was also challenging since changes to an API always risk breaking backwards-compatibility.
Another important problem is that the technologies that version 2 is based on are aging: django-piston has been discontinued and was designed to work with older versions of Python and Django. We already had to fork it in order to introduce compatibility with the more recent versions that we use.
Design philosophy of API version 3
The most important decision was using {json:api} as a base for our specification. Here is a quote from {json:api}’s homepage:
If you’ve ever argued with your team about the way your JSON responses should be formatted, JSON:API can be your anti-bikeshedding tool.
- Discoverability: Even if you know very little about the specifics of an object returned in an API response, the response itself includes URLs that will help you traverse relationships, pagination, filters, etc.
Furthermore, {json:api} is very specific about how to handle relationships. We will never have the dilemma of whether to embed or link to a related object. - Everything is a REST object: We don’t support RPC-style actions on objects. All actions are represented either as attribute edits or as creations of REST objects that represent actions.
For example, to mark a translation as reviewed, you will not be making a request to a URL like /translations/<id>/review; instead, you will make a PATCH request to a URL like /translations/<id> and set the reviewed attribute to true.
Similarly, to upload a file, you will not be making a request to a URL like /project/<id>/resource/<id>/content, like you had to with version 2; instead, you will make a POST request to a URL like /uploads and create an “upload job”.
- Flat URLs: Although not part of the {json:api} specification, it is part of the official recommendations. The old style URLs like /project/<project>/resource/<resource> are gone. Now:
- All collection URLs have the /<type> format (e.g. /projects)
- All object URLs have the /<type>/<id> format (e.g. /projects/o:transifex:p:transifex)
- All related object URLs have the /<type>/<id>/<relationship-name> format (e.g. /project/o:transifex:p:transifex/languages)
- All relationship URLs (for editing, not fetching, relationships) have the /<type>/<id>/relationships/<relationship-name> format (e.g. /project/o:transifex:p:transifex/relationships/languages)
Apart from making the URLs extremely predictable, this has the additional advantage of not having the dilemma of how to design our URLs. Also, it makes the specification more adaptable to changes to our business model. For example, you may have noticed that the root of most of the URLs of API version 2 is /project/…. This is because when version 2 was released, Transifex did not support organizations. With flat URLs, it will be easier for the API to adapt to changes to our business model.
Further to {json:api}, we decided and documented several design guidelines that the new Transifex API should always follow:
- Backwards compatibility: Because deprecating old APIs is always hard, we made the decision that all iterations will never break backwards compatibility. We will rather extend the API. For example:
- A required filter may become optional in the future but not the other way around.
- New fields and (non-required) filters may be introduced, but never removed.
- Pagination: Serving large volumes of data can lead to performance problems which are often solved by introducing pagination or changing its implementation. To that end, we decided that all returned lists should be paginated.
Furthermore, users will be discouraged from trying to understand how pagination parameters work and base their integration on it, but rather follow the pagination links embedded in our responses. This means that, while it is possible for something like {“next”: “/projects?page=2”} to be part of a response, users should not try to send a request to a /projects?page=7 URL on their own. This allows us to change the implementation of pagination for a specific type, and the formatting of the related URL parameters, avoiding regression problems.
Our preferred method of pagination is cursor-based with the required parameters being base-64-encoded and included in a page[cursor]=XXX parameter.For similar reasons, the total number of items in a collection will not be shown by default, since it can be computationally expensive to calculate it. We may add it in some endpoints on a case-by-case basis, after having considered the performance implications and implemented any optimizations. - Standard query language: {json:api} specifies filters as GET variables structured as ?filter[field]=XXX. It also suggests structuring nested filters as ?filter[a][b]=XXX. On top of this, we have come up with standardized ways of performing queries against various types of fields. For example:
- ?filter[created][gt]=2020-07-01 for numeric and timestamp comparisons
- ?filter[status][any]=translated,reviewed for keyword queries
- ?filter[text][contains]=foo for text queries
- ?filter[tags][any]=a,b for list queries
- etc
- Impersonation: In some of our endpoints, it is possible for someone (eg a localization manager) to perform some actions on behalf of another user (eg a translator). An example of this is migrating some translation work that has been done offline. This is too much of an edge case for {json:api} to have a specific proposal. So, having evaluated various options, we came up with a Impersonated-User HTTP header for this purpose.
- Bulk operations: {json:api} doesn’t support bulk operations, in order to reduce ambiguity. It does, however, provide a recommended way to document and add extensions to the specification. To that end, we wrote a Bulk profile/extension for {json:api} which we used on some of our endpoints. This extension is authored in a way respecting of {json:api}’s philosophy:
- All bulk requests must concern the same object type and the same operation. You can create several translations with the same request, or you can edit several translations with the same request, but you cannot create 2 translations and edit another 3 with the same request.
- All requests must be atomic: Either every object in the request will be created/edited/deleted or none will.
- In case of failure, the response is a list of error objects, each pointing to the item in the request that caused the error via the source.pointer field.
- Supporting UIs is not a requirement: Having such a robust API in place, it was tempting to consider reimplementing our frontend so that it would consume this API to render everything. Had we agreed to go into that direction, we would have to take this requirement into consideration when designing the API. However, while examining potential use-cases, we concluded that in order to make the API able to serve the needs of a frontend efficiently, we would have to introduce many impurities in its design. In the end, we decided that sacrificing the design of the API was not worth it. For the needs of the frontend, we can always create a GraphQL backend that will aggregate data from all relevant sources, including the API.
Implementation of API version 3
One of the most important decisions we took was to use OpenAPI (formerly Swagger) to write the specification of our API. Here is a quote from OpenAPI’s homepage:
The OpenAPI Specification (OAS) defines a standard, language-agnostic interface to RESTful APIs which allows both humans and computers to discover and understand the capabilities of the service without access to source code, documentation, or through network traffic inspection.
An OpenAPI document is a JSON/YAML file that describes all the endpoints, authentication methods, content-types, URL parameters, request and response payloads, and others that an API expects and supports, in a structured manner. This had numerous advantages for us:
- We were able to see our API “in action” before we started implementation. This highlighted a lot of the design dilemmas we needed to solve. It would have been very painful to discover these dilemmas during the actual implementation.
- There are tools that can import an OpenAPI specification and generate a high-quality static documentation site. The most prominent ones are the Swagger editor and ReDoc, the latter being what our public API documentation is based on.
- The public documentation is so detailed, that users will be able to build integrations without any support from our tech team.
- Within an OpenAPI document, request and response payloads are defined using JSONSchema. This allows us to import the actual OpenAPI specification document into our implementation and use it in order to validate request payloads and in our test-suite to validate response payloads. This not only removes a lot of effort from the implementation but also ensures that our documentation is consistent with our implementation and vice-versa.
- Work on an iteration of the API always starts in the specification. This allows a product manager to review the new functionality before any coding has taken place. Having the specification be part of the implementation and the test-suite means that, when the code is delivered, we are certain that it delivers exactly what our product manager wants.
- There are tools for helping with the creation of SDKs for APIs that have an OpenAPI specification. This is compounded by the fact that our API is {json:api}-compliant, for which there are also tools.
Conclusion
The API v3 journey was long but very rewarding! The benefits that our current users can enjoy are numerous. Furthermore, and because of the way it is designed, we are certain that it will appeal to engineering teams having yet to solve the mystery of automating localization in their apps. We are here to make the lives of everybody (devs, localization managers) much, much easier!
Want to see how awesome API v3 is? Check it out!