Describe Kafka message payload using Avro Schema

Found an error? Have a suggestion?Edit this page on GitHub

Introduction

The previous tutorial on creating an AsyncAPI document for applications consuming from Kafka taught you about writing an AsyncAPI document for Kafka messages using the default schema. This tutorial will teach you to write the same document using Avro Schema.

While AsyncAPI schema can be the default choice for describing payloads, many prefer using Avro Schemas to define messages in Kafka. This tutorial teaches you to modify your existing AsyncAPI schema to add Avro schema into your document in both YAML and JSON formats.

Background context

AsyncAPI is a specification for describing Event-Driven Architectures (EDAs) in a machine-readable format. AsyncAPI schema outlines the format and content specifications that enable a consistent representation of agreements for communication between services in an Event-Driven Architecture.

Avro is a sophisticated tool in Apache Kafka that handles data communication efficiently. It provides a standardized method for organizing and transmitting data, ensuring that different parts of the system can understand each other effectively. With Avro, there's a common language for messages, promoting compatibility and smooth operation between various components. It's like having a shared rulebook that helps different system parts communicate and exchange information seamlessly.

Define message payload with Avro Schema directly in AsyncAPI document

Defining message schema with the default schema is already covered in the previous Kafka tutorial. The default choice was the AsyncAPI schema, a JSON Schema superset. Here's an example of what the AsyncAPI schema looks like:

1messages:
2  userSignedUp:
3    payload:
4      type: object
5      properties:
6        userId:
7          type: integer
8          description: This property describes the ID of the user
9        userEmail:
10          type: string
11          description: This property describes the email of the user

Now it's time to shift your focus to defining messages using Avro Schemas directly within your document.

1messages:
2  userSignedUp:
3    payload:
4      schemaFormat: 'application/vnd.apache.avro;version=1.9.0'
5      schema:
6        type: record
7        name: UserSignedUp
8        namespace: com.company
9        doc: User sign-up information
10        fields:
11          - name: userId
12            type: int
13          - name: userEmail
14            type: string

In the above code snippet:

  • The userSignedUp message is defined with Avro Schema, using the specified schemaFormat and the schema.
  • Use the schemaFormat to indicate that you're using Avro and specify the version of Avro Schema by using a MIME type.
  • The schema includes a record named UserSignedUp within the com.company namespace. It also describes two fields, userId and userEmail, defining their data types as int and string, respectively.

Now let's combine the above Avro Schema with the AsyncAPI document that you created in the previous tutorial. Check out below what an AsyncAPI document fully equipped with Avro Schema looks like!

1asyncapi: 3.0.0
2info:
3  title: User Signup API
4  version: 1.0.0
5  description: The API notifies you whenever a new user signs up in the application.
6servers:
7  kafkaServer:
8    host: test.mykafkacluster.org:8092
9    description: Kafka Server
10    protocol: kafka
11operations:
12  onUserSignedUp:
13    action: receive
14    channel:
15      $ref: '#/channels/userSignedUp'
16channels:
17  userSignedUp:
18    description: This channel contains a message per each user who signs up in our application.
19    address: user_signedup
20    messages:
21      userSignedUp:
22        $ref: '#/components/messages/userSignedUp'
23components:
24  messages:
25    userSignedUp:
26      payload:
27        schemaFormat: 'application/vnd.apache.avro;version=1.9.0'
28        schema:
29          type: record
30          name: UserSignedUp
31          namespace: com.company
32          doc: User sign-up information
33          fields:
34            - name: userId
35              type: int
36            - name: userEmail
37              type: string

Reusing existing Avro Schemas

Occasionally, you might find yourself with an existing set of Avro Schemas. In such cases, instead of repeatedly defining these schemas in your AsyncAPI document, integrate them seamlessly by calling out existing files.

Assume you have a file named userSchema.json that encapsulates the Avro Schema that resembles the following:

1// userSchema.json
2{
3  "type": "record",
4  "name": "UserSignedUp",
5  "namespace": "com.company",
6  "doc": "User sign-up information",
7  "fields": [
8    { "name": "userId", "type": "int" },
9    { "name": "userEmail", "type": "string" }
10  ]
11}

To seamlessly incorporate this existing Avro schema into your AsyncAPI document, you can use the $ref property to reference the path to the JSON file. This way, your AsyncAPI document will incorporate the Avro Schema from the external JSON file, ensuring consistency and interoperability in your Kafka ecosystem.

1asyncapi: 3.0.0
2info:
3  title: User Signup API
4  version: 1.0.0
5  description: The API notifies you whenever a new user signs up in the application.
6servers:
7  kafkaServer:
8    host: test.mykafkacluster.org:8092
9    description: Kafka Server
10    protocol: kafka
11operations:
12  onUserSignedUp:
13    action: receive
14    channel:
15      $ref: '#/channels/userSignedUp'
16channels:
17  userSignedUp:
18    description: This channel contains a message per each user who signs up in our application.
19    address: user_signedup
20    messages:
21      userSignedUp:
22        $ref: '#/components/messages/userSignedUp'
23components:
24  messages:
25    userSignedUp:
26      payload:
27        schemaFormat: 'application/vnd.apache.avro;version=1.9.0'
28        schema:
29          $ref: './userSchema.json'

Summary

In this tutorial, you smoothly updated from the default schema to utilize the capabilities of Avro Schema. The use of Avro Schema with AsyncAPI ensures improved communication in event-driven systems. Now, you can further experiment by incorporating your business logic and experimenting with more advanced capabilities.

Next Steps

Now that you know how to write an AsyncAPI document using Avro Schemas, let's learn how to use Schema Registry with AsyncAPI.

Was this helpful?
Help us improve the docs by adding your contribution.
OR
Github:AsyncAPICreate Issue on GitHub