|

What is Protobuf?

What is Protobuf? Photo by Pawel Czerwinski on Unsplash

Protobuf, or “Protocol Buffers,” is a cross-platform data format used to serialize structured data. It is an alternative to using JSON, but smaller and faster. It is useful for storing data or when developing programs that communicate with each other over a network.

What is Protobuf?

Protocol buffers are extensible methods for serializing structured data, language, and platform-independent. They are predominantly used in communication protocols and for data storage. Once you define how your data should be formatted, you can use specially generated source code to read and write structured data from a variety of sources and languages.

Example

Let’s see what a .proto file looks like:

// polyline.proto
syntax = "proto2";

message Point {
  required int32 x = 1;
  required int32 y = 2;
  optional string label = 3;
}

message Line {
  required Point start = 1;
  required Point end = 2;
  optional string label = 3;
}

message Polyline {
  repeated Point point = 1;
  optional string label = 2;
}

Why use Protobuf?

Protobuf is quickly surpassing JSON as the most popular serialization method. It was created by Google as an internal tool with the intention of being lighter and faster than XML. It originated in 2001, and years after its creation, in 2008, the tool was publicly released with support for various languages under an open-source license.

Advantages of Protobuf

  • Compact data storage
  • Fast interpretation
  • Available in many programming languages
  • Optimized functionality through automatically generated classes

Who uses Protocol Buffers?

Many projects use protocol buffers, including the following:

Protocol Buffers may not be a good choice?

Protocol buffers are not suitable for all types of data. In particular:

  • Protobufs tend to assume that entire messages can be loaded into memory at once and are not larger than a graph object. For data exceeding a few megabytes, consider a different solution.

  • When protocol buffers are serialized, the same data can have different binary serializations. You cannot compare the equality of two messages without interpreting them completely.

  • Messages are not compressed. While messages can be used with zip or gzip, compression algorithms for a specific purpose, such as those used by JPEG or PNG, will produce much smaller files.

  • Protocol buffers are not decently supported by non-object-oriented languages like Fortran and IDL.

  • Protocol buffer messages do not describe their own data. You need the corresponding .proto file to fully interpret the message.

  • Protocol buffers are not a formal standard of any organization. This makes them unsuitable for use in environments with specific legal requirements.

Conclusion

It is recommended to use Protobuf when:

  • Fast serialization/deserialization is required
  • Type safety is important
  • Adherence to the schema is required
  • Interoperability between languages is required
  • You want to use the latest and modern tools

Consider JSON or other formats when:

  • You want data to be human-readable
  • Data is consumed by a browser
  • Adherence to the schema is not a concern

I hope this content has helped clarify some questions involving protobufs, the Protocol Buffers. Do you already use them in your projects? Do you think it will surpass JSON? Receive my articles firsthand by subscribing to emails on the Blog page. Thank you for reading! 👋