purescript-protobuf

Purescript library and code generator for Google Protocol Buffers Version 3.

This library operates on ArrayBuffer, so it will run both in Node and in browser environments.

Code Generation

The shell.nix environment provides

The Purescript toolchain
The protoc compiler
The protoc-gen-purescript executable plugin for protoc on the PATH so that protoc can find it.

$ nix-shell

Purescript Protobuf development environment.
To build purescript-protobuf, run:

    npm install
    spago build

To test purescript-protobuf, run:

    protoc --purescript_out=./test/generated test/*.proto
    spago -x test.dhall build
    spago -x test.dhall test

To generate Purescript .purs files from .proto files, run:

    protoc --purescript_out=path_to_output *.proto

[nix-shell]$

Writing programs with the generated code

None of the modules in this package should be imported directly in our program. Rather, we'll import the message modules from the generated .purs files, as well as modules for reading and writing ArrayBuffers.

For example, a message in a .proto file declared as

message MyMessage {
  sint32 my_field = 1;
}

will export these four names in the generated .purs modules.

A message record type
- type MyMessageR = { my_field :: Maybe Int }.
A message data type
- newtype MyMessage = MyMessage MyMessageR.
A message encoder which works with purescript-arraybuffer-builder
- putMyMessage :: forall m. MonadEffect m => MyMessage -> PutM m Unit
A message decoder which works with purescript-parsing-dataview
- parseMyMessage :: forall m. MonadEffect m => Int -> ParserT DataView m MyMessage

Then, in our program, our imports will look something like this.

import Generated.Module (MyMessage(..), putMyMessage, parseMyMessage)
import Text.Parsing.Parser (runParserT)
import Data.ArrayBuffer.Builder (execPutM)

The generated code modules will import modules from this package.

The generated code depends on packages

  , "parsing"
  , "parsing-dataview"
  , "arraybuffer-types"
  , "arraybuffer"
  , "arraybuffer-builder"
  , "uint"
  , "long"
  , "text-encoding"

which are in package-sets, except for purescript-longs (see spago.dhall in this package for the particulars).

It also depends on the Javascript package long.

Generated message instances

We cannot easily derive common instances like Eq for the generated message types because

The types might be recursive.
The types might contain fields of type ArrayBuffer which doesn't have those instances.

All of the generated message types have an instance of Generic. This allows us to sometimes use genericEq and genericShow on the generated messages, if neither of two conditions above apply to our particular message types.

All of the generated message types have an instance of NewType.

Interpreting invalid encoding parse errors

When the decode parser encounters an invalid encoding in the protobuf input stream then it will fail to parse.

When Text.Parsing.Parser.ParserT fails it will return a ParseError String (Position {line::Int,column::Int}). The byte offset at which the invalid encoding occured is given by the formula column - 1.

Features

We aim to support proto3. Many proto2-syntax descriptor files will also work, as long as they don't have proto2 features.

We don't support extensions.

The generated optional record fields will use Nothing instead of the default values.

We do not preserve unknown fields.

We do not support services.

Imports

The code generator will use the package statement in the .proto file and the base file name as the Purescript module name for that file.

The Protobuf import statement allows Protobuf messages to have fields consisting of Protobuf messages imported from another file, and qualified by the package name in that file. In order to generate the correct Purescript module name qualifier on the types of imported message fields, the code generator must be able to lookup the package name statement in the imported file.

For that reason, we can only use top-level (not nested) message and enum types from an import.

The generated Purescript code will usually have module imports which cause the purs compiler to emit warnings. Sorry.

Performance

The implementation is simple and straightforward. We haven't done any special optimizations. For example, when encoding a protobuf varint, we allocate a list of new one-byte ArrayBufferss and then copy them all into position in the final ArrayBuffer. For another example, when decoding a packed field of numbers, we build a list of the numbers, and then copy them all into the final Array. Also, this whole library is very stack-unsafe. This may all be improved in later versions.

Contributing

Pull requests welcome.