Purescript library and code generator for Google Protocol Buffers version 3.
This library operates on
ArrayBuffer
, so it will run both
in Node.js
and in browser environments.
We aim to support binary-encoded (not JSON-encoded) proto3. Many proto2-syntax descriptor files will also work, as long as they don't use proto2 features, like groups.
The generated optional record fields will use Nothing
instead of the
default values.
We do not support extensions.
We do not support services.
At the time of this writing, we pass all 194 of the Google conformance tests for binary-wire-format proto3.
See the conformance/README.md
in this repository for details.
We also have our own unit tests, see test/README.md
in this repository.
The shell.nix
environment provides
- The Purescript toolchain: purs, spago, and nodejs.
- The
protoc
compiler - The
protoc-gen-purescript
executable plugin forprotoc
on thePATH
so thatprotoc
can find it.
$ nix-shell
Purescript Protobuf development environment.
To build purescript-protobuf, run:
npm install
spago build
To generate Purescript .purs files from .proto files, run:
protoc --purescript_out=path_to_output file.proto
[nix-shell]$
If you don't want to use Nix, then install the Purescript toolchain and protoc
,
and add the executable script
bin/protoc-gen-purescript
to your PATH
.
The code generator will use the package
import statement in the .proto
file
and the base .proto
file name as the Purescript module name for that file.
A message in a shapes.proto
file declared as
package interproc;
message Rectangle {
double width = 1;
double height = 2;
}
will export these four names from module Interproc.Shapes
in a
generated shapes.Interproc.purs
file.
-
A message data type
newtype Rectangle = Rectangle { width :: Maybe Number, height :: Maybe Number }
The message data type will also include an
__unknown_fields
field which is required for conformance. -
A message maker which constructs a message from a
Record
with some message fieldsmkRectangle :: forall r. Record r -> Rectangle
All message fields are optional, and can be elided when making a message. There are some extra type constraints, not shown here, which will cause a compiler error if we try to add a field which is not in the message data type.
If we want the compiler to check that we've explicitly supplied all the fields, then we can use the ordinary message data type constructor.
-
A message encoder which works with purescript-arraybuffer-builder
putRectangle :: forall m. MonadEffect m => Rectangle -> PutM m Unit
-
A message decoder which works with purescript-parsing-dataview
parseRectangle :: forall m. MonadEffect m => Int -> ParserT DataView m Rectangle
The message decoder needs an argument which tells it the length of the message which it’s about to decode, because “the Protocol Buffer wire format is not self-delimiting.”
In our program, our imports will look something like this.
The only module from this package which we will import into our program
will be the Protobuf.Library
module.
We'll import the message modules from the generated .purs
files.
We'll also import modules for reading and writing ArrayBuffer
s.
import Protobuf.Library (Bytes(..), parseMaybe)
import Interproc.Shapes (Rectangle, mkRectangle, putRectangle, parseRectangle)
import Text.Parsing.Parser (runParserT)
import Data.ArrayBuffer.Builder (execPutM)
import Data.ArrayBuffer.DataView (whole)
import Data.ArrayBuffer.ArrayBuffer (byteLength)
import Data.Newtype (unwrap)
Serialize a Rectangle
to an ArrayBuffer
.
do
arraybuffer <- execPutM $ putRectangle $ mkRectangle
{ width: Just 3.0
, height: Just 4.0
}
Now we'll deserialize Rectangle
from the ArrayBuffer
.
result <- runParserT (whole arraybuffer) $ do
rectangle <- parseRectangle (byteLength arraybuffer)
Now at this point, we've consumed all of the parser input, but we're not finished parsing.
In proto3, all fields are optional.
We want to “validate” the Rectangle
message to make sure it has all of the
fields that we require. Fortunately, we are already in the ParserT
monad,
so we can do better than “validation.”
Parse, don't validate.
We will construct and return a tuple
with the width and height of the Rectangle
. For this step,
pattern matching
on the Rectangle
message type works well, or we might want to use some of the
convenience parsing functions supplied by Protobuf.Library
, like parseMaybe
.
width <- parseMaybe "Missing required width" (unwrap rectangle).width
height <- parseMaybe "Missing required height" (unwrap rectangle).height
pure $ Tuple width height
The result
will now be :: Either ParseError (Tuple Number Number)
.
The generated code modules will import modules from this package.
The generated code depends on packages
, "protobuf"
, "arraybuffer"
, "arraybuffer-types"
, "arraybuffer-builder"
, "parsing"
, "parsing-dataview"
, "uint"
, "longs"
, "text-encoding"
which are all in package-sets.
It also depends on the Javascript package long.
All of the generated message types have instances of
Eq
,
Show
,
Generic
,
NewType
.
The purescript-protobuf repository contains three executable Node.js programs which use code generated by purescript-protobuf. Refer to these for further examples of how to use the generated code.
- The
protoc
compiler plugin. The code generator imports generated code. Trippy, right? This program literally writes itself. - The unit test suite
- The Google conformance test program
When the decode parser encounters an invalid encoding in the protobuf input stream then it will fail to parse.
When
Text.Parsing.Parser.ParserT
fails it will return a ParseError String (Position {line::Int,column::Int})
.
The byte offset at which the parse failure occured is given by the
formula column - 1
.
The path to the protobuf definition which failed to parse will be included
in the ParseError String
and delimited by '/'
, something
like "Message1 / string_field_1 / Invalid UTF8 encoding."
.
The Protobuf
import
statement allows Protobuf messages to have fields
consisting of Protobuf messages imported from another file, and qualified
by the package name in that file. In order to generate
the correct Purescript module name qualifier on the types of imported message
fields, the code generator must be able to lookup the package name
statement in the imported file.
For that reason, we can only use top-level
(not nested)
message
and enum
types from a Protobuf import
.
The generated Purescript code will usually have module imports which cause
the purs
compiler to emit warnings. We beg your pardon.
If we want to run the .proto
→ .purs
generation step as part of a pure Nix
derivation, then import
the top-level default.nix
from this repository
as a nativeBuildInput
.
Then protoc --purescript_out=path_to_output file.proto
will be runnable
in our derivation phases.
See the nix/demo.nix
file for an example.
Pull requests welcome.