Purescript library and code generator for Google Protocol Buffers version 3.
This library operates on
ArrayBuffer
, so it will run both
in Node.js
and in browser environments.
The shell.nix
environment provides
- The Purescript toolchain
- The
protoc
compiler - The
protoc-gen-purescript
executable plugin forprotoc
on thePATH
so thatprotoc
can find it.
$ nix-shell
Purescript Protobuf development environment.
To build purescript-protobuf, run:
npm install
spago build
To test purescript-protobuf, run:
protoc --purescript_out=./test/generated test/*.proto
spago -x test.dhall build
spago -x test.dhall test
To generate Purescript .purs files from .proto files, run:
protoc --purescript_out=path_to_output file.proto
[nix-shell]$
None of the modules in this package should be imported directly in our program.
Rather, we'll import the message modules from the generated .purs
files,
as well as modules for reading and writing ArrayBuffer
s.
For example, a message in a .proto
file declared as
message MyMessage {
sint32 my_field = 1;
}
will export these four names in the generated .purs
modules.
- A message record type
-
type MyMessageR = { my_field :: Maybe Int }
-
- A message data type
-
newtype MyMessage = MyMessage MyMessageR
-
- A message encoder which works with
purescript-arraybuffer-builder
-
putMyMessage :: forall m. MonadEffect m => MyMessage -> PutM m Unit
-
- A message decoder which works with
purescript-parsing-dataview
-
parseMyMessage :: forall m. MonadEffect m => Int -> ParserT DataView m MyMessage
-
Then, in our program, our imports will look something like this.
import Generated.Module (MyMessage(..), putMyMessage, parseMyMessage)
import Text.Parsing.Parser (runParserT)
import Data.ArrayBuffer.Builder (execPutM)
The generated code modules will import modules from this package.
The generated code depends on packages
, "protobuf"
, "arraybuffer"
, "arraybuffer-types"
, "arraybuffer-builder"
, "parsing"
, "parsing-dataview"
, "uint"
, "long"
, "text-encoding"
which are in
package-sets,
except for
purescript-longs
(see spago.dhall
in this package for the particulars).
It also depends on the Javascript package long.
We cannot easily derive common instances like Eq
for the
generated message types because
- The types might be recursive.
- The types might contain fields of type
ArrayBuffer
, which doesn't have those instances.
All of the generated message types have an instance of
Generic
.
This allows us to sometimes use
genericEq
and
genericShow
on a generated message, if the generated message has those instances for
all of its fields.
All of the generated message types have an instance of
NewType
.
The purescript-protobuf repository contains three executable Node.js programs which use code generated by purescript-protobuf. Refer to these for further examples of how to use the generated code.
- The
protoc
compiler plugin. The code generator imports generated code. Trippy, right? This program literally writes itself. - The unit test suite
- The Google conformance test program
When the decode parser encounters an invalid encoding in the protobuf input stream then it will fail to parse.
When
Text.Parsing.Parser.ParserT
fails it will return a ParseError String (Position {line::Int,column::Int})
.
The byte offset at which the parse failure occured is given by the
formula column - 1
.
The path to the protobuf definition which failed to parse will be included
in the ParseError String
and delimited by '/'
, something
like "Message1 / string_field_1 / Invalid UTF8 encoding."
.
We aim to support binary-encoded (not JSON-encoded) proto3. Many proto2-syntax descriptor files will also work, as long as they don't use proto2 features.
We don't support extensions.
The generated optional record fields will use Nothing
instead of the
default values.
We do not preserve unknown fields.
We do not support services.
At the time of this writing, we pass 193 out of 194 of the Google conformance tests for binary-encoded proto3. The one test we fail is the Required.Proto3.ProtobufInput.UnknownVarint.ProtobufOutput test, which is the test for preserving unknown fields, which we do not support, see above.
See the conformance/README.md
in this repository for details.
The code generator will use the package
statement in the .proto
file
and the base file name as the Purescript module name for that file.
The Protobuf
import
statement allows Protobuf messages to have fields
consisting of Protobuf messages imported from another file, and qualified
by the package name in that file. In order to generate
the correct Purescript module name qualifier on the types of imported message
fields, the code generator must be able to lookup the package name
statement in the imported file.
For that reason, we can only use top-level
(not nested)
message
and enum
types from an import
.
The generated Purescript code will usually have module imports which cause
the purs
compiler to emit warnings. Sorry.
The implementation is simple and straightforward. We haven't done
any special optimizations. For example, when encoding a protobuf varint, we
allocate a list of new one-byte ArrayBuffers
s and then copy them all into
position in the final ArrayBuffer
. For another example, when decoding a
packed field of numbers, we build a list of the numbers, and then copy them
all into the final Array
. Also, this whole library is very stack-unsafe.
This may all be improved in later versions.
Pull requests welcome.