Protubuf Persistence for Fun and Profit

Impossible Object

Note: This post is about the recently open sourced proto store project as part of a SiteMorph Open Source initiative.

Since my days writing code at Google I have always been a huge fan of using protobuf to generate all simple POJO object classes. There are a few good reasons for this:

  • Code that is generated by protoc doesn't have to be tested.

  • You avoid writing repetitive boiler plate code like get* and set*. Yes, I realise that your IDE can generate a lot of these for you...

  • The coding style of protobuf makes them easy to use.

  • Protobuf supports data interchange between programming environments.

  • Protobuf in Java are inherently thread safe as they are immutable.

Let's just say that writing protobuf code rocks. An example could be:

message Picture {
required string urn = 1;
required string url = 2;
required string profileUrn = 3;
required Moderated moderated = 4 [default = OK];
optional int32 width = 5;
optional int32 height = 6;

This proto message creates a picture. Note that I am using meta references here with references, a lot like you would in a relational database. Many of the rest APIs I am working on also use meta references rather than fully materialising related objects. Proto does have some disadvantages though which are about 'what it is'. I typically use protobuf for internal representation of data which is persisted. I use a different set of objects for external representations, for example for web services I tend to use Jackson.

In previous versions of projects, SiteMorph, Shomei, Connect, Click Date Love I used to write data access or try to use libraries for object persistence. A few things struck me about these.

  • They were very large, necessarily because they were general purpose.

  • Some were not very performance oriented, XML is usually going to be slow.

  • They are often tricky to configure to work with legacy databases.

  • Re-factoring isn't as clear a process as it could be.

Given all of that, one Saturday a few months ago I decided to try and create a very light weight library that solved 80% of my database coding problems. The answer was to write a CRUD driver that stored protobuf messages into tables. The assumption is that you then use your database SQL to re-factor your data across version changes. This might sound like an unnecessary consideration right? Your database only changes every few months right? Wrong: for some projects I am making more than 10 structural database changes per month. Simply mapping the proto to the table was a clear choice as it lets you read messages based on a really simple interface.

  • Very small library footprint. Version 2.6.1 packed as a jar is only 26KB.

  • Create a message by passing a semi constructed builder and the storage system sets the identifier.

  • Read with support for primary key (unique identifier) and secondary indexes. Also read all.

  • Update an message given a builder created from MyMessage.toBuilder().

  • Delete.

  • Basic ordering of records returned.

  • Support for auto ID primary key column.

  • Support for urn keyed primary key column.

  • Support for reading basic message types as well as enumeration values but not nested messages.

  • A very simple iterator interface for accessing data.

  • Performance comparable to writing prepared statements manually.

  • No more writing SQL for basic operations.

  • Eliminate the need for testing SQL on version migrations.

Don't get me wrong. Writing tests is a great thing. Avoiding the need for them is even better. Take this re-factoring example process where you rename and modify a field.

  1. Do your SQL re-factoring renaming / setting the new value of your field.

  2. Use your IDE to rename the method for the proto field in your code to the new field name. get[Field], has[Field] and builder methods set[Field] and clear[Field]. This updates all of your code to use the new naming convention.

  3. Update your proto field names and rebuild. Everything should just work.

  4. Deploy the updated code which now uses the new field.

  5. Delete your old field in the database table.

Pretty simple right? You may still have to change the semantic interpretation of the underlying data, that is worth testing but we have avoided a few very repetitive tasks and the need to test the code. Combined with the power of using protobuf to generate your object representations you can be a much more effective coder. To give an example, in most projects, between 20% and 50% of the code used in the project is just object representations. Admittedly proto generated code is somewhat verbose but the point stands. Using the protostore means that 80% of your data access layer doesn't need to be written either. All you need to do is create a factory somewhere which returns a protostore for your type of database table.


Falling Down

The last week has presented a lot of challenges not least being knocked off my motorbike, financial issues and a heavy crash on my BMX. After all of that I spent the whole of Sunday lying around, hardly able to move without whimpering like a sick old dog! My injury list is now: right knee ligaments, right wrist strain, strained kneck, left wrist strain, bruised ribs, bruised knees and sprained left foot.

This all started at the beginning of the week when I found myself lying face down in traffic. A pedestrian had run out into the road and appeared from behind a bus - the lights were on green for me but hey...

After the motorbike crash I noticed a few things in the days after where I was being more cautious. On the BMX too, I found myself being more hesitant. The answer is simple: I need a way to regain my confidence. I spent a couple of hours at my local track, basically cruising around and trying to regain the strength in my right knee. I tore the ligaments behind my knee over a month ago. Unfortunately they don't seem to have recovered :-( After only an hour of light riding they were hurting again so I called it a day.

Later on riding home, buoyed on by my session I was feeling good. A windsurfing coach years ago had two pieces of advice for dealing with the fear of crashing:

  • You gain confidence when you feel like you are in control and comfortable.

  • If you aren't falling off, you aren't trying hard enough.

You may be able to spot the inherent contradiction in these two different pieces of advice! As is often the case, after riding for a while I was probably more tired than I realised and stacked a landing on a little jump on the ride home :-(

The psychological effect of falling off is quite strange because you are more likely to hesitate and crash.

At a recent event by Stewart Bewley, part of the discussion was how you call upon all of your power and confidence even if you are having the worst week of your life. In that context, using a routine to put yourself into a ready mental and emotional state made a lot of sense. I think this is what you can see gladiator like MotoGP riders doing before the race. These guys fall off at high speeds every week. Some of them even get back on with broken bones, such is their will to win. They all have different routines, be it crouching beside the bike like Rossi or listening to Phil Collins (seriously JL?).

The main problem is most likely just that I am getting older so it takes much longer to recover from injuries. The trick looks like it will be to train and get a lot stronger physically so that I can roll out of the crashes more comfortably. Time to hit youtube to find some good training tips. Unfortunately my doc said I have to stay off interval training for at least another week for the neck strain I picked up at the beginning.

What do you do to train after a crash? Tweet Me.


Time To Train: Coaching For Saxophone

As I mentioned in a previous post, I have started to learn the Saxophone as it was a long time ambition. I have made some progress with the Hite premier mouth piece, and getting a nice sound. I have been using the tune a day book and trying to supliment that with inspirational lessons from YouTube like this one from Nigel McGill.

From my time learning the clarinet I can see I have made progress adapting to the Sax. It is quite a different embouchure and breathing technique in a way and the very low notes present a particular problem as they require very fine control on the reed.

To try and learn faster I have decided to take some lessons and have found a teacher in East London.

Before you suggest I play something here I should say that piece above is 'inspiring' and currently completely unattainable :-)