The Wild West of Working On NodeJS In Production (and why it’s fun).

thorns

NodeJS is a lightweight technology that no one has yet conquered and shackled in the prison of “Best Practices”. When coding a new app, there’s no list of frameworks to use, no service containers to worry about, no definitions, no injections or ejections or what have you. When you start a new app, there’s app.js and there’s your code.

It’s interesting that Node has become such a fun tool to work with and everyone that I’ve met or whose articles I’ve read, has had a wide smile on their face, eager to tell me how amazing it was working on their last project. My reasoning behind it is this Wild West of NodeJS.

The Story of Our Own Paradigm

Without any default ORMs to turn to as I normally would have when working with PHP (Doctrine, Eloquent), and no (very opinionated) frameworks to install (Laravel, Symfony), I had the freedom to choose not only the structure of my application but its process.

About two years ago, my boss and a co-worker decided that their structure and process will not fall into the trap of: models, views, controllers, tests, routes, and public. Their application skeleton became Genesis Skeleton which crudely split the working paradigm to client (front-end) and server (back-end) for single app applications. The beauty of this setup is that it separated things perfectly.

On the server, we’d build all of the logic to bring the data forth. We would create a robust API to drive the client. And on the client side, we would use AngularJS to build a single page app. With its routing capabilities, we were free to use the back-end router for semantic API calls. It was a perfect balance.

All of this was running on ExpressJS which is pretty much the only “framework” for Node today. And it’s not really even a framework, it’s a toolkit. It doesn’t determine how you run your application, what your structure is, it just helps you get the server working.

Working on a single page app this way was a lot of fun and because everything was Javascript, there was no “shock” when entering the front-end world. However, this is not as easy in PHP, especially when using frameworks.

Sure, you can just set up loose PHP scripts to work this way but is that really the best idea? You’d be mixing PHP and JS together in an ugly fashion and ignoring PHP’s class system and full range of capabilities. It’s funny but the reason why PHP can’t be so lightweight and fun anymore is because it’s built to run enterprise-sized apps.

With Node, it’s called it “lightweight”. With PHP, it’s called “cringe-worthy”.

The Story of 5 Query Builder…Builders

I said there’s no “default” ORM, right? The only one really out there is Bookshelf and I’m not a big fan. Plus, most features that I’d expect an ORM to have (or a framework to have) are not there. There are no migrations, no built-in seeding. But then again, it’s not a framework, it’s just an ORM.

At my work, we often deal with huge data sources both our own and some government supplied (read: overly complicated and messy). We decided to build an application that would hold various APIs and manage them. Basically, we created a “module manager” and each API that  would be a “module”. My boss quickly set up an express app with no controllers or views and we got to work. Each of us taking a bite of a service or data source.

While my boss built the first service provider (a GeoIP api), I was the first to take a stab at the data source API. The first thing that was clear was that Bookshelf would not cut it. It was too opinionated, and would require a lot of hand-holding in order to create the models for the database structure. I was dealing with 20+ tables, all intermingled, and most not necessary for individual API calls.

I opted to use Knex, the query builder which Bookshelf relies on. I started coding and realized that Knex was lacking in several different ways. Not by accident or thoughtlessness but by design:

  1. Knex didn’t do “model hydration”, which is why Bookshelf is useful. This basically means that you get a flat SQL result back, not an object with parents and children associated. None of the join data was glued together in a simple deep graph.
  2. Knex did not internally check if a join was already made. Meaning that if you accidentally (or on purpose) joined the same table with the same join clause twice, Knex wouldn’t throw an error or deal with the situation. MySQL would.
  3. Last but not least, Knex does not build objects where you can: add on to the select clause dynamically, fire off the object numerous times, change your mind about various features after setting them.

To solve Problem #1, my ex-co-worker Kevin looked for a path finding solution. An npm module that could look at flat MySQL data and say “Yep, these belong together, let’s make a neat deep graph out of them”. But there wasn’t one. As I said, this is the “Wild West”. So he built Treeize, an NPM module that used simple syntax that would be turned into a deep graph. All of a sudden, my select clauses were Treeize-bound:

table1.id AS id, table2.id AS table2:id

And with very pretty results:

[ {
id: 1,
table2: {
  id: 3
}} ]

Problem #2 and Problem #3 was a tad different. I originally wanted to directly interact with Knex’s collection of joins but found it difficult so I had to build a Query Builder Builder. My first one. The builder would create an instance of Knex bound to a table, and would use its own internal collections to keep track of all the information:

  • there was joins array that kept track of all the tables that were joined.
  • there was a view manager that would concatenate an array of selects.

The joins arra worked well, the view manager was great too since Treeize required “as” clauses per data point in order to work. My “views” became arrays with datapoints.

But I wasn’t really done.

My next API (API #2) got a few improvements:

  1. The builder did not use Knex’s “select” or other clauses until it was prompted to. So using builder.select('something.id as id') would not actually touch Knex. The builder would create a collection of selects that would get fired on builder.compile() after which you could use builder.then(function(results){}).
  2. Filtering was handled outside of the builder but using builder’s “join” clause which, again, would get fired on builder.compile().

Due to the views being tricky when it came to government data, I employed a schema format for API #3. The Schema had a list of all relevant data points with the following information:

  • data point “alias” (something:something)
  • data point “Path” (something.something)

The builder would crawl the schema, and collect all of the paths as “required tables”. Then the builder would crawl the table schema and pick up all the joins and pivots necessary to get the data points.

In the end, this API really only required passing a query parameter and it would build everything out of it. The beauty of this solved a new problem: getting a count of results. When building my API, I needed to have a count of current results (accomplished by results.length after Treeized) and “out of how many” which meant that I had to use a second query. The builder solved this for me because I could create a new builder object, pass the req.query and then override with a custom builder.count() method which removed limits, offsets, and most of the data points in order to get a clean count.

API #4 dug into a new problem. Each builder had its own issues and could not be used everywhere. Building a schema or a data point list was useless here so instead, I build a new query builder. This builder did not possess any of the features above, it simply joined on all required tables, processed req.query and left you to create a select.

This last API required tons of table joins but very little data and little filtering so I could get away with a simpler builder.

My co-worker ended up tackling a monstrous project with cross-database joins in API #5. His builder resembles my last one in some ways in that the builder already contained the joins and everything else required. This meant that he could call builder.someInformation() and the builder would join on the right tables, and select the right information for him to use.

Conclusion: What ended up happening is that all these modules used a custom-made builder for themselves which was totally fine. It broke the rules of DRY but then, each API presented its own issues, and its own DB structure and they all needed their own solution.

However, my point is that there was no ORM to solve this for us. No pre-made solution that we could plug in. But because of Node’s carelessness for our app structure, we were able to build a light-weight, quick, modularized application that we use in production today and gets hit with an average of 30-40K uncached hits.

On top of that, we did not deal with complex class systems, service providers, listeners or anything of that nature. We simply dealt with getting the data out and the pure and simple logic to do it.

So Why Is It So Fun?

A few weeks ago, after working on my personal Laravel project, I switched to work on a Node project and it’s just refreshing.

The refreshment comes from the fact that not only were not most problems already solved for the platform (meaning that there is space for your own NPM packages and your own improvements) but there was a freedom from conventions and structures.

Could you do this in PHP? Yeah, but you’d hate yourself. With Node, it’s called it “lightweight”. With PHP, it’s called “cringe-worthy”. The closest thing to Node freedom is Laravel with its routing structure. In L3, the docs were okay with encouraging you to write your logic into your routes. Can you imagine doing that today? No, you’d get slapped by someone.

Node is fun and easy precisely because of its lack of convention yet considerable speed (though the loose PHP scripts would probably be faster). On top of that, it already handles dependencies for you (npm + require) meaning that you don’t need to setup a dependency system, class system, call system, or anything like that. There are tons of fun problems to resolve but a huge number of packages as well. Since the market isn’t as saturated as Ruby or PHP, there’s space for “yet another CMS” for you to work on (I’m not a big fan of Ghost) meaning that you’re not “reinventing the wheel”, you’re creating the wheel for the platform.

My Node project meant for me that I get to focus on “how to solve this problem” rather than “how to boilerplate this app”.

Comments

  1. antonio gioia says:

    “Could you do this in PHP? Yeah, but you’d hate yourself”, love this, i agree.. with node the fun is back!

Trackbacks

  1. […] NodeJS is a lightweight technology that no one has yet conquered and shackled in the prison of “Best Practices”. When coding a new app, there’s no list of frameworks to use, no service containers to worry about, no definitions, no injections or ejections or what have you. When you start a new app, there’s app.js and there’s your code.It’s interesting that Node has become such a fun tool to work with and everyone that I’ve met or whose articles I’ve read, has had a wide smile on their face, eager to tell me how amazing it was working on their last project. My reasoning behind it is this Wild West of NodeJS.A few weeks ago, after working on my personal Laravel project, I switched to work on a Node project and it’s just refreshing.The refreshment comes from the fact that not only were not most problems already solved for the platform (meaning that there is space for your own NPM packages and your own improvements) but there was a freedom from conventions and structures.  […]

Add Your Comment