Photo by Brujo. Relevant: binary database extraction?

For those of you who may not know, DataTables.Net is a fantastic jQuery plug-in that creates a data grid presentation and offers support for filtering and server-side paging solutions. Yep, define an endpoint web service, pump down your data and you are good to go. But the cool kids these days are into the No-SQL thing, and one of the great entries into the document based database arena is RavenDB. With Raven, you define your domain objects and just store them in the database, as Raven will do it’s magic and persist your objects as documents. Have a List<Customer> to store, give it to Raven and it will create one document of the customers and store it in JSON format.

This post will show you how to combine the front end goodness of DataTables with the back end magic of RavenDB. The goal is to provide:

  • The ability to define a single class for that indexes data.
  • Control how what data is selected by define the columns, sort order and paging size in Javascript. In other words, DataTables will tell the server what it wants to pull back
  • Provide support for filtering properties with a single search field, ala Google style.
  • Above all, save you time, make you a hero in front of your fans. :)

For the Impatient, Here’s the End Product

There’s a lot ground that we’ll cover but for those who want to see the light at the end of the tunnel here what the end solution will look like. You may want to download the solution first so you can follow along in the code. First, your web service or controller will have the following:

[HttpPost]
public JsonResult GetTenants(string jsonAOData)
{
var tenantPager = new DataTablesPager(DocumentStore);
var results = tenantPager.PrepAOData(jsonAOData)
.FilterFormattedList();

return Json(results);
}

// The core method that is used to get the data is in DataTablesPager.cs
public List Filter(int pageSize, int pageIndex, string[] terms)
{
var targetList = new List();
RavenQueryStatistics stats;

using(var session = docStore.OpenSession())
{
if (terms[0].Length > 0)
{
targetList = session.Query()
.Customize(x => x.WaitForNonStaleResults())
.SearchMultiple(x => x.QueryFields, string.Join(" ", terms), options: SearchOptions.And)
.Statistics(out stats)
.Skip(pageIndex)
.Take(pageSize)
.As()
.ToList();

this.totalDisplayResults = stats.TotalResults;

session.Query()
.Statistics(out stats)
.As()
.ToList();
this.totalResults = stats.TotalResults;
}
// Code reduced for reading purposes.

Take note of the Generics on the constructor – Tenant is your domain object, Tenant_Search is the class that Raven will use to create the index for retrieving the data, as well as defining what properties you can filter on the object. We’ll cover indexing shortly along with some RavenDB background.

Your Javascript will be the following:

var otable;

$(document).ready(function(){
otable = $("#tenantTable").dataTable({
"bProcessing": true,
"bSort": true,
"sPaginationType": "full_numbers",
"aoColumnDefs": [
{ "sName": "Name", "aTargets": [0], "bSortable": true, "bSearchable": true },
{ "sName": "Agent", "aTargets": [1], "bSortable": true, "bSearchable": true },
{ "sName": "Center", "aTargets": [2], "bSortable": true, "bSearchable": true },
{ "sName": "StartDate", "aTargets": [3], "bSortable": true, "bSearchable": true },
{ "sName": "EndDate", "aTargets": [4], "bSortable": true, "bSearchable": true },
{ "sName": "DealAmount", "aTargets": [4], "bSortable": true, "bSearchable": true }
],
"oLanguage": {
"sSearch": "Search all columns:"
},
"aaSorting": [[1, "asc"]],
"iDisplayLength": 7,
"bServerSide": true,
"sAjaxSource": "GetTenants",
"fnServerData": function (sSource, aoData, fnCallback) {

var jsonAOData = JSON.stringify(aoData);

$.ajax({
//dataType: 'json',
contentType: "application/json; charset=utf-8",
type: "POST",
url: sSource,
data: "{jsonAOData : '" + jsonAOData + "'}",
success: function (msg) {
fnCallback(msg);
},
error: function (XMLHttpRequest, textStatus, errorThrown) {
alert(XMLHttpRequest.status);
alert(XMLHttpRequest.responseText);

}
});
}
});

otable.fnSetFilteringDelay(1000);
});

It’s actually longer than the .Net stuff!!! We’ll cover what this means as well.

Getting Data and Providing Search with RavenDB

This post assumes that you have installed RavenDB, can connect to it, know how to store your objects, and that you can perform queries with LINQ. There are some good tutorials such as Rob Ashton’s video introduction to Raven, as well as a brief overview by Sensei (me). We’re going to focus on Raven’s inherent full-text search capability and rely on Raven’s built in paging mechanism to help us achieve our goals. While there is great capability that Raven provides, it is not SQL, and much of what you know about LINQ and LINQ to SQL will help you as well as paint you into a corner at the same. We’ll cover those aspects too.

First off, RavenDB is built on top of the search engine Lucene.Net. It is a schema-less database so up front we will need to identify what how we want to fetch data, as they indexes provide super fast data retrieval. Raven Indexes reduce the need to devote huge cpu cycles to processing a query, as the index is built from the documents and processed as a background operation. This operation is asynchronous and is performed by Lucene. Without this approach, any query force a complete scan of all documents. A miserably slow scan. With indexes define up front, Raven will work quitely to keep the indexes up to date when new documents are created. So why is this important? Well, what you think are doing in LINQ:

var search = "Bonus";
var steps = session.Query()
.Customize(x => x.WaitForNonStaleResults())
.Where(x => x.State.StartsWith(search) || x.WorkflowName.StartsWith(search))
.ToList();

is really translated to Lucene syntax. So what we get in the end is State:Bonus OR WorkflowName:Bonus. While it is simple to write a query that includes all the properties of an object, if you had an object with 15 properties would you really want to create a monster statement with a ton of ||’s? Hell no! If you look in the TestSuite project of the source code there is a few example of using pure LINQ queries. Check out the method “CanFilterAccrossAllStringProperties” and you will see where things were headed.

We want to be like Fonzie, and what was Fonzie? Correctomundo – he’s cool. A good solution would be to know what properties a domain object had, and perform a filter against those properties. In other words, it would really helpful if we could write some code that would look like this:

var propertyFilterSteps = session.Query()
.Customize(x => x.WaitForNonStaleResults())
.Where(AllStringPropertiesFilter(search, "Answer,AnsweredBy,Id,Participants,WorkflowName,WorkflowType,"))
.ToList();

Here we are using a Expression<Func<T>> to pass in a delimited list of property names and with a little LINQ query against the class Step, we can generate a Lambda to process of filter. This is in the test method “CanFilterAccrossAllStringProperties”. It worked great until we needed to include DateTime properties. The code is included in the project, so you look at it there.

So how do we achieve the goal of have a single text box driven search that will query across all type of properties, and when you type “Spock 2010″ will query the properties that you specify for both values of “Spock” and “2010”? Let’s go back to Raven as you can specify an index query by mapping what properties you want to be included in the index and how you want Raven / Lucene to parse the text for matching values in that index. Raven provides a class called “AbstractIndexCreationTask” where you define a Map / Reduce function to create your index. In other words you are picking which properties are included in the index. You can define the output of the Map to anything that you wish. This is held in a class that we’ll name ReduceResult and we will query against the properties of that class. We want to tell Raven to take the significant properties and index them in a format that we can match our terms against. So we will create the following index that will allows us to filter for any term. This is found in Step_Search.cs in the Index folder

public class Step_Search : AbstractIndexCreationTask
{
public class ReduceResult
{
public string[] QueryFields { get; set; }

// ... code eliminated for reading purpose
}

public Step_Search()
{
Map = steps =>
from step in steps
select new
{
QueryFields = new [] { step.State, step.Answer, step.AnsweredBy, step.WorkflowName,
step.Created.ToShortDateString(), step.Created.Year.ToString(),
step.Created.Month.ToString() + "/" + step.Created.Year.ToString()},
DateCreated = step.Created,
WorkflowName = step.WorkflowName,
State = step.State
};
Indexes.Add(x => x.QueryFields, FieldIndexing.Analyzed);

// ... more code eliminated for reading purposes

So what we have done is create an index that has an array of strings. This array holds the text of our properties that we will match against. Raven has a method called Search that will perform a “StartsWith” style match against each object in the array. The call is .Search(x => x.QueryFields, “string to be searched”). If you take a look at the index we have done some additional things with dates: for one, we create a string representation in ShortDate format. So when the user knows the exact date they can enter it and the Pager will match it. But we want to make things as easy possible, so we have created string representations in mm/yyyy format so it’s easy for the users to filter if they only know the month and year of the item they are looking for. “I think it was April last year …”. This is a big for those users who don’t recall exact details, as it allows them to quickly discover what they are looking for.

One last thing before we move on to making this work with DataTables. Raven provides the search method that works with the IRavenQueryable collection. Take a look at the DataTablesPager.Filter method and you will see a SearchMultiple method. This is put in place to perform a search for multiple terms. In other words it will search for “Spock” and then chain a search for “2010” against the IRavenQueryable. Phillip Haydon came up with this approach that works with partial matches, as it will give Lucene the right syntax. Otherwise you end up with weird results because you feed Lucene “spoc 201″ and because of the tokens Lucene creates with the text analyzer it will not pick up what you need. Phillip’s brilliant approach bridges this gap by using an extension method to perform the chaining of search terms. This is found in the class RavenExtensionMethods.cs, and it basically tokenizes the search string, creates an array and an individual call to the Search() method for member of the array. It allows us to perform advanced filtering such as partial matches like “spoc 201″. Try this out in the Tenant.aspx page of the WebDemo solution are you’ll see how it works.

Outta Breath Yet? Let’s Talk DataTables.Net!!!

Breathing hard yet? Good! There’s more to do – how does this work with DataTables.Net? DataTables uses the following parameters when processing server-side data:

Sent to the server:

Type Name Info
int iDisplayStart Display start point
int iDisplayLength Number of records to display
int iColumns Number of columns being displayed (useful for getting individual column search info)
string sSearch Global search field
boolean bEscapeRegex Global search is regex or not
boolean bSortable_(int) Indicator for if a column is flagged as sortable or not on the client-side
boolean bSearchable_(int) Indicator for if a column is flagged as searchable or not on the client-side
string sSearch_(int) Individual column filter
boolean bEscapeRegex_(int) Individual column filter is regex or not
int iSortingCols Number of columns to sort on
int iSortCol_(int) Column being sorted on (you will need to decode this number for your database)
string sSortDir_(int) Direction to be sorted – “desc” or “asc”. Note that the prefix for this variable is wrong in 1.5.x where iSortDir_(int) was used)
string sEcho Information for DataTables to use for rendering

Reply from the server

In reply to each request for information that DataTables makes to the server, it expects to get a well formed JSON object with the following parameters.

Type Name Info
int iTotalRecords Total records, before filtering (i.e. the total number of records in the database)
int iTotalDisplayRecords Total records, after filtering (i.e. the total number of records after filtering has been applied – not just the number of records being returned in this result set)
string sEcho An unaltered copy of sEcho sent from the client side. This parameter will change with each draw (it is basically a draw count) – so it is important that this is implemented. Note that it strongly recommended for security reasons that you ‘cast’ this parameter to an integer in order to prevent Cross Site Scripting (XSS) attacks.
string sColumns Optional – this is a string of column names, comma separated (used in combination with sName) which will allow DataTables to reorder data on the client-side if required for display
array array mixed aaData The data in a 2D array

DataTables will POST an AOData object. The class DataTablesPager.cs will handle parsing this object with the method PrepAOData. It’s responsible for determining what properties we are querying, how the data will be sorted, paging size, as well as including any terms for filtering. Because we have used generics, PrepAOData doesn’t care what object in your domain you are using as it is designed to read the properties and match those properties against the list of data items that DataTables has sent up to our application on the server. Again, our goal is to let DataTables dictate what to look for, and as long as we did our work when we created the index we should have great flexibility.

Let’s look at the Javascript again:

"aoColumnDefs": [
{ "sName": "Name", "aTargets": [0], "bSortable": true, "bSearchable": true },
{ "sName": "Agent", "aTargets": [1], "bSortable": true, "bSearchable": true },
{ "sName": "Center", "aTargets": [2], "bSortable": true, "bSearchable": true },
{ "sName": "StartDate", "aTargets": [3], "bSortable": true, "bSearchable": true },
{ "sName": "EndDate", "aTargets": [4], "bSortable": true, "bSearchable": true },
{ "sName": "DealAmount", "aTargets": [4], "bSortable": true, "bSearchable": true }
],

In the server side application we have Tenant_Search.cs that has created an index with the properties Name, Agent, Center, StartDate, EndDate and DealAmount. The Javascript above is DataTables way of saying “Hey, server, give me information back in the form of an array of value pairs and by they way, here are the data columns to use when you get me my stuff.” On the server side, we don’t care what the order of columns in the grid will be as the server assumes that DataTables will take care of that. And indeed, DataTables is supplying the request in the sName value pair. The server fetches it, spits it back to the browser and DataTables munches it. You can change the Javascript and leave your server application alone as long as you stick to using the fields you included in your index. Just like Fonzie, be cool.

But even cooler is the fact that Raven will handle paging for us: it has a built in limit of return up to 128 documents at a slice. Given that it’s retrieval speed is very fast this will work very well for us. If you look at the Raven console for each page that you retrieve you will see a very low time for the fetches. Remember, there is very little processing for the query as it is the index that has already performed the heavy lifting. For an example of this the page Tenants.aspx in WebDemo solution will page and filter 13,000 + documents. It is lightning fast.

Has Your Head Exploded Yet?

This is a lot to digest. Source code is here, along with the means to create 13,000 documents that you can use for testing. Please note that you will be required to pull down the Raven assemblies/packages via NuGet. Otherwise the download would be about 36 MB. Work for responding to sorting request has been started and hopefully you’ll want to see how that’s is solved in future post. But what we have accomplished is a very robust and easy way to display data from our document database, with low effort on the back end application.

Play with the code. The only way we make this better is to critique constructively, adapt to better ideas and grow! Some of the failed experiments have been included in the test so you can see how things have progressed. They marked as failures so you can focus on testing the DataTablesPager class. These failures are interesting though, and can provide insight to how the solution was arrived at. Also, the first time you fire up the web site that Global.ascx page will look for the test records and create them. This takes some time so if you want the wait those sections are marked for you so you comment them out and do what you need to. Enjoy.


David Robbins is the ActiveEngine Sensei and publishes his ravings about the web, software development in .Net, Elvis and the Zen of productivity at the blog ActiveEngine.Net. He lurks in the outer realm of open source software and loves to blend the best of breed solutions with jQuery, KnockoutJs and C#.


Editor’s note: I was approached about this articles after it came out and gave permission to have it translated into Crotian. You can check it out Here