3.16.2012

Hootie and the Blowfish vs. Blank Tape




retweet

11.15.2010

Clojure Report Generation

The entire code from this blog post is here. I will chunk it up to explain:
https://gist.github.com/4fcd64dc020a9c749b76

Working with collections in Clojure makes programming such a natural experience. Especially when you want to report on your data. Many different data stores have libraries which make getting information into Clojure trivial. They often use a collection of maps where map keys are the "column" names and the values are the data. The following generate-report function should work on any collection of maps.



Above I am taking a set of query results (i.e. a collection of maps) and any number of report items. A small sample of one of the maps in the collections with which I'm dealing is here.

The first thing to do is grab the headers. These are just the firsts of each report item. Pretty simple. The process function takes the keys from each report-item and applies manipulation-function to the result of (get-in % ks). Process returns a new function that takes a query-result, so I can map it over every report-item to get columns as a seq of functions. Finally to get rows, I use apply juxt to call every function in columns on each query-result to get a vector of each of the results of those calls.

You can tell from the doc string what report-items are supposed to look like, so I will show you two of my reports. The first one uses the generate-report function to put its :headers and :rows into an html table.



The code above is all that is required to create this:

Granted the manipulation functions in each report-item are doing most of the display work, the generate-report function makes it easier to focus on what matters out of each map in the collection of query-results.

The reason I wanted to break out the generate-report function is that reports on the same data might not always look the same. The next example looks at a report that needs to export to a spreadsheet. For the HTML table, I wanted to shorten up some of the fields, by displaying "Lastname, First initial." The title tag for the td displays the entire name if the user needs to know. However, once exported, there are no title tags, so the user would have no idea if he was looking at John Smith's or Jessica Smith's score. In xls-report I use a function to display all of the parts of the operators and evaluators which together constitute a unique identifier (see employee-display).



This report demonstrates the use of the nested maps by getting out the vector of [:client :account-numer] and [:client :account-name]. These are passed to a get-in call which gets me the :client of each score, and then the :account-number and :account-name of that result. Now the *.xls that gets saved can be sorted by either account-number or account-name.

Definitely handy stuff. I'm really starting to love this whole idea that code is data and vice versa.

11.04.2010

clojure map-keys

Someone in #clojure was asking about how to case-insensitively check the keys of a map. The best answer anyone present at the time came up with was pre-processing the map by naming, lowering, and keywordifying the keywords. We then found out the asker was looking for keys in nested maps, so I wrote a function for that.

After writing the function I realized I could extract the lowering function and let it take a function to apply to all keys of m and all keys in m's vals.

Here is the map-keys function with the lower-key function extracted. I run it on some goofy data at the end. Try it in a REPL.

10.08.2010

Clojure Dates with Java Interop

I really just want to test how an embedded gist looks in blogger. I thought I would do it with a simple function I wrote to test if a date object was between two other date objects in clojure. I found how to do it in Java on stackoverflow, so I took advantage of clojure's java interop, and this is what we get:


10.03.2010

The Idea

I'm standing in my low basement. I'm on one of few marked spots in which I can stand up straight--between the floor rafters. Standing up straight aims my sight directly at a pair of old computers. They aren't incredibly old, but given the nature of the technology world and the fact that they were going to be thrown away, we can call them old.

However, with the addition of some spare parts from other "old" computers, I have on my hands two 2.6 GHz computers with 2 GB RAM and 40 GB HDD in each. Not so bad for "old" computers. Then it hits me: why don't I try out mongodb's replica sets. Then I start brainstorming all of the possibilities, and I think, why don't I document all of this on the web?

I have been learning Clojure this year. It's a language in which my boss wrote a great wrapper around the Java mongodb driver called karras. Since April I have been putting together a web application in Clojure which uses karras and mongodb. At MongoBoston this year, I watched a lot of great talks on interesting features of mongodb. Also, I'd like to learn more about deploying redundant applications.

Anyway, here's what I want to do. Run two physical machines with Ubuntu server. Within each of these host two virtual machines. With the four total virtual machines, create a mongodb replica set and a fallback to keep my application live if either physical machine goes down. The point of documenting my plans here, is to get (hopefully constructive) feedback which will help me build a fairly robust system on two very cheap machines.

I will set some time for installing ubuntu server on both computers the next time I go in the office. Between now and the time this actually happens I have some questions to anyone who feels knowledgeable on the subject.

  1. Can I even do what is described above?
  2. Would it be better to oust the VMs simply using 2 instances of mongod on each physical machine?
  3. What would you do differently given the hardware mentioned above?

9.26.2010

Optimize your Optimization.

I constantly am looking for a better way to write my functions. I want them to be fast, flexible and readable. However, in the fast department, I also want to write them fast. Tradeoffs are always made in programming. It hit me recently to optimize my optimizations. That is, focus my optimization on the things used the most.

I was working out a function in a web app controller which gets accessed every time there is a write operation on the database. I was trying to figure out how to perform less reads to get the data into the write function. This is when I realized that the write operations would be done, on average, less than once a day. I think I can handle two or three reads per write in that frequency.

Now I can move on, but while I'm working on optimizing things that happen every second, what do you think of this approach? Am I caring too little? Should I be concerned of anything? How would you approach it?

33,000 cases of beer on the wall. . .

SHSH - home - BONHEUR PROVISOIRE