28 October, 2009

The hardest thing about programming.

When asked what the hardest thing about programming is, Bill Gates answered:

The hardest part is deciding what the algorithms are, and then simplifying them as much as you can. It’s difficult to get things down to their simplest forms. You have to simulate in your mind how the program’s going to work, and you have to have a complete grasp of how the various pieces of the program work together. The finest pieces of software are those where one individual has a complete sense of exactly how the program works. To have that, you have to really love the program and concentrate on keeping it simple, to an incredible degree.
What's interesting, however, is that the hardest thing about programming is also the best thing about programming.

I love programming because learning it is relatively easy, but mastering the art is extremely difficult. Like Gates says, reducing things to their simplest form is difficult - but fortunately, it's also beautiful. It's also true that the finest software is written by those who love the program and make a deliberate effort to keep it simple - and again, although difficult, being able to model an entire system in your head is pretty awesome!

06 October, 2009

Simple Events In ASP.NET MVC

We've been busy improving our customer facing applications. Our new marketing VP noticed some significant improvements that could be made in the workflow of our customer acquisition funnel.

Making the changes that our marketing VP recommended required some major code changes so we took the opportunity to move our web apps into ASP.NET MVC.

One of the early decisions we made was to put as little business and application logic on our controllers as possible. Finding a place for business logic is straightforward most of the time; it usually belongs in your (domain) Models.

Placing application logic in the right place gets a little more complicated. We don't really have services for stuff like sending out emails, etc. so we have to write classes to do all that work.

Invoking email code, authentication code, etc. right in the controller is an ugly solution. To solve our problem, I came up with a simple eventing mechanism that allows us to move all application code out of our controllers.

Let's first start with what the code in one of the controllers would look like:


[AcceptVerbs(HttpVerbs.Post), ValidateAntiForgeryToken]
public ActionResult Index(IndexModel model)
{
//code to interact with your model goes here.

//And now the event code:
AcmeEvents.Raise(new IndexRanEvent { IndexModel = model });

}

The great thing about the code above is that all sort of things could now happen once your index runs; there may be 1 or N subscribers to the event above. Furthermore, you could (at run time) change the code that runs when the event triggers... but I'm getting ahead of my self; we'll look at how that could happen in just a second.

Let's now look at the "raise event code". To do this, I decided to just use Unity's IoC container to resolve the code should trigger when the event get's raised:


public static void Raise<T>(T args) where T : IAcmeEvent
{
foreach(var handler in unityContainer.ResolveAll<Handles<T>>())
{
handler.Handle(args);
}
}

Some assumptions the code above makes: (1) IAcmeEvent has a Hanlde method; (2) by the time the Raise method runs, the IoC container must have been initialized to map the different "event types" to the specific implementations that will execute.

How you initialize the container is really not that important: you could do it in code, through a config, or any other method of your choice. What is important, however, is that when you initialize your container, you do it in such a way that code from other assemblies can execute. If you do this, you could literally just drop assemblies in your application and add to or change the behavior of your app.

So there you have it; a simple eventing mechanism to keep your MVC controllers unpolluted from application logic. I'm really happy about coming up with this solution; I hope you find it useful too.

05 August, 2009

Barricading Your Code

You've heard it a million times: "Never trust your inputs". Or maybe you've heard its corollary: "Always sanitize your inputs". Easier said then done, right?

I'd like to show you a way to barricade your code so that you can always sanitize your inputs without littering all of your code with sanitation code.

Wall Street Barricade
The key to writing good sanitation code is controlling where and howyou accept external input. Just like the barricade in the image above, traffic can only flow on the sidewalks and not on the street.

Here's how to accomplish the same with code:


public class ClassThatUsesBarricade
{
public void DoWork(string inputOne, int inputTwo, ...)
{
//sanitize inputs
RealDoWork(inputOne, inputTwo, ...);
}

private void RealDoWork(string inputOne, int inputTwo, ...)
{
//don't worry about inputs because no external class could have called this.
//do work
}
}

By writing code like the code above, you've barricaded your class: data can only flow into your class through the public method. The private method can absolutely, 100% or your money back, trust its inputs without having to worry about sanitizing them. This way, your real work method can be short, and concise - after all, a method should only do one thing, and do it well.

If your class has a lot of methods with very similar signatures that do similar things, you might want to reverse the code I showed: you could have a private sanitation code that all public methods that actually do work call. However, that's not really barricading your important methods... at some point you could forget to call the private sanitation method (ie. set up your barricade), and you'd be screwed.

And there you have it, a code barricade.

07 July, 2009

When to use Class Inheritance in Programming

Last week I listened to a friend of mine complain about some straight-out-of-college programmers he had recently hired: "All of their code has 10 layers of inheritance. It's crazy trying to maintain their code!".

Unfortunately my friend is right: inheritance, although powerful - when used correctly, naturally tends to increase complexity.

Since your primary purpose as a programmer is to manage and reduce complexity, you should favors solutions that don't use inheritance.

At this point you're probably asking yourself: When am I supposed to use inheritance? How do I know if I'm using it correctly?Fortunately, the great Steve McConnel, has written 4 clear cut rules on when to use inheritance and when to use containment:
1. If multiple classes share common data but not behavior, create a common object that those classes contain.
2. If multiple classes share common behavior but not data, derive them from a common bases class that defines the common routines.
3. If multiple clses share commond data and behavior, inherit from a common base class that defines the common data and routines.
4. Inherit when you want the base class to control your itnerface; contain when you want to control the interface.

In summary, only inherit if the new class truly is-a more specialized version of the base class.

Oh, and one more piece of advice on inheritance: avoid multiple inheritance like the plague.

15 June, 2009

How To Write Reusable Software: A Lesson From LEGO Blocks

I don't know about the places where you have worked, but where I've worked we haven't had a very good track record for writing code that actually gets re-used. In fact, at one place we even gave up: "Software reuse is overrated" a boss muttered once.

I think the reason we fail so miserably is that we focus too much on the end goal (reusable code), and too little on the qualities that make software reusable. What, then, are the properties of reusable code? Well, that's where the LEGO blocks part comes in.


The LEGO block analogy comes from David West's Object Thinking, but I'd like to highlight a few ideas that make this metaphor so applicable to software development:

  • The interface is simple and intuitive. I've seen my 3 year old daughter build ships with LEGOs. If you want your code to be reused, it should be so simple and so intuitive that others could use it without having to refer to external documentation.
  • There's a finite and small number of LEGO block types. If you find that your library has a high number of classes, think about what you are doing. After all, everything on this planet is made from the same 92 (naturally occurring) elements.
  • LEGOs are cool because you can build stuff. The reason everyone loves LEGO blocks is because they can build whatever they want with them, not just what the box tells them to build. Your software should be the same way; you should be surprised by what end users build with your code. If your code only gets used for what you envisioned, you've failed to write reusable code: people are not able to compose software with your code.
Finally, don't forget that people don't really care about the technical details behind LEGO blocks; no one cares how the plastic is molded, dyed, etc. LEGO blocks aren't reused because of how well they're manufactured; they're reused because they let end users accomplish their goals.

Once people are able to accomplish their goals with your code, they'll be using it over and over gain. Guaranteed!

09 June, 2009

Custom Thousand Separator In C#

I guess the NFA requires us to use a space (" ") instead of a comma (",") as our thousands separator. I dunno why the have to regulate something like that, but I had to figure out an elegant way to format our numbers the way the powers that be require them to be formatted.

After struggling with different inelegant solutions for a little while, I finally realized that different countries use different symbols as thousands separators; I knew then that my problem was really a localization problem. A quick search in MSDN lead me to the NumberFormatInfo class, which "defines how numeric values are formatted and displayed, depending on the culture" (MSDN).

The NumberFormatInfo class, has a NumberGroupSeparator property that defines what character to use to separate thousands.

So, here's how I ended up solving my problem:


public void FormatDouble(double doubleToFormat)
{
NumberFormatInfo nfi = new CultureInfo("en-US", false).NumberFormat;
nfi.NumberGroupSeparator = " ";
Write(doubleToFormat.ToString("n", nfi);
}

Neat, huh?

24 March, 2009

Highly Parallel LINQ To Lucene Indexer

I've been wanting to post about this for a while, but haven't been able because I was away vacationing in Hawaii. :)

I was recently given the assignment to index a couple of our tables that hold large amounts of data. Each tables has well over 3 million records, and each record is fairly large as it represents "emails" with our clients.

I knew from the get-go I'd have to write a highly parallel process in order for the task to finish promptly - just bringing the data over the wire from the database takes about 45 minutes.

I have to show a lot of code now, so please bear with me for just a second. The key here is to use several RamDirectory(s) for the small data chunks and merge them into a FSDirectory index that will hold all the data:


namespace Indexer
{ public class BatchIndexer[T] : ParallelIndexer[T]
{
IQueryable[T] items;
bool isMergingFinished = false;
private static object lockObject = new object();

public override bool Stopped
{
get
{
var pstopped = base.Stopped;
if (pstopped == true)
{
return isMergingFinished;
}
else
{
return false;
}
}
}

//stop setting the temp directory here and force the client to pass it in
public BatchIndexer(string path, IQueryable[T] items)
: base(new Index[T](path))
{
this.items = items;
QueueChildren();
}

private void QueueChildren()
{
if (null != items && items.Count() > 0)
{
const int pageSize = 100;
int pageIndex = 0;

var currentPage = items.Take(pageSize);
int pageCount = 1 + items.Count() / pageSize;
while (pageIndex < pageCount)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(QueueChild), currentPage);
pageIndex++;
currentPage = items.Skip(pageIndex * pageSize).Take(pageSize);
}

//wait for all children to be added to parent
while (Children.Count() < pageCount)
{
Thread.Sleep(2000);
}
}
}

private void QueueChild(object stateInfo)
{
var currentPage = stateInfo as IQueryable[T];
lock (lockObject)
{
AddChild(new BatchIndexerChild[T](new Index[T](), currentPage.ToList()));
}
}

public override void Run()
{
//start indexing the children
foreach (var item in this.Children)
{
var child = item as BatchIndexerChild[T];
ThreadPool.QueueUserWorkItem(new WaitCallback(child.AsyncRun));
}

//wait for all of the children to finish indexing
while (!IsChildrenStopped)
{
Thread.Sleep(1000);
}

//merge all indeces into main index
this.Merge((from c in this.Children select c.Index.Directory).ToArray());
isMergingFinished = true;
}

protected override void Dispose(bool disposing)
{
base.Dispose(disposing);
if (disposing)
{
//release managed resource
if (this.Index != null)
{
this.Index.Dispose();
}
}
}
}
}

I particularly like the approach I used because it threads the chunking of the data into smaller pieces AND is also threads the actual indexing of each chunk.

If you're going to index as much data as I had to index, however, you still have to pre-chunk the data and feed into several instances BatchIndexer; I found that 100,000 records was optima. You'll also want to make sure you release all database resources ASAP: you don't to be hogging a whole bunch of memory!

Anyhow, I hope this code is useful... it should at least give you start in the right direction. Let me know if you have any questions.

22 February, 2009

Factory Pattern

After reading this article, a reader recently requested that I show some sample code for the "factory" mentioned in the article.

Before I give you some sample code, however, let me just write a few words about the Factory Pattern: As the pattern's name suggests, the Factory Pattern creates stuff. Specifically, "factories" create classes without coupling you to a specific instance of the classes it creates.

Here is some sample code:

public class CreditCardBase
{
//this is the base class that all credit cards implement:
//it could be an abstract class, or it could have some functionality;
//it could also be an interface... whatever floats your boat.
}

public static class CreditCardFactory
{
public static CreditCardBase GetCreditCard(object someObject)
{
//business logic that returns correct credit cart type goes in here.
if(someObject == "something")
return new Visa(); //visa of course implements CreditCardBase
else
return new Amex(); //also implements CreditCardBase
}
}

public class CreditClassClient
{
//this class consumes a credit card
public void DoWork()
{
CreditCardBase myCreditCard = CreditCardFactory.GetCreditCard(new bizObject());
//do stuff with myCreditCard
myCreditCard.Run();
}
}


And there you have it. That's pretty much all there's to it, and the code is pretty self explanatory.

There's one more thing that's worth mentioning: usually, classes returned by factories have private constructors - to avoid people instantiating the class without calling the factory. This, however is problematic for several reasons:
  1. You may have clients that already instantiate your class.
  2. People usually make private constructors singletons (please don't).
  3. How do you extend a class that has a private constructor? :)
Finally, to the anonymous reader that asked for the factory code: let me know if this was useful.

10 February, 2009

How Not To Lose Money In The Stock Market

One of the most important rules to remember when investing in the financial markets is to Preserve Your Capital! Not losing money is the first step before you can make any money.

And although I haven't figure out a technique that guarantees 10% yearly gains, I have figured out a pretty good technique for not losing money in major market draw downs. Allow me to show you a chart of one of the ETFs I like to use to trade:


The chart to above shows the daily price for QLD, along with the 50-day exponential moving average (in red) and the 10-day exponential moving average (in blue). At the bottom of the chart, you can also see the daily trading volume.

What's interesting in the QLD chart is that during the past year every major market drop has been preceded by a clear signal: the 10-day EMA crosses the 50-day EMA (heading down, of course). And this, even when the market has faced major volatility and huge daily drops.

You can go back as far as you like, and you'll find that the signal above always hold true. And this is not because I'm so awesome and I've figured out some magic formula - no, it holds true because of the very definition of EMA. The EMA lines just show you the averages of the stock price for the last 50 and 10 days respectively.

Of course this technique isn't perfect: you will lose some money before the 10-day EMA crosses the 50-day EMA; but at least you won't lose more that 50% of your money, like several people have in the past year.

Disclaimer: Don't take financial advice from a guy that writes software for a living (i.e. me). Also, don't take advice that makes a living selling financial advice (i.e. your financial advisor). Those guys don't make money in the markets; they make money making you feel good about losing 50% of your capital.

05 February, 2009

ASP.NET MVC Release Candidate Now Available!

The release candidate for Microsoft's ASP.NET is finally here.

Sure, this is just another "me-too" Microsoft product: we've had Rails, Cake, and Grails - but this is great news nonetheless! The ASP.NET web forms model is one that I never really understood: Why would you try to imitate a windows form on the web? Anyhow, web forms have been long over-due for a replacement, and now MVC provides a great architectural pattern for building web apps with the .NET framework.

Scott Gu, has written and in-depth article describing some of the most important improvements on this RC. And, from what Phil Haack said on his blog about this release, I get the feeling there's just a few minor bugs that need to be addressed before the RTM. You could probably start using the RC now without any concerns. In fact, I have a co-worker that has been writing a site on MVC since beta with no problems at all. Another co-worker mentioned to me today how much more productive he was under the MVC model than the WebForms model.

If you're still unsure about ASP.NET MVC, go check out stackoveflow.com. StackOverflow was written on the beta release of the MVC stack, and it's an awesome site.

Now, if you're still unsure, just trust me on this one: put your WebForms on the ground, back away slowly! I don't want anyone getting injured here. :)

26 January, 2009

Software Engineering: There's nothing like it!

Intro


Seriously, there's no other type of engineering like software engineering. And that, may or may not be a good thing: it just depends on what you decide to focus on.

I went to school in the same building where the mechanical engineers, chemical engineers, and civil engineers went to school - yes, I spent almost every waking hour of my college life in the infamous Clyde Building:


Yes, for those of you that are wondering, that building has no windows. I'm pretty sure they designed it that way so that your soul would be crushed within the first semester of your career. But that's beside the point: You want to know what I learned in 4 years as an EE at BYU? Well, for the most part, I learned a whole bunch of math... oh yea, and I also learned that writing software isn't like making bridges, roads or "stuff".

And here's why software enginnering is so different: software is cheap to make (Yeah, that's right; even though you think you're making big bucks by writing software, you're still cheap compared to what goes into creating a new car design). Oh, and there's something else: writing software is much harder than building bridges.

Software is hard


See, we can't really prove software will "work". Proving that a program will work according to the specs is pretty much a halting problem - that is, given a program and a finite input set, we can't even prove whether the program will even terminate!

Civil engineers, on the other hand have it easy: given a bridge and environment conditions (load, winds, etc.), they can tell you whether the bridge will stand or not. In fact, it's so easy to "model" bridges that we software engineers have written programs to tell you whether you've built a good bridge or not.

Software is also harder because as long as the program "works", it doesn't matter what the source code may look like. Sure, some code may be harder to maintain than other, but how do you tell? There's no metrics for code design; we only have smells.

Software is cheap


Let's not move on to why software is cheap. If you write a program that doesn't work according to specs, it's not a big deal: you just go back to Notepad, change a little bit of text and (perhaps) recompile. Even software distribution is cheap: when was the last time you got software in a CD? We could spend a lot of time and effort finding bugs and/or proving our program works correctly, but why would we do that? After some point, it really doesn't make sense to spend money finding and solving bugs that may never surface in normal use.

On the other hand, if you screw the pooch on a bridge you're toast: it's hard to fix, and it was really expensive to build (not to mention how much time it actually take to build). This is why civil engineers, plan, model, test and then finally, after a really long time, build. We just build because it's fast to build. How awesome is that?!

Conclusion


Now you know why you're much better than all those other engineers out there. So go rub it in their faces whenever you feel like it. :p

23 January, 2009

What do programmers do?

Have you been ever asked what you do for work? I have, and it's always hard for me to answer: if I say I'm a computer programmer people immediately think I can fix their printer - or worse, they want me to build them a dating website for five hundred bucks.

The truth is that what we "computer scientists" do is more along the lines of what sorcerers do. Hal Abelson describes it really well:

Computational processes are abstract beings that inhabit computers. As they evolve, processes manipulate other abstract things called data. The evolution of a process is directed by a pattern of rules called a program. People create programs to direct processes. In effect, we conjure the spirits of the computer with our spells.

A computational process is indeed much like a sorcerer's idea of a spirit. It cannot be seen or touched. It is not composed of matter at all. However, it is very real. It can perform intellectual work. It can answer questions. It can affect the world by disbursing money at a bank or by controlling a robot arm in a factory. The programs we use to conjure processes are like a sorcerer's spells. They are carefully composed from symbolic expressions in arcane and esoteric programming languages that prescribe the tasks we want our processes to perform.

A computational process, in a correctly working computer, executes programs precisely and accurately. Thus, like the sorcerer's apprentice, novice programmers must learn to understand and to anticipate the consequences of their conjuring. Even small errors (usually called bugs or glitches) in programs can have complex and unanticipated consequences.

Fortunately, learning to program is considerably less dangerous than learning sorcery, because the spirits we deal with are conveniently contained in a secure way. Real-world programming, however, requires care, expertise, and wisdom. A small bug in a computer-aided design program, for example, can lead to the catastrophic collapse of an airplane or a dam or the self-destruction of an industrial robot.

Master software engineers have the ability to organize programs so that they can be reasonably sure that the resulting processes will perform the tasks intended. They can visualize the behavior of their systems in advance. They know how to structure programs so that unanticipated problems do not lead to catastrophic consequences, and when problems do arise, they can debug their programs. Well-designed computational systems, like well-designed automobiles or nuclear reactors, are designed in a modular manner, so that the parts can be constructed, replaced, and debugged separately.


Although I'm attributing the quote to Abelson, the text above comes from the book Structure and Interpretation of Computer Programs, written by Albeson, Sussman, and Sussman. The reason I'm attributing the quote to Albeson is because I first came across this by watching one of his MIT lectures on the same topic.

If you're interested in watching MIT's introductory "Structure and Interpretation of Computer Programs" class, you can find the videos here.

And now, go conjure some spirits!

15 January, 2009

The Perils of Sushi Bars

This is the story of how my credit card was almost stolen by a Samurai in a sushi bar in Salt Lake City. If you're interested, read on.

About 2 months ago I went to a sushi restaurant here in Salt Lake City. After paying our bills and while we were walking out, Kevin (one of my co-workers) noticed that his receipt had his full credit card number printed on it. Rightly concerned, he immediately talked with the restaurant's manager and asked to see the restaurant's copy of his receipt - sure enough, the restaurant was storing hundreds of credit card numbers in their little dinky cash register.

The whole incident got me thinking about credit card transactions. Every time I buy something with a credit card it's like I'm handing my wallet over to the clerk and saying: "Here's my wallet. Take whatever you need and just hand it back when you're done." Who knows what the POS machine is doing with my credit card number! And that's the least of my concerns: What about the lame payment gateway that stores millions of credit card numbers on servers that will likely be hacked in the next 5 years?

Interestingly enough, however, the whole credit card thing seems to work remarkably well. I mean, there's little risk in the sushi bar overcharging; after all, I know where they're at, and I also have the the credit card company on my side: one call to the them and the restaurant we'll have to pay the bill. As far as the payment gateway is concerned, even thought there's the PCI standard, when they get hacked we all pay the cost (either thorough higher interest rates, more taxes, etc)... that's why I'm really concerned about them - I worked at a place where we were PCI compliant, but there were still gaping holes in our security model.

This particular sushi story, however, has a happy ending. We recently went back to the restaurant and were pleasantly surprised to see that our full numbers were no longer being printed on receipts. It's good to see that non-IT folks are finally starting to get the importance of securing information.

08 January, 2009

Steve Yegge Is My Hero

I regularly read Joel Spolky's blog, Jeff Atwood's blog, and Steve Yegge's blog. I really like them all, but Steve's is probably the most thought provoking blog out of the three.

Recently, for example, he had a post where he discussed (at length), a very intersting topic:

Imagine Mario, and his brother Luigi, having a conversation in which Mario asks Luigi what's on the other side of the boundary to their Mario Kart
universe. Interesting, huh?

It's interesting because to Mario & Luigi, stuff on the "other" side of their universe boundary is "undefined"; it's undfined in the sense that anything could be there. In fact, to some extent, it doesn't even make sense for them to ask what's on the other side; they simply cannot define what's on the other side as long as they reside inside the Mario Kart universe.

On the other hand, programmers living outside the Mario Kart universe can easily define what's outside the Mario Kart universe: all they have to do is inspect the memory contents of the system running Mario Kart, and take a peek at what's in there, on the "other" side.

I enjoyed Steve's post on embedded systemsbecause it helped me make sense of the embedded system I live in. I now understand that anything could be out there; it's truly undefined.

Lucene.net UnAuthorizedAccessException

Unless you specifically tell Lucene.net where to put its LOCK files (for the index readers and writers), it will put them in your OS's default temp directory (normally C:\Windows\TEMP\). This choice of location may be problematic.

As you can imagine, there are plenty of scenarios under which Lucene.net may be running under an identity that does not have enough to access to your temp directory. Under such scenarios, you'll get an UnAuthorizedAccessException thrown at you; and your code may not even be around to handle it.

To solution for this problem is easy; add the following to the section of your application's config file:





After doing this, make sure "Everyone" has full control of the folder you've chosen for Lucene.net to use as its LOCK file folder.

You should also consider properly disposing your indices when your process finishes so that all locks are release on a timely basis. Unfortunately, I can't tell exactly how to do that - you'll just have to figure it out.

Finally, if you're reading this for some other reason than having the exception thrown at you, you should really consider using Lucene. It's great!

06 January, 2009

2008 Reflections

Man, it feels good to be back to blogging! It was an awesome Christmas break, but I must admit I kept coming up with ideas I wanted to blog about.

Anyhow, '08 was a great year for me: I moved to a great company and started blogging regularly. I also learned a lot of LINQ, some Ruby, and had some great personal moments.

As a recap of 2008, I wanted to share my favorite posts so far:
The Secret Behind SQL
Easy Background Tasks in ASP.NET
Why Partial Classes Are Wrong!

I also wanted to thank all of you that regularly read my blog. There's a few of you out there that are loyal readers, checking often whether I have new content or not; knowing that you're reading makes me much more critical of what I write. Thanks for your time!

Finally, here's and interesting graph of traffic on this blog:



You see that spike right around Nov. 17th, 2008? That, my friends, comes from onesmall comment buried among hundred of comment's in Jeff Atwood's blog. Isn't that crazy? That blog is the black hole of developer effort; I swear every developer on planet earth goes there to waste their time... I can't find any other explanation for getting so much traffic from just one comment. Thanks Jeff! And thanks to everyone that followed the link and took time to read what I had to say.

May 2009 be a great year for you.