Project Darkstar is touted as enabling online game developers to deal successfully with client failures, machine-room disasters, and race conditions. Tell us about those.
I'll address race conditions, which produce dupe bugs, first. The online game community has its own language for describing bugs, which they describe from the perspective of players and not in terms of what causes them. For instance, almost every online game has what's called a dupe bug, which refers to the ability to place an object in a game in two places at once. So, for instance, an object could be both on the ground and in my pocket simultaneously, or on the table and on the floor. So they are called duplication bugs.
From the point of view of a database programmer, that's a break in referential integrity. We know exactly what causes it: It's because the systems aren't transactional on the back end, so they can temporarily end up in states in which the referential integrity is not preserved. Darkstar is built on a transactional event-processing system that prevents this from happening.
Give us some details on how dupe bugs occur.
This is an oversimplified example, but imagine I engage in the act of giving you a dollar.
You might code it in a way similar to this:
Add $1 to your wallet. Increase your money variable by one.
Remove $1 from my wallet. Decrease my money variable by one.
What happens if the machine does step one and then it crashes? It never finished. So it added a dollar to you but didn't take a dollar from me. We've just duplicated a dollar.
You might think you could reverse the order, but that doesn't help. If we do it this way:
Remove $1 from my wallet.
Add $1 to your wallet
If we crash between steps one and two, a dollar vanishes to the twilight zone! That's not any better.
Such a trade is what computer scientists call an atomic action. We can't do just half of it and preserve a sensible situation -- that's what we mean when we talk about referential integrity, that the data is in a sensible state.
Transactional processing allows you to group multiple actions together and call them atomic, so that either they all happen or none of them do.
A sudden crash such as I described earlier is rare, but I used it as a simple illustration. Multithreaded code sometimes has what are called race conditions, which are bugs that result from doing two incompatible things at once. I can demonstrate a race condition by breaking the action down and adding a thief in the crowd while I'm giving you a dollar.
The action looks something like this:
Get how much money I have as X.
Get how much money you have as Y.
If (X < 1) stop. I don't have any money to give.
Set your money to y + 1.
Set my money to x - 1.
Let's say I only have one dollar in my wallet. The thief's pickpocket algorithm looks like this:
Get how much money I have as TX.
Get how much money the thief has as TY.
If (TX < 0), then stop. I don't have any money to steal.
Set my money as TX - 1.
Set thief's money as TY + 1.
Each of these actions works well alone, but if they occur simultaneously, problems can arise. Let me walk through what happens to X,Y, TX, and TY when the two routines race.
Imagine I have one dollar in my wallet, everyone else has zero, and the following things happen:
Get how much money I have as X. X now equals 1.
A. Get how much money I have as TX. TX now equals 1.
Get how much money you have as Y. Y now equals 0.
B. Get how much money thief has as TY. TY now equals 0.
If (X < 1), stop. X is 1, so I go on.
C. If (TX < 0), then stop. TX is 1, so the thief continues.
Set your money to y + 1. You now have one dollar.
D. Set my money as TX - 1, I now have zero dollars.
Set my money to x-1. X is still 1, so I get set to zero dollars.
E. Set thief's money as TY + 1. Thief now has one dollar.
At the end, I have zero dollars, but you and the thief each have one dollar, so there are now two dollars in the world. We just duplicated a dollar!
Transactions enforce isolation. They have rules to make sure races such as these can't happen. Race conditions are the number-one cause of dupe bugs.