Saturday, February 20, 2016

I tried to set up remote backups, now I'm running a scientific experiment.

You know how programmers trot out this video to explain what dealing with computers is actually like?


This is one of those stories.

For a while now, I've had in mind to set up remote backups. I have duplicity set up, which successfully creates backups, but they're stored on an external hard drive. For better coverage, I wanted to also send to S3 (duplicity makes encrypted archives with gpg, mind you). Well, I feel a bit uneasy about running the S3 uploader, which is the boto library, as root (I couldn't find any indication one way or the other as to whether duplicity dropped permissions during the upload phase, but I'm assuming not). The reason I don't just run all of duplicity as another user is that some other user needs read access to everything in my system. Even if I were to only back up my main user's home directory, I don't like the idea of hoping I don't accidentally set an important file to have permissions set to make it inaccessible.

So I decided that, rather than run the built-in duplicity-to-S3, I would just write it to the external drive, and then do a sync to s3 of that. That requires that I change the group ownership of all of these archives so that my special backup_uploads user can read them. I had run through many iterations of folders to back up, how often to do incrementally, how often to do full backups, how often to make the backups remote. As it turns out, even with my reasonably fast Internet connection (Sonic, fiber to the node, albeit in a remote area in Berkeley), I can't dream of backing up anywhere close to all my files, all the time. So I came up with an alternate idea for the big, rarely changing stuff (photos and videos) - a scheme involving two external drives that I would swap out with friend or family. I would then shrink down what I was uploading to S3 something manageable. So my job just got more complicated, and this is why backups are awful.

That's when I made a mistake in writing my shell scripts, which I had restructured to accommodate easily having different parameters for each folder. Suffice it to say, a variable didn't get passed into a bash file which I was expecting to be there, and I ended up running chgrp -R backup_uploads on the root of the file system, for a second or two before I figured it out. For those unfamiliar, this means I screwed up the permissions on a lot of my system files. So, it was time to reinstall my system. Fortunately, I had been making backups! (I think this properly qualifies as ironic, right?) I had made backups of most of it, anyway. The backups were a few days old. I did a lot of rsyncing and diffing and chgrp'ing to get it all up to sync, and I have the data that I think I want to restore, ready to go.

This was a great opportunity to confirm that my backups do indeed work. It was also a great way to learn about safe shell scripting. The "don't" option in that link, at this point, is the advice that makes the most sense to me. I will redo all this in Python, now . But if I must, now I know how to have bash safely fail if there's a missing variable (as was the case here) among other potential problems. But there was a third reason this was fortunate mistake, though not in any way pleasant.

All data squared away, I was ready to install Ubuntu server again on my mini server. I downloaded the Ubuntu Server iso, the SHA256SUM file, the SHA256SUM signature file, and the relevant PGP keys. Everything checked out, and I found the fingerprints of said keys on Ubuntu's site, and on Stack Overflow. I'm as sure as I can be without meeting somebody in person.

I flash the installer disk image onto a spare USB stick. I put it into my mini server. It boots up. I run integrity check on the installer disk, because why not. FAIL. efi.img was corrupted. That's interesting. I went back to my laptop with my USB stick, mounted my iso locally, and diffed the hex dumps. Sure enough, a single bit had flipped. That could be a fluke, or a cosmic ray, but I'm not taking chances on defective hardware. I labeled the USB stick as "to zero out and throw away", and put it aside.

The next day I went to Walgreens and picked up a brand new new USB stick. I come home, flash the installer disk onto it. Plug it into my mini-server and.... integrity check passes. Okay, cool. It was late, so I went to sleep. The day after, before commencing with install, I did a mem test, because why not. After a half hour, I realized it would be very slow, so I decided to cancel it and just do it overnight after installing. I ran the integrity check again, because why not. FAIL. Same file as before. I bring it to my laptop, the same bit has changed! Again, this was now on a separate, and brand new USB stick.

I pulled out the original USB stick, and did a hex dump to confirm that it was indeed the same bit that had changed on both disks. However, while it was just sitting there, as best I can tell, more bits were now broken on the old USB stick. So, it seems as though the disks have somehow been made defective by this process.

At this point I'm sort of at a loss. I re-flashed the iso onto the new USB, and ran memcheck on the mini-server that night as originally planned, since it's likely a problem at this point, but it came up fine with three passes. What I have remaining is to run a mem check on my laptop. I gave my laptop the "spill test" a couple months back, maybe that screwed something up, but it is a Lenovo and it's built to withstand such stupidity.

Beyond that, I've decided to buy six identical brand new USB sticks, to run an experiment. I will burn the ISO to 3 USB sticks, right from my mini box, and likewise 3 on my laptop. I will not plug them into anything else. Then I wait, and maybe restart my computers a few times. Hopefully one of them will break again, and I know which computer to blame. Otherwise I have no idea how to check what hardware of mine is busted.

So here I am, I started with wanting to run backups, now I'm conducting a scientific experiment. I don't understand how people handle computers.

Saturday, July 16, 2011

Haskell - Can your language's type system do this?

Haskell is known for its very strict, and yet very versatile type system. There's so much to learn about it, and I'm barely scratching the surface myself. But I've thought of a simple example of something impressive that I think people who know nothing about the language could still appreciate, and hopefully it will convince you to check it out.

Note: There is something in Haskell called "Type level programming". I'll leave you to look it up and decide whether what I'm doing here is an example of this or not. I never figured that out. but what I'm about to show you is at very least something you probably can't do in your language, unless your language is even cooler than Haskell.

Let's start with this simple program:

data Purse a b c = Purse a b c


data Keys = Keys

data Phone = BlackBerry | IPhone | Android

data Wallet = LeatherWallet Float | MoneyClip Float


addKeys (Purse ( ) b c) newkeys = Purse newkeys b c

addPhone (Purse a ( ) c ) newphone = Purse a newphone c

addWallet (Purse a b ( )) newwallet = Purse a b newwallet


emptyPurse = Purse ( ) ( ) ( )


leaveHouse :: (Purse Keys Phone Wallet) -> ( )

leaveHouse purse = ( )


In short, this program makes sure that you add your keys, a phone, and a wallet to your purse before you leave the house. However, unlike most languages, your compiler will make you do this, rather than the runtime.

So first, let me decipher this thing for you.

data Purse a b c = Purse a b c


This creates a data type called Purse. It is a parametric data type, meaning that it takes other types as parameters, in this case 3 different ones. This is very similar to Java or C++, with something like Vector< int >. a, b, and c can be anything, at least for now. On the right side of the equal sign, we describe what it requires to construct an object of type Purse. This also takes three parameters, however these parameters are values, not types, but the values must be of types a, b and c respectively. (Note that in Haskell, it's fairly common for the type, and the value of the type, to have the same name. It might seem confusing, but if you understand the context they're each used in, it's unambiguous which is meant. So here, on the left, Purse is the type, but on the right, Purse a b c represents the value, or more properly one would say that on the right Purse is the type constructor)

This is basically like a struct in C. So in all, we have something like a templated struct in C++, with one data member of each parametric type. For example, based on this declaration, you could create a variable of type Purse Int Float Int that had the value Purse 1 2.3 4 or Purse 2 5.7 8. You could also (in the same program) create a variable of type Purse Int Char Char, and it could hold the value Purse 5 'a' '%' .

data Keys = Keys

data Phone = BlackBerry | IPhone | Android

data Wallet = LeatherWallet Float | MoneyClip Float


Phone and Wallet are also datatypes. They are not parametric, however they have a different feature. The variables of these types can each take on multiple forms. A Phone variable can either have the value BlackBerry, IPhone, or Android. Very similar to an enum in C, they basically are constants, and variables of type Phone are restricted to them. Wallet however, is a bit different. In addition to having multiple forms, it also has a Float parameter. If you remember Purse variables had parameters of unspecified types, however in this case we specify that it must be a Float. Keys is simplest of all, it can only have one value, Keys, with no parameters.

addKeys (Purse ( ) b c) newkeys = Purse newkeys b c


This is a function definition. It takes two parameters. Similar to def addKeys(purse, newkeys): in Python. However, Haskell uses something called pattern matching. You can find this in Erlang, and I'm sure a good handful of other languages as well. What this generally does is allow you to define a function multiple times, for different input, as long as every definition has the same type signature. However in this case we're only doing one, because we just want to use it to (partially) define the type signature. The first parameter of the Purse object has to be ( ), which is of type ( ). This is in some ways like None or NULL in other languages, however it's its own type, with only one value. Variables a and b can be any value of any type. What it returns is a new Purse, with newkeys as its first parameter, rather than ( ). The other parameters depend on what's passed in.This also means that the new Purse has a different type than the first one: Purse Keys b c, rather than Purse ( ) b c

Similar in Python would be:

def addKeys(purse, newkeys):

if purse.a == None:

newpurse = copy(purse)

newpurse.a = newkeys

return newpurse


The difference being that the Haskell version restricts the type of purse, the consequences of which we'll see in a bit.

addPhone (Purse a ( ) c ) newphone = Purse a newphone c

addWallet (Purse a b ( )) newwallet = Purse a b newwallet


Exactly the same as addKeys, except that ( ) are in different slots, appropriate to where this new phone or wallet should go. Again, note that the type of purse returned is different than then one entered, and that the type entered is partially parametric. For instance, for addPhone, c can either be of type Wallet or of type ( ) . Actually, again it could really be of any type, but if the purse was created with the functions defined in this program, it would be one of those.

emptyPurse = Purse ( ) ( ) ( )


This is defining a value called emptyPurse. It is a Purse of type Purse ( ) ( ) ( ) and of value Purse ( ) ( ) ( ) .

leaveHouse :: (Purse Keys Phone Wallet) -> ( )

leaveHouse purse = ( )


Finally, we have a function with an explicit type signature. Until now we've been taking advantage of type inference. However, here we want to force leaveHouse to have a specific type signature. The one argument to this function is of type Purse Keys Phone Wallet. Basically, any wallet with everything in it. No ( ). The return value of this function happens to be ( ) , but this is only because this is for demonstration. For our purposes, we don't care what it returns.

Ok cool, so what's the point? Well let's fire up the interpreter, ghci:

Prelude> :l purse.hs

[1 of 1] Compiling Main ( purse.hs, interpreted )

Ok, modules loaded: Main.

*Main> let p1 = emptyPurse

*Main> let p2 = addKeys p1 Keys

*Main> let p3 = addWallet p2 (LeatherWallet 50.24)

*Main> let p4 = addPhone p3 BlackBerry

*Main> leaveHouse p4

( )


Cool, we've left the house successfully. Let's inspect the types of each purse:

*Main> :t p1

p1 :: Purse ( ) ( ) ( )

*Main> :t p2

p2 :: Purse Keys ( ) ( )

*Main> :t p3

p3 :: Purse Keys ( ) Wallet

*Main> :t p4

p4 :: Purse Keys Phone Wallet


Remember, these aren't the values, these are the types.

Now let's restart out interpreter, and try to skip a step:

Prelude> :l purse.hs

[1 of 1] Compiling Main ( purse.hs, interpreted )

Ok, modules loaded: Main.

*Main> let p1 = emptyPurse

*Main> let p2 = addWallet p1 (MoneyClip 5.20)

*Main> let p3 = addKeys p2 Keys

*Main> leaveHouse p3


<interactive>:1:11:

Couldn't match expected type `Phone' against inferred type `()'

Expected type: Purse Keys Phone Wallet

Inferred type: Purse Keys () Wallet

In the first argument of `leaveHouse', namely `p3'

In the expression: leaveHouse p3

*Main> p4 = addKeys p3 Keys


We now have an error, because it was expecting a purse without ( ) anywhere in its type parameters. Remember, it's not complaining about a bad value, this pass doesn't know about the value, because this is a compiler error. It's complaining about the type. This doesn't look too different from something you might set up in Python, but remember, with Haskell, if you put these lines into a program, you would get these errors without having to run it (you wouldn't even be able to).

Now for fun, with the same interpreter session, let's try making a new purse, adding the keys to p3 a second time:

*Main> let p4 = addKeys p3 Keys


<interactive>:1:17:

Couldn't match expected type `()' against inferred type `Keys'

Expected type: Purse () t t1

Inferred type: Purse Keys () Wallet

In the first argument of `addKeys', namely `p3'

In the expression: addKeys p3 Keys


It's expecting a Purse with the first parameter of type ( ). Any Purse that has addKeys in its "history" will not have this in its type, it will be of type Keys, so the compiler will not let you call addKeys again.

Now, remember one particularly interesting thing here: when desiging the addKeys, addWallet, and addPhone functions, we did not know exactly what type of Purse would be going into them. We just set one restriction, that one particular slot had to be empty. The actual types of purse going into the functions were realized by what called the functions.

Why would you want to do something like this? Supposing you were creating a library function to call on some sort of API. You want to allow for multiple function calls to initialize some sort of object before kicking it off to the library. In a language like C or Python, failure to set some sort of initialization variable would lead to a runtime error you have to decipher. Here, the compiler won't let you run the program until you to set all of the necessary parameters, and you can set them in any order*. There's a benefit to having things fail before you even have a chance to run them. It's a pain to get working, but once it does, your program will have far fewer errors.

This trick I've described to you, however confusing I made it sound, is based on the very basics of the Haskell type system. There's a whole crazy world out there. It's complicated, but once you get to know it, you can force yourself into some pretty stable programs.

* Actually, the way I have it set up, while you can run these functions in any order, it (for better or for worse) forces you to call the functions in the same order every time you use them, if you ever use it more than once in the same program. However, using more advanced features (typeclasses) I think it would be possible to overcome this, though I've not tried it.

Friday, April 29, 2011

Lower the Barrier for Scratching Open Source Itches!

Ok, this is an idea I've had for some time now, I think it's time to put it out there and see what a large audience thinks. I'm a bit ignorant about how distributions are set up, maybe it's just impractical for some reason I'm unaware of.

I think there should be an easy way for users of a distro to jump right into development. This stuff is done for free, we need all the help we can get. It would help to eliminate any barriers to entry we can. My proposal is that it should be integrated into the package manager.

Here is the process I envision: I, the user with coding skills, have an itch with Program X. I issue one command, let's say "sudo apt-get --collab programx". Here's what it does:
  • It automatically creates a fork of the project on my account on Github (or equivalent)
  • It pulls, via Git (or equivalent), the very version of Program X that I am running currently. Now, this is important because I don't want to worry about different behavior in the program, I don't want to deal with a newer version of the program requiring different versions of libraries. I want the same thing I just ran, with the same bug, and I want the source code that generates it.
  • The build environment is all set up. I don't want to hunt for build dependencies, compiler options, etc. Enough said. "apt-get -b" Seems to do most of what I described thus far, minus the crutial Git part.
  • I fix my problem. I make my changes, commit, and push. It shows up on upstream's Github fork queue (or equivalent). They decide whether to accept it.
Github has already done a great part of this, compared to a few years ago, by lowering the barrier to entry with the fork queue. Install via source already exists in apt. Would it be a huge task to coordinate the two?

I think that I would probably have scratched a few itches by now if it were this straightforward. Instead, I have to look up the specific build setup for the project (on the project's site, not Ubuntu's site), figure out build dependencies, etc. Or, I can do apt-get -b, but then it's not ready to commit my changes back (afaik).

The limit of my patience, and free time, is reached much earlier in the process as things currently stand. Remember, this is for people who are perhaps a few levels less involved than the sort of user who would run the bleeding edge Ubuntu Beta. This is a regular user with some coding skills, who might be able to fix a problem or two if the setup were handed to them. They have a different mentality. This is about getting a new class of developers involved.

Again, I'm ignorant about the details of package management and open source project management, so I'm probably leaving holes in this idea that I don't know about. I'm just a developer with an idea. The question is, can these holes be ironed out, or does this have a fundamental problem because of package management, as it stands today?

Or does this already exist and I just never heard of it? (in which case, it should just be promoted more!)

Saturday, January 29, 2011

I found the perfect project to dive into Haskell

I've had a recurring project in my life, a modular software synth. See here to get an idea of what I'm doing. The difference being that with software synths, you're not limited to how many components you have and how they're configured. I somehow thought I came to this revelation on my own years ago, but there's plenty of this sort of thing out there, such as SuperCollider, which sounds like it's fairly popular.

I've been trying to get into Haskell, but have been struggling to get myself out of my Python comfort zone. A friend of mine actually told me about SuperCollider recently, and I realized that my own synth would be the perfect project to get me started on Haskell, and one day last week I got inspired to get started. I found a simple example to start with of a sine wave being played through Pulse Audio and I went from there. Here's my repo

Unfortunately, you need PulseAudio to run this. I'm working on either getting Alsa output or file generation working soon.

When I was making this a few times before, I made it in C++, the latest instance being several years ago. Amazingly enough, despite still being a novice in the language, I found that doing this in Haskell is easier. The infinite lazy lists work perfectly as signals. Before I considered each component to be an object that had a value, and input signals. And I had to have a global "tick" that conveyed info between items. (And I thought that was really neat at the time.) Now I just have components be functions that "output" (return) infinite lists, and take infinite lists as inputs. It all sortof just sorts itself out.

Speed is sacrificed to be sure, at least so far, but real-time synths have been made for Haskell, so I bet I can profile it and optimize it significantly.

Also you will notice that everything is hard coded! I sortof like it that way, it's amusing, particularly when it starts making beat sequences (which is a point I got to in my old version), but I'll probably make an interface at some point. Or maybe not, Haskell is a nice interface.

Here's why I'm writing now of all times though. On top of being functional with the lazy infinite lists and such, Haskell also has a type system from Nazi Germany. This is actually an advantage, though. I have some trouble remembering all the unit conversions involved in the oscillators, when I'm dealing with cycles, seconds, and samples. So today, I made a type framework that provided functions that did the conversions properly. When I was writing out an improved version of my oscillator function, I used these types.

It took me a long time to figure out exactly how I wanted it all to work, and how to make it work. I would start on an expression, and then realize that I was adding different units, and Haskell wouldn't let me do it. Or sometimes the compiler told me so. But eventually I got through it. This is the monstrosity that resulted.

And the kicker: I used this to make a new version of the square wave oscillator, and it sounded exactly the same as the old one, the first time I ran it.

Sunday, January 2, 2011

Testing Multiple Login Sessions Simultaneously

One annoyance in developing websites is that you sometimes have to log in and out all the time to test interaction between multiple users.

Have you ever visited or administered a website (say, www.example.com) which lets you visit "www.example.com" or "www2.example.com", etc, and doesn't forward to "example.com"? Did you ever try logging in at one subdomain, and then switch to another? You'll be logged out, it's a different login session. If you needed to test something remotely with multiple users logging in at once, that's a nice trick to use.

Now let's do the same thing locally (*nix systems only afaik, sorry):

In /etc/hosts you should see:

127.0.0.1 localhost

Add the following:

127.0.0.1 localhost2
127.0.0.1 localhost3
127.0.0.1 localhost4

And so on for however many you need. Now each one will access your site with a different session, so you can log in as a different user for each.

Saturday, December 18, 2010

Cryptonomicon: A Lesson for my Hyper-Logical Friends

I'm currently reading Cryptonomicon by Neil Stephenson. I'm not very acquainted with literature at large, so forgive me if I'm being ignorant here, but it seems that this book is unique or among very few that are in wide release and yet somewhat esoteric. That is to say, anybody can appreciate it, but I think it speaks specifically to computer programmers and mathematicians, and may not be 100% understood by those who are unfamiliar with certain mathematical and engineering concepts, and who don't share that mentality. Then again, the purpose could be to provide some insight to outsiders who may want to understand the hyper-logical nerd mentality. Tom Wolfe seems to do a similar thing, for instance, with the investment bankers in Bonfire of the Vanities.

Though I think Neil Stephenson must have a closer personal connection with this mentality. It's a great book for a nerd because it's literature we can really relate to. It's told from the perspective of those of us who try to make logical sense of everything, see patterns all around us, and are confused by strange things like social niceties.

All in all I think it teaches an important lesson to nerds and non-nerds alike. I only just now crossed the 1/3 way mark (it's like 1100 pages), but I just came across some particular dialog which I think is particularly insightful. In this scene, Randy Waterhouse pulls Eberhard Föhr aside during a business meeting, and explains to him why, for their own legal protection, information has been withheld from them by one of their business partners, Avi. Ebehard, being of this nerd mindset, is frustrated that his business partners are not behaving logically. Randy, being of the same mindset but somewhat more enlightened, explains to Ebehard the realities of dealing with illogical people, but he does so in logical terms that Ebehard can relate to. This conversation is amusing like a lot of things in this book, because it demonstrates how us analytical types like to deconstruct everything.

Rather than risk inviting Neil Stephenson's lawyers (I have no idea how likely a scenario this is be but I don't care to do the research right now) I'll just invite you to read this page via Google Books.


I appreciate a couple things about this passage. Firstly, I appreciate that Randy's character is sort of an enlightened techie, who we should aspire to, who respects the qualities of other sorts of people, even if he doesn't understand their mentality. Business people clueless about technology, idealistic designers with a vision, techies who can't design a usable interface to save their life, we should all accept our own limitations of understanding, respect the others, and occasionally yield our own ideals for the sake of other ones. (ex: if "doing it right" means taking twice as long, and failing in the market, what use is your ideally laid out code if nobody's going to use it?)

The other thing I like about this passage is, as I mentioned above, the logical way that it approaches illogical people. Some nerds have a tendency to refuse to approach the world in anything other than a logical manner. Normal People may try to explain to them that the world, particularly other individuals, aren't rational at all, and we should stop seeing things so logically. I include myself in this group of nerds, so honestly, this line of argument is ridiculous to me. The universe is logical. But, I think that sometimes we as nerds are just Doing It Wrong, and we can take a cue from Randy here.

What we need to do is to appreciate that the fact that people act irrationally, out of emotion, is just a condition of the world. Just as we accept that animals are irrational, or that the sun is hot. It's a datum. Further, accept that you yourself, the nerd, are also emotional, particularly when people don't act logically. This frustration with others' illogical behavior is based on an expectation for people to act contrary to their nature. You're ignoring a data point. You're mad at the sun for being hot. You're a non-techie who's mad at your computer for doing something other than exactly what you told it to. Now look who is being irrational? I'm going to agitate a little and propose that we are in fact being hypocritical here.

The main problem I think we sometimes have is the distinction between Logic and Logical Faculties. The expectation of perfection in Logic is not the same as expecting a human to have perfect Logical Faculties. The universe works by rational laws. People are part of the universe, so their workings are rationally explainable. But this is entirely distinct from their Logical Faculties being able to perfectly model the world around them. Furthermore, people's Logical Faculties being able to model the world around them is distinct from their ability to defend it from any of their Emotional Faculties getting in the way. We humans are but animals who happen to possess a limited amount of logical faculties.

Expecting people to act in a rational straightforward manner is like expecting a computer to compute beyond its capacity. A problem may be Logically solvable. There is a perfect Logical progression toward the answer. If we treated computers the same way we sometimes treat other humans, we would demand that we should be able to stick the problem into a computer and get an instant output. But again, Logical Faculties are in limited supply. Somehow we don't seem to have a problem accepting this in computers. In fact, we have entire sub-fields of computer science, taking RAM, HD, and time limitations as data, and creating a whole new set of Logical problems. Why not accept the same limitations and challenges in humans?

Perhaps it's that there is one fundamental difference between computers and humans, which is that our departure from being perfect logic solvers is not just in our processing capabilities, but also, as Randy pointed out in the passage linked above, in our interfaces. Human interfaces are more like neural networks than serial connections. To gain access to the Logical Faculties, one must enter a pattern that is accepted by the neural network. The patterns include such things as social niceties and innuendo. Some of us have simpler interfaces than others. (And as Randy described, some may even require other humans to act as intermediate interfaces. When I worked at Oracle, there was a guy who was fluent in both Engineer and Customer, and intermediated all conversation. I understand this is a common thing to have in a company.)

And you, the nerd, are a neural network, at your core, not a Turing machine. You operate in that domain. That means you have the natural ability, however impaired by years sitting in front of the computer, to interface with other neural networks, if you would just accept your nature. This is in fact the only way you can communicate with other humans, so you might as well accept it for what it is. You may try to approximate a Turing machine, but your neural network nature will still show on occasion. For instance, as I pointed out above, when you are frustrated about others not behaving like Turing machines.