Archive

Archive for May 26th, 2009

Hacking Tutor, Part 1

May 26th, 2009

Computers are an amazing thing. Society has pretty much accepted them these days as much as they have washing machines and microwaves. And much like those more primitive devices, to the vast majority of people, computers are magic boxes.

This is how it should be! And as a programmer, it is my job to keep it this way. If we had to know exactly how cars worked to drive them from Point A to Point B, it would take years before we’d make it to nice restaurants in faraway big cities.

Like cars and many other things in life, however, if one wants to gain more knowledge, control, power, and–dare I say–joy out of this magic, digging deeper is essential. I won’t go into all the practical reasons why this is the case; if you are reading this, I will assume that there is more you want to accomplish with computers. Maybe you want to be able to troubleshoot problems yourself when they come up. Maybe you want to make certain tasks automated, or make something easier for yourself or others. Whatever the reason, learning how to manipulate your computer on a deeper level (henceforth referred to as “hacking,” which is what this word really means, and not what Hollywood thinks it means) is a rewarding experience.

So, where to start?

That is the question. Obviously to some extent, this depends on what you want to accomplish. But as that is impossible for me to ascertain for each individual, I will endeavor to introduce some things that should be useful to anyone.

These ramblings will assume you are using a Unix-based system, preferably Linux. This is partially because such systems give you the tools to hack out of the box, and partially because their communities are not only amiable to hackers, but are made up of them. Linux is built under the principle that anyone who wants to explore software deeper (and change it if they want) can (and should!) do so. Mac OS X is a Unix under the hood, so it’s acceptable.

The Shell

Any aspiring hacker should become acquainted with the shell. What is this shell of which I speak? You may know it as “the command line.” It’s called a shell because it provides a nice way to interface with your operating system, much in the same way an eggshell provides a nice way to interface with an egg. The shell is accessed through a program called the terminal.

What is a terminal?

Very simply, a terminal receives input from you, and sometimes tells you something back. In the early days of computers, there would only be one real “computer” at a given establishment. Multiple people interacted with this computer simultaneously through dumb terminals, which were not individual computers themselves, but (essentially) monitors and keyboards–ways to tell something to the computer, and get something back. Likewise, you can open as many terminals into your computer as you want: they don’t affect each other, they just affect your machine.

Nowadays, when you open a terminal program, it automatically starts a shell for you. But let’s talk about how to use both.

Getting friendly with your terminal

Terminals are quite basic programs from the perspective of the user (you). But if you want to use one, you should know where to find it! In Ubuntu, it’s in Applications -> Accessories -> Terminal. On OS X, it’s called Terminal.app, in Applications -> Utilities. If you can’t find it on your system, and Google isn’t helping, let me know.

Once you’ve got it open, you will want to spend a bit of time configuring it to your liking. Very important to me are the colors: if you’re going to be doing powerful hackery things, you should look the part! I prefer amber text on black background, but use whatever you like. Again, exactly how to do this is specific to your system, but your terminal program should have an easily-findable preferences menu somewhere. Also make sure the size and whatnot is to your liking. Note that you will want to make sure to use a monospace (fixed-width) font (which certainly will be the default for your terminal). Since you are only dealing with text and not graphics, it’s important that all the letters be the same width so everything lines up correctly.

Getting friendly with the bash shell

There are several different popular shells for a given system, with varying capabilities that you don’t need to worry about now. Just stick with bash, which is the default shell in OS X and almost all Linux distributions. That is, when you open your terminal, you are now looking at a bash shell.

In the world of the shell, the first thing that’s important to understand is that you are always working in a specific place in your system. That is, you are in your Documents directory, or the root directory, or the directory called /usr/local, or whatever. (Note that we call them “directories,” not “folders,” though they both do the same job of being a place where files are.) This is different from your normal computer-using experience, where you don’t often deal directly with files (except to save or open them), really.

Your current location should be displayed at the prompt, or the actual line that is waiting for your input.

The other important thing to keep in mind is not only where you are, but who you are. In unix-like systems, users have varying levels of privileges–that is to say, not everybody can do everything. This is important for security and privacy: you can’t delete another user’s files, or destroy something important to the operation of the whole system, when you are just a standard user. This is a Good Thing, and you’ll want to shun the ability to do dangerous things until you are sure what you’re doing.

Who you are is also displayed at the prompt, generally by means of the symbol (usually $ or #) that immediately precedes your cursor.

Anatomy of a Prompt. Click to Enlarge

Anatomy of a Prompt. Click to Enlarge

So, what do you do now? Well, it’s a command line, so you enter commands!

What are these commands?

Generally, giving a command is as simple as telling the shell which program you want to run. For example, a very common command is one to list which files are in your current directory. To do this, you just type the name of the program that lists files: ls. Then hit enter. The results of running the program will be displayed. Try it on your terminal!

Of course, often you will want to give the program some information about what you want. With the ls program, for example, we often would prefer the files to be arranged in a nice table, instead of just haphazardly listed like in the example above. To tell a program something, you give it parameters. The parameter that does what we want for ls is -l. You put parameters after the program’s name, separating each one with a space. So typing ls -l gives us:

Output of ls -l. Click to enlarge.

Output of ls -l. Click to enlarge.

By default, the ls program doesn’t list hidden files, because they are… well, hidden. But of course, sometimes you want to be able to see hidden files! So you pass the parameter -a (which stands for “all,” meaning we want to show ALL files). So we type ls -a and it Enter.

Ruh-roh! It showed all the files, but it’s back to that ugly output again. Looks like we need to combine this new parameter with the previous one. This turns out to be a very common thing to do. We could do this ls -l -a to get what we want. Try it.

But hackers are always looking for speedier/more efficient ways to do things, and so there are often shortcuts. When you have multiple parameters starting with a dash (-), you can scrunch em all together, like so: ls -la. The result is the same.

In the future, we’ll do more complex stuff with the shell, but these basic techniques of using the program name followed by optional parameters are ubiquitous.

What is scripting?

You may ask what the advantage is to typing commands and getting their results as text instead of using graphics. Obviously both have their strong points, but the command method has a major plus: you can script it.

Scripting is the process of doing the work once, and enjoying the benefits multiple times. It is making a list of commands that you can later execute just by typing the name of the script (with optional parameters… sound familiar?)

Scripts are commonly used to automate tasks. For example, you might write a script to backup your important documents to another location. If you’ve ever tried to consistently (daily, for example) do such backups just using your graphical file browser, you know how tedious this can be–eventually, you start backing up less frequently (or not at all) because it’s such a pain in the butt.

Let’s say that to you, “backing up” just means copying your Documents directory into a directory called Backups that’s on a different hard drive. That way, if your main hard drive dies, at least you have copies in another spot. That’s painless to do via the graphical interface, right?

1. Open file browser.
2. Click through folders until you get to Documents.
3. Right click on the folder, select “Copy.”
4. Click through folders on your other hard drive until you get to Backups.
5. Right click in the folder, hit Paste.

How would we do the same thing in the shell?
cp -R /path/to/Documents /path/to/Backup

Alright, you say. That might be faster, but it’s a lot of typing! Well, for one thing, the shell helps you out a lot: you can just start typing a little bit of the word, then hit Tab, and it will complete the word for you. So you can probably just type “D” and then hit Tab, and it will fill in the full word “Documents” automatically.Or, you can just write a script that has the above command in it, name it something like backup, and all you have to do is type “backup” at the command line, and you’re done!

But now let’s say you notice you’re having space issues. Keeping a full second copy of all your stuff means using twice the hard drive space. It’d be nice if you could compress the backups so that they took up less space.

In this scenario, most likely your workflow in the graphical interface has completely changed. It depends on how integrated your archiving program is with your file explorer. For the sake of hyperbole, let’s pretend it’s not integrated at all. Here are the steps for doing it graphically:

1. Open archive program.
2. Go to File -> Add to Archive.
3. Click through folders until you get to Documents.
4. Click OK.
5. Go to File -> Save As.
6. Click through folders until you get to Backups.
7. Think up a name, hit Save.

That last step is important: how are you going to name your backups? Are you going to put today’s date in the name, so that you have easy access to a given day’s backup? I’ve done exactly this before, in the above way listed (or pretty close), and it’s tedious.

Let’s see how we’d change our previous little one-line script (backup) to do the same thing, including the date in the name of the backup.

tar czf /path/to/Backup/`date +%F`.tgz /path/to/Documents

Now when you run backup, your documents are automatically archived and stored in your Backup directory with the name of today’s date. Neat, huh? Once you change the script once, you are done–you still just run backup like you did before. It figures out the date and everything for you.

But now the truly cool thing that really puts you ahead of the graphical way: you can tell your system to simply run backup for you. That’s right, now you don’t have to open the shell, don’t have to type anything–and you never worry about forgetting to do a backup. Telling the system to run your script is quite easy, but I won’t mention it here, as I’ve made my point. Obviously this doesn’t just apply to backups, but for any task that needs doing more than once–and you’d be surprised how many of those there are.

Anyway, hopefully that helps to give you a taste. I’ll gladly answer any emails with questions, and will give a more “hands-on” tutorial next time.

Vaguely Instructional , , ,