Introduction to Hacking
The practice of ROM hacking has and continues to interest thousands as a hobby. Comprising both the analysis and manipulation of data, hacking can appeal to the spirit of exploration, creative problem solving, engineering, and creativity. Thanks to the subject's breadth and its propagation through the Internet, ROM hacking has become an art form of sorts. The Wikipedia article on ROM hacking offers an excellent overview of the practice, but it is not a prerequisite for this article, which intends to introduce any interested person to the subject and to set aspiring hackers on their way.
Aspects of ROM Hacking
It is possible to view ROM hacking as two somewhat different practices: analysis and manipulation, the differences between the two being somewhat comparable to the differences between pure and applied sciences (respectively). Many casual hackers and gaming enthusiasts are familiar with the end result of data manipulation - a completed hack, usually distributed in a patch. Hackers alter the data in a game for whatever purposes they desire. A hacker can make a game easier or more difficult, change graphics, alter or translate text, or even use and/or abuse the game's engine to design a game of his or her own. To make these modifications, of course, requires either specialized tools or knowledge of the workings of the game. As such, the other aspect of ROM hacking is the analysis of ROMs to discover their inner workings and contents. Some hackers prefer the challenge of reverse-engineering and probing the deepest recesses of ROMs; some hackers prefer to use hacking to express themselves creatively. Some hackers enjoy both aspects.
Binary and Hexadecimal
To become a hacker, no matter what the intent, it will most likely be necessary to have at least a casual understanding of the binary and hexadecimal number systems and the applications of these systems to computer science. Such an explanation follows; readers familiar with these systems and their use may skip the following section.
Most readers will be familiar with the everyday number system, in which thirteen is written 13, a hundred and two is written 102, and so on. This system is known as the decimal or base 10 (or synonymously radix 10) number system. Within the base 10 system, we use ten digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. Each digit stands for a quantity: none, one, two, and so on. Rather than having to write a number such as eleven as, for example, 9 + 2 or 8 + 3, we have a system of columns. A one-digit number has numbers only in the "ones column", where each digit is worth just what it says: 6 means six and so on. The next column is the tens column, in which every digit is worth ten times its normal value. 16, for example, means one ten and six ones, better known as sixteen. The next column after the tens is the hundreds, followed by the thousands and so on. A pattern should be evident: in base 10, the columns have values of 100, 101, 102, 103, and so on. Let us extend this concept to a system of a different radix.
It is suggested that the readability of this section could be improved with tables or diagrams.
Of fundamental importance to computer science is the binary system, which is base 2. Attempt to comprehend such a system with the same approach that was taken to comprehend base 10. In the base 2 system, there are two digits: the familiar 0 and 1, respectively meaning none and one. With only these quantities, we cannot represent two, as no digit on its own represents two. Therefore we use a system of columns, where the first column has value of 20, the second 21, then 22, 23, 24, and so on. Evaluating these exponentials, we see that we have a ones column, then a twos, then 4, 8, 16, 32, 64, 128, and so on. So, to write two in binary, we write 10. Observe why this works: 10 means one two and no ones, thus two. As another example, observe that one can represent twenty-one in binary as 10101 (one sixteen, one four, and one one). Though binary numbers can become long somewhat quickly (notice that the three-digit decimal number 255 represented in binary, 11111111, is an eight-digit number), binary allows us to represent any value with nothing more than the simple values one or none.
Assuming comprehension of the previous section, it should not be difficult to understand one further number system: hexadecimal, or base 16. One potentially alarming fact is that such a system would require sixteen digits. This is handled simply: after the familiar digits 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, we introduce the digits A, B, C, D, E, and F. Just as 9 represents nine, A represents ten, B represents eleven, and so on. Note that in hexadecimal, a quantity as large as fifteen can be represented with a single digit (F). For larger quantities, again, we use columns. The columns' values are 160, the second 161, then 162, 163, 164, and so on. Note that these values quickly become alarmingly large (1, 16, 256, 4096...); fortunately, there is rarely need to use more than a sixteens column. As an example, the value thirty-five is written in hexadecimal as 23, which indicates two sixteens and three ones. To demonstrate the unfamiliar digits, note that the value sixty is written in hexadecimal as 3C; that is, three sixteens and twelve ones, where we recall that the digit C represents twelve. Notice that in hexadecimal, a two-digit number can represent a value as large as two hundred and fifty five (FF in hex).
With multiple number systems in common use amongst hackers, it is not always immediately clear what number system is being used. The presence of letters is a pretty good indication that the number is in hex, but the number 14 could very well mean fourteen (decimal) or twenty (hex). To clear up confusion, hexadecimal numbers are commonly preceeded with the symbol 0x. The 0x symbol does not have a mathematical value or meaning; it simply indicates that the number proceeding it is written in hex. The number 0x21, for example, is thirty-three, not twenty-one.
Use of the Systems
It may not be immediately obvious why it is important to know the binary and hexidecimal systems, but a bit of knowledge of the workings of computers should clear things up. All information in the memory of a computer is stored by electrical switches which are either on or off. There are no in-between values; the only states are on and off. The binary digits 0 and 1 can represent off and on, respectively. Therefore each on-or-off switch in memory is represented by one binary digit, or bit. Of course, a one-digit binary number does not allow for the storage of large quantities, so bits are divided into groups of eight known as bytes. Note that this means that a byte can store any value from 00000000 to 11111111 (decimal 0 and 255 respectively). To represent values 256 and larger, computers simply group bytes together. For instance, the number 600 requires two bytes, as such: 00000010 01011000. To verify this more easily, rather than taking the 1 in the first byte to mean one five-hundred-and-twelve, simply determine the value of each byte independently, multiply the value of the higher byte by 256, and add - note that this works as since 10 * 100000000 = 1000000000. Using our example (and writing in decimal for convenience), we have that the combination of the bytes 00000010 01011000 has a value of 256(2) + (64 + 16 + 8) = 600. It should be noted that the byte of the higher value (known as the high byte) does not always come before the lower byte: this is the concept of endianness, and it varies from platform to platform.
The use of hexidecimal in computer science is not so much a necessity as a great convenience. There are two important observations to be made about hexadecimal. Note first that any byte can be written as a two-digit hexadecimal number, from the smallest (0x00, 0 in decimal) to the largest (0xFF, 255 in decimal). The previous two-byte value 600 is written in hex as 02 58. For the second convenience of hexadecimal, pay close attention to the following observation: 0x58 = 01011000, 0101 = 0x5, and 1000 = 0x8. This applies to all bytes: the high hex digit represents the four highest bits; the low hex digit represents the four lowest bits. A group of four bits is known as a nybble. With this in mind, it is easy to mentally convert from binary to hex and vice versa. For example, observing 11010111, 1101 = 0xD and 0111 = 0x7, so 11010111 = 0xD7. Calculations between these systems and decimal, when not easily computed mentally, can be performed by many calculators, including the calculator calc.exe included with most versions of Microsoft Windows.
At its simplest, a ROM consists of millions of bits, usually viewed as the bytes which they comprise. It is evident how numerical values can be stored in binary form; however, it may not be immediately obvious how other content, such as text, graphics, and programming code can be represented. In fact, all data in a ROM can (and must) be made of numbers. Examples of representing text and graphics follow; having studied these examples, it should be rational to assume that all data can be reduced to ones and zeros.
How would it be possible to represent the string "mario" using numbers? Possibly the simplest solution would be to use one byte for each letter, assigning a letter to each numerical value. For instance, if we say that 1 = a, 2 = b, 3 = c, and so on, then "mario" could appear in a ROM as 0D 01 12 09 0F. This is, though somewhat simplified, the method used by most ROMs to store text. In an actual mapping of numbers to text (known as a character map or table), it would be necessary to have seperate values for capital and lowercase letters, as well as spaces, punctuation, and symbols. Additionally, small numerical values are usually used for formatting purposes; 0x0A, for example, may indicate a new line of text.
Representing graphics with numbers can be a bit more of an undertaking than text, but it is by all means possible. It is important to note that different platforms use different methods for storing grahpics and even colors, but a generalization can be presented. First, we define the concept of the picture element, or pixel. Staring closely at a computer monitor will reveal that what appears to be an image is composed of many tiny dots, each having one color. Each of these square dots is a pixel. Graphics in ROMs are defined by the colors of each of their pixels. To proceed, it is necessary to be able to represent different colors. Possibly the simplest example is the three-byte RGB model, in which three bytes represent a color's red, green, and blue content, respectively. The color FF 00 00, for example, is pure red. The color 7F 00 FF contains pure blue and a half-strength red, resulting in a shade of violet. In a ROM, a group of colors can be stored together as a palette, in which there would be a first color, a second color, and so on. Now it is possible to refer to a color by a single number. For example, if we have a palette of white, red, and green, we can say that zero represents white, one represents red, and two represents green. Now, we construct, for example, a 6x4 grid, where each square on the grid is one pixel. Such a grid can represent a graphic that is 6 by 4 pixels. Going from top to bottom, left to right, we can assign one byte to each square on the grid. Observe that, using the palette described above, the string of bytes 02 02 00 00 01 01 02 02 00 00 01 01 02 02 00 00 01 01 02 02 00 00 01 01 would result in an image resembling the Italian flag. Complex images can be constructed in the same manner. It should be noted that, with the example given, the bytes used to describe the graphic was inefficient in more ways than one. What if, for instance, each nybble in a byte could represent one pixel, or if there was some way to indicate that the string 02 02 00 00 01 01 was repeated three times? The former such issue is resolved by various, more efficient graphical formats; the latter is addressed by compression.
It is suggested that the readability of this section could be improved with tables or diagrams.
Pointers and Programming
As a ROM comprises nothing more than numbers, it is necessary to have a system in which numbers can represent instructions of what to do with other numbers. Everything that happens in a computer program is nothing more than a staggering amount of numbers being adjusted and moved from place to place. The fundamental hardware (or software, in the case of emulators) that performs these calculations and relocations of numbers is the processor. For this to be possible, it is first necessary to have a way to 'talk about' a specific byte in memory. How, for example, could the processor add the 316th byte in a ROM to the 400th without knowing what exactly these values are? The processor is given a pointer, a number which points to a byte at a specific distance from a reference point, often the beginning of the ROM. A byte's distance from this reference point is its address.In the above calculation, for example, the processor would be told to load and add the values at 013C and 0190. It should be noted that ROM, which stands for read-only memory, cannot be altered by the processor. The processor must use RAM, or random-access memory, in order to store and remember information. Pointers become all the more important with RAM, as during the course of a program, the value of the byte at 2C6E in RAM may change any number of times.
It should come as no surprise that instructions to the processor are also represented by numbers. A processor must be told where to start reading numbers as instructions, after which the processor will continue following instructions as long as the program is running. The numbers representing instructions for a processor comprise its machine code. Machine code varies significantly from processor to processor. In general, however, there will be numbers which tell the processor to load and store values to and from memory, perform calculations with these numbers, start reading instructions from a different address, and perform different instructions based on certain conditions.
With the knowledge that ROMs store all content numerically, it is possible to alter a ROM's content by changing these numbers. The principal tool for viewing and changing the bytes that comprise a ROM is a hex editor, in which every byte in the ROM is displayed as one after another in hexadecimal. For simple data such as pure numerical data, a hacker need only find the address of the value in question and alter it. Changing an enemy's hit points in an RPG, for example, is usually a matter of going to the address in question with a hex editor and changing the value to the desired value. As data formats get more complex, such as those for graphics and music, manually editing the bytes for such data can become very difficult (but never impossible).
Thanks to the work of hacking enthusiast programmers, there exist a number of utilities specifically designed to edit ROMs. Most of these utilities are designed for specific games, but there are a number of universal utilities which can edit the data of any of a certain type of game, such as all those for one system. Some hacking utilities, such as Lunar Magic, become incredibly complex, effectively creating an interaction layer above the actual game data. With these tools, a hacker may not need to know the addresses of the value he or she is editing, nor the specifics of a format: the daunting task of altering graphics, with the appropriate tool, becomes as simple as using a painting program. The creation of a new utility opens up the potential of a game to be modified by those who do not know the game mechanics as intimately as the programmer. A hacker using utilities is free to unleash his or her creativity without having to make tedious calculations or decipher formats. It is still important, of course, to understand the basic operation of the system to best exploit it (and more importantly, to fix it when it breaks).
With a mastery of the layout and content of a game, it is possible to alter the machine code that dictates the game's very behavior (the game's engine). It is generally safest and easiest to resort to engine modification only when all other methods would fail, but it is not unreasonable that hacker may want a game to perform in ways that it had never done so before. One of the simplest engine modifications is the bugfix, in which a hacker fixes a bug present in a game's engine - for example, a hacker may want to put an upper limit on the amount of money a character in an RPG can hold if such a limit did not exist, as collecting more than this much money would likely cause the amount of money to roll over to zero. Engine modifications can also be incredibly complex - adding scorekeeping systems, coding minigames, or causing the main character to periodically split in two; with enough skill and patience, engine modification theoretically makes anything within the bounds of hardware capabilities possible.
Main article: Reverse engineering
Earlier, it was explained how numbers could be used to represent any content and instructions within a game. Before hackers can alter a game's contents or write a utilitiy to do so, it is necessary to do the opposite: analyze the game's raw numerical data and determine how it relates to the content. There is no hard and fast method for doing so, and this reverse engineering can require staggering ingenuity, deduction, persistence and luck. Techniques for reverse engineering are discussed in the main article. The simplest (and often, the first) targets for hackers approaching a fresh ROM are the data tables - organized, structured data; essentially, databases. The contents of such data tables are usually numeric; more complex data is referred to in such tables by pointers. Reverse engineering data such as graphics and music is a more monumental undertaking, usually (by tradition) requiring, if not a group effort, multiple caffeine-laden all-night hack sessions.
As a hobby largely propagated by the Internet and catching the interest of artists, engineers, and all those sharing an interest for the subject, all ROM hackers are united as members as a certain culture. As any culture with highs and lows, significant personalities, and defining events, the history of ROM hacking is a complex story.
The Lone Wolf Mentality
Despite the spread of information being at the very heart of hacking, to many hackers, there seems to be a romanticism to the image of the lone hacker, mouse in one hand and chocolate-covered coffee beans in another, flying through data in a trancelike state by the glow of a monitor. While hackers are eager to spread knowledge, there is often a deserved sense of accomplishment in announcing one's findings or releasing a utility as the manifestation of one's hard work. While this is inevitable, it is important to keep in mind the nature of hacking as a collaboration. The most poignant representations of this collaboration are the hacking communities which have formed around the Internet. In addition to often being more productive than individual hackers, members of hacking communities often form tight friendships. As well, some communities have held friendly (and more hostile) rivalries with other communities.
As previously mentioned, many hackers seem to be creatures of the night; most work best with caffeine. Hackers generally share traits with nerds, arguably being a subset of them. Hackers typically commune at message boards and IRC channels. Intelligence and/or creativity is necessary. This considered, one must want to be a hacker for the thrill of discovery, the satisfaction of modification, or both; not for the desire to be called a hacker. While no generalizations are to be made as to the kindness of hackers, ignorance is almost universally detested. This said, the best way to learn is by doing; an aspiring hacker would be recommended to acquire the utilities necessary for what he or she wants to do and begin to play around with them. As skills become second nature, it is easy to become quickly engrossed in the hobby of ROM hacking.