Update docs

This commit is contained in:
Aetias
2024-10-13 11:18:14 +02:00
parent 96e8e4386a
commit ce2c78a308
6 changed files with 64 additions and 256 deletions
+14 -94
View File
@@ -5,111 +5,31 @@
- [Creating new `.c`/`.cpp` files](#creating-new-ccpp-files)
## Project structure
- `asm/`: Non-decompiled assembly code
- `ovXX/`: Code for overlay `XX`
- `*.s`: Source file in assembly
- `*.inc`: External symbols imported by respective source file
- `build/`: Build output
- `arm9_linker_script.lcf`: Linker command file for ARM9 program, specifies the order to put code and data into the ROM
- `arm9_objects.txt`: List of object files to pass to the linker
- `eur/`: Compiled/linked files
- `asm/`: Built assembly code
- `src/`: Built C/C++ code
- `overlays/`: Contains `.bin` and `.lz` files for each overlay
- `*.bin`: Linked code/data to compress or put in the ROM
- `*.lz`: Compressed code to put in the ROM
- `main.bin.xMAP`: Map file listing RAM addresses for all symbols
- `eur|usa/`: Target version
- `build/`: Linked ROM objects
- `delinks/`: Objects delinked from the base ROM
- `libs|src/`: Built C/C++ code
- `arm9.o`: Linked ELF object
- `arm9.o.xMAP`: Map file listing memory addresses for all symbols
- `config/`: [`dsd`](https://github.com/AetiasHax/ds-decomp) configuration files
- `docs/`: Documentation about the game
- `extract/`: Game assets, extracted from your own supplied ROM
- `eur|usa/`: [`ds-rom`](https://github.com/AetiasHax/ds-rom) extract directories
- `include/`: Include files
- `ph_eur/`: Game assets, extracted from your own supplied ROM
- `assets/`: Unmodified assets
- `banner/`: Banner logo and text that shows on the DS home menu
- `arm7.bin`: Extracted ARM7 program
- `arm9_ovdata.bin`: Data about ARM9 overlays
- `src/`: Source C/C++ files
- `tools/`: Tools for this project
- `compress/`: Compresses code before it is put in the ROM
- `include/`: Common C code for multiple tools
- `mwccarm/`: Compiler toolchain
- `rom/`: Extracts and builds ROMs
- `gen_externs.py`: Generates `.inc` files, use `make gen_externs` to run it
- `lcf.py`: Generates `arm9_linker_script.lcf`
- `m2ctx.py`: Generates context for decomp.me
- `patch_mwcc.py`: Patches bugs in the toolchain
- `progress.py`: Computes decompilation progress
- `configure.py`: Generates `build.ninja`
- `m2ctx.py`: Generates context for [decomp.me](https://decomp.me/)
- `mangle.py`: Shows mangled symbol names in a given C/C++ file
- `requirements.txt`: Python libraries
- `setup.py`: Sets up the project
- `assets.txt`: The order of asset directories to put in the ROM
- `*.sha1`: SHA-1 digests of different versions of the game
## Decompiling
See [/docs/decompiling.md](/docs/decompiling.md).
## Creating new `.c`/`.cpp` files
New source files must be added to the LCF (Linker Command File). This is done via `lcf.py`, which generates the LCF when
building.
In `lcf.py`, you will see a list of overlays near the top. Each overlay then has a list of source files ending in `.s`, `.c` or
`.cpp`. Those source files, when compiled, are appended to the ROM in the order that they appear in the list.
So, to create a new source file, you put the path to the source file in the correct overlay so that it appears in the correct
order in relation to other source files.
## Code style
The code style is not strict, but please try to mimic the existing style as much as possible.
If it's impossible to match a function while following the code style, then it's OK to not follow it. But do let us know when
this happens so we may amend the code style.
Below is an example of the code style in this project. If something is unclear, look at existing code. If the existing code is
insufficient, then you may decide the code style in that situation.
```cpp
// Space before pointer asterisk * and reference ampersand &
s32 MyClass::MyMethod(MyStruct *myStruct, s32 &anInteger) {
// Opening brace { on the same line
// Space after `if`, `while`, `for` and `switch`
if (myStruct->isCool) {
// Class member fields are prefixed with "m"
mInteger = anInteger;
}
// No space before asterisk * in pointer casts
// Space after cast operator
mPointer = (u32*) &anInteger;
// Prefer pre-increment ++i
// Use s32, s16, s8, etc. instead of int, short, char
for (s32 i = 0; i < 10; ++i) {
// Use `char` instead of s8 to indicate actual characters
char ch = 'A' + i * 2;
mString[i] = ch;
}
// Put long conditions on new line
if (
// Add clarifying parentheses for bool operators
(mInteger > 10 && mPointer != NULL) ||
(mInteger < 5)
) {
// Add clarifying parentheses for bitwise operators
mBool = ((mInteger >> 5) & 1) != 0;
}
do {
// Call member functions using `this`
this->DoStuff();
// In do-while loops, `while` on same line as closing brace }
} while (this->CanDoStuff());
switch (mInteger) {
// Indent `case`
// If possible, put braces after `case`
case 8: {
return *mPointer;
// If possible, put `break` after closing brace }
} break;
}
// No parentheses around return value
return mInteger;
}
```
This project has a `.clang-format` file and all C/C++ files in this project should follow it. We recommend using an editor
compatible with `clang-format` to format the code as you save.
+13 -30
View File
@@ -10,14 +10,13 @@ Contents:
## Prerequisites
1. Use one of these platforms:
- Windows (MSYS)
- Linux via WSL
- Windows (recommended)
- Linux
2. Install the following:
- Python 3.11+ and pip
- GCC 9+
- Make
- **On Linux/WSL**: Wine/Wibo
- Ninja
- **On Linux**: Wine/Wibo
3. Install the Python dependencies:
```shell
python -m pip install -r tools/requirements.txt
@@ -26,38 +25,22 @@ python -m pip install -r tools/requirements.txt
```shell
python tools/setup.py
```
5. Run the Ninja configure script:
```shell
python tools/configure.py
```
> [!IMPORTANT]
> Rerun `configure.py` often to ensure that all C/C++ code gets compiled.
> [!NOTE]
> For Linux users: If you plan to use Wibo instead of Wine, run make with `make WINE=<path/to/wibo> ...`.
## Build the ROM
This repository does not include any of the game's assets, and you will need an original decrypted base ROM.
Put the base ROM in the root directory of this repository. Please verify that your dumped ROM matches one of the versions
below:
| Version | File name | SHA1 |
| ------- | ----------------- | ------------------------------------------ |
| EUR | `baserom_eur.nds` | `02be55db55cf254bd064d2b3eb368b92a5b4156d` |
| USA | `baserom_usa.nds` | `4c8f52dd719918bbcd46e73a8bae8628139c1b85` |
Run `make extract` to extract from all the base ROMs you've provided. You only need to do this once.
Once you have extracted the base ROM, simply run `make eur` or `make usa` to rebuild it.
> For Linux users: If you plan to use Wibo instead of Wine, run `configure.py` with `-w <path/to/wibo>`.
6. Put one or more base ROMs in the [`/extract/`](/extract/README.md) directory of this repository.
### Matching the base ROM
**This is optional!** You only need to follow these steps if you want a matching ROM.
> [!NOTE]
> For interested readers:
> Retail games are usually "encrypted," which means that the first 0x800 bytes of the secure area is encrypted using a
4168-byte key found in the ARM7 BIOS. The secure area is 0x4000 bytes long and lives at the start of the ARM9 program at
address 0x2000000.
> This encryption is optional, and games will run just fine without it. In fact, this project doesn't even produce an
encrypted ROM. However, the ROM header includes a checksum of the secure area **after** encryption, so we must calculate it
somehow.
First, [extract the ARM7 BIOS from your DS device](https://wiki.ds-homebrew.com/ds-index/ds-bios-firmware-dump). Put the
ARM7 BIOS in the root directory of this repository, and verify that your dumped BIOS matches the one below:
@@ -65,4 +48,4 @@ ARM7 BIOS in the root directory of this repository, and verify that your dumped
| --------------- | ------------------------------------------ |
| `arm7_bios.bin` | `6ee830c7f552c5bf194c20a2c13d5bb44bdb5c03` |
Now, `make` should automatically detect the ARM7 BIOS and will build a matching ROM.
Now, rerun `configure.py` so it can update `build.ninja` to build a matching ROM.
+2 -2
View File
@@ -1,8 +1,8 @@
# The Legend of Zelda: Phantom Hourglass
**Work in progress!** This project aims to recreate source code for ***The Legend of Zelda: Phantom Hourglass*** by decompiling
assembly code by hand. **The repository only contains code.** To build the ROM, you must own an existing copy of the game to
extract assets from.
assembly code by hand. **The repository does not contain assets or assembly code.** To build the ROM, you must own an existing
copy of the game to extract assets from.
**Note:** The project targets the European and American versions, and other versions might be supported later.
+28 -124
View File
@@ -2,30 +2,34 @@
This document describes the build system used for this decompilation project, for those interested to learn about how we build
the ROM.
- [Extracting assets](#extracting-assets)
- [Assembling code](#assembling-code)
- [Delinking code](#delinking-code)
- [Compiling code](#compiling-code)
- [Postprocessing ELF files](#postprocessing-elf-files)
- [Generating a linker command file](#generating-a-linker-command-file)
- [Linking modules](#linking-modules)
- [Compressing modules](#compressing-modules)
- [Building the ROM](#building-the-rom)
## Extracting assets
We implemented a tool called [`extractrom`](/tools/rom/extract.c) that extracts assets from a base ROM that you
provide yourself. It extracts the following data:
We use [`ds-rom`](https://github.com/AetiasHax/ds-rom) to extract code and assets from a base ROM that you provide yourself. It
extracts the following data:
- ARM7 program
- Code for the DS coprocessor CPU, aka ARM7
- Code for the DS coprocessor CPU, the ARM7TDMI aka ARM7
- The program is likely similar to other retail games, so it is not decompiled in this project
- ARM9 program
- The main program that runs on game launch
- Also contains the Instruction TCM (ITCM) and Data TCM (DTCM) modules
- ARM9 overlays
- Dynamically loaded modules that overlap each other in memory
- Banner
- Logo and text that is displayed on the DS home menu
- Assets
- Files/assets
- Models, textures, maps, etc.
- Overlay data
- We need the file ID for each overlay, since there is currently no other way to determine the file IDs correctly
## Assembling code
Files in the `/asm/` directory with the `.s` extension is assembly code. These files are grouped into modules, which consists
of overlays, a main module, an Instruction TCM (ITCM) module and a Data TCM (DTCM) module.
## Delinking code
We use [`dsd`](https://github.com/AetiasHax/ds-decomp) as a toolkit for DS decompilation. This includes taking the extracted
code and splitting (delinking) them into smaller files. By editing a `delinks.txt` file, we can tell `dsd` to add more delinked
files to the project.
Each `delinks.txt` file belongs to one module, such as the ARM9 program, the ITCM, the DTCM or an overlay.
> [!NOTE]
> For interested readers:
@@ -43,7 +47,7 @@ of overlays, a main module, an Instruction TCM (ITCM) module and a Data TCM (DTC
> memory and has predictable access time unlike typical RAM. However, they are fully static, which means no heap or stack will
> live there. So, they are mostly reserved for hot code and data.
The assembly files themselves consist of multiple sections:
Each module and delinked file consist of multiple sections:
- `.text`: Functions
- `.init`: Static initializers
- `.ctor`: List of static initializers
@@ -51,7 +55,8 @@ The assembly files themselves consist of multiple sections:
- `.data`: Global variables
- `.bss`/`.sbss`: Global uninitialized variables
When the code is linked, all code of the same section will be written adjacent to each other. More on this in [Linking modules](#linking-modules) below.
When the code is linked, all code of the same section will be written adjacent to each other. More on this in
[Linking modules](#linking-modules) below.
## Compiling code
This game was written in C++, so most of the code we decompile will be in this programming language. In C++, we typically don't
@@ -70,8 +75,7 @@ void MyClass::MemberFunction() {}
- To our knowledge, there is at most one static initializer per source file. This means that multiple variables can be
initialized in one static initializer, if they are in the same source file.
- See the example below. Since `foo` is initialized by a constructor and not as plain data, this constructor has to be
called at some point before `foo` can be used. In the case of an overlay, this happens as soon as the overlay has been
loaded.
called at some point before `foo` can be used. Overlays do this happens as soon as the overlay has been loaded.
```cpp
class Foo {
int myValue;
@@ -83,7 +87,7 @@ Foo foo = Foo(42);
```
- `.ctor`
- List of static initializers
- Generated automatically as soon as you make a static initializer
- Generated automatically when you create a static initializer
- `.rodata`
- Global or static constants
- Example:
@@ -134,22 +138,10 @@ int thisWillBeSbss;
#pragma section sbss end
```
## Postprocessing ELF files
The result of compiling and assembling is an ELF (Executable and Linkable Format) file. We do some postprocessing on these
files to ensure that we can get a matching ROM:
- Killing implicit functions
- Writing a constructor/destructor often generates multiple functions used for different purposes. The game does not always
use each type of ctor/dtor, so some functions must be killed before building the ROM. This is done by writing
`KILL(FunctionToKill)` in any C/C++ file, which is postprocessed by [`elfkill`](/tools/elf/elfkill.cpp) which puts such
functions in a section called `.dead`, instead of `.text`.
## Generating a linker command file
The linker command file (LCF), also known as linker script, tells the linker in which order it should link the compiled or
assembled files. It is generated by [`lcf.py`](/tools/lcf.py), which is also the file where we define our source files.
In `lcf.py` we can see how the source/assembly files are grouped into modules. These groups are then used to generate the LCF.
You can see the generated LCF in `/build/arm9_linker_script.lcf` after you've built the ROM.
assembled files. It is generated by `dsd` which calculates a correct file order according to the `delinks.txt`.
The LCF also decides in what order the sections are linked in each module. In the main module, the order is:
@@ -163,106 +155,18 @@ For overlays, `.init` comes after `.rodata`:
---------|-----------|---------|---------|--------|--------|---------
<br>
The ITCM module contains mostly `.text`, but has an unused `.bss` section at the end to pad out the ITCM to exactly 32 kB,
which is exactly the size of the ITCM.
The ITCM only contains `.text` and the DTCM only contains `.data` and `.bss`.
The DTCM module contains only `.data` and `.bss` and is exactly 16 kB, i.e. the size of the DTCM.
The LCF also decides the file names where each module is written to. Overlays have one file each (`ov00.bin`, `ov01.bin`, etc),
while the main module, ITCM and DTCM are linked to the same file (`arm9.bin`).
Lastly, the LCF creates extra files that do not come from code:
- `arm9_footer.bin`
- To be appended to the ROM after `arm9.bin`.
- This file contains an offset to some build information in the main module. This information then points to the ITCM and
DTCM modules inside `arm9.bin`. Technically, the TCMs are placed in the main module's `.bss` section, and will be moved
over to the actual ITCM and DTCM when the game boots up.
- `arm9_metadata.bin`
- Contains some data which will be inserted into the main module build information mentioned above. Some of this data is
also needed during the [ROM building step](#building-the-rom), which is why they are placed in this metadata file.
- `arm9_ovt.bin`
- ARM9 overlay table
- This is a segment in the ROM which declares the address space for each overlay. Some data is missing in this table, and
will be completed during the [ROM building step](#building-the-rom).
The LCF generates ROM images for each module into the `/build/<version>/build/` directories. These are then passed back into
`ds-rom` to rebuild the ROM.
## Linking modules
The LCF and list of compiled/assembled files will be passed to the linker, which generates the files mentioned in the previous
section.
## Compressing modules
All ARM9 code is compressed, to save space on the ROM. The compression algorithm is a variant of [LZ77](https://en.wikipedia.org/wiki/LZ77_and_LZ78#LZ77)
but compressed backwards, starting from the end of the file and working its way to the start.
In short, LZ77 works as follows. The file is read back to front, byte for byte. Anytime a new byte is read, the algorithm
searches forward through the file for any sequence of bytes that match the bytes being read.
If such a sequence exists, and is 3 bytes or longer, the algorithm emits a **length-distance pair**. A length-distance pair
encodes this sequence as 4 bits of length, and 12 bits of distance. The length ranges between 3 and 18, and the distance can be
up to 4095 bytes ahead.
If no such sequence exists within this 4095 byte window, the algorithm instead emits a **literal**, which is simply one
uncompressed byte.
Length-distance pairs and literals are collectively called **tokens**. For every 8 tokens, the algorithm emits a flag byte.
In this byte, each of the 8 bits determines if an upcoming token is a literal or a length-distance pair.
This project implements [`compress`](/tools/compress/main.c), which manages to match this algorithm, including several edge
case improvements to the compressed file.
For instance, as you approach the start of the file, you may lose a few bytes due to lack of length-distance pairs. In that
case, it's actually better not to compress the start of the file, as it would waste both ROM space and CPU time when
decompressing.
The code that decompresses the modules is located in the main module. This means that the first 16 kB of the main module is not
compressed. This segment is called the secure area, and includes the entrypoint function and decompression algorithm, among
others.
The linker eliminates some dead code such as unused constructor and destructor variants.
## Building the ROM
At this stage, we have obtained the following resources to put in the final ROM:
- Extracted:
- ARM7 program
- Banner
- Assets
- Overlay data (file IDs)
- Built:
- ARM9 main module (compressed), including ITCM and DTCM
- ARM9 main footer
- ARM9 metadata
- ARM9 overlay modules (compressed)
- ARM9 overlay table
- Other:
- Assets listing [`assets.txt`](/assets.txt)
- ARM7 BIOS (dumped from your own DS device)
We implement the [`buildrom`](/tools/rom/build.c) tool which combines these files in order to build a ROM, in such a way that
it can match the original base ROM.
The procedure is quite long, but here's a summary of the content in the ROM, listed in order of appearance:
Section | Description
----------------------|-------------
Header | Game ID, region, offsets to other sections, CRC checksums, ARM9/ARM7 entrypoint addresses
ARM9 main module | The full contents of `arm9.lz`
ARM9 main footer | The full contents of `arm9_footer.bin`
ARM9 overlay table | The full contents of `arm9_ovt.bin`, plus file IDs from `extractrom` and overlay file sizes after compression
ARM9 overlay modules | The full contents of `ov00.lz`, `ov01.lz`, etc
ARM7 program | Taken directly from `extractrom`
File name table | Assets file hierarchy, directory/file names, file IDs for each asset file
File allocation table | Maps file ID to an offset within the ROM where the asset file is located
Assets | Taken directly from `extractrom`, prioritized by `assets.txt`
> [!NOTE]
> For interested readers:
> The ROM file format has been documented online for a very long time, but there are some details that are necessary for
> building a matching ROM that there was no documentation for, until now:
>
> The file name table (FNT) is sorted with special priority rules:
> 1. Directories before files
> 2. Alphabetic, case-insensitive ordering
> 3. Shortest name first
>
> The order that assets are written to the ROM is sorted in a different way:
> 1. Traverse directories listed in `assets.txt` from top to bottom
> 2. ASCII ordering, i.e. case-sensitive
> 3. Shortest name first
At this stage, we should have all the resources needed to rebuild the ROM. We use `ds-rom` to build everything according to the
specifications of the base ROM, but instead using the ROM images that the linker created.
+5 -6
View File
@@ -10,21 +10,21 @@ stuck or need assistance.
## Pick a source file
See the `decomp` tag in the [issue tracker](https://github.com/AetiasHax/ph/issues?q=is%3Aopen+is%3Aissue+label%3Adecomp) for
a list of delinked source files that are ready to be decompiled. This list grows as more source files are delinked from the
rest of the Assembly code.
rest of the base ROM.
You can claim a source file by leaving a comment on its issue, so that GitHub allows us to assign you to it. This indicates
that you are currently decompiling that source file.
If you want to unclaim the file, leave another comment so we can be certain that the source file is available to be claimed
again. Remember to make a pull request of any notable progress you made on the source file, which can include
[non-matching functions](/CONTRIBUTING.md#non-matching-functions).
again. Remember to make a pull request of any progress you made on the source file, whether it is just header files or
partially decompiled code.
## Decompiling a source file
We use the object diffing tool [`objdiff`](https://github.com/encounter/objdiff) to track differences between C++ and assembly
code.
1. [Download the latest release.](https://github.com/encounter/objdiff/releases/latest)
1. Run `python tools/objdiff.py <EUR|USA>` to generate `objdiff.json` in the project root.
1. In `objdiff`, set the project directory to the root of this project. This will load `objdiff.json`.
1. Run `configure.py` and `ninja` to generate `objdiff.json` in the `/config/<version>/arm9/` directories.
1. In `objdiff`, set the project directory to one of the mentioned `arm9/` directories.
1. Select your source file in the left sidebar:
![List of objects in objdiff](images/objdiff_objects.png)
5. See the list of functions and data to decompile:
@@ -68,7 +68,6 @@ following:
1. Once you're sent to `decomp.me`, go to "Options" and change the preset to "Phantom Hourglass".
1. Paste your code into the "Source code" tab.
1. Share the link with us!
- In the worst case, add the function as a [non-matching function](/CONTRIBUTING.md#non-matching-functions).
## Decompiling `.init` functions
> [!NOTE]
+2
View File
@@ -1,3 +1,5 @@
This repository does not include any of the game's assets, and you will need an original decrypted base ROM.
Put the base ROM(s) in this directory. Please verify that your dumped ROM matches one of the versions below:
| Version | File name | SHA1 |