Resolves#3075
TODO before merge:
- [x] Properly draw non-korean strings while in korean mode (language
selection)
- [x] Check jak 3
- [x] Translation scaffolding (allow korean characters, add to Crowdin,
fix japanese locale, etc)
- [x] Check translation of text lines
- [x] Check translation of subtitle lines
- [x] Cleanup PR / some performance optimization (it's take a bit too
long to build the text and it shouldn't since the information is in a
giant lookup table)
- [x] Wait until release is cut
I confirmed the font textures are identical between Jak 2 and Jak 3, so
thank god for that.
Some examples of converting the korean encoding to utf-8. These show off
all scenarios, pure korean / korean with ascii and japanese / korean
with replacements (flags):
<img width="316" height="611" alt="Screenshot 2025-07-26 191511"
src="https://github.com/user-attachments/assets/614383ba-8049-4bf4-937e-24ad3e605d41"
/>
<img width="254" height="220" alt="Screenshot 2025-07-26 191529"
src="https://github.com/user-attachments/assets/1f6e5a6c-8527-4f98-a988-925ec66e437d"
/>
And it working in game. `Input Options` is a custom not-yet-translated
string. It now shows up properly instead of a disgusting block of
glyphs, and all the original strings are hopefully the same
semantically!:
<img width="550" height="493" alt="Screenshot 2025-07-26 202838"
src="https://github.com/user-attachments/assets/9ebdf6c0-f5a3-4a30-84a1-e5840809a1a2"
/>
Quite the challenge. The crux of the problem is -- Naughty Dog came up
with their own encoding for representing korean syllable blocks, and
that source information is lost so it has to be reverse engineered.
Instead of trying to figure out their encoding from the text -- I went
at it from the angle of just "how do i draw every single korean
character using their glyph set".
One might think this is way too time consuming but it's important to
remember:
- Korean letters are designed to be composable from a relatively small
number of glyphs (more on this later)
- Someone at naughty dog did basically this exact process
- There is no other way! While there are loose patterns, there isn't an
overarching rhyme or reason, they just picked the right glyph for the
writing context (more on this later). And there are even situations
where there IS NO good looking glyph, or the one ND chose looks awful
and unreadable (we could technically fix this by adjusting the
positioning of the glyphs but....no more)!
Information on their encoding that gets passed to `convert-korean-text`:
- It's a raw stream of bytes
- It can contain normal font letters
- Every syllable block begins with: `0x04 <num_glyphs> <...the glyph
bytes...>`
- DO NOT confuse `num_glyphs` with num jamo, because some glyphs can
have multiple jamo!
- Every section of normal text starts with `0x03`. For example a space
would be `0x03 0x20`
- There are a very select few number of jamo glyphs on a secondary
texture page, these glyph bytes are preceeded with a `0x05`. These jamo
are a variant of some of the final vowels, moving them as low down as
possible.
Crash course on korean writing:
- Nice resource as this is basically what we are doing -
https://glyphsapp.com/learn/creating-a-hangeul-font
- Korean syllable blocks have either 2 or 3 jamo. Jamo are basically
letters and are the individual pieces that make up the syllable blocks.
- The jamo are split up into "initial", "medial" and "final" categories.
Within the "medial" category there are obvious visual variants:
- Horizontal
- Vertical
- Combination (horizontal + a vertical)
- These jamo are laid out in 6 main pre-defined "orientations":
- initial + vertical medial
- initial + horizontal medial
- initial + combination
- initial + vertical medial + final
- initial + horizontal medial + final
- initial + combination + final
- Sometimes, for stylistic reasons, jamo will be written in different
ways (ie. if there is nothing below a vertical vowel will be extended).
- Annoying, and ND's glyph set supports this stylistic choice!
- There are some combination of jamo that are never used, and some that
are only used for a single word in the entire language!
With all that in mind, my basic process was:
- Scan the game's entire corpus of korean text, that includes subtitles.
It's very easy to look at the font texture's glyphs and assign them to
their respective jamo
- This let me construct a mapping and see which glyphs were used under
which context
- I then shoved this information into a 2-D matrix in excel, and created
an in-game tool to check every single jamo permutation to fill in the
gaps / change them if naughty dogs was bad. Most of the time, ND's
encoding was fine.
-
https://docs.google.com/spreadsheets/d/e/2PACX-1vTtyMeb5-mL5rXseS9YllVj32BGCISOGZFic6nkRV5Er5aLZ9CLq1Hj_rTY7pRCn-wrQDH1rvTqUHwB/pubhtml?gid=886895534&single=true
anything in red is an addition / modification on my part.
- This was the most lengthy part but not as long as you may think, you
can do a lot of pruning. For example if you are checking a 3-jamo
variant (the ones with the most permutations) and you've verified that
the medial jamo is as far up vertically as it can be, and you are using
the lowest final jamo that are available -- there is nothing to check or
improve -- for better or worse! So those end up being the permutations
between the initial and medial instead of a three-way permutation
nightmare.
- Also, while it is a 2d matrix, there's a lot of pruning even within
that. For example, for the first 3 orientations, you dont have to care
about final vowels at all.
- At the end, I'm left with a lookup table that I can use the encode the
best looking korean syllable blocks possible given the context of the
jamo combination.
Adds a quick perf report feature to `goalc` that lets you compare how
much faster / slower it takes to compile the projects, with some simple
features like filtering the files, adjusting for how large of a margin
of error in the speeds you care about, and which test iteration you want
to compare against.
This is something I plan to use as I work more in `goalc` as an easy way
to track / show the results.

Relates to #1353
This adds no new functionality or overhead to the compiler, yet. This is
the preliminary work that has:
- added code to the compiler in several spots to flag when something is
used without being properly required/imported/whatever (disabled by
default)
- that was used to generate project wide file dependencies (some
circulars were manually fixed)
- then that graph underwent a transitive reduction and the result was
written to all `jak1` source files.
The next step will be making this actually produce and use a dependency
graph. Some of the reasons why I'm working on this:
- eliminates more `game.gp` boilerplate. This includes the `.gd` files
to some extent (`*-ag` files and `tpage` files will still need to be
handled) this is the point of the new `bundles` form. This should make
it even easier to add a new file into the source tree.
- a build order that is actually informed from something real and
compiler warnings that tell you when you are using something that won't
be available at build time.
- narrows the search space for doing LSP actions -- like searching for
references. Since it would be way too much work to store in the compiler
every location where every symbol/function/etc is used, I have to do
ad-hoc searches. By having a dependency graph i can significantly reduce
that search space.
- opens the doors for common shared code with a legitimate pattern.
Right now jak 2 shares code from the jak 1 folder. This is basically a
hack -- but by having an explicit require syntax, it would be possible
to reference arbitrary file paths, such as a `common` folder.
Some stats:
- Jak 1 has about 2500 edges between files, including transitives
- With transitives reduced at the source code level, each file seems to
have a modest amount of explicit requirements.
Known issues:
- Tracking the location for where `defmacro`s and virtual state
definitions were defined (and therefore the file) is still problematic.
Because those forms are in a macro environment, the reader does not
track them. I'm wondering if a workaround could be to search the
reader's text_db by not just the `goos::Object` but by the text
position. But for the purposes of finishing this work, I just statically
analyzed and searched the code with throwaway python code.
I noticed that jak 3's compilation was spending a lot of time accessing
the `unordered_map`s we use to store constants and symbol types.
I repurposed the `EnvironmentMap` originally made for GOOS for this. It
turns out that we were copying the entire constant map whenever we
encountered a `deftype`, and fixed that too.
This speeds up jak3 compiles from ~16 to 11 seconds for me.
Started at 349,880,038 allocations and 42s
- Switched to making `Symbol` in GOOS be a "fixed type", just a wrapper
around a `const char*` pointing to the string in the symbol table. This
is a step toward making a lot of things better, but by itself not a huge
improvement. Some things may be worse due to more temp `std::string`
allocations, but one day all these can be removed. On linux it saved
allocations (347,685,429), and saved a second or two (41 s).
- cache `#t` and `#f` in interpreter, better lookup for special
forms/builtins (hashtable of pointers instead of strings, vector for the
small special form list). Dropped time to 38s.
- special-case in quasiquote when splicing is the last thing in a list.
Allocation dropped to 340,603,082
- custom hash table for environment lookups (lexical vars). Dropped to
36s and 314,637,194
- less allocation in `read_list` 311,613,616. Time about the same.
- `let` and `let*` in Interpreter.cpp 191,988,083, time down to 28s.
The main thing that was done here was to slightly modify the new
subtitle-v2 JSON schema to be more similar to the existing one so that
it can properly be used in Crowdin.
Draft while I double-check the diff myself
Along the way the following was also done (among other things):
- got rid of as much duplication as was feasible in the serialization
and editor code
- separated the text serialization code from the subtitle code for
better organization
- simplified "base language" in the editor. The new subtitle format has
built-in support for defining a base language so the editor doesn't have
to be used as a crutch. Also, cutscenes only defined in the base come
first in the list now as that is generally the order you'd work from
(what you havn't done first)
- got rid of the GOAL subtitle format code completely
- switched jak 2 text translations to the JSON format as well
- found a few mistakes in the jak 1 subtitle metadata files
- added a couple minor features to the editor
- consolidate and removed complexity, ie. recently all jak 1 hints were
forced to the `named` type, so I got rid of the two types as there isn't
a need anymore.
- removed subtitle editor groups for jak 1, the only reason they existed
was so when the GOAL file was manually written out they were somewhat
organized, the editor has a decent filter control, there's no need for
them.
- removed the GOAL -> JSON python script helper, it's been a month or so
and no one has come forward with existing translations that they need
help with migrating. If they do need it, the script will be in the git
history.
I did some reasonably through testing in Jak1/Jak 2 and everything
seemed to work. But more testing is always a good idea.
---------
Co-authored-by: ManDude <7569514+ManDude@users.noreply.github.com>
Adds support for adding custom subtitles to Jak 2 audio. Comes with a
new editor for the new system and format. Compared to the Jak 1 system,
this is much simpler to make an editor for.
Comes with a few subtitles already made as an example.
Cutscenes are not officially supported but you can technically subtitle
those with editor, so please don't right now.
This new system supports multiple subtitles playing at once (even from a
single source!) and will smartly push the subtitles up if there's a
message already playing:


Unlike in Jak 1, it will not hide the bottom HUD when subtitles are
active:

Sadly this leaves us with not much space for the subtitle region (and
the subtitles are shrunk when the minimap is enabled) but when you have
guards and citizens talking all the time, hiding the HUD every time
anyone spoke would get really frustrating.
The subtitle speaker is also color-coded now, because I thought that
would be fun to do.
TODO:
- [x] proper cutscene support.
- [x] merge mode for cutscenes so we don't have to rewrite the script?
---------
Co-authored-by: Hat Kid <6624576+Hat-Kid@users.noreply.github.com>
Increase level heaps and borrow heaps. The level heap increase was
likely not needed, but better safe than sorry. We allocate the 128 MB
main heap anyway so there's no harm.
Also fix the crash when using `-boot`. As I thought it was just a
one-line typo in the kernel.
This automatically generates documentation from goal_src docstrings,
think doxygen/java-docs/rust docs/etc. It mostly supports everything
already, but here are the following things that aren't yet complete:
- file descriptions
- high-level documentation to go along with this (think pure markdown
docs describing overall systems that would be co-located in goal_src for
organizational purposes)
- enums
- states
- std-lib functions (all have empty strings right now for docs anyway)
The job of the new `gen-docs` function is solely to generate a bunch of
JSON data which should give you everything you need to generate some
decent documentation (outputting markdown/html/pdf/etc). It is not it's
responsibility to do that nice formatting -- this is by design to
intentionally delegate that responsibility elsewhere. Side-note, this is
about 12-15MB of minified json for jak 2 so far :)
In our normal "goal_src has changed" action -- we will generate this
data, and the website can download it -- use the information to generate
the documentation at build time -- and it will be included in the site.
Likewise, if we wanted to include docs along with releases for offline
viewing, we could do so in a similar fashion (just write a formatting
script to generate said documentation).
Lastly this work somewhat paves the way for doing more interesting
things in the LSP like:
- whats the docstring for this symbol?
- autocompleting function arguments
- type checking function arguments
- where is this symbol defined?
- etc
Fixes#2215
- lets you split up your `startup.gc` file into two sections
- one that runs on initial startup / reloads
- the other that runs when you listen to a target
- allows for customization of the keybinds added a month or so ago
- removes a useless flag (--startup-cmd) and marks others for
deprecation.
- added another help prompt that lists all the keybinds and what they do
Co-authored-by: water <awaterford111445@gmail.com>
This allows you to not have to define the entire file path to a source
file to re-compile and load it. Technically a stop-gap until editor
tools are developed around writing OpenGOAL.

As opposed to `(ml "goal_src/jak2/engine/game/main.gc")` (which still
works)
This is accomplished via the following config (connection attempts is
irrelevant):
```json
{
"numConnectToTargetAttempts": 1,
"jak2": {
"asmFileSearchDirs": [
"goal_src/jak2"
]
}
}
```
This also provides a way to make game-specific configurations for the
REPL fairly easily.
- You can define a `startup.gc` in your user folder, each line will be
executed on startup (deprecates the usefulness of some cli flags)
- You can define a `repl-config.json` file to override REPL settings.
Long-term this is a better approach than a bunch of CLI flags as well
- Via this, you can override the amount of time the repl will attempt to
listen for the target
- At the same time, I think i may have found why on Windows it can
sometimes take forever to timeout when the game dies, will dig into this
later
- Added some keybinds for common operations, shown here
https://user-images.githubusercontent.com/13153231/202890278-1ff2bb06-dddf-4bde-9178-aa0883799167.mp4
> builds the game, connects to it, attaches a debugger and continues,
launches it, gets the backtrace, stops the target -- all with only
keybinds.
If you want these keybinds to work inside VSCode's integrated terminal,
you need to add the following to your settings file
```json
"terminal.integrated.commandsToSkipShell": [
"-workbench.action.quickOpen",
"-workbench.action.quickOpenView"
]
```
Favors the `lg` namespace over `fmt` directly, as this will output the
logs to a file / has log levels.
I also made assertion errors go to a file, this unfortunately means
importing `lg` and hence `fmt` which was attempted to be avoided before.
But I'm not sure how else to do this aspect without re-inventing the
file logging.
We have a lot of commented out prints as well that we should probably
cleanup at some point / switch them to trace level and default to `info`
level.
I noticed the pattern of disabling debug logs behind some boolean,
something to consider cleaning up in the future -- if our logs were more
structured (knowing where they are coming from) then a lot this
boilerplate could be eliminated.
Closes#1358
* extractor: refactor and cleanup for multi-game support
* deps: switch to `ghc::filesystem` as it is utf-8 everywhere by default
* extractor: finally working with unicode
* unicode: fix unicode cli args on windows in all `main` functions
* put some duplicated code in a func
* make jak 2 text "work"
* group up all subtitles c++ code into one folder
* compact single-line subtitles
* fix a couple compiler crashes
* Update game_subtitle_en.gd
* `rolling` and `sunken`
* `swamp`
* `ogre`
* `village3`
* `maincave`
* `snow`
* `lavatube`
* `citadel`
* Update .gitignore
* clang
* fix encoding and decoding for quote
* properly fix quotes
* subtitle deserialize: sort by kind, ID and name
* sub editor: fix line speaker not being converted
* cleanup game text ids 1
* update text ids 2
* update source
* update refs
* [extractor] validate files when extracted as folder
* jp text fixes
* move game text version to the text file and fix subtitle editor escape chars
* make bad subtitles not crash the game
* fix texscroll in lag
* fix mood, fix decomp of other versions, fix text decomp
* clang
* fix tests
* oops dammit
* new fixes
* shut up codacy
* fix nonexistant subtitles crashing the game
* fix text hacks and extractor re-use on folders
* update refs
* [decompiler] read and process art groups
* finish decompiler art group selection & detect in `ja-group?`
* make art stuff work on offline tests!
* [decompiler] detect `ja-group!` (primitive)
* corrections.
* more
* use new feature on skel groups!
* find `loop!` as well
* fully fledged `ja` macro & decomp + `loop` detect
* fancy fixed point printing!
* update source
* `:num! max` (i knew i should've done this)
* Update jak1_ntsc_black_label.jsonc
* hi imports
* make compiling the game work
* fix `defskelgroup`
* clang
* update refs
* fix chan
* fix seek and finalboss
* fix tests
* delete unused function
* track let rewrite stats
* reorder `rewrite_let`
* Update .gitattributes
* fix bug with `:num! max`
* Update robotboss-part.gc
* Update goal-lib.gc
* document `ja`
* get rid of pc fixes thing
* use std::abs
* fix typo
* more typo
* shorten discord rpc text
* allow expanding enums after the fact (untested)
* make `game_text` work similar to subtitles
* update progress decomp
* update some types + `do-not-decompile` in bitfield
* fixes and fall back to original progress code
* update `progress` decomp with new enums
* update config files
* fix enums and debug menu
* always allocate (but not use) a lot of particles
* small rework to display mode options
* revert resolution/aspect-ratio symbol mess
* begin the override stuff
* make `progress-draw` more readable
* more fixes
* codacy good boy points
* first step overriding code
* finish progress overrides, game options menu fully functional!
* minor fixes
* Update game.gp
* Update sparticle-launcher.gc
* clang
* change camera controls text
* oops
* some cleanup
* derp
* nice job
* implement menu scrolling lol
* make scrollable menus less cramped, fix arrows
* make some carousell things i guess
* add msaa carousell to test
* oops
* Update progress-pc.gc
* make `pc-get-screen-size` (untested)
* resolution menu
* input fixes
* return when selecting resolution
* scroll fixes
* Update progress-pc.gc
* add "fit to screen" button
* bug
* complete resolutions menu
* aspect ratio menu
* subtitles language
* subtitle speaker
* final adjustments
* ref test
* fix tests
* fix ref!
* reduce redundancy a bit
* fix mem leaks?
* save settings on progress exit
* fix init reorder
* remove unused code
* rename goal project-like files to the project extension
* sha display toggle
* aspect ratio settings fixes
* dont store text db's in compiler
* properly save+load native aspect stuff
* cmake: reduce warning spam especially from libs
* runtime: add FS helper functions
* game: save/restore pc-settings | add original aspect option
* game: overwrite unloadable settings with defaults
* temp: unable to set the games aspect-ratio in the boot else crash?
* runtime: save memcard files to user directory as well
* runtime: fix `pckernel` load order which resolves setting the orig aspect ratio
* lint: format
* cmake: revert warning suppression, it's just causing problems it seems
* fix the order of the rest of `pckernel` and creation of obj file paths
* lint: formatting
* game: don't save settings on startup even if they are corrupted
* add subtitles support (tools + goal + text file).
* add to build system proper
* better handling of line speakers
* billy test subtitles
* adjust timings
* better handling of subtitle timing + citadel subs
* press square to toggle cutscene subtitles
* improve DirectRenderer performance
* clang
* dont error out if there's no user files
* make system supports multiple inputs for subtitles
* clang
* oh no typo!!
* avoid future issues
* fix warp gate crash
* remove no longer necessary code in DirectRenderer
* remove temp prints
* delete triplicate code
* i found a better way
* fix make issues with subtitles
* force avx compilation
* clean up allocator interface to be simpler
* working on functions without spills
* working for all
* fix missing includes for windows
* more windows includes
* initialize regs to zero so printing value unintiailized by game code is repeatable
* [decomp] even more `res`
* [decompiler] make `logand` with pointers and constants return pointer
* [decomp] more work
* update offline tests
* fix tests(?)
* `*res-static-buf*`
* fixes
* fix reference
* [opengoal] make `logand` work directly with pointers
* [decomp] `inspect res-lump`
* use the inline methods
* don't use a math mode for pointers
* [compiler] allow optionally setting disassembly output file
* [x86 disasm] Keep casing consistent between instructions and offsets
* fix warnings in repl lib and add macros to autocomplete
* fix crash on ctrl-c, build runtime as static lib and make goos prompt look fancier
* some tweaks for linux build
* swap to replxx from linenoise
* repl: Implement form auto-tab-completion
* repl: color coordinate the prompts
* repl: Add some basic syntax highlighting, bracket pairs and forms (all one color)
* repl: A more consistent starting screen for the repl
* repl: bug fix for auto-complete
* debug linux
* linting