Commit Graph

30 Commits

Author SHA1 Message Date
Tyler Wilding 006d24b29a
game: Support korean in Jak 2 and Jak 3 (#3988)
Resolves #3075 

TODO before merge:
- [x] Properly draw non-korean strings while in korean mode (language
selection)
- [x] Check jak 3
- [x] Translation scaffolding (allow korean characters, add to Crowdin,
fix japanese locale, etc)
- [x] Check translation of text lines
- [x] Check translation of subtitle lines
- [x] Cleanup PR / some performance optimization (it's take a bit too
long to build the text and it shouldn't since the information is in a
giant lookup table)
- [x] Wait until release is cut

I confirmed the font textures are identical between Jak 2 and Jak 3, so
thank god for that.

Some examples of converting the korean encoding to utf-8. These show off
all scenarios, pure korean / korean with ascii and japanese / korean
with replacements (flags):
<img width="316" height="611" alt="Screenshot 2025-07-26 191511"
src="https://github.com/user-attachments/assets/614383ba-8049-4bf4-937e-24ad3e605d41"
/>
<img width="254" height="220" alt="Screenshot 2025-07-26 191529"
src="https://github.com/user-attachments/assets/1f6e5a6c-8527-4f98-a988-925ec66e437d"
/>

And it working in game. `Input Options` is a custom not-yet-translated
string. It now shows up properly instead of a disgusting block of
glyphs, and all the original strings are hopefully the same
semantically!:
<img width="550" height="493" alt="Screenshot 2025-07-26 202838"
src="https://github.com/user-attachments/assets/9ebdf6c0-f5a3-4a30-84a1-e5840809a1a2"
/>

Quite the challenge. The crux of the problem is -- Naughty Dog came up
with their own encoding for representing korean syllable blocks, and
that source information is lost so it has to be reverse engineered.
Instead of trying to figure out their encoding from the text -- I went
at it from the angle of just "how do i draw every single korean
character using their glyph set".

One might think this is way too time consuming but it's important to
remember:
- Korean letters are designed to be composable from a relatively small
number of glyphs (more on this later)
- Someone at naughty dog did basically this exact process
- There is no other way! While there are loose patterns, there isn't an
overarching rhyme or reason, they just picked the right glyph for the
writing context (more on this later). And there are even situations
where there IS NO good looking glyph, or the one ND chose looks awful
and unreadable (we could technically fix this by adjusting the
positioning of the glyphs but....no more)!

Information on their encoding that gets passed to `convert-korean-text`:
- It's a raw stream of bytes
- It can contain normal font letters
- Every syllable block begins with: `0x04 <num_glyphs> <...the glyph
bytes...>`
- DO NOT confuse `num_glyphs` with num jamo, because some glyphs can
have multiple jamo!
- Every section of normal text starts with `0x03`. For example a space
would be `0x03 0x20`
- There are a very select few number of jamo glyphs on a secondary
texture page, these glyph bytes are preceeded with a `0x05`. These jamo
are a variant of some of the final vowels, moving them as low down as
possible.

Crash course on korean writing:
- Nice resource as this is basically what we are doing -
https://glyphsapp.com/learn/creating-a-hangeul-font
- Korean syllable blocks have either 2 or 3 jamo. Jamo are basically
letters and are the individual pieces that make up the syllable blocks.
- The jamo are split up into "initial", "medial" and "final" categories.
Within the "medial" category there are obvious visual variants:
  - Horizontal
  - Vertical
  - Combination (horizontal + a vertical)
- These jamo are laid out in 6 main pre-defined "orientations":
  - initial + vertical medial
  - initial + horizontal medial
  - initial + combination
  - initial + vertical medial + final
  - initial + horizontal medial + final
  - initial + combination + final
- Sometimes, for stylistic reasons, jamo will be written in different
ways (ie. if there is nothing below a vertical vowel will be extended).
  - Annoying, and ND's glyph set supports this stylistic choice!
- There are some combination of jamo that are never used, and some that
are only used for a single word in the entire language!

With all that in mind, my basic process was:
- Scan the game's entire corpus of korean text, that includes subtitles.
It's very easy to look at the font texture's glyphs and assign them to
their respective jamo
- This let me construct a mapping and see which glyphs were used under
which context
- I then shoved this information into a 2-D matrix in excel, and created
an in-game tool to check every single jamo permutation to fill in the
gaps / change them if naughty dogs was bad. Most of the time, ND's
encoding was fine.
-
https://docs.google.com/spreadsheets/d/e/2PACX-1vTtyMeb5-mL5rXseS9YllVj32BGCISOGZFic6nkRV5Er5aLZ9CLq1Hj_rTY7pRCn-wrQDH1rvTqUHwB/pubhtml?gid=886895534&single=true
anything in red is an addition / modification on my part.
- This was the most lengthy part but not as long as you may think, you
can do a lot of pruning. For example if you are checking a 3-jamo
variant (the ones with the most permutations) and you've verified that
the medial jamo is as far up vertically as it can be, and you are using
the lowest final jamo that are available -- there is nothing to check or
improve -- for better or worse! So those end up being the permutations
between the initial and medial instead of a three-way permutation
nightmare.
- Also, while it is a 2d matrix, there's a lot of pruning even within
that. For example, for the first 3 orientations, you dont have to care
about final vowels at all.
- At the end, I'm left with a lookup table that I can use the encode the
best looking korean syllable blocks possible given the context of the
jamo combination.
2025-08-16 19:35:47 -04:00
Tyler Wilding 647282d896
deps: update `fmt` to `11.1.4` (#3880)
Fixes #3886
2025-04-12 15:59:13 -04:00
Tyler Wilding eb703ee96e
REPL related improvements and fixes (#3545)
Motivated by - https://github.com/open-goal/opengoal-vscode/pull/358

This addresses the following:
- Fixes #2939 spam edge-case
- Stop picking a different nREPL port based on the game mode by default,
this causes friction for tools in the average usecase (having a REPL
open for a single game, and wanting to connect to it). `goalc` spins up
fine even if the port is already bound to.
- For people that need/want this behaviour, adding per-game
configuration to the `repl-config.json` is on my todo list.
- Allows `goalc` to permit redefining symbols, including functions. This
is defaulted to off via the `repl-config.json` but it allows you to for
example, change the definition of a function without having to restart
and rebuild the entire game.
![Screenshot 2024-06-02
124558](https://github.com/open-goal/jak-project/assets/13153231/28f81f6e-b7b8-4172-9787-f96e4ab1305b)
- Updates the welcome message to include a bunch of useful metadata
up-front. Cleaned up all the startup logs that appear when starting
goalc, many of whom's information is now included in the welcome
message.
  - Before:

![image](https://github.com/open-goal/jak-project/assets/13153231/814c2374-4808-408e-9ed6-67114902a1d9)

  - After:
![Screenshot 2024-06-01
235954](https://github.com/open-goal/jak-project/assets/13153231/f3f459fb-2cbb-46ba-a90f-318243d4b3b3)
2024-06-03 00:14:52 -04:00
Tyler Wilding 60db0e5ef9
deps: update `fmt` to latest version (#3403)
This updates `fmt` to the latest version and moves to just being a copy
of their repo to make updating easier (no editing their cmake / figuring
out which files to minimally include).

The motivation for this is now that we switched to C++ 20, there were a
ton of deprecated function usages that is going away in future compiler
versions. This gets rid of all those warnings.
2024-03-05 22:11:52 -05:00
Tyler Wilding 4101d5d80e
tests: add jak3 typeconsistency test and ensure offline tests are working (#3310) 2024-01-16 00:15:33 -05:00
water111 395c98db19
[goalc] Cleaned up speedups (#3066)
Started at 349,880,038 allocations and 42s

- Switched to making `Symbol` in GOOS be a "fixed type", just a wrapper
around a `const char*` pointing to the string in the symbol table. This
is a step toward making a lot of things better, but by itself not a huge
improvement. Some things may be worse due to more temp `std::string`
allocations, but one day all these can be removed. On linux it saved
allocations (347,685,429), and saved a second or two (41 s).
- cache `#t` and `#f` in interpreter, better lookup for special
forms/builtins (hashtable of pointers instead of strings, vector for the
small special form list). Dropped time to 38s.
- special-case in quasiquote when splicing is the last thing in a list.
Allocation dropped to 340,603,082
- custom hash table for environment lookups (lexical vars). Dropped to
36s and 314,637,194
- less allocation in `read_list` 311,613,616. Time about the same.
- `let` and `let*` in Interpreter.cpp 191,988,083, time down to 28s.
2023-10-07 10:48:17 -04:00
Tyler Wilding 00ac12094e
goalc/repl: cleanup of goalc/REPL code and some QoL improvements (#2104)
- lets you split up your `startup.gc` file into two sections
  - one that runs on initial startup / reloads
  - the other that runs when you listen to a target
- allows for customization of the keybinds added a month or so ago
- removes a useless flag (--startup-cmd) and marks others for
deprecation.
- added another help prompt that lists all the keybinds and what they do

Co-authored-by: water <awaterford111445@gmail.com>
2023-01-07 11:24:02 -05:00
Tyler Wilding ac3c4e59b0
goalc/repl: Allow hot-loading files via `ml` with just the object name (#2036)
This allows you to not have to define the entire file path to a source
file to re-compile and load it. Technically a stop-gap until editor
tools are developed around writing OpenGOAL.


![image](https://user-images.githubusercontent.com/13153231/203196148-de61cf4b-42c8-43dc-a7fd-80e6ba6f5ac2.png)

As opposed to `(ml "goal_src/jak2/engine/game/main.gc")` (which still
works)

This is accomplished via the following config (connection attempts is
irrelevant):
```json
{
  "numConnectToTargetAttempts": 1,
  "jak2": {
    "asmFileSearchDirs": [
      "goal_src/jak2"
    ]
  }
}
```

This also provides a way to make game-specific configurations for the
REPL fairly easily.
2022-11-29 19:22:22 -05:00
water111 46f11b1b47
add level ref test (#1973)
There's a few functions that have to be excluded, but I think it's still
useful to have this.
2022-10-15 18:36:32 -04:00
Tyler Wilding 4d751af38e
logs: replace every `fmt::print` with a `lg` call instead (#1368)
Favors the `lg` namespace over `fmt` directly, as this will output the
logs to a file / has log levels.

I also made assertion errors go to a file, this unfortunately means
importing `lg` and hence `fmt` which was attempted to be avoided before.
But I'm not sure how else to do this aspect without re-inventing the
file logging.

We have a lot of commented out prints as well that we should probably
cleanup at some point / switch them to trace level and default to `info`
level.

I noticed the pattern of disabling debug logs behind some boolean,
something to consider cleaning up in the future -- if our logs were more
structured (knowing where they are coming from) then a lot this
boilerplate could be eliminated.

Closes #1358
2022-10-01 11:58:36 -04:00
water111 97637d7902
fix for windows newlines (#1834) 2022-09-01 22:44:43 -04:00
Tyler Wilding 6446389263
extractor: cleanup, support unicode properly, and add multi-game support (#1609)
* extractor: refactor and cleanup for multi-game support

* deps: switch to `ghc::filesystem` as it is utf-8 everywhere by default

* extractor: finally working with unicode

* unicode: fix unicode cli args on windows in all `main` functions
2022-07-05 20:38:13 -04:00
Tyler Wilding 2d595c1ac0
lint: add include sorting config to clang-format (#1517) 2022-06-22 23:37:46 -04:00
water111 5bd0b735a5
Add extractor tool (#1276)
* first attempt

* fix

* zip to tar

* windows

* try again, std::filesystem sucks

* std::filesystem is still garbage

* std::filesystem is terrible

* std::filesystem continues to waste my time

* again

* neadsflaldksal;df
2022-04-03 19:17:03 -04:00
water111 78cde74d5a
update readme and fix always playing str (#1139)
* update readme deps

* replace assert

* bump timeout

* fix memory corruption in kernel

* use unknown if level name is invalid
2022-02-08 19:02:47 -05:00
ManDude 5b44aece75
random fixes + support `clang-cl` on visual studio (#1129)
* delete unused shaders

* hide some options in debug menu

* change fullscreen logic a bit

* add "all actors" toggle

* borderless fix and fix alpha in direct renderer untextured (do we need a separate shader for that?)

* fix fuel cell orbit icons in widescreen

* fix `curve` types

* refs

* fix levitator task...

* fix some task stuff

* update font code a bit (temp)

* cmake, third-party and visual studio overhaul

* Update .gitmodules

* update modules

* clone repos

* fix encoding in zydis

* where did these come from

* try again

* add submodule

* Update 11zip

* Update 11zip

* Update 11zip

* delete

* try again

* clang

* update compiler flags

* delete 11zip. go away.

* Create memory-dump-p2s.py

* properly

* fix minimum architecture c++ compiler flags

* fix zydis

* oops

* Update all-types.gc

* fix clang-cl tests

* make "all actors" work better, entity debug qol

* update game-text conversion code to be more modularized

* Create vendor.txt

* fix typos and minor things

* update refs

* clang

* Attempt to add clang-cl support to vs2019 and CI

* vs2022 + clang-cl

* srsly? fix clang build

* Update launch.vs.json

* extend windows CI timer
2022-02-07 19:15:37 -05:00
water111 54c6ddbab9
pass filename through (#1080) 2022-01-15 23:31:07 -05:00
ManDude 59d071eccb
[goos/goal] user profiles (#977)
* implement user profiles

* example usage!

* typo

* use a cond

* fix errors

* fixes

* fix potential commandline args disaster
2021-11-24 00:44:04 -05:00
ManDude 0e96cdb6c5
[de/compiler] New text tool (#918)
* fix `citb-drop-plat` a bit and PAL `fisher`

* Clean up `pc-pad-utils`. Looks so clean!

* Increase process stacks by ~2x

* Convert `game_text` custom encoding to and from a readable one (UTF-8)

* clang

* add missing characters

* support all diacritic variants

* fix a character

* remaining cases

* fix tests

* fix memory leak?

* clang

* add custom characters w/ diacritics

* Update all-types.gc

* robustness

* minor bug

* move custom font decoding function to `FontUtils.cpp`

* Move valid source chars patching to Reader constructor
2021-10-19 00:02:18 -04:00
water111 188373a3f6
decompile some drawable stuff and fix a few small bugs (#859) 2021-09-28 19:24:09 -04:00
water111 7a5562106e
Compiler performance improvements and error clean-up (#782)
* compiler cleanup and error improvement

* fix test
2021-08-24 22:15:26 -04:00
water111 322a4ed9b2
Fix compiler crashes and improve return statements. (#652)
* fix a few small bugs

* fi

* fix merge conflict
2021-06-30 00:11:46 -04:00
water111 542edfb164
[compiler/decompiler] Take the address of a variable (#554)
* support taking the address of variables

* partially working stack variables

* implement type cast stuff

* remove final
2021-06-04 13:43:19 -04:00
water111 0f0902eabf
add config option for changing cond splitting behavior (#522) 2021-05-24 19:52:19 -04:00
water111 fe336b7b5f
[compiler] fix warnings in repl lib and add macros to autocomplete (#317)
* fix warnings in repl lib and add macros to autocomplete

* fix crash on ctrl-c, build runtime as static lib and make goos prompt look fancier

* some tweaks for linux build
2021-03-11 12:54:16 -05:00
Tyler Wilding 8bba3d7fd7
REPL: Add clear-screen / auto-complete / basic hints and syntax highlighting (#316)
* swap to replxx from linenoise

* repl: Implement form auto-tab-completion

* repl: color coordinate the prompts

* repl: Add some basic syntax highlighting, bracket pairs and forms (all one color)

* repl: A more consistent starting screen for the repl

* repl: bug fix for auto-complete

* debug linux

* linting
2021-03-07 23:41:21 -05:00
Tyler Wilding 2eca9ab801
repl: Support cross-session history (#301) 2021-03-03 00:05:13 -05:00
water111 40d328f4eb
[Decompiler] Test framework for decompiler regression tests and gcommon tests (#200)
* test framework for pre-expression compact stuff

* check ordering

* more tests

* final tests gcommon
2021-01-18 13:33:32 -05:00
water111 ae053870c3
support escaping any character (#125) 2020-11-17 15:25:48 -05:00
water111 abcd444a3b
Add deftype (#48)
* initial deftype implementation

* fix library setup for windows

* implement deftype

* fix memory bug

* fix formatting
2020-09-17 21:47:52 -04:00