Page 2 of 4

Re: Restoration of a few games' EXEs versions

Posted: August 30th, 2015, 1:05 pm
by MrFlibble
This is very interesting! Amazing work on restoring Super 3D Noah's Ark!

Re: Restoration of a few games' EXEs versions

Posted: August 31st, 2015, 1:03 pm
by MrFlibble
BTW, would you be interested in restoring executables for the different versions of Rise of the Triad and In Pursuit of Greed? I think In Pursuit of Greed could be interesting because the game changed quite a lot from the early demo versions to the final release.

On another note, do you plan to make an enhanced DOS port of Wolf3D based on your findings generally and the restored S3DNA code in particular?

Re: Restoration of a few games' EXEs versions

Posted: September 1st, 2015, 3:28 pm
by NY00123
MrFlibble wrote:This is very interesting! Amazing work on restoring Super 3D Noah's Ark!
Thanks for showing interest in this work!
MrFlibble wrote:BTW, would you be interested in restoring executables for the different versions of Rise of the Triad and In Pursuit of Greed? I think In Pursuit of Greed could be interesting because the game changed quite a lot from the early demo versions to the final release.
This is a nice idea and can be good for a challenge. One possible difficulty is that I've dealt only with 16-bit real mode applications for DOS built using Borland C++ 2.0/3.0/3.1 so far. ROTT and Greed surely don't seem to fit in this category. There may also be a little bit of difficulty in the case of ROTT as the DOS/4GW extender appears to be built-in the EXE. It is at least separate for Greed, though.

Still, in the cases of Blake Stone of Super 3-D Noah's Ark, I preferred to not publicly reveal anything about the work (except for maybe one or a few individuals) before I think it's more-or-less possible to recreate original EXE's behaviors, as I can never be entirely sure what will happen while trying to recreate some EXE. I think this won't change for a while, unless it's maybe a minor thing.

It may be a good chance to show just a few small examples of unexpected challenges that one may encounter. Consider the sources as available from https://github.com/FlatRockSoft/:
- With Borland C++ 2.00 and LZEXE 0.91, original Hovertank and Catacomb 3-D (v1.22 for the latter) can be recreated from the sources in the above repository, byte-by-byte.
- There is a bit weird change required for rebuilding the ABYSGAME.EXE file from The Catacomb Abyss v1.24. As with the games above, Borland C++ 2.00 and LZEXE 0.91 should be used. The definition of STARTTILE8 in GFXE_ABS.EQU (but not GFXE_ABS.H) should be changed from 315 to 314 (https://github.com/FlatRockSoft/Catacom ... S.EQU#L652). Reason is, for some weird reason the value of STARTTILE8 as mentioned in ID_VW_AE.ASM:VW_DrawTile8 (https://github.com/FlatRockSoft/Catacom ... E.ASM#L191) differs from what's found in the original EXE by an offset of 1. I think this function is never called anyway, though, so this was never observed in the game.
- As a side-note, with a few more changes, the file CATABYSS.EXE from The Catacomb Abyss v1.13 (Shareware release) can be recreated as-is.
- I've gotten stuck with the later two Catacomb games, though. I should tell what is the main difficulty with The Catacomb Armageddon v1.02 (ARMGAME.EXE). There are two lines of code in C5_DRAW.C:DrawVWall referring to walllight1 and walldark1 (see e.g., https://github.com/FlatRockSoft/Catacom ... RAW.C#L748). Although Borland C++ 2.0 generally appears to recreate the code exactly as in the original EXE, these two code lines translate to a bit different machine code. By swapping the places of wallptr->color and wall_anim_pos[wallptr->color], I can at least make sure that the rest of the code isn't effected (as the same storage is used for the generated code). More precisely, I think there may still be a problem in another generated OBJ file, but recreating the corresponding OBJ file after removing it may do the job.
- This same difficulty, along with a few more ones which I don't recall, can also be reproduced with the code for The Catacomb Apocalypse v1.01 (APOCGAME.EXE).
- Also, if you check out the instructions for building SODFOR14.EXE from the modified Wolfenstein 3D / Spear of Destiny sources (using Borland C++ 3.1 and LZEXE91 for this specific version), you can see that after the EXE is built, WL_GAME.OBJ should be removed and then recreated, along with the EXE (but other OBJs should remain untouched). I have no idea why is this the case, although it looks like some compiler quirk. It's basically about a compiler's decision between division by 2 and right shift by 1 as generated machine code.
On another note, do you plan to make an enhanced DOS port of Wolf3D based on your findings generally and the restored S3DNA code in particular?
This is also an interesting idea, although I don't see myself doing this any time soon.
Nevertheless, this repository can still be used in order to verify a lot of things, like certain behaviors of a source port being different from the (any) original.

Re: Restoration of a few games' EXEs versions

Posted: September 23rd, 2015, 2:45 am
by NY00123
ok, this is more like a minor side note, but did anybody notice that some of the Noah tunes play in the DOS version with different tempos, compared to other versions (including the ECWolf-based re-release)?

Code: Select all

diff -r 789c61553acd w3d_plus/ID_SD.C
--- a/w3d_plus/ID_SD.C   Fri Aug 28 16:36:20 2015 +0300
+++ b/w3d_plus/ID_SD.C   Wed Sep 23 00:40:22 2015 +0300
@@ -2126,9 +2126,9 @@
       {
       case 0x51:
          length = MIDI_VarLength();
-         tempo = ((long)(*midiData)<<16) + (long)((*(midiData+1))<<8) + (*(midiData+2));
+         tempo = ((long)(*midiData)<<16) + ((long)(*(midiData+1))<<8) + (*(midiData+2));
          midiTimeScale = (double)tempo/2.74176e5;
-         midiTimeScale *= 1.1;
+         //midiTimeScale *= 1.1;
          midiData += length;
          break;
       case 0x2F:
A directory with the same patch, an EXE built after applying the patch and the following description may currently be found in the w3d_plus/MISC subdirectory of the same repository: https://bitbucket.org/NY00123/gamesrc-v ... at=default

What I think that happened is this: Originally, the cast to long for the middle byte was done in the wrong place, leading to an integer overflow in case *(midiData+1), an unsigned 8-bit byte, had the value of 128 or greater. While (*midiData+1) may internally be casted to int, a 16-bit signed integer in this case, for the shift by 8 bits to the left, this isn't sufficient for the later cast to long, since a sign extension is done here.

Example: if (*midiData+1) is, in binary, the value of 11000100b, then after shifting we get the 16-bit value of 1100010000000000b. By casting from a 16-bit SIGNED int to a 32-bit signed long int (i.e. doing a signed extension to 32-bit), we get the bit pattern of 11111111111111111100010000000000b. This is not the expected result, though, as it should've really been just 0000000000000000100010000000000b stored in a long int.

The multiplication of midiTimeScale by 1.1 is probably an attempt to get the correct tempo, at least for certain musical tracks, as the actual cause of the bug was apparently not found back in the 90s.

Obviously this isn't exactly in the goal of recreating original EXEs' behaviors, including all quirks, but it may still be a bit relevant since this is a minor modification of the recreated S3DNA sources.

Re: Restoration of a few games' EXEs versions

Posted: September 23rd, 2015, 12:04 pm
by developertn
Thank you God for the rest of my life because you have given me the best for the remainder of my life!

Concerning the sound, that is very interesting. According to Borland C++ 2.0 manual the sound frequency of 7 Hz continuous exposure stops an egg from hatching and growing. So be careful not to use that frequency; otherwise it looks like you are doing good work.

Bless Jesus Christ, then my real mom Huong Thi Vu.

Honours to my real dad Nguyen Binh Thuy.
Honours to my real mom Huong Thi Vu.

Loving greetings to my two dear sisters Nguyen Khoa Thi, and Nguyen Khoa Thuyen.

Thank you for being safe and happy my real friends and real families#!! :)

Re: Restoration of a few games' EXEs versions

Posted: December 25th, 2015, 1:04 pm
by NY00123
Hi again,

I think Dec 25 may be a good time for an update!

This time, it's about Blake Stone: Aliens of Gold again. With the possibility of a few really minor differences, you can now recreate the EXEs from the 1 and 6 episodes releases, versions 1.0 and 2.0.

To begin with, it's probably not a surprise that version 2.1 has very few differences from 2.0 in terms of code. The chunks definitions as given in GFXV_BS1.H are exactly the same for these. If I haven't missed anything, I think these are all of the differences in the code:
- The addition of destPath/tempPath handling.
- The new GS_TAB_ROTATED_MAP flag (you can decide if Tab shows a rotated map in-game, or Shift+Tab).
- The "flags" field of the gametype struct was changed from an unsigned to a long. By the way, this field's type is still unsigned in Planet Strike.
- Misc. changes in GetNewActor and PlayLoop (3D_PLAY.C).

Now, it's version 1.0 where things changed more significantly. I don't see myself bothering to list all of the differences, but I'll give a few examples, one of them possibly being the most significant. Let's begin with simple cases:
- Some ID_SD.C code was moved to JM_FREE.C, before being relocated back to ID_SD.C for v2.0. I really have no explanation for this.
- A few bits of ID_SD.C are closer to code from the Wolfenstein 3D sources in v1.0. This also applies to at least 3D_DRAW.C function (ScalePost).
- If you check out ID_PM.C from the released Wolfenstein 3D sources, or JM_FREE.C from the Planet Strike sources as originally released, you should find two mentions of the comment "AJR: bugfix 10/8/92" in PML_StartupXMS. While recreating different version of Wolfenstein 3D, I found out this was fixed before making any release of Wolfenstein 3D or Spear of Destiny with version number 1.4. Looks like this was similarly applied in-between versions 1.0 and 2.0 of Blake Stone: Aliens of Gold.

One known difference between versions 1.0 and 2.0, is that 2.0 requires less memory (required amount changed from 605k to 580k and then to 540k): http://litude.webege.com/apogee/version ... ersion_2.0

It probably sounds a bit too silly to think that modifying the value of MIN_MEM_NEEDED is all that was needed: https://bitbucket.org/NY00123/gamesrc-v ... D_DEF.H-63

In this case, I think that 3D_SCALE.C may have the answer. Between versions 1.0 and 2.0, 3D_SCALE.C was mostly rewritten (possibly along with some related code pieces). In fact, for the recreation of v1.0, I simply copy-and-pasted WL_SCALE.C from the Wolfenstein 3D sources and modified it as required.

It should be known that version 1.0 missed some features, like the way lighting is done in later version. This is also a good explanation for differences in performance: v1.0 may feel much smoother than v2.0 and later on a given machine, unless you disable the lighting. That may be one reason for the changes to 3D_SCALE.C; I don't know.

But there's also the following point. Anybody sufficiently familiar with the Wolfenstein 3D codebase, or at least WL_SCALE.C, should be familiar with the function BuildCompScale, which is responsible for generating scaling code in runtime. It's called as a part of a chain of function calls NewViewSize -> SetViewSize -> SetupScaling -> BuildCompScale on certain occasions, like changing the view size during gameplay. BuildCompScale is used in Wolfenstein 3D and AOG v1.0, but is not used in later releases of Blake Stone games. I can't say I get all the details, but it clearly feels like a rewrite to me.

To the few who may wonder: According to http://litude.webege.com/apogee/version/blake.html, versions 1.0 and 2.0 were also released as 3-episodes forms. In the case of Wolfenstein 3D, the 6-episodes EXEs were known to be shared. For each original 3-episodes build of Blake Stone: Aliens of Gold, I *guess* that MAPTEMP_CHECKSUM is all that's different from the value for the corresponding 6-episodes build. I can't say this for sure at the moment, though.

One final note: By looking at GAMEVER.H (or even just the raw EXEs, at least after unpacking), you can find out the 1-episode v1.0 EXE was built one month before the corresponding 6-episodes EXE. As expected, there are a few changes in the code between these versions (currently you may find these by looking for mentions of GAMEVER_RESTORATION_BS1_100), but these are really just a few.

Re: Restoration of a few games' EXEs versions

Posted: December 26th, 2015, 5:20 am
by MrFlibble
Thanks for the update! This is some really cool research.

BTW, do you know if any similar detailed descriptions of code "evolution" exist for other games? The Cutting Room Floor Wiki aims to catalogue differences in assets/media, but frankly I don't know anything related to code except listing differences that can be observed in game behaviour.

Re: Restoration of a few games' EXEs versions

Posted: December 26th, 2015, 12:10 pm
by NY00123
MrFlibble wrote:Thanks for the update! This is some really cool research.
You're welcome!
BTW, do you know if any similar detailed descriptions of code "evolution" exist for other games? The Cutting Room Floor Wiki aims to catalogue differences in assets/media, but frankly I don't know anything related to code except listing differences that can be observed in game behaviour.
As hinted in this topic beforehand, I think it usually is not very interesting for the audience (and the programmer) to have a list all differences between versions, smaller or larger, in a change log. I do guess, though,that all changes can be found for games being open source from the very beginning, especially with sources being updated in a central public repository. Open source ports of games originally available with no source codes are probably covered as well.

I've just found the following case of the ReDMCSB project, though. This covers games for which no original source was released, yet someone came up to reverse engineer and, almost precisely, recreate original executables from more than one version: http://www.dungeon-master.com/forum/vie ... 25&t=29805

Re: Restoration of a few games' EXEs versions

Posted: March 6th, 2016, 2:05 pm
by MrFlibble
I think I forgot to ask, do you have plans to process the executables of Corridor7 and Operation Body Count in a similar fashion?

If I understood your method correctly, you have combined reverse-engineering of game executables with references to the source code. I wonder how more difficult it would be to restore the code from scratch, without any source code? For example, Bethesda Softworks' The Elder Scrolls: Arena and The Terminator: Rampage use a 2.5D engine with a feature set largely similar to the Wolf3D engine. How much work do you estimate it would require to recreate the source code of these games by decompiling/reverse-engineering the respective executables (not necessarily with the aim of getting the code to compile back into byte-by-byte identical EXEs)?

Re: Restoration of a few games' EXEs versions

Posted: January 6th, 2017, 3:43 pm
by NY00123
Hey there,

Sorry for not responding earlier! Guess I can still add some comments of mine.
MrFlibble wrote:I think I forgot to ask, do you have plans to process the executables of Corridor7 and Operation Body Count in a similar fashion?
While this can surely be tempting (although it depends on the mood), and these games are also based on Wolf3D, I'm afraid that I probably won't take care of any of these soon.

To compare, when I did the work on S3DNA, this was a title for which its original DOS sources are assumed to be lost for good. The latter was stated by Blzut3, the developer of ECWolf who also worked with Wisdom Tree to make the last official S3DNA port.

When it comes to Corridor 7 and Operation Body Count, though, I think it isn't that clear to me that the sources are lost for good. So, even though the chances any of these is available are slim, *and* the chances of a release would also be slim if anything is available, I prefer to not publicly come up with recreation efforts if there's still a (slim) chance.

Of course, recreating different-but-similar versions based on available sources (as done e.g., with Wolf3D) is acceptable by me.
If I understood your method correctly, you have combined reverse-engineering of game executables with references to the source code. I wonder how more difficult it would be to restore the code from scratch, without any source code?
This greatly depends on multiple factors, including:
- Familiarly with the kind of EXE you're working with (e.g., 16-bit real mode or 32-bit protected mode EXE, the compiler originally used).
- The complexity and/or size of the EXE's contents.
- The availability of embedded debugging symbols, and/or earlier research work done by anybody else.

In general, the more you know, the better. As expected, familiarly with the codebase of a common engine (or engine components) is a great example of this, but earlier experience may greatly assist in general.
For example, Bethesda Softworks' The Elder Scrolls: Arena and The Terminator: Rampage use a 2.5D engine with a feature set largely similar to the Wolf3D engine. How much work do you estimate it would require to recreate the source code of these games by decompiling/reverse-engineering the respective executables (not necessarily with the aim of getting the code to compile back into byte-by-byte identical EXEs)?
I feel like the time to do the work is a combination of more than one factor, so it can be a bit difficult to estimate. Maybe about half a year for an early game from ~1990, *if* there no use of features like encryption, and it doesn't take a lot of time to figure out file formats in use (already knowing these is of course the simplest case); Oh, and that's under the assumption that, on average, at least a few hours are spent on this per day, of course.

Re: Restoration of a few games' EXEs versions

Posted: January 7th, 2017, 1:12 pm
by MrFlibble
NY00123, thanks for a detailed reply!

With my renewed interest in source ports and engine recreations, I discovered a project called KeeperFX which uses an interesting approach:
The problem with remaking games is that usually the remakes are abandoned at some stage and never finished. This means that all the work put into such remake is lost, as it usually isn't finished enoughly to be used. Making games requires lots of time, and often volunteers do not have enough will to finish the project.

Bearing this in mind, I've decided I won't try to remake the game from start. Instead, I've learned binary formats of EXE and DLL files, and modified the Dungeon Keeper executable file to become a DLL.

With my new DLL, I was able to create very simple executable file which may be used as complete code to run the game. Now I'm incrementally rewriting DK; functions which are not yet rewritten are called from the DLL, so the project functions like whole game, even though it wasn't completely rewritten yet.


Many functions are already rewritten and fixed. Structure of the code allows to take advantages of every rewritten part, by fixing bugs and making new functions. The project is open-source, and its code is downloadable here. [emphasis added]
Apparently, this is a viable alternative to decompilation of the executable. Hendricks266 suggested that he plans on producing a version of Blood (and possibly other Build engine games) that runs natively on modern systems in a similar fashion.

Re: Restoration of a few games' EXEs versions

Posted: January 9th, 2017, 5:53 pm
by K1n9_Duk3
I doubt that turning DOS executables into DLLs is going to work, at least as far as executing low-level hardware stuff is concerned. On WinXP and above, you can't even read/write ports from a Win32 program. Only device drivers are allowed to execute such instructions. I don't know a thing about the structure of a DLLs and EXEs, so I can't say if it would be possible to actually convert a DOS executable into a Windows DLL in the first place.

As for recreating source code from scratch, I can confirm that it would take roughly half a year to get something decent. My first attept at disassembling a game and porting the assembly code back to C code took about two months. The game's file formats were already well documented, but I had never written any x86 assembly code before, nor had I programmed any DOS games in C. I used a manual to look up what each assembly instruction does, and I wrote little test programs and disassembled them to figure out how the compiler translates certain code patterns into assembly code. After those two months, I had code that compiled properly, but it had tons of bugs.

The biggest issue is that when you are manually converting assembly code into C code, you are bound to make mistakes here and there. Unless you manage to reproduce the original EXE byte-by-byte, you can't say for sure that the C code you created is equivalent to the C code that was used to build the original EXE. That's actually one of the big problems in computer science. If you're lucky, the game has a demo record/playback feature that makes it easier to detect where the new EXE differs from the old one. When NY00123 (and others) were working on the Commander Keen 1-3 recreation, they actually made a patch that unlocked the demo record/playback routines in the original executable, so that the reimplementation could be tested against the original game. But that will only prove that the new EXE differs. It can't actually prove that the new EXE is equivalent to the old one.

In some cases, I had to disassemble my new executable to track down the bugs. Stupid stuff like having to use "myfunction((x >> 8))" instead of "myfunction(x >> 8)" because the stupid compiler decided to convert x to a byte value before doing the shift operation... :evil:

Re: Restoration of a few games' EXEs versions

Posted: January 20th, 2017, 7:49 am
by NY00123
K1n9_Duk3 wrote:I doubt that turning DOS executables into DLLs is going to work, at least as far as executing low-level hardware stuff is concerned. On WinXP and above, you can't even read/write ports from a Win32 program. Only device drivers are allowed to execute such instructions. I don't know a thing about the structure of a DLLs and EXEs, so I can't say if it would be possible to actually convert a DOS executable into a Windows DLL in the first place.
I think that in Hendricks266's example, the idea is running the engine code (which is open source) as-is, while most of the Blood-specific code (from BLOOD.EXE) will be run under an x86 emulator. Some JIT compilation may be used for the latter. Basically, it's a mix of emulated/translated x86 and native code, while hardware accesses (e.g., port reads/writes) are appropriately translated.
As for recreating source code from scratch, I can confirm that it would take roughly half a year to get something decent. My first attept at disassembling a game and porting the assembly code back to C code took about two months. The game's file formats were already well documented, but I had never written any x86 assembly code before, nor had I programmed any DOS games in C. I used a manual to look up what each assembly instruction does, and I wrote little test programs and disassembled them to figure out how the compiler translates certain code patterns into assembly code. After those two months, I had code that compiled properly, but it had tons of bugs.
It is great to read about this from you! This includes the point about learning x86 assembly and DOS-related stuff on the way. I do share the sentiment, that getting something like this to compile is less difficult than getting it to compile *and* run well.
The biggest issue is that when you are manually converting assembly code into C code, you are bound to make mistakes here and there. Unless you manage to reproduce the original EXE byte-by-byte, you can't say for sure that the C code you created is equivalent to the C code that was used to build the original EXE. That's actually one of the big problems in computer science. If you're lucky, the game has a demo record/playback feature that makes it easier to detect where the new EXE differs from the old one.
Oh yeah, these are all true. There are cases in which a sufficiently similar EXE (say, differing by just 1000 bytes) may be ok, but getting even to that is quite a challenge, especially without having original sources. As it should be clear for some, in order to even try and reproduce original machine code as-is, there are various factors that one should take into account, apart from the source code itself; These are project settings and versions of tools in use (e.g., compiler and linker), if not more than these.
When NY00123 (and others) were working on the Commander Keen 1-3 recreation, they actually made a patch that unlocked the demo record/playback routines in the original executable, so that the reimplementation could be tested against the original game. But that will only prove that the new EXE differs. It can't actually prove that the new EXE is equivalent to the old one.
Heh, I forgot that there's a topic about the Keen 1-3 recreation in these forums! Re-using the demo routines surely made me discover a few differences from vanilla Keen 1-3, including some jump height miscalculations, as-well-as a minor episode-specific thing in Vorticon's behaviors (Vorticon's state changes to "jumping" upon falling from a floor, but only in Keen 2-3).

So yeah, I should probably tell a short summary on the process of getting that reimplementation up:
1. First, QuantumG, who has great knowledge about reverse engineering, did some RE work for Keen 1 by modifying the DOSBox emulator. Currently his work can be found here: http://www.quantumg.net/keen1.c.txt
2. Afterwards, lemm, who's known for his earlier patching work, as well as his later source modding efforts, took QuantumG's work, including a lot of readable function and variable names, and applied most of it (if not all) to a disassembly of KEEN1.EXE (in x86 ASM).
3. Near the beginning of September of 2012 (possibly in August), I found out this disassembly. A bit later, I started doing some work on the reimplementation of Keen 1. I also used a few chunks of Commander Genius code for decompressing data files (this might be the most difficult to understand code piece in the original DOS EXE, from what I recall).
4. I uploaded an initial release near the end of October 2012. So, maybe a bit less than 2 months, *with* all the earlier work done by other individuals beforehand. Also, I did get to have earlier experience with 16-bit x86 ASM (from patching Bio Menace 1), which must had helped.
5. About a month later (November), I got a feature-complete reimplementation of Keen 1.
6. Later, lemm assisted with adding support for Keen 2 and 3. As expected, most of the coding was already done while reimplementing Keen 1.
In some cases, I had to disassemble my new executable to track down the bugs. Stupid stuff like having to use "myfunction((x >> 8))" instead of "myfunction(x >> 8)" because the stupid compiler decided to convert x to a byte value before doing the shift operation... :evil:
This reminds me of a case where I had to force some right shift being signed, or maybe unsigned; I don't recall now. This was required in order to make a bridge in some Keen 2 level operate as expected (when using a switch).
There are also the MIDI-related differences in Super 3-D Noah's Ark posted by me earlier, including something to do with integer casting: viewtopic.php?p=7475#p7475

Re: Restoration of a few games' EXEs versions

Posted: January 25th, 2017, 6:30 pm
by K1n9_Duk3
Speaking of Keen 1-3, did you ever intend to recreate the original EXE versions for that or at least port Chocolate Keen back to DOS?

Re: Restoration of a few games' EXEs versions

Posted: February 17th, 2017, 5:45 am
by NY00123
K1n9_Duk3 wrote:Speaking of Keen 1-3, did you ever intend to recreate the original EXE versions for that or at least port Chocolate Keen back to DOS?
Heh, these ideas have surely been tempting; Especially the fact that DOS support was "skipped" while doing Chocolate Keen.
As previously stated, though, I prefer to not try to recreate any of the EXEs (or possibly even just port back to 16-bit real mode DOS) if we lack access to original sources.