Site Wide Message: (current site time 9/6/2010 5:36:35 AM EDT)
  • We want your input! One of our sponsors wants to know your opinion about development related issues. Click here to tell us what you think.
  • Are you an emerging/young developer (aged 18-30)? If so, would you like the chance to affect future developer tools and products?
    If so, then click here to give your feedback.
 

The 5 byte EXE file

Print
Email
article
Submitted on: 3/10/2002 2:47:46 PM
By: Jonathan Smith  
Level: Advanced
User Rating: By 13 Users
Compatibility:C, C++ (general), Microsoft Visual C++, Borland C++, UNIX C++

Users have accessed this article  14074 times.
 
author picture
(About the author)
 
     This article is based on Vbmew's "make 7 byte .exes" (http://www.1cplusplusstreet.com/vb/scripts/ShowCode.asp?txtCodeId=2221&lngWId=3) His article perked my interest in the Assembly language, so I went out and did some research. This article is a very brief primer on assembler and machine code.

 
 
Terms of Agreement:   
By using this article, you agree to the following terms...   
  1. You may use this article in your own programs (and may compile it into a program and distribute it in compiled format for languages that allow it) freely and with no charge.
  2. You MAY NOT redistribute this article (for example to a web site) without written permission from the original author. Failure to do so is a violation of copyright laws.   
  3. You may link to this article from another website, but ONLY if it is not wrapped in a frame. 
  4. You will abide by any additional copyright restrictions which the author may have placed in the article or article's description.
				

Note: Before I begin, I realize this program isn't really C or C++. If it's anything, it's assembler. I posted it to the C++ section because it's closer to what I want to accomplish. Personally, I think an ASM section on PSC is long, long overdue. So please, no complaining about this not being C or C++.

With having said that, let us begin.

THE 5-BYTE EXECUTABLE - A PRIMER ON ASSEMBLY LANGUAGE AND MACHINE CODE

For starters, I'd like to create a program which is only 5 bytes in size and does more than the 7 byte EXE that Vbmew shows how to create.

At a DOS prompt, type "copy con echochar.com"

The listing of the program is as follows (with the assembler code explaination)

  1. Press Alt-180
    Code B4, "mov ah, ??". 'mov' is a symbol used to tell the processor to copy a value from somewhere into somwhere else. 'ah' is a CPU register. ah is commonly used with input and output routines. '??' is the value we want to put in to the ah register. We fill in the value of '??' in the next line.
     
  2. Press Alt-1
    ASCII character 1. This makes the first line look like "mov ah,1".
     
  3. Press Alt-205
    Code CD, "int ??". 'int' simply calls an interrupt. An interrupt is an instruction built in to the CPU.
     
  4. Press Alt-33
    ASCII character 33. In hex, 33 is 21. This makes the previous line look like "int 21h". Interrupt 21h is a commonly used IO interrupt. By setting the ah register to 1 and calling interrupt 33 (or 21h), we're telling the computer to stop and wait for input from the keyboard. Since ah is set to 1, once a key is pressed, it is echoed to the screen. If, however, ah was set to 8, the character pressed would *not* be echoed.
     
  5. Press Alt-195
    Code C3, "ret". 'ret' basically tells the computer to return to the previous environment.
     
  6. Press Ctrl-Z to mark the end of the file and the press Enter to write the file.

Like Vbmew's program, this one displays a character to the screen.  Unlike his program, however, this one lets you choose which character is displayed. =)

While this program basically does nothing, it's a great primer (for me, at least).  It gives an introduction as to what basic assembler commands do what, and what their machine code representation is.

Here's the program again, this time, in all assembly.

mov ah,1
int 21h
ret

To make this even more low-level, we could eliminate the automatic display of the character to the screen and display it with code instead. As stated earlier, to eliminate the character echo, the ah register needs to be set to 8 instead of 1.  After a key is pressed, is put into the 'al' register, which is simply another register in the CPU which you need not concern yourself with at the moment. Just know that it holds the ASCII value of the key that was pressed and we need to put that value into the 'dl' register, which is commonly used for output. To do this, we need to use totally different symbols specific for moving registers to registers. One of these new symbols are 88 and C2 combined. In fact, the 88-C2 command specifically copies the value in the al register to the dl register. After performing this operation, we need to tell the computer that we want to display the output. This is done by setting the ah register to 2 and once again calling interrupt 21h.

I also recommend a unicode hex editor for this as the DOS prompt will not suffice (because Alt-8 translates into backspace).

Keyboard command

ASM Code

Machine Code

Alt-180, Alt-8 (can't be done at DOS prompt) mov ah,8 ┤◘
Alt-205, Alt-33 int 21h ═!
Alt-136, Alt-194 mov dl,al    (this is the 88-C2 command) ê┬
Alt-180, Alt-2 mov ah,2 ┤☻
Alt-205, Alt-33 int 21h ═!
Alt-195 ret

There you have it. An even more low-level program that does practically the same thing as the first. And it's only 11 bytes in size, still. =)

I know this article might not be the hardcore "we-don't-need-no-stinkin'-programming-language-we've-got-pure-machine-code" tutorial, but at least it's a start. I would recommend taking a look at http://www.theteacher.freeserve.co.uk/alevel/assem/assemix.htm for more information on assembly language and CPU architecture, and I'd also recommend W32Dasm and NASM for writing programs in assembly and finding out what the symbol codes are.

Also, don't forget to vote for an ASM section for PSC!


Other 3 submission(s) by this author

 

 
 Report Bad Submission
Use this form to notify us if this entry should be deleted (i.e contains no code, is a virus, etc.).
This submission should be removed because:
 
Your Vote!

What do you think of this article(in the Advanced category)?
(The article with your highest vote will win this month's coding contest!)
Excellent  Good  Average  Below Average  Poor See Voting Log
 
Other User Comments
3/10/2002 4:54:20 PMRaX

This is good. It explains how everything works. anyway, onto the real reason for posting. Someone needs to contact Ian about an ASM section. ASM isn't alot commercially alot but who doesn't love asm? jesus, you think that'd be the first section on the list
(If this comment was disrespectful, please report it.)

 
3/10/2002 5:50:13 PMUltimatum

RaX, thanks for your feedback. Anyway, after doing some more research, I found out that the 'ah' register is used to select the interrupt sub-function. 21h is the DOS interrupt. There are several sub-functions in each interrupt. The most commonly used ones are 1 (single character keyboard input), 2 (single character screen output), 8 (single character no-echo keyboard input), 9 (string output), 0A (buffered keyboard input), and 4C (exit program). For more information, take a look at http://www.csn.ul.ie/~darkstar/assembler. And yes, everyone contact Ian about adding an ASM or at least a mixed-language programming section.
(If this comment was disrespectful, please report it.)

 
3/10/2002 8:20:45 PMRaheel

Ultimatum u really took a lot of time and effort to write this up- helped a lot - keep it up...
coding like this will get people interested in assembly
(If this comment was disrespectful, please report it.)

 
3/10/2002 8:50:19 PMvbmew

Pretty kewl also check out my other article that shows you how to print strings in pure machine code
(If this comment was disrespectful, please report it.)

 
3/10/2002 8:52:12 PMUltimatum

Raheel, thank you for your vote of confidence. =) If an ASM section shows up sometime soon, I may post my Assembly tutorial, written for beginners from the eyes of a beginner. The biggest problem I think is that most people are afraid of Assembly because it appears to be too complex. I was like that myself. But the truth is it's very simple. This article is a poor representation of what Assembly really is, but I thought it may show everyone what's actually going on behind the scenes and why it's important to learn ASM. By the way, if anyone has any questions pertaining to assembly language, feel free to ask me.
(If this comment was disrespectful, please report it.)

 
3/10/2002 9:12:10 PMUltimatum

Vbmew, I looked =) I've seen all your code. Thank you for the inspiration. =) I was wondering, however, if you had any information on how to create references to more than one variable offest with machine code?
(If this comment was disrespectful, please report it.)

 
3/11/2002 2:02:29 AMChris

Sweet! This is exactly the kind of article I have been looking for, for about 2 years! I have always wanted to find out more about assembly, and after reading this, I want to know more! Thanks vbmew and Ultimatim! Please post more of these, and I'll be sure to contact Ian about an ASM section!
(If this comment was disrespectful, please report it.)

 
3/11/2002 7:01:18 AMRaX

hmmm maybe i should write up a tutorial on how to code a full app in hex ;)
(If this comment was disrespectful, please report it.)

 
3/11/2002 7:04:46 AMRaX

and... assembly is easy as hell. whats not to understand. people just overlook it for vb etc. I have alot of sources mainly in asm and i need somehwere to stash them hehe. I also made a hell of alot of reverse engineering tuts which i surpose would be deleted. But maybe not reverseme targets...
(If this comment was disrespectful, please report it.)

 
3/11/2002 11:11:27 AMDean

This is a great start, im about to embark on learning ASM, problem is i dont have time now, i will, this has given me even more impetus into my desire to goto the bare bones!

Good stuff man!
(If this comment was disrespectful, please report it.)

 
3/11/2002 12:15:34 PMUltimatum

Dean and Chris, I'm glad to help, and thank you for the votes. Maybe the more that roll in, the easier it will be for PSC to get an ASM section. =)
(If this comment was disrespectful, please report it.)

 
3/11/2002 12:16:23 PMUltimatum

My original idea was to write a disassembler (because not all disassembly programs are used for bad purposes), but I don't know all the opcodes for the instructions yet. =) I'm actually in the process of writing a C compiler from scratch. Right now, it doesn't support variables, strings, or comparison functions (if, switch, etc.) because I don't know how to dynamically offset the jumps to the correct addresses. My only theory is to count the number of bytes used during compile time and fill in the offsets with a linker. (continued below)
(If this comment was disrespectful, please report it.)

 
3/11/2002 12:16:53 PMUltimatum

And I also don't know how to compare values to see if their greater than or less than. Only equal or not equal. Maybe I'll take a look at the GCC code or something and see how they do it. =) RaX, you said you know how to write programs in all hex. Any ideas?
(If this comment was disrespectful, please report it.)

 
3/11/2002 3:03:37 PMUltimatum

I also want to say that I have more tutorials like this on the way, hopefully more in-depth than this one.
(If this comment was disrespectful, please report it.)

 
3/11/2002 7:22:05 PMvbmew

One thing we or missing which I did in one of my machine code tutorials is machine word. It reverse :'( thats why its almost impossible to reference varibles like 2000h would be 0020h pretty tricky
(If this comment was disrespectful, please report it.)

 
3/11/2002 8:58:34 PMUltimatum

Vbmew, this is true, but if my theory is correct, this only has to be done on the first two bits of the address.
(If this comment was disrespectful, please report it.)

 
3/12/2002 7:16:39 AMpear49

yeah i really think an asm section is gd.
asm still has a lot of followers on the net nowadays.

and oh yeah, there's no 5 bytes exe file one, an exe file without a header is considered a .com file also. So it is a com file
(If this comment was disrespectful, please report it.)

 
3/12/2002 7:18:42 AMpear49

ultimatum, not on the first 2 bit only but every 2 bytes are reversed
(If this comment was disrespectful, please report it.)

 
3/12/2002 7:58:42 AMUltimatum

Pear, you're correct. This program is smaller than the smallest amount of given memory, which technically makes it a .COM file, but it is still executable. =)
(If this comment was disrespectful, please report it.)

 
3/12/2002 3:27:24 PMRaX

nice. a disassembler is good but waaay to much work for a single person to create something as complex and great as IDA. comparing value's? you means jumps or? gimme an example and i'll try and help :). btw, if you are looking for co-coders for your dasm'r then id be interested
(If this comment was disrespectful, please report it.)

 
3/12/2002 5:00:27 PMUltimatum

RaX, it wouldn't be something really complex like W32Dasm, just something simpile like NDISASM. And nevermind about the comparing values thing. I just figured it out. =) The biggest thing holding me back from finishing this C compiler now is predetermining the offets of variables within the executable, especially strings. Any clue on to how to go about that?
(If this comment was disrespectful, please report it.)

 
3/12/2002 5:48:10 PMRaX

complex? w32dasm? pfft ;P what langauge you coding it in?
hmm, as for offsets for variables...i'm not sure exactly how compilers determine the offsets :( i should know this through just general play around with PE's so much >:(
(If this comment was disrespectful, please report it.)

 
3/13/2002 3:38:21 AMSilverAngel

wow and i thought ForTran was hard...
(If this comment was disrespectful, please report it.)

 
3/13/2002 3:39:35 AMSilverAngel

settle down boys.. dont make it so complicated.. i use Visual Basic more than anything else and i cant be happier.. avoid what your learning.. its like living a new life.
(If this comment was disrespectful, please report it.)

 
3/13/2002 4:55:34 AMRaX

hmm. making things compilcated is what makes some of us tick ;)
(If this comment was disrespectful, please report it.)

 
3/13/2002 5:57:25 AMBinSurf

Hey, guys. I was reading some of your comments and thought I could help. A few years back I wrote a Turing interpreter in C. I don't know if you know Turing. It's Canadian. I've been wanting to write a compiler for it, but lost the code. (poetic justice?) One of my thoughts for a compiler was to make output ascii assembly code and call an assembler. However, that gets sticky sometimes. It's easier to write an assembly compiler, so integrate that into it. It may be easier. Depends on how ambitious you are... :-)
(If this comment was disrespectful, please report it.)

 
3/13/2002 5:58:59 AMJames W. Manning

Incidentally, the approach to write EXE's in a binary format rather than ascii certainly is ambitious! :-) Happy coding!
(If this comment was disrespectful, please report it.)

 
3/13/2002 6:26:55 AMpear49

it's better to compile to asm code instead of machine code directly, it's easier and more modular. Furthermore in hope of future support, such as u have to provide support for another new type of executable format, u dun have to recompile the compiler. if u have a system of:

parser->assembly language generator->assembler->linker

u just have to update the linker to support the latest executable file format
(If this comment was disrespectful, please report it.)

 
3/13/2002 6:29:58 AMpear49

oh yeah, if u could phrase ur problem in simpler terms (such that i could understand), i may be able to help u.


(If this comment was disrespectful, please report it.)

 
3/13/2002 7:56:50 AMUltimatum

Actually, that's the point I'm coming to with the C compiler. I have a ton of classes and modules written in VB to parse the source and then output NASM-compatible assembler code (the OBJ file). NASM would then serve as somewhat of a linker. Eventually, however, I plan on writing an assembler of my own. I might have to take a look at the NASM source for that, however. Anyway, thanks for all the comments!
(If this comment was disrespectful, please report it.)

 
3/13/2002 7:59:31 AMUltimatum

SilverAngel, this is far from complicated. After learning this stuff, I now understand how practically all other forms of programming work. I even know how to call the Windows API from within an ASM program. =) And I'm using Visual Basic, too, mainly to write the C compiler and output the ASM code. It's not really that difficult. =)
(If this comment was disrespectful, please report it.)

 
3/13/2002 8:01:37 AMUltimatum

Hey, BinSurf, perhaps you could send me a file on the Turning language grammar in BNF format? I might be able to write a compiler for you. =)
(If this comment was disrespectful, please report it.)

 
3/14/2002 4:10:11 AMRaX

wow...turing. i have a turing program. i'm from england though. i got it from a canadian friend so that may explain why.
(If this comment was disrespectful, please report it.)

 
3/14/2002 5:45:20 PMg-zoom

This is GREAT! i've just started looking at asm but none of the books i found got right to what i needed and all depended on asemblers. Now that i have a couple examples in raw machine code i can finally write some of my own ASM programs. If i come up with anything useful i'll be sure to post it.
Thanks again for the awesome tutorial.
(If this comment was disrespectful, please report it.)

 
3/31/2002 1:12:58 PMUltimatum

Isn't it strange how the top three codes this month (so far) are about assembler in some way or another? HINT HINT, PSC! =)
(If this comment was disrespectful, please report it.)

 
6/9/2002 10:25:31 PMEric M

Ultimatum, how do I use the jump command to make your program echo over and over and over and over... you get the idea? By the way, 5 stars!
(If this comment was disrespectful, please report it.)

 
6/19/2002 1:44:12 PMDaNo

Where does the asm section stay¿¿¿
(If this comment was disrespectful, please report it.)

 
7/15/2002 10:34:16 AMLinkin Park

hey my friend i have a question for u
why do exe programs need the dll files such as msvb60.dll , which is the visual basic library if the programs are in machine code and processor can understand it?
(If this comment was disrespectful, please report it.)

 
7/20/2002 12:14:54 AMMicah Lansing

There is an ASM Section at www.forevercode.d2g.com if anyone cared...
(If this comment was disrespectful, please report it.)

 
Add Your Feedback!

Note:Not only will your feedback be posted, but an email will be sent to the code's author from the email account you registered on the site, so you can correspond directly.

NOTICE: The author of this article has been kind enough to share it with you.  If you have a criticism, please state it politely or it will be deleted.

For feedback not related to this particular article, please click here.
 
To post feedback, first please login.