Using Windows Code Pages

Created 4/15/2007

The Windows Console API functions such as WriteConsole use code pages to translate numeric character codes to displayable characters. The default code page is named OEM, based on the hardware-based character set created by IBM in the 1980's. You can find the OEM character in the back inside cover of Assembly Language for Intel-Based computers, for example. This character set is very limited, so you may prefer to use a wider set of international characters. Microsoft uses the extended ANSI character set to represent international characters for programs running console mode. Unicode, the more recent international standard, is not implemented in Windows console mode.

Download the sample program (CodePageDemo.asm)

By calling the SetConsoleOutputCP function from the Windows API, you can alter the code page for character translation. This only works when the console window is not displaying a raster font. You pass it a code page ID. You can see a list of available code pages in your computer's system registry at the following key:

 HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\control\Nls\CodePage

If your computer's locale is set to the United States, the default OEM code page is 437. Code page 858, on the other hand, is multilingual Latin and European. The code page for the extended Latin-1 character set is 1252. It has a wider selection of international characters than the OEM code page. Here is a link at microsoft.com that lists the code pages supported by Windows. In general, the non-OEM single-byte code pages are the following:

1250 - Central Europe
1251 - Cyrillic
1252 - Latin I
1253 - Greek
1254 - Turkish
1255 - Hebrew
1256 - Arabic
1247 - Baltic
1258 - Vietnam
874 - Thai

Sample Program

The following program switches the code page to ANSI and prints the characters 1 to 255. The first time you run it, change the font in the console window to a TrueType font (rather than a raster font). To set a font in the console window, click the System box, select Properties, select the Font tab, and select the font name from a list.

INCLUDE Irvine32.inc
SetConsoleOutputCP PROTO, pageNum:DWORD
.data
   divider BYTE " - ",0
   codepage DWORD 1253
.code
   main PROC
   invoke SetConsoleOutputCP, codePage
   mov  ecx,255
   mov  eax,1
   mov  edx,OFFSET divider

L1: 
   call WriteDec           ; EAX is a counter
   call WriteString        ; EDX points to string
   call WriteChar          ; AL is the character
   call Crlf
   inc  al                 ; next character
   Loop L1
   exit
   main ENDP
   END main

References