Original PDF Flash format postscript-language-reference,-third-edition  


Postscript Language Reference, Third Edition


PostScript®
LANGUAGE REFERENCE
third edition
Adobe Systems Incorporated
Addison-Wesley Publishing Company
Reading, Massachusetts • Menlo Park, California • New York • Don Mills, Ontario
Harlow, England • Amsterdam • Bonn • Sydney • Singapore • Tokyo
Madrid • San Juan • Paris • Seoul • Milan • Mexico City • Taipei


Library of Congress Cataloging-in-Publication Data
PostScript language reference manual / Adobe Systems Incorporated. — 3rd ed.
p. cm.
Includes bibliographical references and index.
ISBN 0-201-37922-8
1. PostScript (Computer program language) I. Adobe Systems.
QA76.73.P67 P67 1999
005.13'3—dc21
98-55489
CIP
© 1985–1999 Adobe Systems Incorporated. All rights reserved.
NOTICE: All information contained herein is the property of Adobe Systems Incorporated.
No part of this publication (whether in hardcopy or electronic form) may be reproduced
or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written consent of the publisher.
PostScript is a registered trademark of Adobe Systems Incorporated. All instances of the
name PostScript in the text are references to the PostScript language as defined by Adobe
Systems Incorporated unless otherwise stated. The name PostScript also is used as a prod-
uct trademark for Adobe Systems’ implementation of the PostScript language interpreter.
Except as otherwise stated, any mention of a “PostScript printer,” “PostScript software,” or
similar item refers to a product that contains PostScript technology created or licensed by
Adobe Systems Incorporated, not to one that purports to be merely compatible.
Adobe, Adobe Illustrator, Adobe Type Manager, Chameleon, Display PostScript, Frame-
Maker, Minion, Myriad, Photoshop, PostScript, PostScript 3, and the PostScript logo are
trademarks of Adobe Systems Incorporated. LocalTalk, QuickDraw, and TrueType are
trademarks and Mac OS is a registered trademark of Apple Computer, Inc. Helvetica and
Times are registered trademarks of Linotype-Hell AG and/or its subsidiaries. Times New
Roman is a trademark of The Monotype Corporation registered in the U.S. Patent and
Trademark Office and may be registered in certain other jurisdictions. Unicode is a regis-
tered trademark of Unicode, Inc. PANTONE is a registered trademark and Hexachrome is
a trademark of Pantone, Inc. Windows is a registered trademark of Microsoft Corporation.
All other trademarks are the property of their respective owners.
This publication and the information herein are furnished AS IS, are subject to change
without notice, and should not be construed as a commitment by Adobe Systems Incorpo-
rated. Adobe Systems Incorporated assumes no responsibility or liability for any errors or
inaccuracies, makes no warranty of any kind (express, implied, or statutory) with respect
to this publication, and expressly disclaims any and all warranties of merchantability, fit-
ness for particular purposes, and noninfringement of third-party rights.
ISBN 0-201-37922-8
1 2 3 4 5 6 7 8 9 CRS 03 02 01 00 99
First printing February 1999


iii
Contents
Preface
xiii
Chapter 1: Introduction
1
1.1
About This Book
3
1.2
Evolution of the PostScript Language
5
1.3
LanguageLevel 3 Overview
6
1.4
Related Publications
7
1.5
Copyrights and Trademarks
9
Chapter 2: Basic Ideas
11
2.1
Raster Output Devices
11
2.2
Scan Conversion
12
2.3
Page Description Languages
13
2.4
Using the PostScript Language
15
Chapter 3: Language
23
3.1
Interpreter
24
3.2
Syntax
25
3.3
Data Types and Objects
34
3.4
Stacks
45
3.5
Execution
46
3.6
Overview of Basic Operators
51
3.7
Memory Management
56
3.8
File Input and Output
73
3.9
Named Resources
87
3.10
Functions
106
3.11
Errors
114
3.12
Early Name Binding
117
3.13
Filtered Files Details
123
3.14
Binary Encoding Details
156
Chapter 4: Graphics
175
4.1
Imaging Model
176
4.2
Graphics State
178
4.3
Coordinate Systems and Transformations
182


iv
Contents
4.4
Path Construction
189
4.5
Painting
193
4.6
User Paths
197
4.7
Forms
206
4.8
Color Spaces
210
4.9
Patterns
248
4.10
Images
288
Chapter 5: Fonts
313
5.1
Organization and Use of Fonts
313
5.2
Font Dictionaries
321
5.3
Character Encoding
328
5.4
Glyph Metric Information
331
5.5
Font Cache
333
5.6
Unique ID Generation
335
5.7
Type 3 Fonts
337
5.8
Additional Base Font Types
343
5.9
Font Derivation and Modification
348
5.10
Composite Fonts
357
5.11
CID-Keyed Fonts
364
Chapter 6: Device Control
391
6.1
Using Page Devices
393
6.2
Page Device Parameters
398
6.3
In-RIP Trapping
439
6.4
Output Device Dictionary
455
Chapter 7: Rendering
457
7.1
CIE-Based Color to Device Color
459
7.2
Conversions among Device Color Spaces
473
7.3
Transfer Functions
478
7.4
Halftones
480
7.5
Scan Conversion Details
501
Chapter 8: Operators
505
8.1
Operator Summary
508
8.2
Operator Details
524
Appendix A: LanguageLevel Feature Summary
725
A.1
LanguageLevel 3 Features
725
A.2
LanguageLevel 2 Features
731
A.3
Incompatibilities
735


v
Contents
Appendix B: Implementation Limits
737
B.1
Typical Limits
738
B.2
Virtual Memory Use
742
Appendix C: Interpreter Parameters
745
C.1
Properties of User and System Parameters
746
C.2
Defined User and System Parameters
749
C.3
Details of User and System Parameters
753
C.4
Device Parameters
760
Appendix D: Compatibility Strategies
761
D.1
The LanguageLevel Approach
761
D.2
When to Provide Compatibility
763
D.3
Compatibility Techniques
765
D.4
Installing Emulations
769
Appendix E: Character Sets and Encoding Vectors
773
E.1
Times Family
775
E.2
Helvetica Family
776
E.3
Courier Family
777
E.4
Symbol
778
E.5
Standard Latin Character Set
779
E.6
StandardEncoding Encoding Vector
784
E.7
ISOLatin1Encoding Encoding Vector
785
E.8
CE Encoding Vector
786
E.9
Expert Character Set
787
E.10
Expert Encoding Vector
790
E.11
ExpertSubset Encoding Vector
791
E.12
Symbol Character Set
792
E.13
Symbol Encoding Vector
794
Appendix F: System Name Encodings
795
Appendix G: Operator Usage Guidelines
801
Bibliography
811
INDEX
817



vii
Figures
2.1
How the PostScript interpreter and an application interact
16
3.1
Mapping with the Decode array
112
3.2
Homogeneous number array
161
3.3
Binary object sequence
164
4.1
The two squares produced by Example 4.1
186
4.2
Effects of coordinate transformations
188
4.3
Nonzero winding number rule
195
4.4
Even-odd rule
196
4.5
Color specification
212
4.6
Color rendering
213
4.7
Component transformations in the CIEBasedABC color space
222
4.8
Component transformations in the CIEBasedA color space
229
4.9
CIEBasedDEFG pre-extension to the CIEBasedABC color space
232
4.10
Output from Example 4.21
256
4.11
Output from Example 4.23
259
4.12
Starting a new triangle in a free-form Gouraud-shaded triangle mesh
272
4.13
Connecting triangles in a free-form Gouraud-shaded triangle mesh
272
4.14
Varying the value of the edge flag to create different shapes
273
4.15
Lattice-form triangular meshes
275
4.16
Coordinate mapping from a unit square to a four-sided Coons patch
277
4.17
Painted area and boundary of a Coons patch
279
4.18
Color values and edge flags in Coons patch meshes
281
4.19
Edge connections in a Coons patch mesh
282
4.20
Control points in a tensor-product mesh
284
4.21
Typical sampled image
288
4.22
Image data organization and processing
293
4.23
Source image coordinate system
294
4.24
Mapping the source image
295


viii
Contents
5.1
Results of Example 5.2
317
5.2
Glyphs painted in 50% gray
318
5.3
Glyph outlines treated as a path
319
5.4
Graphics clipped by a glyph path
320
5.5
Encoding scheme for Type 1 fonts
329
5.6
Glyph metrics
331
5.7
Relationship between two sets of metrics
333
5.8
Output from Example 5.6
341
5.9
Composite font mapping example
359
5.10
CID-keyed font basics
367
5.11
Type 0 CIDFont character processing
372
6.1
Trapping example
440
6.2
Sliding trap
452
7.1
Various halftoning effects
486
7.2
Halftone cell with a nonzero angle
493
7.3
Angled halftone cell divided into two squares
493
7.4
Halftone cell and two squares tiled across device space
494
7.5
Tiling of device space in a type 16 halftone dictionary
497
7.6
Rasterization without stroke adjustment
504
8.1
arc operator
530
8.2
arc operator example
531
8.3
arcn operator example
532
8.4
arct operator
533
8.5
arct operator example
533
8.6
curveto operator
565
8.7
imagemask example
609
8.8
setflat operator
669
8.9
Line cap parameter shapes
673
8.10
Line join parameter shapes
674
8.11
Miter length
676


ix
Tables
2.1
Control characters for the interactive executive
21
3.1
White-space characters
27
3.2
Types of objects
34
3.3
Standard local dictionaries
65
3.4
Standard global dictionaries
66
3.5
Access strings
79
3.6
Standard filters
85
3.7
Regular resources
91
3.8
Resources whose instances are implicit
91
3.9
Resources used in defining new resource categories
92
3.10
Standard procedure sets in LanguageLevel 3
96
3.11
Entries in a category implementation dictionary
101
3.12
Entries common to all function dictionaries
108
3.13
Additional entries specific to a type 0 function dictionary
109
3.14
Additional entries specific to a type 2 function dictionary
113
3.15
Additional entries specific to a type 3 function dictionary
114
3.16
Entries in the $error dictionary
116
3.17
Entries in an LZWEncode or LZWDecode parameter dictionary
133
3.18
Typical LZW encoding sequence
135
3.19
Entries in a FlateEncode or FlateDecode parameter dictionary
138
3.20
Predictor-related entries in an LZW or Flate filter parameter dictionary
141
3.21
Entries in a CCITTFaxEncode or CCITTFaxDecode parameter
dictionary
144
3.22
Entries in a DCTEncode parameter dictionary
148
3.23
Entries in a SubFileDecode parameter dictionary (LanguageLevel 3)
152
3.24
Entries in a ReusableStreamDecode parameter dictionary
155
3.25
Binary token interpretation
158
3.26
Number representation in header for a homogeneous number array
162
3.27
Object type, length, and value fields
166
4.1
Device-independent parameters of the graphics state
179
4.2
Device-dependent parameters of the graphics state
180
4.3
Operation codes for encoded user paths
201
4.4
Entries in a type 1 form dictionary
208
4.5
Entries in a CIEBasedABC color space dictionary
223

x
Contents
4.6
Entries in a CIEBasedA color space dictionary
229
4.7
Additional entries specific to a CIEBasedDEF color space dictionary
233
4.8
Additional entries specific to a CIEBasedDEFG color space dictionary
235
4.9
Entries in a type 1 pattern dictionary
251
4.10
Entries in a type 2 pattern dictionary
260
4.11
Entries common to all shading dictionaries
262
4.12
Additional entries specific to a type 1 shading dictionary
265
4.13
Additional entries specific to a type 2 shading dictionary
266
4.14
Additional entries specific to a type 3 shading dictionary
268
4.15
Additional entries specific to a type 4 shading dictionary
270
4.16
Additional entries specific to a type 5 shading dictionary
275
4.17
Additional entries specific to a type 6 shading dictionary
279
4.18
Data values in a Coons patch mesh
282
4.19
Data values in a tensor-product patch mesh
287
4.20
Entries in a type 1 image dictionary
298
4.21
Typical Decode arrays
300
4.22
Entries in a type 3 image dictionary
304
4.23
Entries in an image data dictionary
305
4.24
Entries in a mask dictionary
306
4.25
Entries in a type 4 image dictionary
307
5.1
Font types
322
5.2
Entries common to all font dictionaries
324
5.3
Additional entries common to all base fonts
325
5.4
Additional entries specific to Type 1 fonts
326
5.5
Entries in a FontInfo dictionary
327
5.6
Additional entries specific to Type 3 fonts
338
5.7
Additional entries specific to Type 42 fonts
346
5.8
Additional entries specific to Type 0 fonts
357
5.9
FMapType mapping algorithms
360
5.10
Entries in a CIDSystemInfo dictionary
368
5.11
CIDFontType and FontType values
370
5.12
Entries common to all CIDFont dictionaries
370
5.13
Additional entries specific to Type 0 CIDFont dictionaries
373
5.14
Entries in a dictionary in FDArray
374
5.15
Entries replacing Subrs in the Private dictionary of an FDArray
dictionary
375
5.16
Additional entry specific to Type 1 CIDFont dictionaries
377
5.17
Additional entries specific to Type 2 CIDFont dictionaries
378
5.18
Entries in a CMap dictionary
383
6.1
Categories of page device parameters
399
6.2
Page device parameters related to media selection
400

xi
Contents
6.3
Page device parameters related to roll-fed media
412
6.4
Page device parameters related to page image placement
414
6.5
Page device parameters related to page delivery
417
6.6
Page device parameters related to color support
420
6.7
Page device parameters related to device initialization and page
setup
426
6.8
Page device parameter related to recovery policies
433
6.9
Entries in the Policies dictionary
433
6.10
Entries in a Type 1001 trapping details dictionary
442
6.11
Entries in a colorant details dictionary
443
6.12
Entries in a colorant subdictionary
444
6.13
Entries in a trapping parameter dictionary
447
6.14
Example of normal trapping rule
451
6.15
Entries in a ColorantZoneDetails dictionary
454
6.16
Entries in an output device dictionary
455
7.1
Entries in a type 1 CIE-based color rendering dictionary
463
7.2
Rendering intents
470
7.3
Types of halftone dictionaries
485
7.4
Entries in a type 1 halftone dictionary
487
7.5
Entries in a type 3 halftone dictionary
490
7.6
Entries in a type 6 halftone dictionary
491
7.7
Entries in a type 10 halftone dictionary
495
7.8
Entries in a type 16 halftone dictionary
496
7.9
Entries in a proprietary halftone dictionary
500
8.1
Operand and result types
506
A.1
LanguageLevel 3 operators defined in procedure sets
726
A.2
New resource categories
727
A.3
New resource instances
727
A.4
New page device and interpreter parameters
728
B.1
Architectural limits
739
B.2
Typical memory limits in LanguageLevel 1
741
C.1
User parameters
749
C.2
System parameters
751
E.1
Encoding vectors
773
G.1
Guidelines summary
802


xiii
Preface
IN THE 1980S, ADOBE DEVISED a powerful graphics imaging model that over
time has formed the basis for the Adobe PostScript technologies. These technolo-
gies—a combination of the PostScript language and PostScript language–based
graphics and text-formatting applications, drivers, and imaging systems—have
forever changed the printing and publishing world by sparking the desktop and
digital publishing revolutions. Since their inception, PostScript technologies have
enabled unprecedented control of the look and feel of printed documents and
have changed the overall process for designing and printing them as well. The
capabilities PostScript makes possible have established it as the industry page de-
scription language standard.
Today, as never before, application developers and imaging systems vendors
support the PostScript language as the industry standard. We at Adobe accept our
responsibility as stewards of this standard to continually advance the standard in
response to the creative needs of the industry.
With this third advance of the language, which we call LanguageLevel 3, Adobe
has greatly expanded the boundaries of imaging capabilities made possible
through the PostScript language. This most recent advance has yielded significant
improvements in the efficiency and performance of the language as well as in the
quality of final output.
To complement the strengths of LanguageLevel 3, Adobe PostScript 3 imaging
system technologies have been engineered to exploit the new LanguageLevel 3
constructs to the fullest extent, fulfilling the Adobe commitment to provide
printing solutions for the broad spectrum of users.
No significant change comes without the concerted effort of many individuals.
The work to advance the PostScript language and to create Adobe PostScript 3
imaging system technologies is no exception. Our goal since the introduction of
the first Adobe imaging model has been nothing less than to provide the most in-
novative, meaningful imaging solutions in the industry. Dedicated Adobe em-
ployees and many industry partners have striven to make that goal a reality. We
take this opportunity to thank all those who contributed to this effort.
John Warnock and Chuck Geschke
February 1999


1
CHAPTER 1
Introduction
1
THE POSTSCRIPT® LANGUAGE is a simple interpretive programming lan-
guage with powerful graphics capabilities. Its primary application is to describe
the appearance of text, graphical shapes, and sampled images on printed or dis-
played pages, according to the Adobe imaging model. A program in this language
can communicate a description of a document from a composition system to a
printing system or control the appearance of text and graphics on a display. The
description is high-level and device-independent.
The page description and interactive graphics capabilities of the PostScript lan-
guage include the following features, which can be used in any combination:
Arbitrary shapes made of straight lines, arcs, rectangles, and cubic curves. Such
shapes may self-intersect and have disconnected sections and holes.
Painting operators that permit a shape to be outlined with lines of any thick-
ness, filled with any color, or used as a clipping path to crop any other graphic.
Colors can be specified in a variety of ways: grayscale, RGB, CMYK, and CIE-
based. Certain other features are also modeled as special kinds of colors: re-
peating patterns, smooth shading, color mapping, and spot colors.
Text fully integrated with graphics. In the Adobe imaging model, text charac-
ters in both built-in and user-defined fonts are treated as graphical shapes that
may be operated on by any of the normal graphics operators.
Sampled images derived from natural sources (such as scanned photographs)
or generated synthetically. The PostScript language can describe images sam-
pled at any resolution and according to a variety of color models. It provides a
number of ways to reproduce images on an output device.

2
C H A P T E R 1
Introduction
A general coordinate system that supports all combinations of linear transfor-
mations, including translation, scaling, rotation, reflection, and skewing. These
transformations apply uniformly to all elements of a page, including text,
graphical shapes, and sampled images.
A PostScript page description can be rendered on a printer, display, or other out-
put device by presenting it to a PostScript interpreter controlling that device. As
the interpreter executes commands to paint characters, graphical shapes, and
sampled images, it converts the high-level PostScript description into the low-
level raster data format for that particular device.
Normally, application programs such as document composition systems, illustra-
tors, and computer-aided design systems generate PostScript page descriptions
automatically. Programmers generally write PostScript programs only when cre-
ating new applications. However, in special situations a programmer can write
PostScript programs to take advantage of capabilities of the PostScript language
that are not accessible through an application program.
The extensive graphics capabilities of the PostScript language are embedded in
the framework of a general-purpose programming language. The language
includes a conventional set of data types, such as numbers, arrays, and strings;
control primitives, such as conditionals, loops, and procedures; and some unusu-
al features, such as dictionaries. These features enable application programmers
to define higher-level operations that closely match the needs of the application
and then to generate commands that invoke those higher-level operations. Such a
description is more compact and easier to generate than one written entirely in
terms of a fixed set of basic operations.
PostScript programs can be created, transmitted, and interpreted in the form of
ASCII source text as defined in this book. The entire language can be described in
terms of printable characters and white space. This representation is convenient
for programmers to create, manipulate, and understand. It also facilitates storage
and transmission of files among diverse computers and operating systems, en-
hancing machine independence.
There are also binary encoded forms of the language for use in suitably controlled
environments—for example, when the program is assured of a fully transparent
communications path to the PostScript interpreter. Adobe recommends strict ad-
herence to the ASCII representation of PostScript programs for document inter-
change or archival storage.

3
1 . 1
About This Book
1.1 About This Book
This is the programmer’s reference for the PostScript language. It is the definitive
documentation for the syntax and semantics of the language, the imaging model,
and the effects of the graphics operators.
Chapter 2, “Basic Ideas,” is an informal presentation of some basic ideas under-
lying the more formal descriptions and definitions to come in later chapters.
These include the properties and capabilities of raster output devices, require-
ments for a language that effectively uses those capabilities, and some pragmat-
ic information about the environments in which the PostScript interpreter
operates and the kinds of PostScript programs it typically executes.
Chapter 3, “Language,” introduces the fundamentals of the PostScript lan-
guage: its syntax, semantics, data types, execution model, and interactions with
application programs. This chapter concentrates on the conventional program-
ming aspects of the language, ignoring its graphical capabilities and use as a
page description language.
Chapter 4, “Graphics,” introduces the Adobe imaging model at a device-
independent level. It describes how to define and manipulate graphical enti-
ties—lines, curves, filled areas, sampled images, and higher-level structures
such as patterns and forms. It includes complete information on the color
models that the PostScript language supports.
Chapter 5, “Fonts,” describes how the PostScript language deals with text.
Characters are defined as graphical shapes, whose behavior conforms to the
imaging model presented in Chapter 4. Because of the importance of text in
most applications, the PostScript language provides special capabilities for or-
ganizing sets of characters as fonts and for painting characters efficiently.
Chapter 6, “Device Control,” describes how a page description communicates
its document processing requirements to the output device. These include page
size, media selection, finishing options, and in-RIP trapping.
Chapter 7, “Rendering,” details the device-dependent aspects of rendering page
descriptions on raster output devices (printers and displays). These include
color rendering, transfer functions, halftoning, and scan conversion, each of
which is device-dependent in some way.

4
C H A P T E R 1
Introduction
Chapter 8, “Operators,” describes all PostScript operators and procedures. The
chapter begins by categorizing operators into functional groups. Then the
operators appear in alphabetical order, with complete descriptions of their op-
erands, results, side effects, and possible errors.
The appendices contain useful tables and other auxiliary information.
Appendix A, “LanguageLevel Feature Summary,” summarizes the ways the
PostScript language has been extended with new operators and other features
over time.
Appendix B, “Implementation Limits,” describes typical limits imposed by im-
plementations of the PostScript interpreter—for example, maximum integer
value and maximum stack depth.
Appendix C, “Interpreter Parameters,” specifies various parameters to control
the operation and behavior of the PostScript interpreter. Most of these parame-
ters have to do with allocation of memory and other resources for specific pur-
poses.
Appendix D, “Compatibility Strategies,” helps PostScript programmers take
advantage of newer PostScript language features while maintaining compatibil-
ity with the installed base of older PostScript interpreter products.
Appendix E, “Character Sets and Encoding Vectors,” describes the organization
of common fonts that are built into interpreters or are available as separate
software products.
Appendix F, “System Name Encodings,” assigns numeric codes to standard
names, for use in binary-encoded PostScript programs.
Appendix G, “Operator Usage Guidelines,” provides guidelines for PostScript
operators whose use can cause unintended side effects, make a document
device-dependent, or inhibit postprocessing of a document by other programs.
The book concludes with a Bibliography and an Index.
The enclosed CD-ROM contains the entire text of this book in Portable Docu-
ment Format (PDF).

5
1 . 2
Evolution of the PostScript Language
1.2 Evolution of the PostScript Language
Since its introduction in 1985, the PostScript language has been considerably ex-
tended for greater programming power, efficiency, and flexibility. Typically, these
language extensions have been designed to adapt the PostScript language to new
imaging technologies or system environments. While these extensions have intro-
duced significant new functionality and flexibility to the language, the basic
imaging model remains unchanged.
Extensions are organized into major groups, called LanguageLevels. Three
LanguageLevels have been defined, numbered 1, 2, and 3. Each LanguageLevel
encompasses all features of previous LanguageLevels as well as a significant num-
ber of new features. A PostScript interpreter claiming to support a given
LanguageLevel must implement all features defined in that LanguageLevel and
lower. Thus, for example, a feature identified in this book as “LanguageLevel 2” is
understood to be available in all LanguageLevel 3 implementations as well.
This book documents the entire PostScript language, which consists of three dis-
tinct groups of features. Features that are part of the LanguageLevel 2 or
LanguageLevel 3 additions are clearly identified as such. Features that are not
otherwise identified are LanguageLevel 1.
A PostScript interpreter can also support extensions that are not part of its base
LanguageLevel. Some such extensions are specialized to particular applications,
while others are of general utility and are candidates for inclusion in a future
LanguageLevel.
The most significant special-purpose extension is the set of features for the
Display PostScript® system. Those features enable workstation applications to use
the PostScript language and the Adobe imaging model for managing the appear-
ance of the display and for interacting with the workstation’s windowing system.
The Display PostScript extensions were documented in the second edition of this
book but have been removed for this edition. Further information is available in
the Display PostScript System manuals.
Appendix D describes strategies for writing PostScript programs that can run
compatibly on interpreters supporting different LanguageLevels. With some care,
a program can take advantage of features in a higher LanguageLevel when avail-
able but will still run acceptably when those features are not available.

6
C H A P T E R 1
Introduction
1.3 LanguageLevel 3 Overview
In addition to unifying many previous PostScript language extensions, Language-
Level 3 introduces a number of new features. This section summarizes those fea-
tures, for the benefit of readers who are already familiar with LanguageLevel 2.
Functions. A PostScript function is a self-contained, static description of a
mathematical function having one or more arguments and one or more results.
Filters. Three filters have been added, named FlateDecode, FlateEncode, and
ReusableStreamDecode. Some existing filters accept additional optional
parameters.
Idiom recognition. The bind operator can find and replace certain commonly
occurring procedures, called idioms, typically appearing in application prologs.
The substituted procedure achieves equivalent results with significantly im-
proved performance or quality. This enables LanguageLevel 3 features to work
in applications that have not yet been modified to use those features directly.
Clipping path stack. The clipsave and cliprestore operators save and restore just
the clipping path without affecting the rest of the graphics state.
Color spaces. Three color spaces have been added: CIEBasedDEF and CIEBased–
DEFG provide increased flexibility for specifying device-independent colors;
DeviceN provides a means of specifying high-fidelity and multitone colors.
Color space substitution. Colors that have been specified in DeviceGray,
DeviceRGB, or DeviceCMYK color spaces can be remapped into CIE-based
color spaces. This capability can be useful in a variety of circumstances, such as
for redirecting output intended for one device to a different one or for pro-
ducing CIE-based colors from an application that generates LanguageLevel 1
output only (and thus is unable to specify them directly).
Smooth shading. It is now possible to paint with a color that varies smoothly
over the object or region being painted.
Masked images. A sampled image can be clipped by a mask as it is painted. The
mask can be represented explicitly or encoded with a color key in the image
data. This enables the background to show through parts of the image.
CID-keyed fonts. This font organization provides a convenient and efficient
means for defining multiple-byte character encodings and for creating base
fonts containing a very large number of character descriptions.

7
1 . 4
Related Publications
Font formats. Support has been added for additional types of base fonts, includ-
ing CFF (Compact Font Format), Chameleon®, TrueType™, and bitmap fonts.
Device setup. There are many additional page device parameters to control col-
orant selection, finishing options, and other features. Any device can now pro-
duce arbitrary separations, even in a monochrome printing system (which can
mark only one colorant at a time).
In-RIP trapping. Certain products support trapping, which is the automatic
generation of overlaps to correct for colorant misregistration during the print-
ing process.
Color rendering intent. A PostScript program can specify a rendering intent for
color reproduction, causing automatic selection of an appropriate CIE-based
color rendering dictionary.
Halftones. Several standard halftone types have been added. They include 16-
bit threshold arrays and more flexible tiling organizations for improved color
accuracy on high-resolution devices. Halftone supercells increase the number
of gray levels achievable on low-resolution devices.
1.4 Related Publications
A number of publications related to this book are listed in the Bibliography; some
notable ones are mentioned here. For more details, see the Bibliography.
1.4.1 The Supplement
The PostScript Language Reference Supplement documents PostScript language
extensions that are available in certain releases of Adobe PostScript® software. A
new edition of the Supplement is published along with each major release of
Adobe PostScript software.
The Supplement documents three major classes of extensions:
New PostScript language features that have been introduced since the most re-
cent LanguageLevel and that are candidates for inclusion in the next Language-
Level.
Extensions for controlling unique features of products, such as communication
parameters, print engine options, and so on. Certain PostScript language fea-
tures, such as setdevparams, setpagedevice, and the named resource facility,

8
C H A P T E R 1
Introduction
are designed to be extended in this way. Although the framework for this is a
standard part of the PostScript language, the specific extensions are product-
dependent.
LanguageLevel 1 compatibility operators, principally in the statusdict diction-
ary. Those features were the LanguageLevel 1 means for controlling unique fea-
tures of products, but they have been superseded. They are not formally a part
of the PostScript language, but many of them are still supported in Adobe Post-
Script interpreters as a concession to existing applications that depend on
them.
1.4.2 Font Formats
PostScript interpreters support several standard formats for font programs, in-
cluding Adobe Type 1, CFF (Compact Font Format), TrueType, and CID-keyed
fonts. The PostScript language manifestations of those fonts are documented in
this book. However, the specifications for the font files themselves are published
separately, because they are highly specialized and are of interest to a different
user community. A variety of Adobe publications are available on the subject of
font formats, most notably the following:
Adobe Type 1 Font Format and Adobe Technical Note #5015, Type 1 Font Format
Supplement
Adobe Technical Note #5176, The Compact Font Format Specification
Adobe Technical Note #5012, The Type 42 Font Format Specification
Adobe Technical Note #5014, Adobe CMap and CID Font Files Specification
1.4.3 Document Structure
Some conventions have been established for the structure of PostScript programs
that are to be treated as documents. Those conventions, while not formally part
of the PostScript language, are highly recommended, since they enable interoper-
ability with applications that pay attention to them.
Adobe Technical Note #5001, PostScript Language Document Structuring Con-
ventions Specification, describes a convention for structuring PostScript page
descriptions to facilitate their handling and processing by other programs.

9
1 . 5
Copyrights and Trademarks
Adobe Technical Note #5002, Encapsulated PostScript File Format Specification,
describes a format that enables applications to treat each other’s output as in-
cluded illustrations.
1.4.4 Portable Document Format (PDF)
Adobe has specified another format, PDF, for portable representation of electron-
ic documents. PDF is documented in the Portable Document Format Reference
Manual.

PDF and the PostScript language share the same underlying Adobe imaging
model. A document can be converted straightforwardly between PDF and the
PostScript language; the two representations produce the same output when
printed. However, PDF lacks the general-purpose programming language frame-
work of the PostScript language. A PDF document is a static data structure that is
designed for efficient random access and includes navigational information suit-
able for interactive viewing.
1.5 Copyrights and Trademarks
The general idea of using a page description language is in the public domain.
Anyone is free to devise his or her own set of unique commands that constitute a
page description language. However, Adobe Systems Incorporated owns the
copyright for the list of operators and the written specification for Adobe’s Post-
Script language. Thus, these elements of the PostScript language may not be cop-
ied without Adobe’s permission. Additionally, Adobe owns the trademark
“PostScript,” which is used to identify both the PostScript language and Adobe’s
PostScript software.
Adobe will enforce its copyright and trademark rights. Adobe’s intentions are to:
Maintain the integrity of the PostScript language standard. This enables the
public to distinguish between the PostScript language and other page descrip-
tion languages.
Maintain the integrity of “PostScript” as a trademark. This enables the public
to distinguish between Adobe’s PostScript interpreter and other interpreters
that can execute PostScript language programs.

10
C H A P T E R 1
Introduction
However, Adobe desires to promote the use of the PostScript language for in-
formation interchange among diverse products and applications. Accordingly,
Adobe gives permission to anyone to:
Write programs in the PostScript language.
Write drivers to generate output consisting of PostScript language commands.
Write software to interpret programs written in the PostScript language.
Copy Adobe’s copyrighted list of commands to the extent necessary to use the
PostScript language for the above purposes.
The only condition of such permission is that anyone who uses the copyrighted
list of commands in this way must include an appropriate copyright notice. This
limited right to use the copyrighted list of commands does not include a right to
copy this book, other copyrighted publications from Adobe, or the software in
Adobe’s PostScript interpreter, in whole or in part.
The trademark PostScript® (or a derivative trademark, such as PostScript® 3™)
may not be used to identify any product not originating from or licensed by
Adobe. However, it is acceptable for a non-Adobe product to be described as be-
ing PostScript-compatible and supporting a specific LanguageLevel, assuming
that the claim is true.

11
CHAPTER 2
Basic Ideas
2
OBTAINING A COMPLETE UNDERSTANDING of the PostScript language
requires considering it from several points of view:
As a general-purpose programming language with powerful built-in graphics
primitives
As a page description language that includes programming features
As an interactive system for controlling raster output devices (printers and
displays)
As an application- and device-independent interchange format for page de-
scriptions
This chapter presents some basic ideas that are essential to understanding the
problems the PostScript language is designed to solve and the environments in
which it is designed to operate. Terminology introduced here appears throughout
the manual.
2.1 Raster Output Devices
Much of the power of the PostScript language derives from its ability to deal with
the general class of raster output devices. This class encompasses such technology
as laser, dot-matrix, and ink-jet printers, digital imagesetters, and raster scan
displays.
The defining property of a raster output device is that a printed or displayed im-
age consists of a rectangular array of dots, called pixels (picture elements), that
can be addressed individually. On a typical black-and-white output device, each
pixel can be made either black or white. On certain devices, each pixel can be set

12
C H A P T E R 2
Basic Ideas
to an intermediate shade of gray or to some color. The ability to individually set
the colors of pixels means that printed or displayed output can include text, arbi-
trary graphical shapes, and reproductions of sampled images.
The resolution of a raster output device is a measure of the number of pixels per
unit of distance along the two linear dimensions. Resolution is typically—but not
necessarily—the same horizontally and vertically.
Manufacturers’ decisions on device technology and price/performance tradeoffs
create characteristic ranges of resolution:
Computer displays have relatively low resolution, typically 75 to 110 pixels per
inch.
Dot-matrix printers generally range from 100 to 250 pixels per inch.
Ink-jet and laser-scanned xerographic printing technologies are capable of
medium-resolution output of 300 to 1400 pixels per inch.
Photographic technology permits high resolutions of 2400 pixels per inch or
more.
Higher resolution yields better quality and fidelity of the resulting output, but is
achieved at greater cost. As the technology improves and computing costs de-
crease, products evolve to higher resolutions.
2.2 Scan Conversion
An abstract graphical element (for example, a line, a circle, a text character, or a
sampled image) is rendered on a raster output device by a process known as scan
conversion
. Given a mathematical description of the graphical element, this pro-
cess determines which pixels to adjust and what values to assign those pixels to
achieve the most faithful rendition possible at the device resolution.
The pixels on the page can be represented by a two-dimensional array of pixel
values in computer memory. For an output device whose pixels can be only black
or white, a single bit suffices to represent each pixel. For a device whose pixels can
reproduce gray shades or colors, multiple bits per pixel are required.
Note: Although the ultimate representation of a printed or displayed page is logically
a complete array of pixels, its actual representation in computer memory need not


13
2 . 3
Page Description Languages
consist of one memory cell per pixel. Some implementations use other representa-
tions, such as display lists. The Adobe imaging model has been carefully designed not
to depend on any particular representation of raster memory.

For each graphical element that is to appear on the page, the scan converter sets
the values of the corresponding pixels. When the interpretation of the page de-
scription is complete, the pixel values in memory represent the appearance of the
page. At this point, a raster output process can make this representation visible
on a printed page or a display.
Scan-converting a graphical shape, such as a rectangle or a circle, involves deter-
mining which device pixels lie “inside” the shape and setting their values appro-
priately (for example, setting them to black). Because the edges of a shape do not
always fall precisely on the boundaries between pixels, some policy is required for
deciding which pixels along the edges are considered to be “inside.” Scan-
converting a text character is conceptually the same as scan-converting an arbi-
trary graphical shape; however, characters are much more sensitive to legibility
requirements and must meet more rigid objective and subjective measures of
quality.
Rendering grayscale elements on a bilevel device—one whose pixels can only be
black or white—is accomplished by a technique known as halftoning. The array
of pixels is divided into small clusters according to some pattern (called the
halftone screen). Within each cluster, some pixels are set to black and some to
white in proportion to the level of gray desired at that place on the page. When
viewed from a sufficient distance, the individual dots become unnoticeable and
the result is a shade of gray. This enables a bilevel raster output device to repro-
duce shades of gray and to approximate natural images, such as photographs.
Some color devices use a similar technique.
2.3 Page Description Languages
Theoretically, an application program could describe any page as a full-page pixel
array. But this would be unsatisfactory, because the description would be bulky,
the pixel array would be device-dependent, and memory requirements would be
beyond the capacity of many personal computers.
A page description language should enable applications to produce files that are
relatively compact for storage and transmission, and independent of any particu-
lar output device.

14
C H A P T E R 2
Basic Ideas
2.3.1 Imaging Model
In today’s computer printing industry, raster output devices with different prop-
erties are proliferating, as are the applications that generate output for those de-
vices. Meanwhile, expectations are also rising; typewriter emulation (text-only
output in a single typeface) is no longer adequate. Users want to create, display,
and print documents that combine sophisticated typography and graphics.
A high-level imaging model enables an application to describe the appearance of
pages containing text, graphical shapes, and sampled images in terms of abstract
graphical elements rather than in terms of device pixels. Such a description is
economical and device-independent. It can be used to produce high-quality out-
put on many different printers and displays.
A page description language is a language for expressing an imaging model. An
application program produces printed output through a two-stage process:
1. The application generates a device-independent description of the desired
output in the page description language.
2. A program controlling a specific raster output device interprets the descrip-
tion and renders it on that device.
The two stages may be executed in different places and at different times; the page
description language serves as an interchange standard for transmission and stor-
age of printable or displayable documents.
2.3.2 Static versus Dynamic Formats
A page description language may have either a static or a dynamic format.
A static format provides some fixed set of operations and a syntax for specifying
the operations and their arguments. Static formats have been in existence since
computers first used printers; classic examples are format control codes for line
printers and “format effector” codes in standard character sets. Historically,
static formats have been designed to capture the capabilities of a specific class
of printing device and have evolved to include new features as needed.
A dynamic format allows more flexibility than a static format. The operator set
may be extensible and the exact meaning of an operator may not be known un-
til it is actually encountered. A page described in a dynamic format is a pro-

15
2 . 4
Using the PostScript Language
gram to be executed, rather than data to be consumed. Dynamic page
description languages contain elements of programming languages, such as
procedures, variables, and control constructs.
The PostScript language design is dynamic. The language includes a set of primi-
tive graphics operators that can be combined to describe the appearance of any
printed or displayed page. It has variables and allows arbitrary computations
while interpreting the page description. It has a rich set of programming-
language control structures for combining its elements.
2.4 Using the PostScript Language
It is important to understand the PostScript interpreter and how it interacts with
applications using it.
2.4.1 The Interpreter
The PostScript interpreter controls the actions of the output device according to
the instructions provided in a PostScript program generated by an application.
The interpreter executes the program and produces output on a printer, display,
or other raster device.
There are three ways the PostScript interpreter and the application interact
(Figure 2.1 illustrates these scenarios):
In the conventional output-only printing model, the application creates a page
description—a self-contained PostScript language description of a document.
The page description can be sent to the PostScript interpreter immediately or
stored for transmission at some other time (via an intermediate print manager
or spooler, for example). The interpreter consumes a sequence of page descrip-
tions as “print jobs” and produces the requested output. The output device is
typically a printer, but it can be a preview window on a workstation’s display.
The PostScript interpreter is often implemented on a dedicated processor that
has direct control over the raster output device.
In the integrated display model, an application interacts with the PostScript
interpreter controlling a display or windowing system. Instead of a one-way
transmission of a page description, a two-way interactive session takes place
between the application and the interpreter. In response to user actions, the

16
C H A P T E R 2
Basic Ideas
application issues commands to the PostScript interpreter and sometimes reads
information back from it.
In the interactive programming language model, an interactive session takes
place directly between a programmer and the PostScript interpreter; the pro-
grammer issues PostScript commands for immediate execution. Many Post-
Script interpreters (for both printers and displays) have a rudimentary
interactive executive to support this mode of use; see Section 2.4.4, “Using the
Interpreter Interactively.”
Conventional output-only printing model
Page
PostScript
Printer or
Application
description
interpreter
preview device
Integrated display model
PostScript
Interactive
Application
interpreter
display
Interactive session
Interactive programming language model
Human
PostScript
Any
programmer
interpreter
device
Interactive session
FIGURE 2.1 How the PostScript interpreter and an application interact
Even when a PostScript interpreter is being used noninteractively to execute page
descriptions prepared previously, there may be some dynamic interactions be-
tween the print manager or spooler and the PostScript interpreter. For example,
the sender may ask the PostScript interpreter whether certain fonts referenced by
a document are available. This is accomplished by sending the interpreter a short
program to read and return the information. The PostScript interpreter makes no
distinction between a page description and a program that makes environmental
queries or performs other arbitrary computations.

17
2 . 4
Using the PostScript Language
To facilitate document interchange and document management, a page descrip-
tion should conform to the structuring conventions discussed below. The struc-
turing conventions do not apply in an interactive session, since there is no notion
that the information being communicated represents a document to be preserved
for later execution; a session has no obvious overall structure.
2.4.2 Program Structure
A well-structured PostScript page description generally consists of two parts: a
prolog followed by a script. There is nothing in the PostScript language that for-
mally distinguishes the prolog from the script or imposes any overall document
structure. Such structuring is merely a convention, but one that is quite useful
and is recommended for most applications.
The prolog is a set of application-specific procedure definitions that an applica-
tion may use in the execution of its script. It is included as the first part of every
PostScript file generated by the application. It contains definitions that match
the output functions of the application with the capabilities supported by the
PostScript language.
The script is generated automatically by the application program to describe
the specific elements of the pages being produced. It consists of references to
PostScript operators and to procedure definitions in the prolog, together with
operands and data. The script, unlike the prolog, is usually very stylized, repet-
itive, and simple.
Dividing a PostScript program into a prolog and a script reduces the size of each
page description and minimizes data communication and disk storage. An exam-
ple may help explain the purpose of a separate prolog and script. One of the most
common tasks in a PostScript program is placing text at a particular location on
the current page. This is really two operations: “moving” the current point to a
specific location and “showing” the text. A program is likely to do this often, so it
is useful for the prolog to define a procedure that combines the operations:
/ms {moveto show} bind def
Later, the script can call the ms procedure instead of restating the individual op-
erations:
(some text) 100 200 ms

18
C H A P T E R 2
Basic Ideas
The script portion of a printable document ordinarily consists of a sequence of
separate pages. The description of an individual page should stand by itself, de-
pending only on the definitions in the prolog and not on anything in previous
pages of the script. The language includes facilities (described in Section 3.7,
“Memory Management”) that can be used to guarantee page independence.
Adobe has established conventions to make document structure explicit. These
document structuring conventions appear in Adobe Technical Note #5001, Post-
Script Language Document Structuring Conventions Specification
. Document
structure is expressed in PostScript comments; the interpreter pays no attention
to them. However, there are good reasons to adhere to the conventions:
Utility programs can operate on structured documents in various ways: change
the order of pages, extract subsets of pages, embed individual pages within oth-
er pages, and so on. This is possible only if the original document maintains
page independence.
Print managers and spoolers can obtain useful information from a properly
structured document to determine how the document should be handled.
The structuring conventions serve as a good basis for organizing printing from
an application.
An application has its own model of the appearance of printable output that it
generates. Some parts of this model are fixed for an entire document or for all
documents; the application should incorporate their descriptions into the prolog.
Other parts vary from one page to another; the application should produce the
necessary descriptions of these as they appear. At page boundaries, the applica-
tion should generate commands to restore the standard environment defined by
the prolog and then explicitly reestablish nonstandard portions of the environ-
ment for the next page. This technique ensures that each page is independent of
any other.
The structuring conventions also include standard methods for performing envi-
ronmental queries. These conventions ensure consistent and reliable behavior in
a variety of system environments, including those with print spoolers.
2.4.3 Translating from Other Print Formats
Many existing applications generate printable documents in some other print file
format or in some intermediate representation. It is possible to print such docu-

19
2 . 4
Using the PostScript Language
ments by translating them into PostScript page descriptions. There are two sce-
narios in which this need arises:
An application describes its printable output by making calls to an application
programming interface, such as GDI in Microsoft Windows® or QuickDraw™
in the Apple Mac® OS. A software component called a printer driver interprets
these calls and produces a PostScript page description.
An application produces printable output directly in some other file format,
such as PCL, HPGL, or DVI. A separate program must then translate this file
into a PostScript page description.
Implementing a driver or translator is often the least expensive way to interface
an existing application to a PostScript printer. Unfortunately, while such transla-
tion is usually straightforward, a translator may not be able to generate page
descriptions that make the best use of the high-level Adobe imaging model. This
is because the information being translated often describes the desired results at a
level that is too low; any higher-level information maintained by the original ap-
plication has been lost and is not available to the translator.
While direct PostScript output from applications is most desirable, translation
from another print format may be the only choice available for some applica-
tions. A translator should do the best it can to produce output that conforms to
the document structuring conventions (see Technical Note #5001). This ensures
that such output is compatible with the tools for manipulating PostScript page
descriptions.
2.4.4 Using the Interpreter Interactively
Normally, the interpreter executes PostScript programs generated by application
programs; a user does not interact with the PostScript interpreter directly. How-
ever, many PostScript interpreters provide an interactive executive that enables a
user to control the interpreter directly. That is, from a terminal or terminal emu-
lator connected directly to the PostScript interpreter, you can issue commands
for immediate execution and control the operation of the interpreter in limited
ways. This is useful for experimentation and debugging.
To use the interpreter this way, you must first connect your keyboard and display
directly to the standard input and output channels of the PostScript interpreter,
so that characters you type are sent directly to the interpreter and characters the

20
C H A P T E R 2
Basic Ideas
interpreter sends appear on the screen. How to accomplish this depends on the
product. A typical method is to connect a personal computer running terminal
emulation software to a PostScript printer, either by direct physical connection or
by establishing communication over a network.
Once the input and output connections are established, you can invoke the inter-
active executive by typing
executive
(all lowercase) and pressing the Return key. The interpreter responds with a
herald, such as
PostScript(r) Version 3010.106
Copyright (c) 1984-1998 Adobe Systems Incorporated.
All Rights Reserved.
PS>
The PS> prompt is an indication that the PostScript interpreter is waiting for a
command.
Each time you type a complete PostScript statement followed by the Return key,
the interpreter executes that statement and then sends another PS> prompt. If the
statement causes the interpreter to send back any output (produced by execution
of the print or = operator, for example), that output appears before the PS>
prompt. If the statement causes an error to occur, an error message appears be-
fore the PS> prompt; control remains in the interactive executive, whereas errors
normally cause a job to terminate. The interactive executive remains in operation
until you invoke the quit operator or enter a channel-dependent end-of-file indi-
cation (for example, Control-D for a serial connection).
The interactive executive provides a few simple amenities. While you are typing,
the interpreter ordinarily “echoes” the typed characters (sends them back to your
terminal so that you can see them). You can use the control characters in
Table 2.1 to make corrections while entering a statement.

21
2 . 4
Using the PostScript Language
TABLE 2.1 Control characters for the interactive executive
CHARACTER
FUNCTION
Backspace (BS)
Backs up and erases one character.
Delete (DEL)
Same as backspace.
Control-U
Erases the current line.
Control-R
Redisplays the current line.
Control-C
Aborts the entire statement and starts over. Control-C can also
abort a statement that is executing and force the executive to revert
to a PS> prompt.
There are several important things to understand about the interactive executive:
It is intended solely for direct interaction with the user; an application that is
generating PostScript programs should never invoke executive. In general, a
PostScript program will behave differently when sent through the interactive
executive than when executed directly by the PostScript interpreter. For exam-
ple, the executive produces extraneous output such as echoes of the input char-
acters and PS> prompts. Furthermore, a program that explicitly reads data
embedded in the program file will malfunction if invoked via the executive,
since the executive itself is interpreting the file.
The user amenities are intentionally minimal. The executive is not a full-scale
programming environment; it lacks a text editor and other tools required for
program development and it does not keep a record of your interactive session.
The executive is useful mainly for experimentation and debugging.
The executive operator is not necessarily available in all PostScript interpreters.
Its behavior may vary among different products.


23
CHAPTER 3
Language
3
SYNTAX, DATA TYPES, AND EXECUTION SEMANTICS are essential aspects
of any PostScript program. Later chapters document the graphics and font capa-
bilities that specialize PostScript programs to the task of controlling the appear-
ance of a printed page. This chapter explains the PostScript language as a
programming language.
Like all programming languages, the PostScript language builds on elements and
ideas from several of the great programming languages. The syntax most closely
resembles that of the programming language FORTH. It incorporates a postfix
notation in which operators are preceded by their operands. The number of spe-
cial characters is small and there are no reserved words.
Note: Although the number of built-in operators is large, the names that represent
operators are not reserved by the language. A PostScript program may change the
meanings of operator names.

The data model includes elements, such as numbers, strings, and arrays, that are
found in many modern programming languages. It also includes the ability to
treat programs as data and to monitor and control many aspects of the language’s
execution state; these notions are derived from programming languages such as
LISP.
The PostScript language is relatively simple. It derives its power from the ability
to combine these features in unlimited ways without arbitrary restrictions.
Though you may seldom fully exploit this power, you can design sophisticated
graphical applications that would otherwise be difficult or impossible.
Because this is a reference book and not a tutorial, this chapter describes each as-
pect of the language systematically and thoroughly before moving on to the next.

24
C H A P T E R 3
Language
It begins with a brief overview of the PostScript interpreter. The following sec-
tions detail the syntax, data types, execution semantics, memory organization,
and general-purpose operators of the PostScript language (excluding those that
deal with graphics and fonts). The final sections cover file input and output,
named resources, function dictionaries, errors, how the interpreter evaluates
name objects, and details on filtered files and binary encoding.
3.1 Interpreter
The PostScript interpreter executes the PostScript language according to the rules
in this chapter. These rules determine the order in which operations are carried
out and how the pieces of a PostScript program fit together to produce the de-
sired results.
The interpreter manipulates entities called PostScript objects. Some objects are
data, such as numbers, boolean values, strings, and arrays. Other objects are ele-
ments of programs to be executed, such as names, operators, and procedures.
However, there is not a distinction between data and programs; any PostScript
object may be treated as data or be executed as part of a program.
The interpreter operates by executing a sequence of objects. The effect of exe-
cuting a particular object depends on that object’s type, attributes, and value. For
example, executing a number object causes the interpreter to push a copy of that
object on the operand stack (to be described shortly). Executing a name object
causes the interpreter to look up the name in a dictionary, fetch the associated
value, and execute it. Executing an operator object causes the interpreter to
perform a built-in action, such as adding two numbers or painting characters in
raster memory.
The objects to be executed by the interpreter come from two principal sources:
A character stream may be scanned according to the syntax rules of the Post-
Script language, producing a sequence of new objects. As each object is
scanned, it is immediately executed. The character stream may come from an
external source, such as a file or a communication channel, or it may come
from a string object previously stored in the PostScript interpreter’s memory.
Objects previously stored in an array in memory may be executed in sequence.
Such an array is known as a procedure.

25
3 . 2
Syntax
The interpreter can switch back and forth between executing a procedure and
scanning a character stream. For example, if the interpreter encounters a name in
a character stream, it executes that name by looking it up in a dictionary and re-
trieving the associated value. If that value is a procedure object, the interpreter
suspends scanning the character stream and begins executing the objects in the
procedure. When it reaches the end of the procedure, it resumes scanning the
character stream where it left off. The interpreter maintains an execution stack for
remembering all of its suspended execution contexts.
3.2 Syntax
As the interpreter scans the text of a PostScript program, it creates various types
of PostScript objects, such as numbers, strings, and procedures. This section dis-
cusses only the syntactic representation of such objects. Their internal representa-
tion and behavior are covered in Section 3.3, “Data Types and Objects.”
There are three encodings for the PostScript language: ASCII, binary token, and
binary object sequence. The ASCII encoding is preferred for expository purposes
(such as this book), for archiving documents, and for transmission via communi-
cations facilities, because it is easy to read and does not rely on any special charac-
ters that might be reserved for communications use. The two binary encodings
are usable in controlled environments to improve the efficiency of representation
or execution; they are intended exclusively for machine generation. Detailed in-
formation on the binary encodings is provided in Section 3.14, “Binary Encoding
Details.”
3.2.1 Scanner
The PostScript language differs from most other programming languages in that
it does not have any syntactic entity for a “program,” nor is it necessary for an en-
tire “program” to exist in one place at one time. There is no notion of “reading in”
a program before executing it. Instead, the PostScript interpreter consumes a pro-
gram by reading and executing one syntactic entity at a time. From the interpret-
er’s point of view, the program has no permanent existence. Execution of the
program may have side effects in the interpreter’s memory or elsewhere. These
side effects may include the creation of procedure objects in memory that are in-
tended to be invoked later in the program; their execution is deferred.

26
C H A P T E R 3
Language
It is not correct to think that the PostScript interpreter “executes” the character
stream directly. Rather, a scanner groups characters into tokens according to the
PostScript language syntax rules. It then assembles one or more tokens to create a
PostScript object—in other words, a data value in the interpreter’s memory.
Finally, the interpreter executes the object.
For example, when the scanner encounters a group of consecutive digits sur-
rounded by spaces or other separators, it assembles the digits into a token and
then converts the token into a number object represented internally as a binary
integer. The interpreter then executes this number object; in this case, it pushes a
copy of the object on the operand stack.
3.2.2 ASCII Encoding
The standard character set for ASCII-encoded PostScript programs is the visible
printable subset of the ASCII character set, plus characters that appear as “white
space,” such as space, tab, and newline characters. ASCII is the American Stan-
dard Code for Information Interchange, a widely used convention for encoding
characters as binary numbers. ASCII encoding does not prohibit the use of char-
acters outside this set, but such use is not recommended, because it impairs port-
ability and may make transmission and storage of PostScript programs more
difficult.
Note: Control characters are often usurped by communications functions. Control
codes are device-dependent—not part of the PostScript language. For example, the
serial communication protocol supported by many products uses the Control-D
character as an end-of-file indication. In such cases, Control-D is a communications
function and should not be part of a PostScript program.

White-space characters (Table 3.1) separate syntactic constructs such as names
and numbers from each other. The interpreter treats any number of consecutive
white-space characters as if there were just one. All white-space characters are
equivalent, except in comments and strings.
The characters carriage return (CR) and line feed (LF) are also called newline
characters. The combination of a carriage return followed immediately by a line
feed is treated as one newline.

27
3 . 2
Syntax
TABLE 3.1 White-space characters
OCTAL
HEXADECIMAL
DECIMAL
NAME
000
00
0
Null (nul)
011
09
9
Tab (tab)
012
0A
10
Line feed (LF)
014
0C
12
Form feed (FF)
015
0D
13
Carriage return (CR)
040
20
32
Space (SP)
The characters (, ), <, >, [, ], {, }, /, and % are special. They delimit syntactic entities
such as strings, procedure bodies, name literals, and comments. Any of these
characters terminates the entity preceding it and is not included in the entity.
All characters besides the white-space characters and delimiters are referred to as
regular characters. These include nonprinting characters that are outside the rec-
ommended PostScript ASCII character set.
Comments
Any occurrence of the character % outside a string introduces a comment. The
comment consists of all characters between the % and the next newline or form
feed, including regular, delimiter, space, and tab characters.
The scanner ignores comments, treating each one as if it were a single white-
space character. That is, a comment separates the token preceding it from the one
following. Thus the ASCII-encoded program fragment
abc% comment {/%) blah blah blah
123
is treated by the scanner as just two tokens: abc and 123.

28
C H A P T E R 3
Language
Numbers
Numbers in the PostScript language include:
Signed integers, such as
123 −98 43445 0 +17
Real numbers, such as
−.002 34.5 −3.62 123.6e10 1.0E−5 1E6 −1. 0.0
Radix numbers, such as
8#1777 16#FFFE 2#1000
An integer consists of an optional sign followed by one or more decimal digits.
The number is interpreted as a signed decimal integer and is converted to an inte-
ger object. If it exceeds the implementation limit for integers, it is converted to a
real object. (See Appendix B for implementation limits.)
A real number consists of an optional sign and one or more decimal digits, with
an embedded period (decimal point), a trailing exponent, or both. The exponent,
if present, consists of the letter E or e followed by an optional sign and one or
more decimal digits. The number is interpreted as a real number and is converted
to a real (floating-point) object. If it exceeds the implementation limit for real
numbers, a limitcheck error occurs.
A radix number takes the form base#number, where base is a decimal integer in
the range 2 through 36. number is interpreted in this base; it must consist of digits
ranging from 0 to base − 1. Digits greater than 9 are represented by the letters A
through Z (or a through z). The number is treated as an unsigned integer and is
converted to an integer object having the same twos-complement binary repre-
sentation. This notation is intended for specifying integers in a nondecimal radix,
such as binary, octal, or hexadecimal. If the number exceeds the implementation
limit for integers, a limitcheck error occurs.

29
3 . 2
Syntax
Strings
There are three conventions for quoting a literal string object:
As literal text, enclosed in ( and )
As hexadecimal data, enclosed in < and >
As ASCII base-85 data, enclosed in <~ and ~> (LanguageLevel 2)
Literal Text Strings
A literal text string consists of an arbitrary number of characters enclosed in
( and ). Any characters may appear in the string other than (, ), and \, which must
be treated specially. Balanced pairs of parentheses in the string require no special
treatment.
The following lines show several valid strings:
(This is a string)
(Strings may contain newlines
and such.)
(Strings may contain special characters *!&}^% and
balanced parentheses ( ) (and so on).)
(The following is an empty string.)
()
(It has 0 (zero) length.)
Within a text string, the \ (backslash) character is treated as an “escape” for vari-
ous purposes, such as including newline characters, unbalanced parentheses, and
the \ character itself in the string. The character immediately following the \ de-
termines its precise interpretation.
\n
line feed (LF)
\r
carriage return (CR)
\t
horizontal tab
\b
backspace
\f
form feed
\\
backslash
\(
left parenthesis
\)
right parenthesis
\ddd
character code ddd (octal)

30
C H A P T E R 3
Language
If the character following the \ is not in the preceding list, the scanner ignores the
\. If the \ is followed immediately by a newline (CR, LF, or CR-LF pair), the scan-
ner ignores both the initial \ and the newline; this breaks a string into multiple
lines without including the newline character as part of the string, as in the fol-
lowing example:
(These \
two strings \
are the same.)
(These two strings are the same.)
But if a newline appears without a preceding \, the result is equivalent to \n. For
example:
(This string has a newline at the end of it.
)
(So does this one.\n)
For more information about end-of-line conventions, see Section 3.8, “File Input
and Output.”
The \ddd form may be used to include any 8-bit character constant in a string.
One, two, or three octal digits may be specified, with high-order overflow ig-
nored. This notation is preferred for specifying a character outside the recom-
mended ASCII character set for the PostScript language, since the notation itself
stays within the standard set and thereby avoids possible difficulties in transmit-
ting or storing the text of the program. It is recommended that three octal digits
always be used, with leading zeros as needed, to prevent ambiguity. The string
(\0053), for example, contains two characters—an ASCII 5 (Control-E) followed
by the digit 3—whereas the strings (\53) and (\053) contain one character, the
ASCII character whose code is octal 53 (plus sign).
Hexadecimal Strings
A hexadecimal string consists of a sequence of hexadecimal digits (0–9 and either
A–F or a–f) enclosed within < and >. Each pair of hexadecimal digits defines one
character of the string. White-space characters are ignored. If a hexadecimal
string contains characters outside the allowed character set, a syntaxerror occurs.
Hexadecimal strings are useful for including arbitrary binary data as literal text.

31
3 . 2
Syntax
If the final digit of a given hexadecimal string is missing—in other words, if there
is an odd number of digits—the final digit is assumed to be 0. For example,
<901fa3> is a 3-character string containing the characters whose hexadecimal
codes are 90, 1f, and a3, but <901fa> is a 3-character string containing the charac-
ters whose hexadecimal codes are 90, 1f, and a0.
ASCII Base-85 Strings
An ASCII base-85 string (LanguageLevel 2) consists of a sequence of printable
ASCII characters enclosed in <~ and ~>. This notation represents arbitrary bi-
nary data using an encoding technique that produces a 4:5 expansion as opposed
to the 1:2 expansion for hexadecimal. The ASCII base-85 encoding algorithm is
described under “ASCII85Encode Filter” on page 131. If an ASCII base-85 string
is malformed, a syntaxerror occurs.
Names
Any token that consists entirely of regular characters and cannot be interpreted as
a number is treated as a name object (more precisely, an executable name). All
characters except delimiters and white-space characters can appear in names, in-
cluding characters ordinarily considered to be punctuation.
The following are examples of valid names:
abc Offset $$ 23A 13−456 a.b $MyDict @pattern
Use care when choosing names that begin with digits. For example, while 23A is a
valid name, 23E1 is a real number, and 23#1 is a radix number token that repre-
sents an integer.
A / (slash—not backslash) introduces a literal name. The slash is not part of the
name itself, but is a prefix indicating that the following sequence of zero or more
regular characters constitutes a literal name object. There can be no white-space
characters between the / and the name. The characters // (two slashes) introduce
an immediately evaluated name. The important properties and uses of names and
the distinction between executable and literal names are described in Section 3.3,
“Data Types and Objects”; immediately evaluated names are discussed in
Section 3.12.2, “Immediately Evaluated Names.”
Note: The token / (a slash followed by no regular characters) is a valid literal name.

32
C H A P T E R 3
Language
Arrays
The characters [ and ] are self-delimiting tokens that specify the construction of
an array. For example, the program fragment
[ 123 /abc (xyz) ]
results in the construction of an array object containing the integer object 123,
the literal name object abc, and the string object xyz. Each token within the
brackets is executed in turn.
The [ and ] characters are special syntax for names that, when executed, invoke
PostScript operators that collect objects and construct an array containing them.
Thus the example
[ 123 /abc (xyz) ]
contains these five tokens:
The name object [
The integer object 123
The literal name object abc
The string object xyz
The name object ]
When the example is executed, a sixth object (the array) results from executing
the [ and ] name objects.
Procedures
The special characters { and } delimit an executable array, otherwise known as a
procedure. The syntax is superficially similar to that for the array construction op-
erators [ and ]; however, the semantics are entirely different and arise as a result of
scanning the procedure rather than executing it.
Scanning the program fragment
{add 2 div}

33
3 . 2
Syntax
produces a single procedure object that contains the name object add, the integer
object 2, and the name object div. When the scanner encounters the initial {, it
continues scanning and creating objects, but the interpreter does not execute
them. When the scanner encounters the matching }, it puts all the objects created
since the initial { into a new executable array (procedure) object.
The interpreter does not execute a procedure immediately, but treats it as data; it
pushes the procedure on the operand stack. Only when the procedure is explicitly
invoked (by means yet to be described) will it be executed. Execution of the pro-
cedure—and of all objects within the procedure, including any embedded proce-
dures—has been deferred. The matter of immediate versus deferred execution is
discussed in Section 3.5, “Execution.”
The procedure object created by { and } is either an array or a packed array,
according to the current setting of a mode switch. The distinction between these
array types is discussed in Section 3.3, “Data Types and Objects.”
Dictionaries
The special character sequences << and >> (LanguageLevel 2) are self-delimiting
tokens that denote the construction of a dictionary, much the same as [ and ] de-
note the construction of an array. They are intended to be used as follows:
<< key1 value1 key2 value2 … keyn valuen >>
This creates a dictionary containing the bracketed key-value pairs and pushes it
on the operand stack. Dictionaries are introduced in Section 3.3, “Data Types and
Objects.”
<< and >> are merely special names for operators that, when executed, cause a
dictionary to be constructed. They are like the [ and ] array construction opera-
tors, but unlike the { and } delimiters for procedure literals.
The << and >> tokens are self-delimiting, so they need not be surrounded by
white-space characters or other delimiters. Do not confuse these tokens with
< and >, which delimit a hexadecimal string literal, or <~ and ~>, which delimit
an ASCII base-85 string literal. The << and >> tokens are objects in their own
right (specifically, name objects), whereas in < … > and <~ … ~> the delimiting
characters are merely punctuation for the enclosed literal string objects.

34
C H A P T E R 3
Language
3.3 Data Types and Objects
All data accessible to PostScript programs, including procedures that are part of
the programs themselves, exists in the form of objects. Objects are produced, ma-
nipulated, and consumed by the PostScript operators. They are also created by
the scanner and executed by the interpreter.
Each object has a type, some attributes, and a value. Objects contain their own dy-
namic types; that is, an object’s type is a property of the object itself, not of where
it is stored or what it is called. Table 3.2 lists all the object types supported by the
PostScript language. Extensions to the language may introduce additional object
types. The distinction between simple and composite objects is explained below.
TABLE 3.2 Types of objects
SIMPLE OBJECTS
COMPOSITE OBJECTS
boolean
array
fontID
dictionary
integer
file
mark
gstate (LanguageLevel 2)
name
packedarray (LanguageLevel 2)
null
save
operator
string
real
3.3.1 Simple and Composite Objects
Objects of most types are simple, atomic entities. An atomic object is always con-
stant—a 2 is always a 2. There is no visible substructure in the object; the type, at-
tributes, and value are irrevocably bound together and cannot be changed.
However, objects of certain types indicated in Table 3.2 are composite. Their
values are separate from the objects themselves; for some types of composite ob-
ject, the values have internal substructure that is visible and can sometimes be

35
3 . 3
Data Types and Objects
modified selectively. The details of the substructures are presented later in the de-
scriptions of these individual types.
An important distinction between simple and composite objects is the behavior
of operations that copy objects. Copy refers to any operation that transfers the
contents of an object from one place to another in the memory of the PostScript
interpreter. “Fetching” and “storing” objects are copying operations. It is possible
to derive a new object by copying an existing one, perhaps with modifications.
When a simple object is copied, all of its parts (type, attributes, and value) are
copied together. When a composite object is copied, the value is not copied; in-
stead, the original and copy objects share the same value. Consequently, any
changes made to the substructure of one object’s value also appear as part of the
other object’s value.
The sharing of composite objects’ values in the PostScript language corresponds
to the use of pointers in system programming languages such as C and Pascal. In-
deed, the PostScript interpreter uses pointers to implement shared values: a com-
posite object contains a pointer to its value. However, the PostScript language
does not have any explicit notion of a pointer. It is better to think in terms of the
copying and sharing notions presented here.
The values of simple objects are contained in the objects themselves. The values
of composite objects reside in a special region of memory called virtual memory
or VM. Section 3.7, “Memory Management,” describes the behavior of VM.
3.3.2 Attributes of Objects
In addition to type and value, each object has one or more attributes. These
attributes affect the behavior of the object when it is executed or when certain op-
erations are performed on it. They do not affect its behavior when it is treated
strictly as data; so, for example, two integers with the same value are considered
“equal” even if their attributes differ.

36
C H A P T E R 3
Language
Literal and Executable
Every object is either literal or executable. This distinction comes into play when
the interpreter attempts to execute the object.
If the object is literal, the interpreter treats it strictly as data and pushes it on
the operand stack for use as an operand of some subsequent operator.
If the object is executable, the interpreter executes it.
What it means to execute an object depends on the object’s type; this is described
in Section 3.5, “Execution.” For some object types, such as integers, execution
consists of pushing the object on the operand stack; the distinction between lit-
eral and executable integers is meaningless. But for other types, such as names,
operators, and arrays, execution consists of performing a different action.
Executing an executable name causes it to be looked up in the current diction-
ary context and the associated value to be executed.
Executing an executable operator causes some built-in action to be performed.
Executing an executable array (otherwise known as a procedure) causes the ele-
ments of the array to be executed in turn.
As described in Section 3.2, “Syntax,” some tokens produce literal objects and
some produce executable ones.
Integer, real, and string constants are always literal objects.
Names are literal if they are preceded by / and executable if they are not.
The [ and ] operators, when executed, produce a literal array object with the en-
closed objects as elements. Likewise, << and >> (LanguageLevel 2) produce a
literal dictionary object.
{ and } enclose an executable array or procedure.
Note: As mentioned above, it does not matter whether an object is literal or execut-
able when it is accessed as data, only when it is executed. However, referring to an
executable object by name often causes that object to be executed automatically; see
Section 3.5.5, “Execution of Specific Types.” To avoid unintended behavior, it is best
to use the executable attribute only for objects that are meant to be executed, such as
procedures.


37
3 . 3
Data Types and Objects
Access
The other attribute of an object is its access. Only composite objects have access
attributes, which restrict the set of operations that can be performed on the ob-
ject’s value.
There are four types of access. In increasing order of restriction, they are:
1. Unlimited. Normally, objects have unlimited access: all operations defined for
that object are allowed. However, packed array objects always have read-only
(or even more restricted) access.
2. Read-only. An object with read-only access may not have its value written, but
may still be read or executed.
3. Execute-only. An object with execute-only access may not have its value either
read or written, but may still be executed by the PostScript interpreter.
4. None. An object with no access may not be operated on in any way by a Post-
Script program. Such objects are not of any direct use to PostScript programs,
but serve internal purposes that are not documented in this book.
The literal/executable distinction and the access attribute are entirely indepen-
dent, although there are combinations that are not of any practical use (for exam-
ple, a literal array that is execute-only).
With one exception, attributes are properties of an object itself and not of its
value. Two composite objects can share the same value but have different
literal/executable or access attributes. The exception is the dictionary type: a dic-
tionary’s access attribute is a property of the value, so multiple dictionary objects
sharing the same value have the same access attribute.
3.3.3 Integer and Real Objects
The PostScript language provides two types of numeric object: integer and real.
Integer objects represent mathematical integers within a certain interval centered
at 0. Real objects approximate mathematical real numbers within a much larger
interval, but with limited precision; they are implemented as floating-point num-
bers.

38
C H A P T E R 3
Language
Most PostScript arithmetic and mathematical operators can be applied to num-
bers of both types. The interpreter performs automatic type conversion when
necessary. Some operators expect only integers or a subrange of the integers as
operands. There are operators to convert from one data type to another explicitly.
Throughout this book, number means an object whose type is either integer or
real.
The range and precision of numbers is limited by the internal representations
used in the machine on which the PostScript interpreter is running. Appendix B
gives these limits for typical implementations of the PostScript interpreter.
Note: The machine representation of integers is accessible to a PostScript program
through the bitwise operators. However, the representation of integers may depend
on the CPU architecture of the implementation. The machine representation of real
numbers is not accessible to PostScript programs.

3.3.4 Boolean Objects
The PostScript language provides boolean objects with values true and false for
use in conditional and logical expressions. The names true and false are associ-
ated with values of this type. Boolean objects are the results of the relational
(comparison) and logical operators. Various other operators return them as sta-
tus information. Boolean objects are mainly used as operands for the control op-
erators if and ifelse.
3.3.5 Array Objects
An array is a one-dimensional collection of objects accessed by a numeric index.
Unlike arrays in many other computer languages, PostScript arrays may be heter-
ogeneous; that is, an array’s elements may be any combination of numbers,
strings, dictionaries, other arrays, or any other objects. A procedure is an array
that can be executed by the PostScript interpreter.
All arrays are indexed from 0, so an array of n elements has indices from 0
through n − 1. All accesses to arrays are bounds-checked, and a reference with an
out-of-bounds index results in a rangecheck error. The length of an array is sub-
ject to an implementation limit; see Appendix B.

39
3 . 3
Data Types and Objects
The PostScript language directly supports only one-dimensional arrays. Arrays of
higher dimension can be constructed by using arrays as elements of arrays, nested
to any depth.
As discussed earlier, an array is a composite object. When an array object is cop-
ied, the value is not copied. Instead, the old and new objects share the same value.
Additionally, there is an operator (getinterval) that creates a new array object
whose value is a subinterval of an existing array; the old and new objects share
the array elements in that subinterval.
3.3.6 Packed Array Objects
A packed array is a more compact representation of an ordinary array, intended
primarily for use as a procedure. A packed array object is distinct from an ordi-
nary array object (it has type packedarray instead of array), but in most respects it
behaves the same as an ordinary array. Its principal distinguishing feature is that
it usually occupies much less space in memory (see Section B.2, “Virtual Memory
Use”).
Throughout this book, any mention of a procedure may refer to either an execut-
able array or an executable packed array. The two types of array are not distin-
guishable when they are executed, only when they are treated as data. See the
introduction to the array operators in Section 3.6, “Overview of Basic Operators.”
3.3.7 String Objects
A string is similar to an array, but its elements must be integers in the range 0 to
255. The string elements are not integer objects, but are stored in a more compact
format. However, the operators that access string elements accept or return ordi-
nary integer objects with values in the range 0 to 255. The length of a string is
subject to an implementation limit; see Appendix B.
String objects are conventionally used to hold text, one character per string
element. However, the PostScript language does not have a distinct “character”
syntax or data type and does not require that the integer elements of a string en-
code any particular character set. String objects may also be used to hold arbi-
trary binary data.

40
C H A P T E R 3
Language
To enhance program portability, strings appearing literally as part of a PostScript
program should be limited to characters from the printable ASCII character set,
with other characters inserted by means of the \ddd escape convention (see
Section 3.2.2, “ASCII Encoding”). ASCII text strings are fully portable; ASCII
base-85 text strings are fully portable among LanguageLevel
2 and
LanguageLevel 3 PostScript interpreters.
Like an array, a string is a composite object. Copying a string object or creating a
subinterval (substring) results in sharing the string’s value.
3.3.8 Name Objects
A name is an atomic symbol uniquely defined by a sequence of characters. Names
serve the same purpose as “identifiers” in other programming languages: as tags
for variables, procedures, and so on. However, PostScript names are not just lan-
guage artifacts, but are first-class data objects, similar to “atoms” in LISP.
A name object is ordinarily created when the scanner encounters a PostScript to-
ken consisting entirely of regular characters, perhaps preceded by /, as described
in Section 3.2, “Syntax.” However, a name may also be created by explicit conver-
sion from a string, so there is no restriction on the set of characters that can be
included in names. The length of a name, however, is subject to an implementa-
tion limit; see Appendix B.
Unlike a string, a name is a simple object not made up of other objects. Although
a name is defined by a sequence of characters, those characters are not “elements”
of the name. A name object, although logically simple, does have an invisible
“value” that occupies space in VM.
A name is unique. Any two name objects defined by the same sequence of charac-
ters are identical copies of each other. Name equality is based on an exact match
between the corresponding characters defining each name. The case of letters
must match, so the names A and a are different. Literal and executable objects can
be equal, however.
The interpreter can efficiently determine whether two existing name objects are
equal without comparing the characters that define the names. This makes names
useful as keys in dictionaries.

41
3 . 3
Data Types and Objects
Names do not have values, unlike variable or procedure names in other program-
ming languages. However, names can be associated with values in dictionaries.
3.3.9 Dictionary Objects
A dictionary is an associative table whose entries are pairs of PostScript objects.
The first element of an entry is the key and the second element is the value. The
PostScript language includes operators that insert an entry into a dictionary, look
up a key and fetch the associated value, and perform various other operations.
Keys are normally name objects. The PostScript syntax and the interpreter are
optimized for this most common case. However, a key may be any PostScript ob-
ject except null (defined later). If you attempt to use a string as a key, the Post-
Script interpreter will first convert the string to a name object; thus, strings and
names are interchangeable when used as keys in dictionaries
. Consequently, a string
used as a dictionary key is subject to the implementation limit on the length of a
name.
A dictionary has the capacity to hold a certain maximum number of entries; the
capacity is specified when the dictionary is created. PostScript interpreters of dif-
ferent LanguageLevels differ in their behavior when a program attempts to insert
an entry into a dictionary that is full: in LanguageLevel 1, a dictfull error occurs;
in LanguageLevels 2 and 3, the interpreter enlarges the dictionary automatically.
The length of a dictionary is also subject to an implementation limit; see
Appendix B.
Dictionaries ordinarily associate the names and values of a program’s compo-
nents, such as variables and procedures. This association corresponds to the con-
ventional use of identifiers in other programming languages. But there are many
other uses for dictionaries. For example, a PostScript font program contains a
dictionary that associates the names of characters with the procedures for draw-
ing those characters’ shapes (see Chapter 5).
There are three primary methods for accessing dictionaries:
Operators exist to access a specific dictionary supplied as an operand.
There is a current dictionary and a set of operators to access it implicitly.
The interpreter automatically looks up executable names it encounters in the
program being executed.

42
C H A P T E R 3
Language
The interpreter maintains a dictionary stack defining the current dynamic name
space. Dictionaries may be pushed on and popped off the dictionary stack at will.
The topmost dictionary on the stack is the current dictionary.
When the interpreter looks up a key implicitly—for example, when it executes a
name object—it searches for the key in the dictionaries on the dictionary stack. It
searches first in the topmost dictionary, then in successively lower dictionaries on
the dictionary stack, until it either finds the key or exhausts the dictionary stack.
In LanguageLevel 1, there are two built-in dictionaries permanently on the dic-
tionary stack; they are called systemdict and userdict. In LanguageLevels 2 and 3,
there are three dictionaries: systemdict, globaldict, and userdict.
systemdict is a read-only dictionary that associates the names of all the Post-
Script operators (those defined in this book) with their values (the built-in ac-
tions that implement them). It also contains other definitions, including the
standard local and global dictionaries listed in Section 3.7.5, “Standard and
User-Defined Dictionaries,” as well as various named constants such as true
and false.
globaldict (LanguageLevel 2) is a writeable dictionary in global VM. This is ex-
plained in Section 3.7.2, “Local and Global VM.”
userdict is a writeable dictionary in local VM. It is the default modifiable nam-
ing environment normally used by PostScript programs.
userdict is the topmost of the permanent dictionaries on the dictionary stack.
The def operator puts definitions there unless the program has pushed some oth-
er dictionary on the dictionary stack. Applications can and should create their
own dictionaries rather than put things in userdict.
A dictionary is a composite object. Copying a dictionary object does not copy the
dictionary’s contents. Instead, the contents are shared.
3.3.10 Operator Objects
An operator object represents one of the PostScript language’s built-in actions.
When the object is executed, its built-in action is invoked. Much of this book is
devoted to describing the semantics of the various operators.

43
3 . 3
Data Types and Objects
Operators have names. Most operators are associated with names in systemdict:
the names are the keys and the operators are the associated values. When the in-
terpreter executes one of these names, it looks up the name in the context of the
dictionary stack. Unless the name has been defined in some dictionary higher on
the dictionary stack, the interpreter finds its definition in systemdict, fetches the
associated value (the operator object itself), and executes it.
All standard operators are defined in systemdict. However, an application that
tests whether an operator is defined should not use the known operator to deter-
mine whether the operator is in systemdict; it should instead use the where oper-
ator to check all dictionaries on the dictionary stack. Using where enables proper
handling of operator emulations (see Appendix D).
Note: There are some special internal PostScript operators whose names begin with
an at sign (@). These operators are not officially part of the PostScript language and
are not defined in
systemdict. They may appear as an “offending command” in error
messages.

There is nothing special about an operator name, such as add, that distinguishes
it as an operator. Rather, the name add is associated in systemdict with the oper-
ator for performing addition, and execution of the operator causes the addition
to occur. Thus the name add is not a “reserved word,” as it might be in other pro-
gramming languages. Its meaning can be changed by a PostScript program.
Throughout this book, the notation add means “the operator object associated
with the name add in systemdict” or, occasionally, in some other dictionary.
3.3.11 File Objects
A file is a readable or writeable stream of characters transferred between the Post-
Script interpreter and its environment. The characters in a file may be stored per-
manently—in a disk file, for instance—or may be generated dynamically and
transferred via a communication channel.
A file object represents a file. There are operators to open a file and create a file ob-
ject for it. Other operators access an open file to read, write, and process charac-
ters in various ways—as strings, as PostScript tokens, as binary data represented
in hexadecimal, and so on.

44
C H A P T E R 3
Language
Standard input and output files are always available to a PostScript program. The
standard input file is the usual source of programs to be interpreted; the standard
output file is the usual destination of such things as error and status messages.
Although a file object does not have components visible at the PostScript lan-
guage level, it is composite in the sense that all copies of a file object share the
same underlying file as their value. If a file operator has a side effect on the under-
lying file, such as closing it or changing the current position in the stream, all file
objects sharing the file are affected.
The properties of files and the operations on them are described in more detail in
Section 3.8, “File Input and Output.”
3.3.12 Mark Objects
A mark is a special object used to denote a position on the operand stack. This
use is described in the presentation of stack and array operators in Section 3.6,
“Overview of Basic Operators.” There is only one value of type mark, created by
invoking the operator mark, [, or <<. Mark objects are not legal operands for
most operators. They are legal operands for ], >>, counttomark, cleartomark, and
a few generic operators such as pop and type.
3.3.13 Null Objects
The PostScript interpreter uses null objects to fill empty or uninitialized positions
in composite objects when they are created. There is only one value of type null;
the name null is associated with a null object in systemdict. Null objects are not
legal operands for most operators.
3.3.14 Save Objects
Save objects represent snapshots of the state of the PostScript interpreter’s memo-
ry. They are created and manipulated by the save and restore operators, intro-
duced in Section 3.7.3, “Save and Restore.”

45
3 . 4
Stacks
3.3.15 Other Object Types
FontID objects are special objects used in the construction of fonts; see
Section 5.2, “Font Dictionaries.”
A gstate object (LanguageLevel 2) represents an entire graphics state; see
Section 4.2, “Graphics State.”
3.4 Stacks
The PostScript interpreter manages five stacks representing the execution state of
a PostScript program. Three of them—the operand, dictionary, and execution
stacks—are described here; the other two—the graphics state stack and clipping
path stack—are presented in Chapter 4. Stacks are “last in, first out” (LIFO) data
structures. In this book, “the stack” with no qualifier always means the operand
stack.
The operand stack holds arbitrary PostScript objects that are the operands and
results of PostScript operators being executed. The interpreter pushes objects
on the operand stack when it encounters them as literal data in a program be-
ing executed. When an operator requires one or more operands, it obtains
them by popping them off the top of the operand stack. When an operator re-
turns one or more results, it does so by pushing them on the operand stack.
The dictionary stack holds only dictionary objects. The current set of dictionar-
ies on the dictionary stack defines the environment for all implicit name
searches, such as those that occur when the interpreter encounters an execut-
able name. The role of the dictionary stack is introduced in Section 3.3, “Data
Types and Objects,” and is further explained in Section 3.5, “Execution.”
The execution stack holds executable objects (mainly procedures and files) that
are in intermediate stages of execution. At any point in the execution of a Post-
Script program, this stack represents the program’s call stack. Whenever the in-
terpreter suspends execution of an object to execute some other object, it
pushes the new object on the execution stack. When the interpreter finishes ex-
ecuting an object, it pops that object off the execution stack and resumes exe-
cuting the suspended object beneath it.

46
C H A P T E R 3
Language
The three stacks are independent and there are different ways to access each of
them:
The operand stack is directly under the control of the PostScript program being
executed. Objects may be pushed and popped arbitrarily by various operators.
The dictionary stack is also under PostScript program control, but it can hold
only dictionaries. The bottom three dictionaries on the stack—systemdict,
globaldict, and userdict—(or the bottom two, in LanguageLevel 1) cannot be
popped off. The only operators that can alter the dictionary stack are begin,
end, and cleardictstack.
The execution stack is under the control of the PostScript interpreter. It can be
read but not directly modified by a PostScript program.
When an object is pushed on a stack, the object is copied onto the stack from
wherever it was obtained; however, in the case of a composite object (such as an
array, a string, or a dictionary), the object’s value is not copied onto the stack, but
rather is shared with the original object. Similarly, when a composite object is
popped off a stack and put somewhere, only the object itself is moved, not its
value. See Section 3.3, “Data Types and Objects,” for more details.
The maximum capacity of stacks may be limited; see Appendices B and C.
3.5 Execution
Execution semantics are different for each of the various object types. Also, exe-
cution can be either immediate, occurring as soon as the object is created by the
scanner, or deferred to some later time.
3.5.1 Immediate Execution
Some example PostScript program fragments will help clarify the concept of exe-
cution. Example 3.1 illustrates the immediate execution of a few operators and
operands to perform some simple arithmetic.
Example 3.1
40 60 add 2 div

47
3 . 5
Execution
The interpreter first encounters the literal integer object 40 and pushes it on the
operand stack. Then it pushes the integer object 60 on the operand stack.
Next, it encounters the executable name object add, which it looks up in the envi-
ronment of the current dictionary stack. Unless add has been redefined else-
where, the interpreter finds it associated with an operator object, which it
executes. This invokes a built-in function that pops the two integer objects off the
operand stack, adds them together, and pushes the result (a new integer object
whose value is 100) back on the operand stack.
The rest of the program fragment is executed similarly. The interpreter pushes
the integer 2 on the operand stack and then executes the name div. The div oper-
ator pops two operands off the stack (the integers whose values are 2 and 100),
divides the second-to-top one by the top one (100 divided by 2, in this case), and
pushes the real result 50.0 on the stack.
The source of the objects being executed by the PostScript interpreter does not
matter. They may have been contained within an array or scanned in from a char-
acter stream. Executing a sequence of objects produces the same result regardless
of where the objects come from.
3.5.2 Operand Order
In Example 3.1, 40 is the first and 60 is the second operand of the add operator.
That is, objects are referred to according to the order in which they are pushed on
the operand stack.
This is the reverse of the order in which they are popped off by
the add operator. Similarly, the result pushed by the add operator is the first op-
erand of the div operator, and 2 is its second operand.
The same terminology applies to the results of an operator. If an operator pushes
more than one object on the operand stack, the first object pushed is the first
result. This order corresponds to the usual left-to-right order of appearance of
operands in a PostScript program.
3.5.3 Deferred Execution
The first line of Example 3.2 defines a procedure named average that computes
the average of two numbers. The second line applies that procedure to the inte-
gers 40 and 60, producing the same result as Example 3.1.

48
C H A P T E R 3
Language
Example 3.2
/average {add 2 div} def
40 60 average
The interpreter first encounters the literal name average. Recall from Section 3.2,
“Syntax,” that / introduces a literal name. The interpreter pushes this object on
the operand stack, as it would any object having the literal attribute.
Next, the interpreter encounters the executable array {add 2 div}. Recall that
{ and } enclose a procedure (an executable array or executable packed array object)
that is produced by the scanner. This procedure contains three elements: the exe-
cutable name add, the literal integer 2, and the executable name div. The inter-
preter has not encountered these elements yet.
Here is what the interpreter does:
1. Upon encountering this procedure object, the interpreter pushes it on the
operand stack, even though the object has the executable attribute. This is ex-
plained shortly.
2. The interpreter then encounters the executable name def. Looking up this
name in the current dictionary stack, it finds def to be associated in
systemdict with an operator object, which it invokes.
3. The def operator pops two objects off the operand stack (the procedure
{add 2 div} and the name average). It enters this pair into the current diction-
ary (most likely userdict), creating a new association having the name average
as its key and the procedure {add 2 div} as its value.
4. The interpreter pushes the integer objects 40 and 60 on the operand stack,
then encounters the executable name average.
5. It looks up average in the current dictionary stack, finds it to be associated
with the procedure {add 2 div}, and executes that procedure. In this case, exe-
cution of the array object consists of executing the elements of the array—the
objects add, 2, and div—in sequence. This has the same effect as executing
those objects directly. It produces the same result: the real object 50.0.
Why did the interpreter treat the procedure as data in the first line of the example
but execute it in the second, despite the procedure having the executable attribute
in both cases? There is a special rule that determines this behavior: An executable
array or packed array encountered directly by the interpreter is treated as data

49
3 . 5
Execution
(pushed on the operand stack), but an executable array or packed array encoun-
tered indirectly—as a result of executing some other object, such as a name or an
operator—is invoked as a procedure.
This rule reflects how procedures are ordinarily used. Procedures appearing di-
rectly (either as part of a program being read from a file or as part of some larger
procedure in memory) are usually part of a definition or of a construct, such as a
conditional, that operates on the procedure explicitly. But procedures obtained
indirectly—for example, as a result of looking up a name—are usually intended
to be executed. A PostScript program can override these semantics when
necessary.
3.5.4 Control Constructs
In the PostScript language, control constructs such as conditionals and iterations
are specified by means of operators that take procedures as operands. Example
3.3 computes the maximum of the values associated with the names a and b, as in
the steps that follow.
Example 3.3
a b gt {a} {b} ifelse
1. The interpreter encounters the executable names a and b in turn and looks
them up. Assume both names are associated with numbers. Executing the
numbers causes them to be pushed on the operand stack.
2. The gt (greater than) operator removes two operands from the stack and com-
pares them. If the first operand is greater than the second, it pushes the bool-
ean value true. Otherwise, it pushes false.
3. The interpreter now encounters the procedure objects {a} and {b}, which it
pushes on the operand stack.
4. The ifelse operator takes three operands: a boolean object and two procedures.
If the boolean object’s value is true, ifelse causes the first procedure to be exe-
cuted; otherwise, it causes the second procedure to be executed. All three oper-
ands are removed from the operand stack before the selected procedure is
executed.
In this example, each procedure consists of a single element that is an executable
name (either a or b). The interpreter looks up this name and, since it is associated

50
C H A P T E R 3
Language
with a number, pushes that number on the operand stack. So the result of execut-
ing the entire program fragment is to push on the operand stack the greater of the
values associated with a and b.
3.5.5 Execution of Specific Types
An object with the literal attribute is always treated as data—pushed on the oper-
and stack by the interpreter—regardless of its type. Even operator objects are
treated this way if they have the literal attribute.
For many objects, executing them has the same effect as treating them as data.
This is true of integer, real, boolean, dictionary, mark, save, gstate, and fontID
objects. So the distinction between literal and executable objects of these types is
meaningless. The following descriptions apply only to objects having the execut-
able attribute.
An executable array or executable packed array (procedure) object is pushed on
the operand stack if it is encountered directly by the interpreter. If it is invoked
indirectly as a result of executing some other object (a name or an operator), it
is called instead. The interpreter calls a procedure by pushing it on the execu-
tion stack and then executing the array elements in turn. When the interpreter
reaches the end of the procedure, it pops the procedure object off the execution
stack. (Actually, it pops the procedure object when there is one element
remaining and then pushes that element; this permits unlimited depth of “tail
recursion” without overflowing the execution stack.)
An executable string object is pushed on the execution stack. The interpreter
then uses the string as a source of characters to be converted to tokens and
interpreted according to the PostScript syntax rules. This continues until the
interpreter reaches the end of the string. Then it pops the string object from the
execution stack.
An executable file object is treated much the same as a string: The interpreter
pushes it on the execution stack. It reads the characters of the file and interprets
them as PostScript tokens until it encounters end-of-file. Then it closes the file
and pops the file object from the execution stack. See Section 3.8, “File Input
and Output.”
An executable name object is looked up in the environment of the current dic-
tionary stack and its associated value is executed. The interpreter looks first in
the top dictionary on the dictionary stack and then in other dictionaries suc-

51
3 . 6
Overview of Basic Operators
cessively lower on the stack. If it finds the name as a key in some dictionary, it
executes the associated value. To do that, it examines the value’s type and exe-
cutable attribute and performs the appropriate action described in this section.
Note that if the value is a procedure, the interpreter executes it. If the interpret-
er fails to find the name in any dictionary on the dictionary stack, an undefined
error occurs.
An executable operator object causes the interpreter to perform one of the built-
in operations described in this book.
An executable null object causes the interpreter to perform no action. In partic-
ular, it does not push the object on the operand stack.
3.6 Overview of Basic Operators
This is an overview of the general-purpose PostScript operators, excluding all op-
erators that deal with graphics and fonts, which are described in later chapters.
The information here is insufficient for actual programming; it is intended only
to acquaint you with the available facilities. For complete information about any
particular operator, you should refer to the operator’s detailed description in
Chapter 8.
3.6.1 Stack Operators
The operand stack is the PostScript interpreter’s mechanism for passing argu-
ments to operators and for gathering results from operators. It is introduced in
Section 3.4, “Stacks.”
There are various operators that rearrange or manipulate the objects on the oper-
and stack. Such rearrangement is often required when the results of some opera-
tors are to be used as arguments to other operators that require their operands in
a different order. These operators manipulate only the objects themselves; they
do not copy the values of composite objects.
dup duplicates an object.
exch exchanges the top two elements of the stack.
pop removes the top element from the stack.
copy duplicates portions of the operand stack.

52
C H A P T E R 3
Language
roll treats a portion of the stack as a circular queue.
index accesses the stack as if it were an indexable array.
mark marks a position on the stack.
clear clears the stack.
count counts the number of elements on the stack.
counttomark counts the elements above the highest mark. This is used prima-
rily for array construction (described later), but has other applications as well.
cleartomark removes all elements above the highest mark and then removes
the mark itself.
3.6.2 Arithmetic and Mathematical Operators
The PostScript language includes a conventional complement of arithmetic and
mathematical operators. In general, these operators accept either integer or real
number objects as operands. They produce either integers or real numbers as
results, depending on the types of the operands and the magnitude of the results.
If the result of an operation is mathematically meaningless or cannot be repre-
sented as a real number, an undefinedresult error occurs.
add, sub, mul, div, idiv, and mod are arithmetic operators that take two argu-
ments.
abs, neg, ceiling, floor, round, and truncate are arithmetic operators that take
one argument.
sqrt, exp, ln, log, sin, cos, and atan are mathematical and trigonometric func-
tions.
rand, srand, and rrand access a pseudo-random number generator.
3.6.3 Array, Packed Array, Dictionary, and String Operators
A number of operators are polymorphic: they may be applied to operands of sev-
eral different types and their precise functions depend on the types of the oper-
ands. Except where indicated otherwise, the operators listed below apply to any of
the following types of composite objects: arrays, packed arrays, dictionaries, and
strings.

53
3 . 6
Overview of Basic Operators
get takes a composite object and an index (or a key, in the case of a dictionary)
and returns a single element of the object.
put stores a single element in an array, dictionary, or string. This operator does
not apply to packed array objects, because they always have read-only (or even
more restrictive) access.
copy copies the value of a composite object to another composite object of the
same type, replacing the second object’s former value. This is different from
merely copying the object. See Section 3.3.1, “Simple and Composite Objects”
for a discussion of copying objects.
length returns the number of elements in a composite object.
forall accesses all of the elements of a composite object in sequence, calling a
procedure for each one.
getinterval creates a new object that shares a subinterval of an array, a packed
array, or a string. This operator does not apply to dictionary objects.
putinterval overwrites a subinterval of one array or string with the contents of
another. This operator does not apply to dictionary or packed array objects, al-
though it can overwrite a subinterval of an array with the contents of a packed
array.
In addition to the polymorphic operators, there are operators that apply to only
one of the array, packed array, dictionary, and string types. For each type, there is
an operator (array, packedarray, dict, string) that creates a new object of that
type and a specified length. These four operators explicitly create new composite
object values, consuming virtual memory (VM) resources (see Section 3.7.1,
“Virtual Memory”). Most other operators read and write the values of composite
objects but do not create new ones. Operators that return composite results usu-
ally require an operand that is the composite object into which the result values
are to be stored. The operators are organized this way to give programmers maxi-
mum control over consumption of VM.
Array, packed array, and string objects have a fixed length that is specified when
the object is created. In LanguageLevel 1, dictionary objects also have this proper-
ty. In LanguageLevels 2 and 3, a dictionary’s capacity can grow beyond its initial
allocation.

54
C H A P T E R 3
Language
The following operators apply only to arrays and (sometimes) packed arrays:
aload and astore transfer all the elements of an array to or from the operand
stack in a single operation. aload may also be applied to a packed array.
The array construction operators [ and ] combine to produce a new array object
whose elements are the objects appearing between the brackets. The [ operator,
which is a synonym for mark, pushes a mark object on the operand stack. Exe-
cution of the program fragment between the [ and the ] causes zero or more ob-
jects to be pushed on the operand stack. Finally, the ] operator counts the
number of objects above the mark on the stack, creates an array of that length,
stores the elements from the stack in the array, removes the mark from the
stack, and pushes the array on the stack.
setpacking and currentpacking (both LanguageLevel 2) control a mode setting
that determines the type of procedure objects the scanner generates when it en-
counters a sequence of tokens enclosed in { and }. If the array packing mode is
true, the scanner produces packed arrays; if the mode is false, it produces ordi-
nary arrays. The default value is false.
Packed array objects always have read-only (or even more restricted) access, so
the put, putinterval, and astore operations are not allowed on them. Accessing
arbitrary elements of a packed array object can be quite slow; however, access-
ing the elements sequentially, as the PostScript interpreter and the forall opera-
tor do, is efficient.
The following operators apply only to dictionaries:
begin and end push new dictionaries on the dictionary stack and pop them off.
def and store associate keys with values in dictionaries on the dictionary stack;
load and where search for keys there.
countdictstack, cleardictstack, and dictstack operate on the dictionary stack.
known queries whether a key is present in a specific dictionary.
maxlength obtains a dictionary’s maximum capacity.
undef (LanguageLevel 2) removes an individual key from a dictionary.
<< and >> (LanguageLevel 2) construct a dictionary consisting of the bracketed
objects interpreted as key-value pairs.

55
3 . 6
Overview of Basic Operators
The following operators apply only to strings:
search and anchorsearch perform textual string searching and matching.
token scans the characters of a string according to the PostScript language syn-
tax rules, without executing the resulting objects.
There are many additional operators that use array, dictionary, or string operands
for special purposes—for instance, as transformation matrices, font dictionaries,
or text.
3.6.4 Relational, Boolean, and Bitwise Operators
The relational operators compare two operands and produce a boolean result in-
dicating whether the relation holds. Any two objects may be compared for equal-
ity (eq and ne—equal and not equal); numbers and strings may be compared by
the inequality operators (gt, ge, lt, and le—greater than, greater than or equal to,
less than, and less than or equal to).
The boolean and bitwise operators (and, or, xor, true, false, and not) compute
logical combinations of boolean operands or bitwise combinations of integer op-
erands. The bitwise shift operator bitshift applies only to integers.
3.6.5 Control Operators
The control operators modify the interpreter’s usual sequential execution of ob-
jects. Most of them take a procedure operand that they execute conditionally or
repeatedly.
if and ifelse execute a procedure conditionally depending on the value of a
boolean operand. (ifelse is introduced in Section 3.5, “Execution.”)
exec executes an arbitrary object unconditionally.
for, repeat, loop, and forall execute a procedure repeatedly. Several specialized
graphics and font operators, such as pathforall and kshow, behave similarly.
exit transfers control out of the scope of any of these looping operators.
countexecstack and execstack are used to read the execution stack.

56
C H A P T E R 3
Language
A PostScript program may terminate prematurely by executing the stop operator.
This occurs most commonly as a result of an error; the default error handlers (in
errordict) all execute stop.
The stopped operator establishes an execution environment that encapsulates
the effect of a stop. That is, stopped executes a procedure given as an operand,
just the same as exec. If the interpreter executes stop during that procedure, it
terminates the procedure and resumes execution at the object immediately after
the stopped operator.
3.6.6 Type, Attribute, and Conversion Operators
These operators deal with the details of PostScript types, attributes, and values,
introduced in Section 3.3, “Data Types and Objects.”
type returns the type of any operand as a name object (integertype, realtype,
and so on).
xcheck, rcheck, and wcheck query the literal/executable and access attributes of
an object.
cvlit and cvx change the literal/executable attribute of an object.
readonly, executeonly, and noaccess reduce an object’s access attribute. Access
can only be reduced, never increased.
cvi and cvr convert between integer and real types, and interpret a numeric
string as an integer or real number.
cvn converts a string to a name object defined by the characters of the string.
cvs and cvrs convert objects of several types to a printable string representa-
tion.
3.7 Memory Management
A PostScript program executes in an environment with these major components:
stacks, virtual memory, standard input and output files, and the graphics state.
The operand stack is working storage for objects that are the operands and re-
sults of operators. The dictionary stack contains dictionary objects that define

57
3 . 7
Memory Management
the current name space. The execution stack contains objects that are in partial
stages of execution by the PostScript interpreter. See Section 3.4, “Stacks.”
Virtual memory (VM) is a storage pool for the values of all composite objects.
The adjective “virtual” emphasizes the behavior of this memory visible at the
PostScript language level, not its implementation in computer storage.
The standard input file is the normal source of program text to be executed by
the PostScript interpreter. The standard output file is the normal destination of
output from the print operator and of error messages. Other files can exist as
well. See Section 3.8, “File Input and Output.”
The graphics state is a collection of parameters that control the production of
text and graphics on a raster output device. See Section 4.2, “Graphics State.”
This section describes the behavior of VM and its interactions with other compo-
nents of the PostScript execution environment. It describes facilities for control-
ling the environment as a whole. The PostScript interpreter can execute a
sequence of self-contained PostScript programs as independent “jobs”; similarly,
each job can have internal structure whose components are independent of each
other.
Some PostScript interpreters can support multiple execution contexts—the execu-
tion of multiple independent PostScript programs at the same time. Each context
has an environment consisting of stacks, VM, graphics state, and certain other
data. Under suitable conditions, objects in VM can be shared among contexts;
there are means to regulate concurrent access to the shared objects.
This edition of this book does not document the multiple contexts extension,
although it does indicate which components of a PostScript program’s environ-
ment are maintained on a per-context basis. Further information about multiple
contexts can be found in the second edition of this book and in the Display Post-
Script System
manuals.
3.7.1 Virtual Memory
As described in Section 3.3, “Data Types and Objects,” objects may be either sim-
ple or composite. A simple object’s value is contained in the object itself. A com-
posite object’s value is stored separately; the object contains a reference to it.
Virtual memory (VM) is the storage in which the values of composite objects
reside.

58
C H A P T E R 3
Language
For example, the program fragment
234 (string1)
pushes two objects, an integer and a string, on the operand stack. The integer,
which is a simple object, contains the value 234 as part of the object itself. The
string, which is a composite object, contains a reference to the value string1,
which is a text string that resides in VM. The elements of the text string are char-
acters (actually, integers in the range 0 to 255) that can be individually selected or
replaced.
Here is another example:
{234 (string1)}
This pushes a single object, a two-element executable array, on the operand stack.
The array is a composite object whose value resides in VM. The value in turn
consists of two objects, an integer and a string. Those objects are elements of the
array; they can be individually selected or replaced.
Several composite objects can share the same value. For example, in
{234 (string1)} dup
the dup operator pushes a second copy of the array object on the operand stack.
The two objects share the same value—that is, the same storage in VM. So replac-
ing an element of one array will affect the other. Other types of composite ob-
jects, including strings and dictionaries, behave similarly.
Creating a new composite object consumes VM storage for its value. This occurs
in two principal ways:
The scanner allocates storage for each composite literal object that it encoun-
ters. Composite literals are delimited by ( … ), < … >, <~ … ~>, and { … }. The
first three produce strings; the fourth produces an executable array or packed
array. There also are binary encodings for composite objects.
Some operators explicitly create new composite objects and allocate storage for
them. The array, packedarray, dict, string, and gstate operators create new
array, packed array, dictionary, string, and gstate objects, respectively. Also, the
bracketing constructs [ … ] and << … >> create new array and dictionary ob-

59
3 . 7
Memory Management
jects, respectively. The brackets are just special names for operators; the closing
bracket operators allocate the storage.
For the most part, consumption and management of VM storage is under the
control of the PostScript program. Aside from the operators mentioned above
and a few others that are clearly documented, most operators do not create new
composite objects or allocate storage in VM. Some operators place their results in
existing objects supplied by the caller. For example, the cvs (convert to string) op-
erator overwrites the value of a supplied string operand and returns a string ob-
ject that shares a substring of the supplied string’s storage.
3.7.2 Local and Global VM
There are two divisions of VM containing the values of composite objects: local
and global. Only composite objects occupy VM. An “object in VM” means a
“composite object whose value occupies VM”; the location of the object (for ex-
ample, on a stack or stored as an element of some other object) is immaterial.
Global VM exists only in LanguageLevel 2 and LanguageLevel 3 interpreters. In
LanguageLevel 1 interpreters, all of VM is local.
Local VM is a storage pool that obeys a stacklike discipline. Allocations in local
VM and modifications to existing objects in local VM are subject to the save and
restore operators. These operators bracket a section of a PostScript program
whose local VM activity is to be encapsulated. restore deallocates new objects and
undoes modifications to existing objects that were made since the matching save
operation. save and restore are described in Section 3.7.3, “Save and Restore.”
Global VM is a storage pool for objects that do not obey a fixed discipline. Ob-
jects in global VM can come into existence and disappear in an arbitrary order
during execution of a program. Modifications to existing objects in global VM
are not affected by occurrences of save and restore within the program. However,
an entire job’s VM activity can be encapsulated, enabling separate jobs to be exe-
cuted independently. This is described in Section 3.7.7, “Job Execution Environ-
ment.”
In a hierarchically structured program such as a page description, local VM is
used to hold information whose lifetime conforms to the structure; that is, it per-
sists to the end of a structural division, such as a single page. Global VM may be

60
C H A P T E R 3
Language
used to hold information whose lifetime is independent of the structure, such as
definitions of fonts and other resources that are loaded dynamically during the
execution of a program.
Control over allocation of objects in local versus global VM is provided by the
setglobal operator (LanguageLevel 2). This operator establishes a VM allocation
mode
, a boolean value that determines where subsequent allocations are to occur
(false means local, true means global). It affects objects created implicitly by the
scanner and objects created explicitly by operators. The default VM allocation
mode is local; a program can switch to global allocation mode when it needs to.
The following example illustrates the creation of objects in local and global VM:
/lstr (string1) def
/ldict 10 dict def
true setglobal
/gstr (string2) def
/gdict 5 dict def
false setglobal
In the first line, when the scanner encounters (string1), it allocates the string ob-
ject in local VM. In the second line, the dict operator allocates a new dictionary in
local VM. The third line switches to global VM allocation mode. The fourth and
fifth lines allocate a string object and a dictionary object in global VM. The sixth
line switches back to local VM allocation mode. The program associates the four
newly created objects with the names lstr, ldict, gstr, and gdict in the current dic-
tionary (presumably userdict).
An object in global VM is not allowed to contain a reference to an object in local
VM. An attempt to store a local object as an element of a global object will result
in an invalidaccess error. The reason for this restriction is that subsequent execu-
tion of the restore operator might deallocate the local object, leaving the global
object with a “dangling” reference to a nonexistent object.
This restriction applies only to storing a composite object in local VM as an ele-
ment of a composite object in global VM. All other combinations are allowed. The
following example illustrates this, using the objects that were created in the pre-
ceding example.

61
3 . 7
Memory Management
ldict /a lstr put
% Allowed—a local object into a local dict
gdict /b gstr put
% Allowed—a global object into a global dict
ldict /c gstr put
% Allowed—a global object into a local dict
gdict /d lstr put
% Not allowed (invalidaccess error)—a local object into a global dict
gdict /e 7 put
% Allowed—a simple object into any dict
There are no restrictions on storing simple objects, such as integers and names, as
elements of either local or global composite objects. The gcheck operator in-
quires whether an object can be stored as an element of a global composite
object. It returns true for a simple object or for a composite object in global VM,
or false for a composite object in local VM.
3.7.3 Save and Restore
The save operator takes a snapshot of the state of local VM and returns a save ob-
ject
that represents the snapshot. The restore operator causes local VM to revert
to a snapshot generated by a preceding save operation. Specifically, restore does
the following:
Discards all objects in local VM that were created since the corresponding save,
and reclaims the memory they occupied
Resets the values of all composite objects in local VM, except strings, to their
state at the time of the save
Performs an implicit grestoreall operation, which resets the graphics state to its
value at the time of the save (see Section 4.2, “Graphics State”)
Closes files that were opened since the corresponding save, so long as those
files were opened while local VM allocation mode was in effect (see Section 3.8,
“File Input and Output”)
The effects of restore are limited to the ones described above. In particular,
restore does not:
Affect the contents of the operand, dictionary, and execution stacks. If a stack
contains a reference to a composite object in local VM that would be discarded
by the restore operation, the restore is not allowed; an invalidrestore error oc-
curs.
Affect any objects that reside in global VM, except as described in Section 3.7.7,
“Job Execution Environment.”

62
C H A P T E R 3
Language
Undo side effects outside VM, such as writing data to files or rendering graph-
ics on the raster output device. (However, the implicit grestoreall may deacti-
vate the current device, thereby erasing the current page; see Section 6.2.6,
“Device Initialization and Page Setup,” for details.)
The save and restore operators can be nested to a limited depth (see Appendix B
for implementation limits). A PostScript program can use save and restore to en-
capsulate the execution of an embedded program that also uses save and restore.
save and restore are intended for use in structured programs such as page de-
scriptions. The conventions for structuring programs are introduced in
Section 2.4.2, “Program Structure,” and described in detail in Adobe Technical
Note #5001, PostScript Language Document Structuring Conventions Specification.
In such programs, save and restore serve the following functions:
A document consists of a prolog and a script. The prolog contains definitions
that are used throughout the document. The script consists of a sequence of in-
dependent pages. Each page has a save at the beginning and a restore at the
end, immediately before the showpage operator. Each page begins execution
with the initial conditions established in local VM by the prolog. There are no
unwanted legacies from previous pages.
A page sometimes contains additional substructure, such as embedded illustra-
tions, whose execution needs to be encapsulated. The encapsulated program
can make wholesale changes to the contents of local VM to serve its own pur-
poses. By bracketing the program with save and restore, the enclosing program
can isolate the effects of the embedded program.
As a PostScript program executes, new composite objects accumulate in local
VM. These include objects created by the scanner, such as literal string tokens,
and objects allocated explicitly by operators. The restore operator reclaims all
local VM storage allocated since the corresponding save; executing save and
restore periodically ensures that unreclaimed objects will not exhaust available
VM resources. In LanguageLevel 1, save and restore are the only way to reclaim
VM storage. Even in higher LanguageLevels, explicit reclamation by save and
restore is much more efficient than automatic reclamation (described in
Section 3.7.4, “Garbage Collection”).
The PostScript interpreter uses save and restore to encapsulate the execution of
individual jobs, as described in Section 3.7.7, “Job Execution Environment.”

63
3 . 7
Memory Management
3.7.4 Garbage Collection
In addition to the save and restore operators for explicit VM reclamation,
LanguageLevels 2 and 3 include a facility for automatic reclamation, popularly
known as a garbage collector. The garbage collector reclaims the memory occu-
pied by composite objects that are no longer accessible to the PostScript program.
For example, after the program
/a (string1) def
/a (string2) def
(string3) show
is executed, the string object string1 is no longer accessible, since the dictionary
entry that referred to it has been replaced by a different object, string2. Similarly,
the string object string3 is no longer accessible, since the show operator con-
sumes its operand but does not store it anywhere. These inaccessible strings are
candidates for garbage collection.
Garbage collection normally takes place without explicit action by the PostScript
program. It has no effects that are visible to the program. However, the presence
of a garbage collector strongly influences the style of programming that is per-
missible. If no garbage collector is present, a program that consumes VM endless-
ly and never executes save and restore will eventually exhaust available memory
and cause a VMerror.
There is a cost associated with creating and destroying composite objects in VM.
The most common case is that literal objects—particularly strings, user paths,
and binary object sequences—are immediately consumed by operators such as
show and ufill, and never used again. The garbage collector is engineered to deal
with this case inexpensively, so application programs should not hesitate to take
advantage of it. However, the cost of garbage collection is greater for objects that
have longer lifetimes or are allocated explicitly. Programs that frequently require
temporary objects are encouraged to create them once and reuse them instead of
creating new ones—for example, allocate a string object before an image data ac-
quisition procedure, rather than within it (see Section 4.10.7, “Using Images”).
Even with garbage collection, the save and restore operators still have their stan-
dard behavior. That is, restore resets all accessible objects in local VM to their
state at the time of the matching save. It reclaims all composite objects created in

64
C H A P T E R 3
Language
local VM since the save operation, and does so very cheaply. On the other hand,
garbage collection is the only way to reclaim storage in global VM, since save and
restore normally do not affect global VM.
With garbage collection comes the ability to explicitly discard composite objects
that are no longer needed. This can be done in an order unrelated to the time of
creation of those objects, as opposed to the stacklike order imposed by save and
restore. This technique is particularly desirable for very large objects, such as font
definitions.
If the only reference to a particular composite object is an element of some array
or dictionary, replacing that element with something else (using put, for in-
stance) renders the object inaccessible. Alternatively, the undef operator removes
a dictionary entry entirely; that is, it removes both the key and the value of a key-
value pair, as opposed to replacing the value with some other value. In either case,
the removed object becomes a candidate for garbage collection.
Regardless of the means used to remove a reference to a composite object, if the
object containing the reference is in local VM, the action can be undone by a sub-
sequent restore. This is true even for undef. Consider the following example:
/a (string1) def
save
currentdict /a undef
restore
Execution of undef removes the key a and its value from the current dictionary,
seemingly causing the object string1 to become inaccessible. However, assuming
that the current dictionary is userdict (or some other dictionary in local VM),
restore reinstates the deleted entry, since it existed at the time of the correspond-
ing save. The value is still accessible and cannot be garbage-collected.
As a practical matter, this means that the technique of discarding objects explicit-
ly (in expectation of their being garbage-collected) is useful mainly for objects in
global VM, where save and restore have no effect, and for objects in local VM
that were created at the current level of save nesting.

65
3 . 7
Memory Management
3.7.5 Standard and User-Defined Dictionaries
A job begins execution with three standard dictionaries on the dictionary stack
(in order from bottom to top):
systemdict, a global dictionary that is permanently read-only and contains
mainly operators
globaldict (LanguageLevel 2), a global dictionary that is writeable
userdict, a local dictionary that is writeable
There are other standard dictionaries that are the values of permanent named en-
tries in systemdict. Some of these are in local VM and some in global VM, as
shown in Tables 3.3 and 3.4.
A PostScript program can also create new dictionaries in either local or global
VM, then push them on the dictionary stack or store them as entries in userdict
or globaldict.
TABLE 3.3 Standard local dictionaries
DICTIONARY
DESCRIPTION
userdict
Standard writeable local dictionary. Initially, it is the top dictionary
on the dictionary stack, making it the current dictionary.
errordict
Error dictionary. See Section 3.11, “Errors.”
$error
Dictionary accessed by the built-in error-handling procedures to
store stack snapshots and other information. See Section 3.11,
“Errors.”
statusdict
Dictionary for product-specific operators and other definitions. See
Chapter 8.
FontDirectory
Dictionary for font definitions. It is normally read-only, but is
updated by definefont and consulted by findfont. See Sections 3.9,
“Named Resources,” and 5.2, “Font Dictionaries.”

66
C H A P T E R 3
Language
TABLE 3.4 Standard global dictionaries
DICTIONARY
DESCRIPTION
systemdict
Read-only system dictionary containing all operators and other
definitions that are standard parts of the PostScript language. It is
the bottom dictionary on the dictionary stack.
globaldict
(LanguageLevel 2) Standard writeable global dictionary. It is on the
dictionary stack between systemdict and userdict.
GlobalFontDirectory (LanguageLevel 2) Dictionary for font definitions in global VM. It is
normally read-only, but is updated by definefont and consulted by
findfont. See Sections 3.9, “Named Resources,” and 5.2, “Font
Dictionaries.”
The dictionaries userdict and globaldict are intended to be the principal reposi-
tories for application-defined dictionaries and other objects. When a PostScript
program creates a dictionary in local VM, it then typically associates that diction-
ary with a name in userdict. Similarly, when the program creates a dictionary in
global VM, it typically associates the dictionary with a name in globaldict. Note
that the latter step requires explicit action on the part of the program. Entering
global VM allocation does not alter the dictionary stack (say, to put globaldict on
top).
Note: systemdict, a global dictionary, contains several entries whose values are local
dictionaries, such as
userdict and $error. This is an exception to the normal rule, de-
scribed in Section 3.7.2, “Local and Global VM,” that prohibits objects in global VM
from referring to objects in local VM.

The principal intended use of global VM is to hold font definitions and other re-
sources that are loaded dynamically during execution of a PostScript program.
The findresource operator loads resources into global VM automatically when
appropriate. However, any program can take advantage of global VM when its
properties are useful. The following guidelines are suggested:
Objects that are created during the prolog can be in either local or global VM;
in either case, they will exist throughout the job, since they are defined outside
the save and restore that enclose individual pages of the script. A dictionary in
local VM reverts to the initial state defined by the prolog at the end of each
page. This is usually the desirable behavior. A dictionary in global VM accumu-

67
3 . 7
Memory Management
lates changes indefinitely and never reverts to an earlier state; this is useful
when there is a need to communicate information from one page to another
(strongly discouraged in a page description).
When using a writeable dictionary in global VM, you must be careful about
what objects you store in it. Attempting to store a local composite object in a
global dictionary will cause an invalidaccess error. For this reason, it is advis-
able to segregate local and global data and to use global VM only for those ob-
jects that must persist through executions of save and restore.
In general, the prologs for most existing PostScript programs do not work cor-
rectly if they are simply loaded into global VM. The same is true of some fonts,
particularly Type 3 fonts. These programs must be altered to define global and
local information separately. Typically, global VM should be used to hold pro-
cedure definitions and constant data; local VM should be used to hold tempo-
rary data needed during execution of the procedures.
Creating gstate (graphics state) objects in global VM is particularly risky. This
is because the graphics state almost always contains one or more local objects,
which cannot be stored in a global gstate object (see the currentgstate operator
in Chapter 8).
3.7.6 User Objects
Some applications require a convenient and efficient way to refer to PostScript
objects previously constructed in VM. The conventional way to accomplish this is
to store such objects as named entries in dictionaries and later refer to them by
name. In a PostScript program written by a programmer, this approach is natural
and straightforward. When the program is generated mechanically by another
program, however, it is more convenient to number the objects with small inte-
gers and later refer to them by number. This technique simplifies the bookkeep-
ing the application program must do.
LanguageLevel 2 provides built-in support for a single space of numbered
objects, called user objects. There are three operators, defineuserobject,
undefineuserobject, and execuserobject, that manipulate an array named
UserObjects. These operators do not introduce any fundamental capability, but
merely provide convenient and efficient notation for accessing the elements of a
special array.

68
C H A P T E R 3
Language
Example 3.4 illustrates the intended use of user objects.
Example 3.4
17 {ucache 132 402 316 554 setbbox … } cvlit defineuserobject
17 execuserobject ufill
The first line of the example constructs an interesting object that is to be used re-
peatedly (in this case, a user path; see Section 4.6, “User Paths”) and associates
the index 17 with this object.
The second line pushes the user object on the operand stack, from which ufill
takes it. execuserobject executes the user object associated with index 17. How-
ever, because the object in this example is not executable, the result of the execu-
tion is to push the object on the operand stack.
defineuserobject manages the UserObjects array automatically; there is no rea-
son for a PostScript program to refer to UserObjects explicitly. The array is allo-
cated in local VM and defined in userdict. This means that the effect of
defineuserobject is subject to save and restore. The values of user objects given
to defineuserobject can be in either local or global VM.
3.7.7 Job Execution Environment
As indicated in Section 2.4, “Using the PostScript Language,” the conventional
model of a PostScript interpreter is a “print server”—a single-threaded process
that consumes and executes a sequence of “print jobs,” each of which is a com-
plete, independent PostScript program. This model is also appropriate for certain
other environments, such as a document previewer running on a host computer.
The notion of a print job is not formally a part of the PostScript language, be-
cause it involves not only the PostScript interpreter but also some description of
the environment in which the interpreter operates. Still, it is useful to describe a
general job (and job server) model that is accurate for most PostScript printers,
though perhaps lacking in some details. Information about communication pro-
tocols, job control, system management, and so on, does not appear here, but
rather in documentation for specific products.

69
3 . 7
Memory Management
A job begins execution in an initial environment that consists of the following:
An empty operand stack
A dictionary stack containing the standard dictionaries—systemdict,
globaldict (LanguageLevel 2), and userdict
Execution and graphics state stacks reset to their standard initial state, with no
vestiges of previous jobs
The contents of VM (local and global)
Miscellaneous interpreter parameters
During execution, the job may alter its environment. Ordinarily, when a job fin-
ishes, the environment reverts to its initial state to prepare for the next job. That
is, the job is encapsulated. The server accomplishes this encapsulation by execut-
ing save and restore and by explicitly resetting stacks and parameters between
jobs.
With suitable authorization, a job can make persistent alterations to objects in
VM. That is, the job is not encapsulated. Instead, its alterations appear as part of
the initial state of the next and all subsequent jobs. This is accomplished by
means of the startjob and exitserver facilities, described below.
Server Operation
A job server is presented with a sequence of files via one or more communication
channels. For each file, the server performs the following sequence of steps:
1. Establish standard input and output file objects for the channel from which
the file is to be obtained. The means by which this is done is implementation-
dependent.
2. Execute save. This is the outermost save, which unlike a normal save obtains a
snapshot of the initial state of objects in both local and global VM.
3. Establish the default initial state for the interpreter: empty operand stack, local
VM allocation mode, default user space for the raster output device, and so
on.
4. Execute the standard input file until it reaches end-of-file or an error occurs. If
an error occurs, report it and flush input to end-of-file.

70
C H A P T E R 3
Language
5. Clear the operand stack and reset the dictionary stack to its initial state.
6. Execute restore, causing objects in VM (both local and global) to revert to the
state saved in step 2.
7. Close the standard input and output files, transmitting an end-of-file indica-
tion over the communication channel.
Ordinarily, the server executes all of the above steps once for each file that it re-
ceives. Each file is treated as a separate job, and each job is encapsulated.
Altering Initial VM
A program can circumvent job encapsulation and alter the initial VM for subse-
quent jobs. To do so, it can use either startjob (LanguageLevel 2) or exitserver
(available in all implementations that include a job server). This capability is
controlled by a password. The system administrator can choose not to make the
capability available to ordinary users. Applications and drivers must be prepared
to deal with the possibility that altering the initial VM is not allowed.
Note: startjob and exitserver should be invoked only by a print manager, spooler, or
system administration program. They should never be used by an application pro-
gram composing a page description. Appendix G gives more guidelines for using
startjob and exitserver.
startjob is invoked as follows:
true password startjob
where password is a string or an integer (see Section C.3.1, “Passwords”). If the
password is correct, startjob causes the server to execute steps 5, 6, 3, and 4 in the
sequence above. In other words, it logically ends the current job, undoing all
modifications it has made so far, and starts a new job. However, it does not
precede the new job with a save operation, so its execution is not encapsulated.
Furthermore, it does not disturb the standard input and output files; the inter-
preter resumes consuming the remainder of the same input file.
Having started an unencapsulated job, the PostScript program can alter VM in
arbitrary ways. Such alterations are persistent. If the job simply runs to comple-
tion, ending step 5 in the sequence above, the server skips step 6 (since there is no

71
3 . 7
Memory Management
saved VM snapshot to restore), continues with step 7, and processes the next job
normally starting at step 1.
Alternatively, a program can explicitly terminate its alterations to initial VM:
false password startjob
This operation has the effect of executing steps 2, 3, and 4, logically starting yet
another job that is encapsulated in the normal way, but still continuing to read
from the same file.
If startjob executes successfully, it always starts a new job in the sense described
above. It resets the stacks to their initial state and then pushes the result true on
the operand stack. But if startjob is unsuccessful, it has no effect other than to
push false on the operand stack; the effect is as if the program text before and af-
ter the occurrence of startjob were a single combined job.
The example sequence
true password startjob pop
… Application prolog here …
false password startjob pop
… Application script here …
installs the application prolog in initial VM if it is allowed to do so. However, the
script executes successfully regardless of whether the attempt to alter initial VM
was successful. The program can determine the outcome by testing the result re-
turned by startjob.
The above sequence is an example; there is no restriction on the sequence of en-
capsulated and unencapsulated jobs. If the password is correct and the boolean
operand to startjob is true, the job that follows it is unencapsulated; if false, the
job is encapsulated. But if the password is incorrect, startjob does not start a new
job; the current job simply continues.
startjob also fails to start a new job if, at the time it is executed, the current save
nesting is more than one level deep. In other words, startjob works only when the
current save level is equal to the level at which the current job started. This per-
mits a file that executes startjob to be encapsulated as part of another job simply
by bracketing it with save and restore.

72
C H A P T E R 3
Language
Note: If an unencapsulated job uses save and restore, the save and restore op-
erations affect global as well as local VM, since they are at the outermost
save level.
Also, if the job ends with one or more
save operations pending, a restore to the outer-
most saved VM is performed automatically.

exitserver
exitserver is an unofficial LanguageLevel 1 feature that is retained in higher
LanguageLevels for compatibility. Although exitserver has never been a formal
part of the PostScript language, it exists in nearly every Adobe PostScript prod-
uct, and some applications have come to depend on it. The startjob feature, de-
scribed above, is more flexible and is preferred for new applications in
LanguageLevels 2 and 3.
The canonical method of invoking exitserver is
serverdict begin password exitserver
This has the same effect as
true password startjob not
{/exitserver errordict /invalidaccess get exec}
if
In other words, if successful, exitserver initiates an unencapsulated job that can
alter initial VM; if unsuccessful, it generates an invalidaccess error. Like startjob,
a successful exitserver operation resets the stacks to their initial state: it removes
serverdict from the dictionary stack. The program that follows (terminated by
end-of-file) is executed as an unencapsulated job.
In many implementations, successful execution of exitserver sends the message
%%[exitserver: permanent state may be changed]%%
to the standard output file. This message is not generated by startjob. It is sup-
pressed if binary is true in the $error dictionary; see Section 3.11.2, “Error Han-
dling.”
Note: Aside from exitserver, the other contents of serverdict are not specified as part
of the language. In LanguageLevels 2 and 3, the effect of executing
exitserver more
than once in the same file is the same as that of executing the equivalent
startjob se-

73
3 . 8
File Input and Output
quence multiple times. In LanguageLevel 1, the effect of executing the exitserver op-
erator multiple times is undefined and unpredictable.

3.8 File Input and Output
A file is a finite sequence of characters bounded by an end-of-file indication.
These characters may be stored permanently in some place (for instance, a disk
file) or they may be generated on the fly and transmitted over some communica-
tion channel. Files are the means by which the PostScript interpreter receives exe-
cutable programs and exchanges data with the external environment.
There are two kinds of file: input and output. An input file is a source from which
a PostScript program can read a sequence of characters; an output file is a destina-
tion to which a PostScript program can write characters. Some files can be both
read and written.
The contents of a file are treated as a sequence of 8-bit bytes. In some cases, those
bytes can be interpreted as text characters, such as the ASCII text representing a
PostScript program. In other cases, they can be interpreted as arbitrary binary
data. In the descriptions of files and file operators, the terms character and byte
are synonymous.
3.8.1 Basic File Operators
A PostScript file object represents a file. The file operators take a file object as an
operand to read or write characters. Ignoring for the moment how a file object
comes into existence, the file operators include the following:
read reads the next character from an input file.
write appends a character to an output file.
readstring, readline, and writestring transfer the contents of strings to and
from files.
readhexstring and writehexstring read and write binary data represented in the
file by hexadecimal notation.
token scans characters from an input file according to the PostScript language
syntax rules.

74
C H A P T E R 3
Language
exec, applied to an input file, causes the PostScript interpreter to execute a
PostScript program from that file.
The operators that write to a file do not necessarily deliver the characters to their
destination immediately. They may leave some characters in buffers for reasons
of implementation or efficiency. The flush and flushfile operators deliver these
buffered characters immediately. These operators are useful in certain situations,
such as during two-way interactions with another computer or with a human
user, when such data must be transmitted immediately.
Standard Input and Output Files
All PostScript interpreters provide a standard input file and a standard output file,
which usually represent a real-time communication channel to and from another
computer. The standard input and output files are always present; it is not neces-
sary for a program to create or close them.
The PostScript interpreter reads and interprets the standard input file as Post-
Script program text. It sends error and status messages to the standard output
file. Also, a PostScript program may execute the print operator to send arbitrary
data to the standard output file. Note that print is a file operator; it has nothing to
do with placing text on a page or causing pages to emerge from a printer.
It is seldom necessary for a PostScript program to deal explicitly with file objects
for the standard files, because the PostScript interpreter reads the standard input
file by default and the print operator references the standard output file implicit-
ly. Additionally, the file currently being read by the PostScript interpreter is avail-
able via the currentfile operator; this file need not be the standard input file.
However, when necessary, a program may apply the file operator to the identify-
ing strings %stdin or %stdout to obtain file objects for the standard input and
output files; see Section 3.8.3, “Special Files.”
End-of-Line Conventions
The PostScript language scanner and the readline operator recognize all three ex-
ternal forms of end-of-line (EOL)—CR alone, LF alone, and the CR-LF pair—
and treat them uniformly, translating them as described below. The PostScript
interpreter does not perform any such translation when reading data by other
means or when writing data by any means.

75
3 . 8
File Input and Output
End-of-line sequences are recognized and treated specially in the following situa-
tions:
Any of the three forms of EOL appearing in a literal string is converted to a sin-
gle LF character in the resulting string object. These three examples produce
identical string objects, each with an LF character following the second word in
the string:
(any text〈CR〉some more text)
(any text〈LF〉some more text)
(any text〈CR〉〈LF〉some more text)
Any of the three forms of EOL appearing immediately after \ in a string is
treated as a line continuation; both the \ and the EOL are discarded. These four
examples produce identical string objects:
(any text \〈CR〉some more text)
(any text \〈LF〉some more text)
(any text \〈CR〉〈LF〉some more text)
(any text some more text)
Any of the three forms of EOL appearing outside a string is treated as a single
white-space character. Since the language treats multiple white-space charac-
ters as a single white-space character, the treatment of EOL is interesting only
when a PostScript token is followed by data to be read explicitly by one of the
file operators. The following three examples produce identical results: the oper-
ator reads the character x from the current input file and leaves its character
code (the integer 120) on the stack.
currentfile read〈CR〉x
currentfile read〈LF〉x
currentfile read〈CR〉〈LF〉x
The readline operator treats any of the three forms of EOL as the termination
condition.
Data read by read and readstring does not undergo EOL translation: the Post-
Script interpreter reads whatever characters were received from the channel.
The same is true of data written by write and writestring: whatever characters
the interpreter provides are sent to the channel. However, in either case the
channel itself may perform some EOL translation, as discussed below.

76
C H A P T E R 3
Language
Communication Channel Behavior
Communications functions often usurp control characters. Control codes are
device-dependent and not part of the PostScript language. For example, the serial
communication protocol supported by many products uses the Control-D char-
acter as an end-of-file indication. In this case, Control-D is a communications
function and not logically part of a PostScript program. This applies specifically
to the serial channel; other channels, such as LocalTalk™ and Ethernet, have dif-
ferent conventions for end-of-file and other control functions. In all cases, com-
munication channel behavior is independent of the actions of the PostScript
interpreter.

There are two levels of PostScript EOL translation: one in the PostScript inter-
preter and one in the serial communication channel. The previous description
applies only to the EOL conventions at the level of the PostScript interpreter. The
purpose of the seemingly redundant communication-level EOL translation is to
maintain compatibility with diverse host operating systems and communications
environments.
As discussed in Section 3.2, “Syntax,” the ASCII encoding of the language is de-
signed for maximum portability. It avoids using control characters that might be
preempted by operating systems or communication channels. However, there are
situations in which transmission of arbitrary binary data is desirable. For exam-
ple, sampled images are represented by large quantities of binary data. The Post-
Script language has an alternative binary encoding that is advantageous in certain
situations. There are two main ways to deal with PostScript programs that con-
tain binary information:
Communicate with the interpreter via binary channels exclusively. Some chan-
nels, such as LocalTalk and Ethernet, are binary by nature. They do not pre-
empt any character codes, but instead communicate control information
separately from the data. Other channels, such as serial channels, may support
a binary communication protocol that allows control characters to be quoted.
This approach presupposes a well-controlled environment. PostScript pro-
grams produced in that environment may not be portable to other environ-
ments.
Take advantage of filters for encoding binary data as ASCII text. Filters are a
LanguageLevel 2 feature, described in Section 3.8.4, “Filters.” Programs repre-
sented in this way do not include any control codes and are therefore portable
to any LanguageLevel 2 or 3 interpreter in any environment.

77
3 . 8
File Input and Output
3.8.2 Named Files
The PostScript language provides access to named files in secondary storage. The
file access capabilities are part of the integration of the language with an underly-
ing operating system; there are variations from one such integration to another.
Not all the file system capabilities of the underlying operating system are neces-
sarily made available at the PostScript language level.
The PostScript language provides a standard set of operators for accessing named
files. These operators are supported in LanguageLevels 2 and 3, as well as in cer-
tain LanguageLevel 1 implementations that have access to file systems. The oper-
ators are file, deletefile, renamefile, status, filenameforall, setfileposition, and
fileposition. Even in LanguageLevel 1 implementations that do not support
named files, the file operator is supported, because the special file names %stdin,
%stdout, and %stderr are always allowed (see Section 3.8.3, “Special Files”).
Although the language defines a standard framework for dealing with files, the
detailed semantics of the file system operators, particularly file naming conven-
tions, are operating system–dependent.
Files are stored in one or more “secondary storage devices,” hereafter referred to
simply as devices. (These are not to be confused with the “current device,” which
is a raster output device identified in the graphics state.) The PostScript language
defines a uniform convention for naming devices, but it says nothing about how
files in a given device are named. Different devices have different properties, and
not all devices support all operations.
A complete file name has the form %device%file, where device identifies the sec-
ondary storage device and file is the name of the file within the device. When a
complete file name is presented to a file system operator, the device portion se-
lects the device; the file portion is in turn presented to the implementation of that
device, which is operating system–dependent and environment-dependent.
Note: Typically, file names cannot contain null characters (ASCII code 0); if a file
name is specified by a string object containing a null character, the null character will
effectively terminate the file name.

When a file name is presented without a %device% prefix, a search rule deter-
mines which device is selected. The available storage devices are consulted in or-
der; the requested operation is attempted on each device until the operation
succeeds. The number of available devices, their names, and the order in which

78
C H A P T E R 3
Language
they are searched is environment-dependent. Not all devices necessarily partici-
pate in such searches; some devices can be accessed only by explicitly naming
them.
In an interpreter that runs on top of an operating system, there may be a device
that represents the complete file system provided by the operating system. If so,
by convention that device’s name is os; thus, complete file names are in the form
%os%file, where file conforms to underlying file system conventions. This device
always participates in searches, as described above; a program can access ordinary
files without specifying the %os% prefix. There may be more than one device that
behaves in this way; the names of such devices are product-dependent.
Note: The os device may impose some restrictions on the set of files that can be ac-
cessed. Restrictions are necessary when the PostScript interpreter executes with a user
identity different from that of the user running the application program.

In an interpreter that controls a dedicated product, such as a typical printer prod-
uct, there can be one or more devices that represent file systems on disks and car-
tridges. Files on these devices have names such as %disk0%file, %disk1%file, and
%cartridge0%file. Again, these devices participate in searches when the device
name is not specified.
Each of the operators file, deletefile, renamefile, status, and filenameforall takes
a filename operand—a string object that identifies a file. The name of the file can
be in one of three forms:
%device%file identifies a named file on a specific device, as described above.
file (first character not %) identifies a named file on an unspecified device,
which is selected by an environment-specific search rule, as described above.
%device or %device% identifies an unnamed file on the device. Certain devices,
such as cartridges, support a single unnamed file as opposed to a collection of
named files. Other devices represent communication channels rather than per-
manent storage media. There are also special files named %stdin, %stdout,
%stderr, %statementedit, and %lineedit, described in Section 3.8.3, “Special
Files.” The deletefile, renamefile, and filenameforall operators do not apply to
file names of this form.
“Wildcard” file names are recognized by the filenameforall operator; see
filenameforall in Chapter 8 for more information.

79
3 . 8
File Input and Output
Creating and Closing a File Object
File objects are created by the file operator. This operator takes two strings: the
first identifies the file and the second specifies access. file returns a new file object
associated with that file.
An access string is a string object that specifies how a file is to be accessed. File
access conventions are similar to the ones defined by the ANSI C standard, al-
though some file systems may not support all access methods. The access string
always begins with r, w, or a, possibly followed by +; any additional characters
supply operating system–specific information. Table 3.5 lists access strings and
their meanings.
TABLE 3.5 Access strings
ACCESS STRING
MEANING
r
Open for reading only. Generate an error if the file does not already
exist.
w
Open for writing only. Create the file if it does not already exist.
Truncate and overwrite it if it does exist.
a
Open for writing only. Create the file if it does not already exist.
Append to it if it does exist.
r+
Open for reading and writing. Generate an error if the file does not
already exist.
w+
Open for reading and writing. Create the file if it does not already
exist. Truncate and overwrite it if it does exist.
a+
Open for reading and writing. Create the file if it does not already
exist. Append to it if it does exist.
Note: The special files %stdin, %lineedit, and %statementedit allow only r access;
%stdout and %stderr allow only w access (see Section 3.8.3, “Special Files”).
Like other composite objects, such as strings and arrays, file objects have access
attributes. The access attribute of a file object is based on the access string used to
create it. Attempting to access a file object in a way that would violate its access
attribute causes an invalidaccess error.

80
C H A P T E R 3
Language
Certain files—in particular, named files on disk—are positionable, meaning that
the data in the file can be accessed in an arbitrary order rather than only sequen-
tially from the beginning. The setfileposition operator adjusts a file object so that
it refers to a specified position in the underlying file; subsequent reads or writes
access the file at that new position. Specifying a plus sign (+) in the access string
opens a positionable file for reading and writing, as shown in Table 3.5. To ensure
predictable results, it is necessary to execute setfileposition when switching be-
tween reading and writing.
At the end of reading or writing a file, a program should close the file to break the
association between the PostScript file object and the actual file. The file opera-
tors close a file automatically if end-of-file is encountered during reading (see be-
low). The closefile operator closes a file explicitly. restore closes a file if the file
object was created since the corresponding save operation while in local VM allo-
cation mode. Garbage collection closes a file if the file object is no longer accessi-
ble.
All operators that access files treat end-of-file and exception conditions the same.
During reading, if an end-of-file indication is encountered before the requested
item can be read, the file is closed and the operation returns an explicit end-of-
file result. This also occurs if the file has already been closed when the operator is
executed. All other exceptions during reading and any exceptions during writing
result in execution of the error ioerror, invalidfileaccess, or invalidaccess.
3.8.3 Special Files
The file operator can also return special files that are identified as follows:
%stdin, the standard input file.
%stdout, the standard output file.
%stderr, the standard error file. This file is for reporting low-level errors. In
many configurations, it is the same as the standard output file.
%statementedit, the statement editor filter file, described below.
%lineedit, the line editor filter file, described below.

81
3 . 8
File Input and Output
For example, the statements
(%stdin) (r) file
(%stdout) (w) file
push copies of the standard input and output file objects on the operand stack.
These are duplicates of existing file objects, not new objects. Each execution of
the file operator for %stdin, %stdout, or %stderr within a given job returns the
same file object. A PostScript program should not close these files. In an inter-
preter that supports multiple execution contexts, the standard input and output
files are private to each context; the standard error file is shared among all con-
texts.
Some PostScript interpreters support an interactive executive, invoked by the
executive operator; this is described in Section 2.4.4, “Using the Interpreter Inter-
actively.” executive obtains commands from the user by means of a special file
named %statementedit. Applying the file operator to the file name string
%statementedit causes the following to happen:
The file operator begins reading characters from the standard input file and
storing them in a temporary buffer. While doing so, it echoes the characters to
the standard output file. It also interprets certain control characters as editing
functions for making corrections, as described in Section 2.4.4.
When a complete statement has been entered, the file operator returns. A state-
ment consists of one or more lines terminated by a newline that together form
one or more complete PostScript tokens, with no opening brackets
({, (, <, or <~) left unmatched. A statement is also considered complete if it con-
tains a syntax error.
The returned file object represents a temporary file containing the statement
that was entered, including the terminating end-of-line character. Reading
from this file obtains the characters of the statement in turn; end-of-file is re-
ported when the end of the statement is reached. Normally, this file is used as
an operand to the exec operator, causing the statement to be executed as a
PostScript program.
The %lineedit special file is similar to %statementedit, except that when reading
from %lineedit, the file operator returns after a single line has been entered,
whether or not it constitutes a complete statement. For both the special files
%statementedit and %lineedit, if the standard input file reaches end-of-file before

82
C H A P T E R 3
Language
any characters have been entered, the file operator issues an undefinedfilename
error.
It is important to understand that the file object returned by file for the
%statementedit and %lineedit special files is not the same as the standard input
file. It represents a temporary file containing a single buffered statement. When
the end of that statement is reached, the file is closed and the file object is no
longer of any use. Successive executions of file for %statementedit and %lineedit
return different file objects.
The %statementedit and %lineedit special files are not available in PostScript in-
terpreters that do not support an interactive executive. PostScript programs that
are page descriptions should never refer to these files.
3.8.4 Filters
A filter (LanguageLevel 2) is a special kind of file object that can be layered on top
of some other file to transform data being read from or written to that file. When
a PostScript program reads characters from an input filter, the filter reads charac-
ters from its underlying file and transforms the data in some way, depending on
the filter. Similarly, when a program writes characters to an output filter, the filter
transforms the data and writes the results to its underlying file.
An encoding filter is an output file that takes the data written to it, converts it to
some encoded representation depending on the filter, and writes the encoded
data to the underlying file. For example, the ASCIIHexEncode filter transforms bi-
nary data to an ASCII hexadecimal-encoded representation, which it writes to its
underlying file. All encoding filters have Encode as part of their names.
A decoding filter is an input file that reads encoded data from its underlying file
and decodes it. The program reading from the filter receives the decoded data.
For example, the ASCIIHexDecode filter reads ASCII hexadecimal-encoded data
from its underlying file and transforms it to binary. All decoding filters have
Decode as part of their names.
Decoding filters are most likely to be used in page descriptions. An application
program generating a page description can encode certain information (for ex-
ample, data for sampled images) to compress it or to convert it to a portable
ASCII representation. Then, within the page description itself, it invokes the cor-
responding decoding filter to convert the information back to its original form.

83
3 . 8
File Input and Output
Encoding filters are unlikely to be used in most page descriptions. However, a
PostScript program can use them to encode data to be sent back to the applica-
tion or written to a disk file. In the interest of symmetry, the PostScript language
defines both encoding and decoding filters for all of its standard data transforma-
tion algorithms. However, encoding filters are optional; not all PostScript inter-
preters support them.
Creating Filters
Filter files are created by the filter operator (LanguageLevel 2). The filter operator
expects the following operands in the order given:
1. A data source or data target. This is ordinarily a file object that represents the
underlying file the filter is to read or write. However, it can also be a string or a
procedure. Details are provided in Section 3.13.1, “Data Sources and Targets.”
2. Filter parameters. All filters may take additional parameters, and some require
additional parameters, to control how they operate. These parameters may be
specified in a dictionary given as an operand following the data source or tar-
get; in some cases, required parameters must be given as operands following
the data source or target or following the dictionary operand, if any. The dic-
tionary operand may be omitted whenever all the dictionary-supplied param-
eters have the corresponding default values for that filter. Exactly which
parameters and operands are required for the various filters is described in
Section 3.13, “Filtered Files Details.”
3. Filter name. This is a name object, such as ASCIIHexDecode, that specifies the
data transformation the filter is to perform. It also determines how many pa-
rameters there are and how they are to be interpreted.
The filter operator returns a new file object that represents the filtered file. For an
encoding filter, this is an output file, and for a decoding filter, an input file. The
direction of the underlying file—that is, its read/write attribute—must match
that of the filter. Filtered files can be used just the same as other files; they are val-
id as operands to file operators such as read, write, readstring, and writestring.
Input filters are also valid as data sources for operators such as exec or image.
Since a filter is itself a file, it can be used as the underlying file for yet another fil-
ter. Filters can be cascaded to form a pipeline that passes the data stream through
two or more encoding or decoding transformations in sequence. Example 3.5 il-
lustrates the construction of an input pipeline for decoding sampled image data

84
C H A P T E R 3
Language
that is embedded in the program. The application has encoded the image data
twice: once using the RunLengthEncode method to compress the data, and then
using the ASCII85Encode method to represent the binary compressed data as
ASCII text.
Example 3.5
256 256 8 [256 0 0 −256 0 256]
% Other operands of the image operator
currentfile
/ASCII85Decode filter
/RunLengthDecode filter
image
… Encoded image data …
~>
% ASCII85 end-of-data marker
The currentfile operator returns the file object from which the PostScript inter-
preter is currently executing. The first execution of filter creates an ASCII85-
Decode filter whose underlying file is the one returned by currentfile. It pushes
the filter file object on the stack. The second execution of filter creates a
RunLengthDecode filter whose underlying file is the first filter file; it pushes the
new filter file object on the stack. Finally, the image operator uses the second fil-
ter file as its data source. As image reads from its data source, the data is drawn
from the underlying file and transformed by the two filters in sequence.
Standard Filters
The PostScript language supports a standard set of filters that fall into three main
categories:
ASCII encoding and decoding filters enable arbitrary 8-bit binary data to be rep-
resented in the printable subset of the ASCII character set. This improves the
portability of the resulting data, since it avoids the problem of interference by
operating systems or communication channels that preempt the use of control
characters, represent text as 7-bit bytes, or impose line-length restrictions.
Compression and decompression filters enable data to be represented in a com-
pressed form. Compression is particularly valuable for large sampled images,
since it reduces storage requirements and transmission time. There are several
compression filters, each of which is best suited for particular kinds of data.
Note that the compressed data is in 8-bit binary format, even if the original
data happens to be ASCII text. For maximum portability of the encoded data,

85
3 . 8
File Input and Output
these filters should be used with ASCII encoding filters, as illustrated above in
Example 3.5.
Subfile filters pass data through without modification. These filters permit the
creation of file objects that access arbitrary user-defined data sources or data
targets. Input filters also can read data from an underlying file up to a specified
end-of-data marker.
Table 3.6 summarizes the available filters. A program can determine the complete
set of filters that the PostScript interpreter supports by applying the
resourceforall operator to the Filter resource category; see Section 3.9, “Named
Resources.”
TABLE 3.6 Standard filters
REQUIRED
FILTER NAME
PARAMETERS
DESCRIPTION
ASCIIHexEncode
(none)
Encodes binary data in an ASCII hexadecimal representation. Each
binary data byte is converted to two hexadecimal digits, resulting in
an expansion factor of 1:2 in the size of the encoded data.
ASCIIHexDecode
(none)
Decodes ASCII hexadecimal-encoded data, producing the original
binary data.
ASCII85Encode
(none)
Encodes binary data in an ASCII base-85 representation. This encod-
ing uses nearly all of the printable ASCII character set. The resulting
expansion factor is 4:5, making this encoding much more efficient
than hexadecimal.
ASCII85Decode
(none)
Decodes ASCII base-85 data, producing the original binary data.
LZWEncode
(none)
Compresses data using the LZW (Lempel-Ziv-Welch) adaptive com-
pression method, optionally after pretransformation by a predictor
function. This is a good general-purpose encoding that is especially
well suited for natural-language and PostScript-language text, but it
is also useful for image data.
LZWDecode
(none)
Decompresses LZW-encoded data, producing the original data.
FlateEncode
(none)
(LanguageLevel 3) Compresses data using the public-domain zlib/de-
flate compression method, optionally after pretransformation by a
predictor function. This is a variable-length Lempel-Ziv adaptive
compression method cascaded with adaptive Huffman coding. It is a
good general-purpose encoding that is especially well suited for
natural-language and PostScript-language text, but it is also useful
for image data.

86
C H A P T E R 3
Language
FlateDecode
(none)
(LanguageLevel 3) Decompresses data encoded in zlib/deflate com-
pressed format, producing the original data.
RunLengthEncode
record size
Compresses data using a simple byte-oriented run-length encoding
algorithm. This encoding is best suited to monochrome image data,
or any data that contains frequent long runs of a single byte value.
RunLengthDecode
(none)
Decompresses data encoded in the run-length encoding format, pro-
ducing the original data.
CCITTFaxEncode
(none)
Compresses data using a bit-oriented encoding algorithm (the
CCITT facsimile standard). This encoding is specialized to mono-
chrome image data at 1 bit per pixel.
CCITTFaxDecode
(none)
Decompresses facsimile-encoded data, producing the original data.
DCTEncode
dictionary
Compresses continuous-tone (grayscale or color) sampled image
data using a DCT (discrete cosine transform) technique based on the
JPEG standard. This encoding is specialized to image data. It is
“lossy,” meaning that the encoding algorithm can lose some informa-
tion.
DCTDecode
(none)
Decompresses DCT-encoded data, producing image sample data that
approximate the original data.
ReusableStreamDecode
(none)
(LanguageLevel 3) From any data source, creates an input stream that
can be treated as a random-access, repositionable file.
NullEncode
(none)
Passes all data through, without any modification. This permits an
arbitrary data target (procedure or string) to be treated as an output
file.
SubFileDecode
count, string
Passes all data through, without any modification. This permits an
arbitrary data source (procedure or string) to be treated as an input
file. Optionally, this filter detects an end-of-data marker in the source
data stream, treating the preceding data as a subfile.
Note: In LanguageLevel 3, all encoding filters, with the exception of the NullEncode
filter, are optional—that is, they may or may not be present in a PostScript interpret-
er product. Additional nonstandard filters may be available in some products. To en-
sure portability, PostScript programs that are page descriptions should not depend on
optional or nonstandard filters.

Section 3.13, “Filtered Files Details,” provides complete information about indi-
vidual filters, including specifications of the encoding algorithms for some of

87
3 . 9
Named Resources
them. The section also describes the semantics of data sources and data targets in
more detail.
3.8.5 Additional File Operators
There are other miscellaneous file operators:
status and bytesavailable return status information about a file.
currentfile returns the file object from which the interpreter is currently read-
ing.
run is a convenience operator that combines the functions of file and exec.
Several built-in procedures print the values of objects on the operand stack, send-
ing a readable representation of those values to the standard output file:
= pops one object from the operand stack and writes a text representation of its
value to the standard output file, followed by a newline.
== is similar to =, but produces results closer to full PostScript language syntax
and expands the values of arrays.
stack prints the entire contents of the operand stack with =, but leaves the stack
unchanged.
pstack performs a similar operation to stack, but uses ==.
Input/output and storage devices can be manipulated individually by
LanguageLevel 2 operators. In particular:
setdevparams and currentdevparams access device-dependent parameters (see
Appendix C).
resourceforall, applied to the IODevice resource category, enumerates all avail-
able device parameter sets (see the next section).
3.9 Named Resources
Some features of the PostScript language involve the use of open-ended col-
lections of objects to control their operation. For example, the font machinery
uses font dictionaries that describe the appearance of characters. The number of
possible font dictionaries is unlimited. In LanguageLevels 2 and 3, this same idea

88
C H A P T E R 3
Language
applies to forms, patterns, color rendering dictionaries, and many other catego-
ries of objects.
It is often convenient to associate these objects with names in some central regis-
try. This is particularly true for fonts, which are assigned standard names (such as
Times-Roman or Palatino-BoldItalic) when they are created. Other categories of
objects also can benefit from a central naming convention.
If all available objects in a particular category (for example, all possible fonts)
were permanently resident in VM, they could simply be stored in some dictionary.
Accessing a named object would be a matter of performing get from the diction-
ary; checking whether a named object is available would be accomplished by per-
forming a known operation on the dictionary.
There are many more fonts and objects of other categories than can possibly re-
side in VM at any given time. These objects originate from a source external to
the PostScript interpreter. They are introduced into VM in two ways:
The application or print spooler embeds the objects’ definitions directly in the
job stream.
During execution, the PostScript program requests the objects by name. The
interpreter loads them into VM automatically from an external source, such as
a disk file, a ROM cartridge, or a network file server.
The notion of named resources (LanguageLevel 2) supports the second method. A
resource is a collection of named objects that either reside in VM or can be located
and brought into VM on demand. There are separate categories of resources with
independent name spaces; for example, fonts and forms are distinct resource cat-
egories. Within each category, there is a collection of named resource instances.
Each category can have its own policy for locating instances that are not in VM
and for managing the instances that are in VM.
3.9.1 Resource Operators
There are five LanguageLevel 2 operators that apply to resources: findresource,
resourcestatus, resourceforall, defineresource, and undefineresource. A more
limited pair of operators applicable only to fonts, findfont and definefont, are
available in LanguageLevel 1.

89
3 . 9
Named Resources
The findresource operator is the key feature of the resource facility. Given a re-
source category name and an instance name, findresource returns an object. If
the requested resource instance does not already exist as an object in VM,
findresource gets it from an external source and loads it into VM. A PostScript
program can access named resources without knowing whether they are already
in VM or how they are obtained from external storage.
Other important features include resourcestatus, which returns information
about a resource instance, and resourceforall, which enumerates all available
resource instances in a particular category. These operators apply to all resource
instances, whether or not they reside in VM; the operators do not cause the re-
source instances to be brought into VM. resourceforall should be used with care
and only when absolutely necessary, since the set of available resource instances is
potentially extremely large.
A program can explicitly define a named resource instance in VM. That is, it can
create an object in VM, then execute defineresource to associate the object with a
name in a particular resource category. This resource instance will be visible in
subsequent executions of findresource, resourcestatus, and resourceforall. A
program can also execute undefineresource to reverse the effect of a prior
defineresource. The findresource operator automatically executes define-
resource and undefineresource to manage VM for resource instances that it ob-
tains from external storage.
Resource instances can be defined in either local or global VM. The lifetime of the
definition depends on the VM allocation mode in effect at the time the definition
is made (see Section 3.7.2, “Local and Global VM”). Normally, both local and
global resource instances are visible and available to a program. However, when
the current VM allocation mode is global, only global instances are visible; this
ensures correct behavior of resource instances that are defined in terms of other
resource instances.
When a program executes defineresource to define a resource instance explicitly,
the program has complete control over whether to use local or global VM. How-
ever, when execution of findresource causes a resource instance to be brought
into VM automatically, the decision whether to use local or global VM is inde-
pendent of the VM allocation mode at the time findresource is executed. Usually,
resource instances are loaded into global VM; this enables them to be managed
independently of the save and restore activity of the executing program. How-
ever, certain resource instances do not function correctly when they reside in glo-

90
C H A P T E R 3
Language
bal VM; they are loaded into local VM instead. In general, PostScript programs
using resources should not depend on knowing anything about the policies used
by the resource machinery, since those policies can vary among different resource
implementations.
The language does not specify a standard method for installing resources in ex-
ternal storage. Installation typically consists of writing to a named file in a file
system. However, details of how resource names are mapped to file names and
how the files are managed are environment-dependent. In some environments,
resources may be installed using facilities entirely separate from the PostScript in-
terpreter.
Resource instances are identified by keys that ordinarily are name or string ob-
jects; the resource operators treat names and strings equivalently. Use of other
types of keys is permitted but not recommended. The defineresource operator
can define a resource instance with a key that is not a name or a string, and the
other resource operators can access the instance using that key. However, such a
key can never match any resource instance in external storage.
3.9.2 Resource Categories
Resource categories are identified by name. Tables 3.7, 3.8, and 3.9 list the stan-
dard resource categories. Within a given category, every resource instance that re-
sides in VM is of a particular type and has a particular intended interpretation or
use.
Regular resources are those whose instances are ordinary useful objects, such as
font or halftone dictionaries. For example, a program typically uses the result re-
turned by findresource as an operand of some other operator, such as scalefont
or sethalftone.
Implicit resources represent some built-in capability of the PostScript interpreter.
For example, the instances of the Filter category are filter names, such as
ASCII85Decode and CCITTFaxDecode, that are passed directly to the filter opera-
tor. For such resources, the findresource operator returns only its name operand.
However, resourceforall and resourcestatus are useful for inquiring about the
availability of capabilities such as specific filter algorithms.

91
3 . 9
Named Resources
TABLE 3.7 Regular resources
CATEGORY NAME
OBJECT TYPE
DESCRIPTION
Font
dictionary
Font definition
CIDFont
dictionary
CIDFont definition (LanguageLevel 3)
CMap
dictionary
Character code mapping (LanguageLevel 3)
FontSet
dictionary
Bundle of font definitions (LanguageLevel 3)
Encoding
array
Encoding vector
Form
dictionary
Form definition
Pattern
dictionary
Pattern definition (prototype)
ProcSet
dictionary
Procedure set
ColorSpace
array
Parameterized color space
Halftone
dictionary
Halftone dictionary
ColorRendering
dictionary
Color rendering dictionary
IdiomSet
dictionary
Procedure substitution dictionary
(LanguageLevel 3)
InkParams
dictionary
Colorant details dictionary (LanguageLevel 3)
TrapParams
dictionary
Trapping parameter set (LanguageLevel 3)
OutputDevice
dictionary
Page device capabilities (LanguageLevel 3)
ControlLanguage
dictionary
Control language support (LanguageLevel 3)
Localization
dictionary
Natural language support (LanguageLevel 3)
PDL
dictionary
PDL interpreter support (LanguageLevel 3)
HWOptions
dictionary
Hardware options (LanguageLevel 3)
TABLE 3.8 Resources whose instances are implicit
CATEGORY NAME
OBJECT TYPE
DESCRIPTION
Filter
name
Filter algorithm
ColorSpaceFamily
name
Color space family
Emulator
name
Language interpreter
IODevice
string
Device parameter set

92
C H A P T E R 3
Language
ColorRenderingType
integer
Color rendering dictionary type
FMapType
integer
Composite font mapping algorithm
FontType
integer
Font dictionary type
FormType
integer
Form dictionary type
HalftoneType
integer
Halftone dictionary type
ImageType
integer
Image dictionary type
PatternType
integer
Pattern dictionary type
FunctionType
integer
Function dictionary type (LanguageLevel 3)
ShadingType
integer
Shading dictionary type (LanguageLevel 3)
TrappingType
integer
Trapping method (LanguageLevel 3)
TABLE 3.9 Resources used in defining new resource categories
CATEGORY NAME
OBJECT TYPE
DESCRIPTION
Category
dictionary
Resource category (recursive)
Generic
any
Prototype for new categories
The Category and Generic resources are used in defining new categories of
resources. This capability is described in Section 3.9.3, “Creating Resource Cate-
gories.”
The resource operators—findresource, resourcestatus, resourceforall, define-
resource, and undefineresource—have standard behavior that is uniform across
all resource categories. This behavior is specified in the operator descriptions in
Chapter 8. For some categories, the operators have additional semantics that are
category-specific. The following sections describe the semantics of each resource
category.
Note: Except as indicated below, the PostScript language does not prescribe that a re-
source category must contain any standard instances. Some categories may be popu-
lated with predefined instances, but the set of instances is product-dependent.


93
3 . 9
Named Resources
Font
Instance names of the Font resource category are font names, such as Times-
Roman. The instances are font dictionaries that are suitable for use as operands to
scalefont or makefont, which produce a transformed font dictionary that can be
used to paint characters on the page.
The following special-purpose operators apply only to fonts but are otherwise
equivalent to the resource operators:
findfont, equivalent to /Font findresource
definefont, equivalent to /Font defineresource
undefinefont, equivalent to /Font undefineresource
The definefont and undefinefont operators have additional font-specific seman-
tics, which are described under those operators in Chapter 8. Those semantics
also apply to defineresource and undefineresource when applied to the Font cat-
egory. findfont and definefont are available in LanguageLevel 1, even though the
general facility for named resources is a LanguageLevel 2 feature.
The font operators also maintain dictionaries of font names and Font resource
instances that are defined in VM. Those dictionaries are FontDirectory (all Font
resources in VM) and GlobalFontDirectory (only Font resources in global VM).
They are obsolete, but are provided for compatibility with existing applications.
The preferred method of enumerating all available Font resources is
(*) proc scratch /Font resourceforall
where proc is a procedure and scratch is a string used repeatedly to hold font
names. This method works for all available Font resources, whether or not they
are in VM. Normally, it is preferable to use resourcestatus to determine the avail-
ability of specific resources rather than enumerate all resources and check wheth-
er those of interest are in the list.
When findresource or findfont loads a font from an external source into VM, it
may choose to use global VM rather than the current VM allocation mode. This
choice depends on memory management algorithms used by the interpreter. It
also depends on the font type, since certain Type 3 fonts do not work correctly
when loaded into global VM. The details of this policy are implementation-
dependent; a PostScript program should not depend on knowing what they are.

94
C H A P T E R 3
Language
CIDFont
Instances of the CIDFont resource category (LanguageLevel 3) are dictionaries
that are suitable for use with the composefont operator to construct CID-keyed
fonts, as described in Section 5.11, “CID-Keyed Fonts.” The defineresource oper-
ator has certain category-specific semantics when applied to the CIDFont catego-
ry; furthermore, the definefont and undefinefont operators can be applied to
CIDFonts as well as fonts. For more information on the behavior of these opera-
tors, see Section 5.11.3, “CIDFont Dictionaries.”
CMap
Instances of the CMap resource category (LanguageLevel 3) are character code
mapping dictionaries that are suitable for use with the composefont operator to
construct CID-keyed fonts, as described in Section 5.11, “CID-Keyed Fonts.”
FontSet
Instances of the FontSet resource category (LanguageLevel 3) are bundles of font
definitions that are represented in the Compact Font Format (CFF) or other
multiple-font representations, as described in Section 5.8.1, “Type 2 and Type 14
Fonts (CFF and Chameleon).” Each FontSet instance contains the material from
which one or more Font instances can be constructed.
Encoding
Instances of the Encoding resource category are array objects, suitable for use as
the Encoding entry of font dictionaries (see Section 5.3, “Character Encoding”).
An encoding array usually contains 256 names, permitting it to be indexed by any
8-bit character code. An encoding array for use with composite fonts (described
in Section 5.10, “Composite Fonts”) contains integers instead of names, and can
be of any length.
There are two standard encodings that are permanently defined in VM and avail-
able by name in systemdict:
StandardEncoding, whose value is the same as the array returned by
/StandardEncoding /Encoding findresource

95
3 . 9
Named Resources
ISOLatin1Encoding, whose value is the same as the array returned by
/ISOLatin1Encoding /Encoding findresource
If any other encodings exist, they are available only through findresource. The
convenience operator findencoding is equivalent to /Encoding findresource.
Form
Instances of the Form resource category are form dictionaries, described in
Section 4.7, “Forms.” A form dictionary is suitable as the operand to the
execform operator to render the form on the page.
Pattern
Instances of the Pattern resource category are prototype pattern dictionaries, de-
scribed in Section 4.9, “Patterns.” A prototype pattern dictionary is suitable as the
operand to the makepattern operator, which produces a transformed pattern
dictionary; a PostScript program can then use the resulting dictionary in painting
operations by establishing a Pattern color space or by invoking the setpattern op-
erator.
ProcSet
Instances of the ProcSet resource category are procedure sets. A procedure set is a
dictionary containing named procedures or operators. Application prologs can
be organized as one or more procedure sets that are available from a library
instead of being included in-line in every document that uses them. The ProcSet
resource category provides a way to organize such a library.
In LanguageLevel 3, there are several standard instances of the ProcSet category
that are associated with specific features of the PostScript language. These proce-
dure sets, listed in Table 3.10, contain procedures, operators, and other objects
that a PostScript program can access as part of using those features.

96
C H A P T E R 3
Language
TABLE 3.10 Standard procedure sets in LanguageLevel 3
PROCEDURE SET
ASSOCIATED LANGUAGE FEATURE
BitmapFontInit
Incremental downloading and management of glyph bitmaps in a
Type 4 CIDFont (see “Type 4 CIDFonts” on page 379)
CIDInit
Building a Type 0 CIDFont (“Type 0 CIDFonts” on page 371) or a
CMap dictionary (Section 5.11.4, “CMap Dictionaries”)
ColorRendering
Selecting a color rendering dictionary (Section 7.1.3, “Rendering
Intents”)
FontSetInit
Building a FontSet resource (“FontSet Resources” on page 344)
Trapping
In-RIP trapping (Section 6.3, “In-RIP Trapping”)
ColorSpace
Instances of the ColorSpace resource category are array objects that represent ful-
ly parameterized color spaces. The first element of a color space array is a color
space family name; the remaining elements are parameters to the color space (see
Section 4.8, “Color Spaces”).
Note: The ColorSpace resource category is distinct from the ColorSpaceFamily cate-
gory, described below.

Halftone
Instances of the Halftone resource category are halftone dictionaries, suitable as
operands to the sethalftone operator (see Section 7.4, “Halftones”).
ColorRendering
Instances of the ColorRendering resource category are color rendering diction-
aries, suitable as operands to the setcolorrendering operator (see Section 7.1,
“CIE-Based Color to Device Color”).

97
3 . 9
Named Resources
IdiomSet
Instances of the IdiomSet resource category (LanguageLevel 3) are procedure sub-
stitution dictionaries, for use with the bind operator (see Section 3.12.1, “bind
Operator”).
InkParams and TrapParams
The LanguageLevel 3 resource categories InkParams and TrapParams are present
only in products that support in-RIP trapping (see Section 6.3, “In-RIP Trap-
ping”). Instances of InkParams are dictionaries that define trapping-related prop-
erties of device colorants; instances of TrapParams are dictionaries that define sets
of trapping parameters suitable as operands to the settrapparams operator.
OutputDevice
Instances of the OutputDevice resource category (LanguageLevel 3) are diction-
aries that describe certain capabilities of a particular page device, such as the pos-
sible page sizes or resolutions (see Section 6.4, “Output Device Dictionary”).
ControlLanguage, PDL, Localization, and HWOptions
Instances of the LanguageLevel 3 resource categories ControlLanguage, PDL,
Localization, and HWOptions provide information that is product-dependent, as
summarized below. For further details, see the PostScript Language Reference Sup-
plement
.
Instances of ControlLanguage are dictionaries that describe the control lan-
guages available in a product. A control language is a means for controlling
product features, such as default configuration and status reporting.
Instances of PDL are dictionaries that describe the page description language
interpreters available in a product. This category supersedes the Emulator im-
plicit resource category, because its instances provide a more complete descrip-
tion of each interpreter (or emulator).
Instances of Localization are dictionaries that describe the natural languages
(for example, English, Japanese, or German) supported by a product.

98
C H A P T E R 3
Language
Instances of HWOptions are strings that indicate the special hardware options
that are present in this product.
Implicit Resources
For all implicit resources, the findresource operator returns the instance’s key if
the instance is defined. The resourcestatus and resourceforall operators have
their normal behavior, although the status and size values returned by
resourcestatus are meaningless. The defineresource and undefineresource
operators are ordinarily not allowed, but the ability to define new instances of
implicit resources may exist in some implementations. The mechanisms are
implementation-dependent.
The instances of the Filter category are filter names, such as ASCII85Decode and
RunLengthEncode, which are used as an operand of the filter operator to deter-
mine its behavior. Filters are described in Section 3.8.4, “Filters.”
The instances of the ColorSpaceFamily category are color space family names,
which appear as the first element of a color space array object. Some color spaces,
such as DeviceRGB, are completely determined by their family name; others, such
as CIEBasedABC, require additional parameters to describe them. Color spaces
are described in Section 4.8, “Color Spaces.”
The instances of the Emulator category are names of emulators for languages
other than PostScript that may be built into a particular implementation. Those
emulators are not a standard part of the PostScript language, but one or more of
them may be present in some products. This category has been superseded by the
PDL resource category in LanguageLevel 3.
The instances of the IODevice category are names of device parameter sets. Some
parameter sets are associated with input/output devices, from which the category
name IODevice originates. However, there are also some parameter sets that do
not correspond to physical devices. The keys for all instances of this category are
expressed as strings of the form %device%. See Section C.4, “Device Parameters.”
The instances of the ColorRenderingType, FMapType, FontType, FormType,
HalftoneType, ImageType, PatternType, FunctionType, ShadingType, and
TrappingType
categories are integers that are the acceptable values for the corre-
spondingly named entries in various classes of special dictionaries. For example,
in LanguageLevel 3 the FMapType category includes the integers 1 through 9 as

99
3 . 9
Named Resources
keys; if an interpreter supports additional FMapType values, the FMapType cate-
gory will also include those values as instances.
3.9.3 Creating Resource Categories
The language support for named resources is quite general. Most of it is indepen-
dent of the semantics of specific resource categories. It is occasionally useful to
create new resource categories, each containing an independent collection of
named instances. This is accomplished through a level of recursion in the re-
source machinery itself.
The resource category named Category contains all of the resource categories as
instances. The instance names are resource category names, such as Font, Form,
and Halftone. The instance values are dictionary objects containing information
about how the corresponding resource category is implemented.
A new resource category is created by defining a new instance of the Category
category. Example 3.6 creates a category named Widget.
Example 3.6
true setglobal
/Widget catdict /Category defineresource pop
false setglobal
In this example, catdict is a dictionary describing the implementation of the
Widget category. Once it is defined, instances of the Widget category can be ma-
nipulated like other categories:
/Frob1 w /Widget defineresource
% Returns w
/Frob1 /Widget findresource
% Returns w
/Frob1 /Widget resourcestatus
% Returns status size true
(*) proc scratch /Widget resourceforall
% Pushes (Frob1) on the stack, then calls proc
Here w is an instance of the Widget category whose type is whatever is appropri-
ate for widgets, and /Frob1 is the name of that instance.
It is possible to redefine existing resource categories in this way. Programs that do
this must ensure that the new definition correctly implements any special seman-
tics of the category.

100
C H A P T E R 3
Language
Category Implementation Dictionary
The behavior of all the resource operators, such as defineresource, is determined
by entries in the resource category’s implementation dictionary. This dictionary
was supplied as an operand to defineresource when the category was created. In
the example
/Frob1 w /Widget defineresource
the defineresource operator does the following:
1. Obtains catdict, the implementation dictionary for the Widget category.
2. Executes begin on the implementation dictionary.
3. Executes the dictionary’s DefineResource entry, which is ordinarily a proce-
dure but might be an operator. When the procedure corresponding to the
DefineResource entry is called, the operand stack contains the operands that
were passed to defineresource, except that the category name (Widget in this
example) has been removed. DefineResource is expected to consume the re-
maining operands, perform whatever action is appropriate for this resource
category, and return the appropriate result.
4. Executes the end operator. If an error occurred during step 3, it also restores
the operand and dictionary stacks to their initial state.
The other resource operators—undefineresource, findresource, resourcestatus,
and resourceforall—behave the same way, with the exception that resourceforall
does not restore the stacks upon error. Aside from the steps described above, all
of the behavior of the resource operators is implemented by the corresponding
procedures in the dictionary.
A category implementation dictionary contains the entries listed in Table 3.11.
The dictionary may also contain other information useful to the procedures in
the dictionary. Since the dictionary is on the dictionary stack at the time those
procedures are called, the procedures can access the information conveniently.

101
3 . 9
Named Resources
TABLE 3.11 Entries in a category implementation dictionary
KEY
TYPE
VALUE
DefineResource
procedure
(Required) A procedure that implements defineresource behavior.
UndefineResource
procedure
(Required) A procedure that implements undefineresource behavior.
FindResource
procedure
(Required) A procedure that implements findresource behavior. This pro-
cedure determines the policy for using global versus current VM when
loading a resource from an external source.
ResourceStatus
procedure
(Required) A procedure that implements resourcestatus behavior.
ResourceForAll
procedure
(Required) A procedure that implements resourceforall behavior. This
procedure should remove the category implementation dictionary from
the dictionary stack before executing the procedure operand of
resourceforall, and should put that dictionary back on the dictionary
stack before returning. This ensures that the procedure operand is execut-
ed in the dictionary context in effect at the time resourceforall was in-
voked.
Category
name
(Required) The category name. This entry is inserted by defineresource
when the category is defined.
InstanceType
name
(Optional) The expected type of instances of this category. If this entry is
present, defineresource checks that the instance’s type, as returned by the
type operator, matches it.
ResourceFileName
procedure
(Optional) A procedure that translates a resource instance name to a file
name (see Section 3.9.4, “Resources as Files”).
A single dictionary provides the implementation for both local and global in-
stances of a category. The implementation must maintain the local and global
instances separately and must respect the VM allocation mode in effect at the
time each resource operator is executed. The category implementation dictionary
must be in global VM; the defineresource operator that installs it in the Category
category must be executed while in global VM allocation mode.
The interpreter assumes that the category implementation procedures will be
reasonably well behaved and will generate errors only due to circumstances not
under their control. In this respect, they are similar to the BuildChar procedure in
a Type 3 font or to the PaintProc procedure in a form or pattern, but are unlike
the arbitrary procedures invoked by operators such as forall or resourceforall.

102
C H A P T E R 3
Language
If an error occurs in a category implementation procedure, the resource operator
makes a token attempt to restore the stacks and to provide the illusion that the
error arose from the operator itself. The intent is that the resource operators
should have the usual error behavior as viewed by a program executing them.
The purpose is not to compensate for bugs in the resource implementation pro-
cedures.
Generic Category
The preceding section describes a way to define a new resource category, but it
does not provide guidance about how the individual procedures in the category’s
dictionary should be implemented. In principle, every resource category has
complete freedom over how to organize and manage resource instances, both in
VM and in external storage.
Since different implementations have different conventions for organizing re-
source instances, especially in external storage, a program that seeks to create a
new resource category might need implementation-dependent information. To
overcome this problem, it is useful to have a generic resource implementation
that can be copied and used to define new resource categories. The Category cat-
egory contains an instance named Generic, whose value is a dictionary contain-
ing a generic resource implementation.
Example 3.7 defines the Widget resource category and is similar to Example
3.6 on page 99; however, it generates the category implementation dictionary by
copying the one belonging to the Generic category. This avoids the need to know
anything about how resource categories actually work.
Example 3.7
currentglobal
% Save the current VM status on the stack.
true setglobal
/Generic /Category findresource
dup length 1 add dict copy
dup /InstanceType /dicttype put
/Widget exch /Category defineresource pop
setglobal
% Restore the saved VM status.
The Generic resource category’s implementation dictionary does not have an
InstanceType entry; instances need not be of any particular type. The example
above makes a copy of the dictionary with space for one additional entry and in-

103
3 . 9
Named Resources
serts an InstanceType entry with the value dicttype. As a result, defineresource
requires that instances of the Widget category be dictionaries.
3.9.4 Resources as Files
The PostScript language does not specify how external resources are installed,
how they are loaded, or what correspondence, if any, exists between resource
names and file names. In general, all knowledge of such things is in the category
implementation dictionary and in environment-dependent installation software.
Typically, resource instances are installed as named files, which can also be access-
ed by ordinary PostScript file operators such as file and run. There is a straight-
forward mapping from resource names to file names, though the details of this
mapping vary because of restrictions on file name syntax imposed by the under-
lying file system.
In some implementations, including many dedicated printers, the only access to
the file system is through the PostScript interpreter. In such environments, it is
important for PostScript programs to be able to access the underlying resource
files directly in order to install or remove them. Only resource installation or oth-
er system management software should do this. Page descriptions should never
attempt to access resources as files; they should use only resource operators, such
as findresource.
The implementation dictionary for a category can contain an optional entry,
ResourceFileName, which is a procedure that translates from a resource name to
a file name. If the procedure exists, a program can call it as follows:
1. Push the category implementation dictionary on the dictionary stack. The
ResourceFileName procedure requires this step in order to obtain category-
specific information, such as Category.
2. Push the instance name and a scratch string on the operand stack. The scratch
string must be long enough to accept the complete file name for the resource.
3. Execute ResourceFileName.
4. Pop the dictionary stack.
ResourceFileName builds a complete file name in the scratch string and returns
on the operand stack the substring that was used. This string can then be used as

104
C H A P T E R 3
Language
the filename operand of file operators such as file, deletefile, status, and so on.
For example, the following program fragment obtains the file name for the Times-
Roman font:
/Font /Category findresource
begin
/Times-Roman scratch ResourceFileName
end
If a ResourceFileName procedure for a particular category and instance name ex-
ists and executes without a PostScript error, it will leave a string on the stack. If
that category maintains all of its instances as named files, this string is the name
of the file for that instance. This file name may or may not contain the %device%
prefix. Use of this file name with file operators may not succeed for a variety of
reasons, including:
The category does not maintain all of its instances as named files.
The operator tried to delete a file from a read-only file system.
The operator tried to write to a file system with insufficient space.
There may be a limit on the length of a resource file name, which in turn imposes
a length limit on the instance name. The inherent limit on resource instance
names is the same as that on name objects in general (see Appendix B). By con-
vention, font names are restricted to fewer than 40 characters. This convention is
recommended for other resource categories as well. Note that the resource file
name may be longer or shorter than the resource instance name, depending on
details of the name-mapping algorithm. When calling ResourceFileName, it is
prudent to provide a scratch string at least 100 characters long.
Some implementations provide additional control over the behavior of
ResourceFileName; see Section C.3.6, “Resource File Location.”
A resource file contains a PostScript program that can be executed to load the re-
source instance into VM. The last action the program should take is to execute
defineresource or an equivalent operator, such as definefont, to associate the
resource instance with a category and a name. In other words, each resource file
must be self-identifying and self-defining. The resource file must be well behaved:
it must leave the stacks in their original state and it must not execute any opera-
tors (graphics operators, for instance) that are not directly related to creating the
resource instance.

105
3 . 9
Named Resources
For most resource categories, including Generic, the category’s FindResource
procedure executes true setglobal before executing the resource file and restores
the previous VM allocation mode afterward. As a result, the resource instance is
loaded into global VM and defineresource defines the resource instance globally,
regardless of the VM allocation mode at the time findresource is invoked. Unfor-
tunately, certain resource instances behave incorrectly if they reside in global VM.
Some means are required to defeat the automatic loading into global VM. Two
methods are currently used:
Some implementations of the Font category’s FindResource procedure omit ex-
ecuting true setglobal before executing the font file. This causes fonts to be
defined in the VM allocation mode in effect when findresource is invoked,
rather than always in global VM. Details of this policy are implementation-
dependent.
If a particular resource instance is known not to work in global VM, the re-
source file should begin with an explicit false setglobal.
A resource file can contain header comments, as specified in Adobe Technical
Note #5001, PostScript Language Document Structuring Conventions Specification.
If there is a header comment of the form
%%VMusage: int int
then the resourcestatus operator returns the larger of the two integers as its size
result. If the %%VMusage: comment is not present, resourcestatus may not be
able to determine the VM consumption for the resource instance, in which case it
will return a size of −1.
The definition of an entire resource category—that is, an instance of the
Category category—can come from a resource file in the normal way. If any re-
source operator is presented with an unknown category name, it automatically
executes
category /Category findresource
in an attempt to cause the resource category to become defined. Only if that fails
will the resource operator generate an undefined error to report that the resource
category is unknown.

106
C H A P T E R 3
Language
3.10 Functions
The PostScript language includes operators and procedures that take arguments
off the operand stack and put their results back on the stack. The add operator,
for example, pops two arguments, which must be numbers, and pushes the sum
of those numbers back on the stack. add could be viewed as a function with two
input values and one output value:
f (x , x ) = x + x
0
1
0
1
Similarly, the following procedure computes the average and the square root of
the product of two numbers:
{
2 copy add
2 div
3 1 roll mul
sqrt
}
This could be viewed as a function of two input values and two output values:
x + x
0
1
f (x , x ) = ---------,
x × x
0
1
2
0
1
In general, a function can take any number (m) of input values and produce any
number (n) of output values:
f (x , …, x
) = y , …, y
0
m – 1
0
n – 1
LanguageLevel 3 supports an explicit, static representation for functions, known
as function dictionaries. Functions are less general than PostScript procedures: all
the input values and all the output values are numbers, and functions have no
side effects. On the other hand, functions can be considerably more efficient than
procedures, since they entail no PostScript operator execution.
At present, there is only one use for functions in the PostScript language: they are
used to define the color values in a shading pattern (see Section 4.9.3, “Shading
Patterns,” and the shfill operator in Chapter 8). There is no operator like exec
that explicitly calls a function. Functions are also used extensively in PDF, where
there are no procedures; for more information, see the Portable Document Format
Reference Manual.


107
3 . 1 0
Functions
Each function definition includes a domain, the set of legal values for the input.
Some types of function also define a range, the set of legal values for the output.
Values passed to the function are clipped to the domain, and values produced by
the function are clipped to the range. For example, suppose the function
f(x) = x + 2 is defined with a domain of [−1 1]. If the function is called with the
value 6, that value is replaced with the nearest value in the defined domain, 1,
before the function is evaluated, and the result is therefore 3. Similarly, if the
function f(x0, x1) = 3 × x0 + x1 is defined with a range of [0 100], and if the values
−6 and 4 are passed to the function (and are within its domain), then the value
produced by the function, −14, is replaced with 0, the nearest value in the defined
range.
3.10.1 Function Dictionaries
A function dictionary specifies a function’s representation, the set of attributes
that parameterize that representation, and the additional data needed by that
representation. Three types of function are available, as indicated by the diction-
ary’s FunctionType entry:
A sampled function (type 0) uses a table of sample values to represent the func-
tion. Various techniques are used to interpolate values between the sample
values.
An exponential interpolation function (type 2) defines a set of coefficients for an
exponential function.
A stitching function (type 3) is a combination of other functions, partitioned
across a domain.
All function dictionaries share the entries listed in Table 3.12. In addition, each
type of function dictionary must include attributes appropriate to the particular
function type. The number of output values can usually be inferred from other
attributes of the function; if not (as is always the case for type 0 functions), the
Range attribute is required. The dimensionality of the function implied by the
Domain and Range attributes must be consistent with the dimensionality implied
by other attributes of the function; otherwise, a rangecheck error will occur.

108
C H A P T E R 3
Language
TABLE 3.12 Entries common to all function dictionaries
KEY
TYPE
VALUE
FunctionType
integer
(Required) The function type:
0
Sampled function
2
Exponential interpolation function
3
Stitching function
Domain
array
(Required) An array of 2 × m numbers, where m is the number of input val-
ues. For each i from 0 to m − 1, Domain2i must be less than or equal to
Domain2i+1, and the ith input value, xi, must lie in the interval
Domain2ixi Domain2i+1. Input values outside the declared domain are
clipped to the nearest boundary value.
Range
array
(Required for type 0 functions, optional otherwise; see below) An array of 2 × n
numbers, where n is the number of output values. For each j from 0 to n − 1,
Range2j must be less than or equal to Range2j+1, and the jth output value, yj,
must lie in the interval Range2jyj ≤ Range2j+1. Output values outside the
declared range are clipped to the nearest boundary value. If the Range entry
is absent, no clipping is done.
Type 0 Function Dictionaries (Sampled Functions)
Type 0 function dictionaries use a sequence of sample values to provide an ap-
proximation for functions whose domains and ranges are bounded. The samples
are organized as an m-dimensional table in which each entry has n components.
Sampled functions are highly general and offer reasonably accurate repre-
sentations of arbitrary analytic functions at low expense. For example, a 1-input
sinusoidal function can be represented over the range [0 180] with an average
error of only 1 percent, using just ten samples and linear interpolation. Two-
input functions require significantly more samples, but usually not a prohibitive
number, so long as the function does not have high frequency variations.
The dimensionality of a sampled function is restricted only by implementation
limits. However, the number of samples required to represent high-dimensionality
functions multiplies rapidly unless the sampling resolution is very low. Also, the
process of multilinear interpolation becomes computationally intensive if m is
greater than 2. The multidimensional spline interpolation is even more computa-
tionally intensive.

109
3 . 1 0
Functions
In addition to the entries in Table 3.12, a type 0 function dictionary includes the
entries listed in Table 3.13.
TABLE 3.13 Additional entries specific to a type 0 function dictionary
KEY
TYPE
VALUE
Order
integer
(Optional) The order of interpolation between samples. Allowed values are 1
and 3, specifying linear and cubic spline interpolation, respectively. Default
value: 1.
DataSource
string or file
(Required) A string or positionable file providing the sequence of sample
values that specifies the function. (A file object derived from a Reusable-
StreamDecode filter may be used here.)
BitsPerSample
integer
(Required) The number of bits used to represent each component of each
sample value. The number must be 1, 2, 4, 8, 12, 16, 24, or 32.
Encode
array
(Optional) An array of 2 × m numbers specifying the linear mapping of input
values into the domain of the function’s sample table. Default value:
[0 (Size0 − 1) 0 (Size1 − 1) …].
Decode
array
(Optional) An array of 2 × n numbers specifying the linear mapping of
sample values into the range of values appropriate for the function’s output
values. Default value: Same as the value of Range.
Size
array
(Required) An array of m positive integers specifying the number of samples
in each input dimension of the sample table.
The Domain, Encode, and Size attributes determine how the function’s input
variable values are mapped into the sample table. For example, if Size is [21 31],
the default Encode array is [0 20 0 30], which maps the entire domain into the full
set of sample table entries. Other values of Encode may be used.
To explain the relationship between Domain, Encode, Size, Decode, and Range,
we use the following notation:
(y
y
)
max
min
y = Interpolate(x, x
, x
, y
, y
) = (x x
) ×
+
min
max
min
max
min
(
-----------------
y
x
x
)
min
max
min
For a given value of x, Interpolate calculates the y value on the line defined by the
two points (xmin, ymin) and (xmax, ymax).

110
C H A P T E R 3
Language
When a sampled function is called, each input value xi, for 0 ≤ i < m, is clipped to
the domain:
x′ = min(max(x , Domain ), Domain
)
i
i
2i
2i + 1
That value is encoded:
e = Interpolate(x′, Domain , Domain
, Encode , Encode
)
i
i
2i
2i + 1
2i
2i + 1
That value is clipped to the size of the sample table in that dimension:
e′ = min(max(e , 0), Size – 1)
i
i
i
The encoded input values are real numbers, not restricted to integers. Interpola-
tion is then used to determine output values from the nearest surrounding values
in the sample table. Each output value rj, for 0 ≤ j < n, is then decoded:
r′ = Interpolate(r , 0, 2BitsPerSample – 1, Decode , Decode
)
j
j
2j
2j + 1
Finally, each decoded value is clipped to the range:
y = min(max(r′, Range ), Range
)
j
j
2j
2j + 1
Sample data is represented as a stream of unsigned 8-bit bytes (integers in the
range 0 to 255). The bytes constitute a continuous bit stream, with the high-order
bit of each byte first. Each sample value is represented as a sequence of
BitsPerSample bits. Successive values are adjacent in the bit stream; there is no
padding at byte boundaries.
For a function with multidimensional input (more than one input variable), the
sample values in the first dimension vary fastest, and the values in the last dimen-
sion vary slowest. For example, for a function f(a, b, c), where a, b, and c vary
from 0 to 9 in steps of 1, the sample values would appear in this order: f(0, 0, 0),
f(1, 0, 0), …, f(9, 0, 0), f(0, 1, 0), f(1, 1, 0), …, f(9, 1, 0), f(0, 2, 0), f(1, 2, 0), …,
f(9, 9, 0), f(0, 0, 1), f(1, 0, 1), and so on.
For a function with multidimensional output (more than one output value), the
values are stored in the same order as Range.
The DataSource string or file must be long enough to contain the entire sample
array, as indicated by Size, Range, and BitsPerSample; otherwise, a rangecheck

111
3 . 1 0
Functions
error will occur. If DataSource is a file, the sample data begins at file position 0.
The operators that use the function will reposition this file at unpredictable
times; a PostScript program should not attempt to access the same file. A
ReusableStreamDecode filter is required if in-line data or a subfile is to be used as
data for a sampled function.
Example 3.8 illustrates a sampled function with 4-bit samples in an array con-
taining 21 columns and 31 rows. The function takes two arguments, x and y, in
the domain [−1 1], and returns one value, z, in that same range.
Example 3.8
<< /FunctionType 0
/Domain [–1 1 –1 1]
/Size [21 31]
/Encode [0 20 0 30]
/BitsPerSample 4
/Range [–1 1]
/Decode [–1 1]
/DataSource < … >
>>
The x argument is linearly transformed by the encoding to the domain [0 20] and
the y argument to the domain [0 30]. Using bilinear interpolation between sam-
ple points, the function computes a value for z, which (because BitsPerSample is
4) will be in the range [0 15], and the decoding transforms z to a number in the
range [−1 1] for the result. The sample array is stored in a string of 326 bytes, cal-
culated as follows (rounded up):
326 bytes = 31 rows × 21 samples/row × 4 bits/sample ÷ 8 bits/byte
The first byte contains the sample for the point (−1, −1) in the high-order 4 bits
and the sample for the point (−0.9, −1) in the low-order 4 bits.
The Decode entry can be used creatively to increase the accuracy of encoded
samples corresponding to certain values in the range. For example, if the desired
range of the function is [−1 1] and BitsPerSample is 4, the usual value of Decode
would be [−1 1] and the sample values would be integers in the interval [0 15] (as
shown in Figure 3.1). But if these values were used, the midpoint of the range (0)
would not be represented exactly by any sample value, since it would fall halfway
between 7 and 8. On the other hand, if the Decode array were [−1 +1.1428571]
(or more precisely, [−1 16 14 div]) and the sample values supplied were in the in-

112
C H A P T E R 3
Language
terval [0 14], then the desired effective range of [−1 1] would be achieved, and the
range value 0 would be represented by the sample value 7.
+1
+1
0
ange
0
ange
R
1
2
3
4
5
7
8
9 10 11 12 13 14 15
R
1
2
3
4
6
7
8
9 10 11 12 13 14 15
Samples
Samples
−1
−1
/Decode [−1 1]
/Decode [−1 1.1429]
FIGURE 3.1 Mapping with the Decode array
The Size value for an input dimension can be 1, in which case all input values in
that dimension will be mapped to the single allowed value. If Size is less than 4,
cubic spline interpolation is not possible and Order 3 will be ignored if specified.
Type 2 Function Dictionary (Exponential Interpolation Functions)
Type 2 function dictionaries include a set of parameters that define an exponen-
tial interpolation of one input value and n output values:
f(x) = y , …, y
0
n – 1
In addition to the entries in Table 3.12 on page 108, a type 2 function dictionary
includes the entries listed in Table 3.14.
Values of Domain must constrain x in such a way that if N is not an integer, all
values of x must be greater than or equal to 0, and if N is negative, no value of x
may be 0.
For typical use as an interpolation function, Domain will be declared as [0 1], and
N will be a number greater than 0. The Range parameter is optional and can be
used to clip the output to a desired range.

113
3 . 1 0
Functions
TABLE 3.14 Additional entries specific to a type 2 function dictionary
KEY
TYPE
VALUE
C0
array
(Optional) An array of n numbers defining the function result when x = 0
(hence the “0” in the name). Default value: [0].
C1
array
(Optional) An array of n numbers defining the function result when x = 1
(hence the “1” in the name). Default value: [1].
N
number
(Required) The interpolation exponent. Each input value x will return n
values, given by yj = C0j + xN × (C1j − C0j), for 0 ≤ j < n.
Type 3 Function Dictionaries (Stitching Functions)
Type 3 function dictionaries define a “stitching” of the subdomains of several
1-input functions to produce a single new 1-input function. Since the resulting
stitching function is a 1-input function, the domain is given by a two-element
array, [Domain0 Domain1]. This domain is partitioned into k subdomains, as in-
dicated by the dictionary’s Bounds entry, which is an array of k − 1 numbers that
obey the following inequality:
Domain < Bounds < Bounds < … < Bounds
< Domain
0
0
1
k – 2
1
The value of the Functions entry is an array of k functions. The first function
applies to x values in the first subdomain, Domain0 ≤ x < Bounds0; the second
function applies to x values in the second subdomain, Bounds0 ≤ x < Bounds1;
and so on. The last function applies to x values in the last subdomain, which in-
cludes the upper bound: Boundsk−2 ≤ x ≤ Domain1.
The Encode array contains 2 × k numbers. A value x from the ith subdomain is
encoded as follows:
x′ = Interpolate(x, Bounds
, Bounds , Encode , Encode
)
i – 1
i
2i
2i + 1
for 0 ≤ i < k. In this equation, Bounds−1 means Domain0, and Boundsk−1 means
Domain1.
The value of k may be 1, in which case the Bounds array is empty and the single
item in the Functions array applies to all x values, Domain0 ≤ x ≤ Domain1.

114
C H A P T E R 3
Language
In addition to the entries in Table 3.12 on page 108, a type 3 function dictionary
includes the entries listed in Table 3.15.
TABLE 3.15 Additional entries specific to a type 3 function dictionary
KEY
TYPE
VALUE
Functions
array
(Required) An array of k 1-input functions making up the stitching function.
The output dimensionality of all functions must be the same, and compatible
with the value of Range if Range is present.
Bounds
array
(Required) An array of k − 1 numbers that, in combination with Domain, de-
fine the intervals to which each function from the Functions array applies.
Bounds elements must be in order of increasing value, and each value must
be within the limits specified by Domain.
Encode
array
(Required) An array of 2 × k numbers that, taken in pairs, map each subset of
the domain defined by Domain and the Bounds array to the domain of the
corresponding function.
Domain must be of size 2 (that is, m = 1). Note that Domain0 must be strictly less
than Domain1 unless k = 1.
The stitching function is designed to make it easy to combine several functions to
be used within one shading pattern, over different parts of the shading’s domain.
The same effect could be achieved by creating separate shading dictionaries for
each of the functions, with adjacent domains. However, since each shading would
have similar parameters, and because the overall effect is one shading, it is more
convenient to have a single shading with multiple function definitions.
Also, function type 3 provides a general mechanism for inverting the domains of
1-input functions. For example, consider a function f with a Domain of [0 1], and
a stitching function g with a Domain of [0 1], a Functions array containing f, and
an Encode array of [1 0]. In effect, g(x) = f(1 − x).
3.11 Errors
Various sorts of errors can occur during execution of a PostScript program. Some
errors are detected by the PostScript interpreter, such as overflow of one of the in-
terpreter’s stacks. Others are detected during execution of the built-in operators,
such as occurrence of the wrong type of operand.

115
3 . 1 1
Errors
Errors are handled in a uniform fashion that is under the control of the Post-
Script program. Each error is associated with a name, such as stackoverflow or
typecheck. Each error name appears as a key in a special dictionary called
errordict and is associated with a value that is the handler for that error. The
complete set of error names appears in Section 8.1, “Operator Summary.”
3.11.1 Error Initiation
When an error occurs, the interpreter does the following:
1. Restores the operand stack to the state it was in when it began executing the
current object.
2. Pushes that object on the operand stack.
3. Looks up the error’s name in errordict and executes the associated value,
which is the error handler for that error.
This is everything the interpreter itself does in response to an error. The error
handler in errordict is responsible for all other actions. A PostScript program can
modify error behavior by defining its own error-handling procedures and associ-
ating them with the names in errordict.
The interrupt and timeout errors, which are initiated by events external to the
PostScript interpreter, are treated specially. The interpreter merely executes
interrupt or timeout from errordict, sandwiched between execution of two ob-
jects being interpreted in normal sequence. It does not push the object being exe-
cuted, nor does it alter the operand stack in any other way. In other words, it
omits steps 1 and 2 above.
3.11.2 Error Handling
The errordict dictionary present in the initial state of VM provides standard
handlers for all errors. However, errordict is a writeable dictionary; a program
can replace individual error handlers selectively. errordict is in local VM, so
changes are subject to save and restore; see Section 3.7, “Memory Management.”
The default error-handling procedures all operate in a standard way. They record
information about the error in a special dictionary named $error, set the VM al-

116
C H A P T E R 3
Language
location mode to local, and invoke the stop operator. They do not print anything
or generate any text messages to %stdout or %stderr.
Execution of stop exits the innermost enclosing context established by the
stopped operator. Assuming the user program has not invoked stopped, inter-
pretation continues in the job server, which invoked the user program with
stopped.
As part of error recovery, the job server executes the name handleerror from
errordict. The default handleerror procedure accesses the error information in
the $error dictionary and reports the error in an installation-dependent fashion.
In some environments, handleerror simply writes a text message to the standard
output file. In other environments, it invokes more elaborate error reporting
mechanisms.
After an error occurs and one of the default error-handling procedures is exe-
cuted, $error contains the entries shown in Table 3.16.
TABLE 3.16 Entries in the $error dictionary
KEY
TYPE
VALUE
newerror
boolean
A flag that is set to true to indicate that an error has occurred. handleerror
sets it to false.
errorname
name
The name of the error that occurred.
command
any
The operator or other object being executed by the interpreter at the time the
error occurred.
errorinfo
array or null
(LanguageLevel 2) If the error arose from an operator that takes a parameter
dictionary as an operand (such as setpagedevice or setdevparams), this
array contains the key and value of the incorrect parameter. (If a required
entry was missing, this array contains the expected key with a null value.)
handleerror sets errorinfo to null.
ostack
array
A snapshot of the entire operand stack immediately before the error, stored
as if by the astore operator.
estack
array
A snapshot of the execution stack, stored as if by the execstack operator.
dstack
array
A snapshot of the dictionary stack, stored as if by the dictstack operator.

117
3 . 1 2
Early Name Binding
recordstacks
boolean
(LanguageLevel 2) A flag that controls whether the standard error handlers
record the ostack, estack, and dstack snapshots. Default value: true.
binary
boolean
(LanguageLevel 2) A flag that controls the format of error reports produced
by the standard handleerror procedure. false produces a text message; true
produces a binary object sequence (see Section 3.14.6, “Structured Output”).
Default value: false.
A program that wishes to modify the behavior of error handling can do so in one
of two ways:
It can change the way errors are reported simply by redefining handleerror in
errordict. For example, a revised error handler might report more information
about the context of the error, or it might produce a printed page containing
the error information instead of reporting it to the standard output file.
It can change the way errors are invoked by redefining the individual error
names in errordict. There is no restriction on what an error-handling proce-
dure can do. For example, in an interactive environment, an error handler
might invoke a debugging facility that would enable the user to examine or
alter the execution environment and perhaps resume execution.
3.12 Early Name Binding
Normally, when the PostScript language scanner encounters an executable name
in the program being scanned, it simply produces an executable name object; it
does not look up the value of the name. It looks up the name only when the name
object is executed by the interpreter. The lookup occurs in the dictionaries that
are on the dictionary stack at the time of execution.
A name object contained in a procedure is looked up each time the procedure is
executed. For example, given the definition
/average {add 2 div} def
the names add and div are looked up, yielding operators to be executed, every
time the average procedure is invoked.
This so-called late binding of names is an important feature of the PostScript lan-
guage. However, there are situations in which early binding is advantageous.

118
C H A P T E R 3
Language
There are two facilities for looking up the values of names before execution: the
bind operator and the immediately evaluated name.
3.12.1 bind Operator
The bind operator takes a procedure operand and returns a possibly modified
procedure. There are two kinds of modification: operator substitution and idiom
recognition.
Operator Substitution
The bind operator first systematically replaces names with operators in a proce-
dure. For each executable name whose value is an operator (not an array, pro-
cedure, or other type), it replaces the name with the operator object. This lookup
occurs in the dictionaries that are on the dictionary stack at the time bind is exe-
cuted. The effect of bind applies not only to the procedure being bound but to all
subsidiary procedures (executable arrays or executable packed arrays) contained
within it, nested to arbitrary depth.
When the interpreter subsequently executes this procedure, it encounters the
operator objects, not the names of operators. For example, if the average proce-
dure has been defined as
/average {add 2 div} bind def
then during the execution of average, the interpreter executes the add and div
operators directly, without looking up the names add and div.
There are two main benefits to using bind:
A procedure that has been bound will execute the sequence of operators that
were intended when the procedure was defined, even if one or more of the
operator names have been redefined in the meantime. This benefit is mainly of
interest in procedures that are part of the PostScript implementation, such as
findfont and =. Those procedures are expected to behave correctly and uni-
formly, regardless of how a user program may have altered its name environ-
ment.
A bound procedure executes somewhat faster than one that has not been
bound, since the interpreter need not look up the operator names each time,

119
3 . 1 2
Early Name Binding
but can execute the operators directly. This benefit is of interest in most Post-
Script programs, particularly in the prologs of page descriptions. It is worth-
while to apply bind to any procedure that will be executed more than a few
times.
It is important to understand that bind replaces only those names whose values
are operators at the time bind is executed. Names whose values are of other types,
particularly procedures, are not disturbed. If an operator name has been rede-
fined in some dictionary above systemdict on the dictionary stack before the exe-
cution of bind, occurrences of that name in the procedure will not be replaced.
Note: Certain standard language features, such as findfont, are implemented as
built-in procedures rather than as operators. Also, certain names, such as
true, false,
and
null, are associated directly with literal values in systemdict. Occurrences of such
names in a procedure are not altered by
bind.
Idiom Recognition
In LanguageLevel 3, the bind operator performs an additional task, known as
idiom recognition, following the replacement of names in the bound procedure
with operators. The goal of idiom recognition is to replace certain procedures
(“idioms”) with other procedures, typically ones that have equivalent behavior
but produce better-quality results or execute more efficiently. Performing such
substitution on procedures in an application’s prolog can take advantage of new
language features without changing the application.
The idioms and their replacements are stored as instances of the IdiomSet re-
source category. An IdiomSet instance is a procedure substitution dictionary,
which typically contains idioms for a particular application’s prolog. The keys in
this dictionary are arbitrary. Each value in this dictionary is an array containing
two procedures, a template procedure and a substitute procedure.
The bind operator first tests the value of the user parameter IdiomRecognition to
see whether idiom recognition is enabled. If so, the bound procedure is compared
to every template procedure in every IdiomSet instance. If a match is found, bind
returns the associated substitute procedure; otherwise, it returns the bound pro-
cedure.
Two arrays or procedures are considered to match if corresponding elements
either are equal (in the sense of the eq operator) or are both arrays whose corre-

120
C H A P T E R 3
Language
sponding elements match in turn. The objects’ attributes are disregarded during
this comparison, just as they are by eq. Nested arrays or procedures are compared
to a maximum depth of ten levels.
If substitutions may have an undesirable effect, idiom recognition can be disabled
by setting the value of the user parameter IdiomRecognition to false before in-
voking the bind operator. For example, IdiomRecognition should be set to false
during the construction of instances of the IdiomSet resource category, so that
the template and substitute procedures are not themselves recognized as idioms.
Example 3.9 demonstrates how to construct an instance of the IdiomSet resource
category.
Example 3.9
% Temporarily turn off idiom recognition so that bind does not change our template.
currentuserparams /IdiomRecognition get
% Save current value on stack
<</IdiomRecognition false>> setuserparams
% Define an IdiomSet resource named AdobeWinDriver containing a single substitution.
/AdobeWinDriver
<<
/snap
% Name of this particular idiom (any name)
[ % The template procedure.
% This is a common method in LanguageLevel 1 for aligning points
% on a grid in device space.
{ transform
0.25 sub round 0.25 add exch
0.25 sub round 0.25 add exch
itransform
} bind
% The substitute procedure.
% This procedure does not change the coordinates.
% Assume that setstrokeadjust is on.
{ } bind
]
>>
/IdiomSet defineresource pop
<</IdiomRecognition 3 -1 roll>> setuserparams
% Return idiom recognition
% to its previous state
% If the restored value was true, bind will now replace occurrences of the template
% procedure with the substitute procedure.

121
3 . 1 2
Early Name Binding
The template and substitute procedures should be bound explicitly during the
definition of the IdiomSet instance, since no automatic binding occurs on either
of these procedures during idiom recognition. The comparison during idiom
recognition occurs after the candidate procedure is bound; a successful match de-
pends on the template also being bound. Generally, the substitute procedure
should be bound, unless lookup of operator names during each execution of the
substitute procedure is specifically desired.
Instances of the IdiomSet resource category reside in VM, either local or global; if
local, they are subject to the save and restore operators. The bind operator
follows the usual rules about visibility of resources according to the current VM
allocation mode. That is, if the current VM allocation mode is global, only glo-
bally defined instances of IdiomSet are considered, whereas if the current alloca-
tion mode is local, both locally and globally defined instances are considered.
Additionally, substitution will not occur if the candidate procedure is in global
VM but the proposed substitute procedure is in local VM.
Multiple instances of the IdiomSet resource category may contain identical
template procedures, but only one will be in effect when idiom recognition is
enabled. The instance that takes precedence is not predictable.
As mentioned earlier, idiom recognition is performed by matching the template
procedures in the IdiomSet resource instances. This is unlike all other resource
categories, whose instances are selected according to their keys. This matching by
value occurs only for IdiomSet instances that are defined in VM; bind does not
consider instances that are not in VM but only in external storage.
To ensure that the instances in VM are consistent with the external ones, the in-
terpreter automatically invokes findresource to load external IdiomSet instances
into VM at the beginning of each job and at certain other times. If a PostScript
program installs an external IdiomSet instance, it should then execute
undefineresource to ensure that any existing instance of IdiomSet in VM with the
same key is removed and replaced by the external instance.
3.12.2 Immediately Evaluated Names
LanguageLevels 2 and 3, as well as most LanguageLevel 1 implementations (see
Appendix A), include a syntax feature called immediately evaluated names. When
the PostScript language scanner encounters a token of the form //name (a name
preceded by two slashes with no intervening spaces), it immediately looks up the

122
C H A P T E R 3
Language
name and substitutes the corresponding value. This lookup occurs in the diction-
aries on the dictionary stack at the time the scanner encounters the token. If it
cannot find the name, an undefined error occurs.
The substitution occurs immediately—even inside an executable array delimited
by { and }, where execution is deferred. Note that this process is a substitution and
not an execution; that is, the name’s value is not executed, but rather is substituted
for the name itself, just as if the load operator were applied to the name.
The most common use of immediately evaluated names is to perform early bind-
ing of objects (other than operators) in procedure definitions. The bind operator,
described in the previous section, performs early binding of operators; binding
objects of other types requires the explicit use of immediately evaluated names.
Example 3.10 illustrates the use of an immediately evaluated name to bind a ref-
erence to a dictionary.
Example 3.10
/mydict << … >> def
/proc
{
//mydict begin

end
} bind def
In the definition of proc, //mydict is an immediately evaluated name. At the mo-
ment the scanner encounters the name, it substitutes the name’s current value,
which is the dictionary defined earlier in the example. The first element of the
executable array proc is a dictionary object, not a name object. When proc is exe-
cuted, it will access that dictionary, even if in the meantime mydict has been rede-
fined or the definition has been removed.
Another use of immediately evaluated names is to refer directly to permanent ob-
jects: standard dictionaries, such as systemdict, and constant literal objects, such
as the values of true, false, and null. On the other hand, it does not make sense to
treat the names of variables as immediately evaluated names. Doing so would
cause a procedure to be irrevocably bound to particular values of those variables.
A word of caution: Indiscriminate use of immediately evaluated names may
change the behavior of a program. As discussed in Section 3.5, “Execution,” the

123
3 . 1 3
Filtered Files Details
behavior of a procedure differs depending on whether the interpreter encounters
it directly or as the result of executing some other object (a name or an operator).
Execution of the program fragments
{… b …}
{… //b …}
will have different effects if the value of the name b is a procedure. So it is inad-
visable to treat the names of operators as immediately evaluated names. A pro-
gram that does so will malfunction in an environment in which some operators
have been redefined as procedures. This is why bind applies only to names whose
values are operators, not procedures or other types.
3.13 Filtered Files Details
LanguageLevels 2 and 3 define a special kind of file called a filter, which reads or
writes an underlying file and transforms the data in some way. Filters are intro-
duced in Section 3.8.4, “Filters.” This section describes the semantics of filters in
more detail. It includes information about:
The use of files, procedures, and strings as data sources and targets
End-of-data conventions
Details of individual filters
Specifications of encoding algorithms for some filters
All features described in this section are LanguageLevel 2 features except for those
labeled as LanguageLevel 3.
3.13.1 Data Sources and Targets
As stated in Section 3.8.4, “Filters,” there are two kinds of filters: decoding filters
and encoding filters. A decoding filter is an input file that reads from an underly-
ing data source and produces transformed data as it is read. An encoding filter is
an output file that takes the data written to it and writes transformed data to an
underlying data target. Data sources and data targets may be files, procedures, or
strings.

124
C H A P T E R 3
Language
Files
A file is the most common data source or target for a filter. A file used as a data
source must be an input file, and one used as a data target must be an output file;
otherwise, an invalidaccess error occurs.
If a file is a data source for a decoding filter, the filter reads from it as necessary to
satisfy demands on the filter, until either the filter reaches its end-of-data (EOD)
condition or the data source reaches end-of-file. If a file is a data target for an en-
coding filter, the filter writes to it as necessary to dispose of data that has been
written to the filter and transformed.
Closing a filter file does not close the underlying file, unless explicitly directed by
the CloseSource or CloseTarget filter parameter (LanguageLevel 3). A program
typically creates a decoding filter to process data embedded in the program file it-
self—the one designated by currentfile. When the filter reaches EOD, execution
of the underlying file resumes. Similarly, a program can embed the output of an
encoding filter in the middle of an arbitrary data stream being written to the un-
derlying output file.
Once a program has begun reading from or writing to a filter, it should not
attempt to access the underlying file in any way until the filter has been closed.
Doing so could interfere with the operation of the filter and leave the underlying
file in an unpredictable state. However, it is safe to access the underlying file after
execution of filter but before the first read or write of the filter file, except in cer-
tain uses of the ReusableStreamDecode filter. The method for establishing a filter
pipeline in Example 3.5 on page 84 depends on this.
Procedures
The data source or target can be a procedure. When the filter file is read or writ-
ten, it calls the procedure to obtain input data to be decoded or to dispose of out-
put data that has been encoded. This enables the data to be supplied or consumed
by an arbitrary program.
If a procedure is a data source, the filter calls it whenever it needs to obtain input
data. The procedure must return on the operand stack a readable string contain-
ing any number of bytes of data. The filter pops this string from the stack and
uses its contents as input to the filter. This process repeats until the filter encoun-
ters end-of-data (EOD). Any leftover data in the final string is discarded. The

125
3 . 1 3
Filtered Files Details
procedure can return a string of length 0 to indicate that no more data is avail-
able.
If a procedure is a data target, the filter calls it whenever it needs to dispose of
output data. Before calling the procedure, it pushes two operands on the stack: a
string and a boolean flag. It expects the procedure to consume these operands
and return a string. The filter calls the procedure in the following three situations:
On the first write to the filter after the filter operator creates it, the filter calls
the data target procedure with an empty string and the boolean value true. The
procedure must return a writeable string of nonzero length, into which the fil-
ter can write filtered data.
Whenever the filter needs to dispose of accumulated output data, it calls the
procedure again, passing it a string containing the data and the boolean value
true. This string is either the same string that was returned from the previous
call or a substring of that string. The procedure must now do whatever is ap-
propriate with the data, then return either the same string or another string
into which the filter can write additional filtered data.
When the filter file is closed, it calls the procedure a final time, passing it a
string or substring containing the remaining output data, if any, and the bool-
ean value false. The procedure must now do whatever is appropriate with the
data and perform any required end-of-data actions, then return a string. Any
string (including one of length 0) is acceptable. The filter does not use this
string, but merely pops it off the stack.
It is normal for the data source or target procedure to return the same string each
time. The string is allocated once at the beginning and serves simply as a buffer
that is used repeatedly. Each time a data source procedure is called, it fills the
string with one buffer’s worth of data and returns it. Similarly, each time a data
target procedure is called, it first disposes of any buffered data passed to it, then
returns the original string for reuse.
Between successive calls to the data source or target procedure, a program should
not do anything that would alter the contents of the string returned by that pro-
cedure. The filter reads or writes the string at unpredictable times, so altering it
could disrupt the operation of the filter. If the string returned by the procedure is
reclaimed by a restore operation before the filter becomes closed, the results are
unpredictable. Typically, an ioerror occurs.

126
C H A P T E R 3
Language
Note: If a filter file object is reclaimed by restore or garbage collection before being
closed, it is closed automatically; however, the data target procedure is not called.

One use of procedures as data sources or targets is to run filters “backward.” Fil-
ters are organized so that decoding filters are input files and encoding filters are
output files. Normally, a PostScript program obtains encoded data from some ex-
ternal source, decodes it, and uses the decoded data; or it generates some data,
encodes it, and sends it to some external destination. The organization of filters
supports this model. However, if a program must provide the input to a decoding
filter or consume the output of an encoding filter, it can do so by using proce-
dures as data sources or targets.
Strings
If a string is a data source, the filter simply uses its contents as data to be decoded.
If the filter encounters EOD, it ignores the remainder of the string. Otherwise, it
continues until it has exhausted the string data. Until the filter is closed, the string
should be treated as read-only. Writing into such a string will have unpredictable
consequences for the data read from the filter.
If a string is a data target, the filter writes encoded data into it. This continues un-
til the filter is closed. The contents of the string are not dependable until that
time. If the filter exhausts the capacity of the string, an ioerror occurs. There is no
way to determine how much data the filter has written into the string; if a pro-
gram needs to know, it should use a procedure as the data target.
3.13.2 End-of-Data and End-of-File
A filter can reach a state in which it cannot continue filtering data. This is called
the end-of-data (EOD) condition. Most decoding (input) filters can detect an
EOD marker encoded in the data they are reading. The nature of this marker de-
pends on the filter. Most encoding (output) filters append an EOD marker to the
data they are writing. This generally occurs automatically when the filter file is
closed. In a few instances, the EOD condition is based on predetermined infor-
mation, such as a byte count or a line count, rather than on an explicit marker in
the encoded data.
A file object, including a filter, can be closed at an arbitrary time, and a readable
file can run out of data. This is called the end-of-file (EOF) condition. When a

127
3 . 1 3
Filtered Files Details
decoding filter detects EOD and all the decoded data has been read, the filter
reaches the EOF condition. The underlying data source or target for a filter can it-
self reach EOF. This usually results in the filter reaching EOF, perhaps after some
delay.
For efficient operation, filters must be buffered. The PostScript interpreter auto-
matically provides buffering as part of the filter file object. Because of the effects
of buffering, the filter reads from its data source or writes to its data target at ir-
regular times, not necessarily each time the filter file itself is read or written. Also,
many filtering algorithms require an unpredictable amount of state to be held
within the filter object.
Decoding Filters
Before encountering EOD, a decoding filter reads an unpredictable amount of
data from its data source. However, when it encounters EOD, it stops reading
from its data source. If the data source is a file, encoded data that is properly ter-
minated by EOD can be followed by additional unencoded data, which a pro-
gram can then read directly from that file.
When a filter reaches EOD and all the decoded data has been read from it, the
filter file reaches EOF and is closed automatically. Automatic closing of input files
at EOF is a standard feature of all file objects, not just of filters. (The
ReusableStreamDecode filter is an exception; see “ReusableStreamDecode Filter”
on page 153.) Unlike other file objects, a filter reaches EOF and is closed im-
mediately after the last data character is read from it, rather than at the following
attempt to read a character. A filter also reaches EOF if its data source runs out of
data by reaching EOF.
Note: Data for a filter must be terminated by an explicit EOD, even if the program
reading from the filter (executing the
image operator, for example) reads only the ex-
act amount of data that is present.

Applying flushfile to a decoding filter causes data to be drawn from the data
source until the filter reaches EOD or the source runs out of data, whichever oc-
curs first. This operator can be used to flush the remainder of the encoded data
from the underlying file when the reading of filtered data must be terminated
prematurely. After the flushfile operation, the underlying file is positioned so that
the next read from that file will begin immediately following the EOD of the en-
coded data. If a program closes a decoding filter prematurely before it reaches

128
C H A P T E R 3
Language
EOD and without explicitly flushing it, the data source will be in an indeter-
minate state. Because of buffering, there is no dependable way to predict how
much data will have been consumed from the data source.
Encoding Filters
As stated earlier, writing to an encoding (output) filter causes it to write encoded
data to its data target. However, because of the effects of buffering, the writes to
the data target occur at unpredictable times. The only way to ensure that all en-
coded data has been written is to close the filter.
Most encoding filters can accept an indefinite amount of data to be encoded. The
amount usually is not specified in advance. Closing the filter causes an EOD
marker to be written to the data target at the end of the encoded data. The nature
of the EOD marker depends on the filter being used; it is sometimes under the
control of parameters specified when the filter is created.
The DCTEncode filter requires the amount of data to be specified in advance,
when the filter is created. When that amount of data has been encoded, the filter
reaches the EOD condition automatically. Attempting to write additional data to
the filter causes an ioerror, possibly after some delay.
Some data targets can become unable to accept further data. For instance, if the
data target is a string, the string may become full. If the data target is a file, the file
may become closed. Attempting to write to a filter whose data target cannot ac-
cept data causes an ioerror.
Applying flushfile to an encoding filter file causes the filter to flush buffered data
to its data target to the extent possible. If the data target is a file, flushfile is also
invoked for it. The effect of flushfile will propagate all the way down a filter pipe-
line. However, because of the nature of filter algorithms, it is not possible to guar-
antee that all data stored as part of a filter’s internal state will be flushed.
On the other hand, applying closefile to an encoding filter flushes both the buff-
ered data and the filter’s internal state. This causes all encoded data to be written
to the data target, followed by an EOD marker, if appropriate.
When a program closes a pipeline consisting of two or more encoding filters, it
must close each component filter file in sequence, starting with the one that was

129
3 . 1 3
Filtered Files Details
created last (in other words, the one farthest upstream). This ensures that all
buffered data and all appropriate EOD markers are written in the proper order.
If a filter file object is reclaimed by restore or garbage collection before being
closed, it is closed automatically (as is the case for all file objects); however, no at-
tempt is made to close a filter pipeline in the correct order. Errors arising from
closing in the wrong order are ignored, and filter target procedures are not called.
CloseSource and CloseTarget
CloseSource and CloseTarget (both LanguageLevel 3) are optional boolean
parameters in the parameter dictionary for decoding and encoding filters, re-
spectively. These parameters govern the disposition of the filter’s data source or
target when the closefile operator is applied to the filter explicitly, or implicitly in
one of the following ways: by the restore operator, by garbage collection, or (ex-
cept for the ReusableStreamDecode filter) by reaching EOD.
If CloseSource or CloseTarget is false (as they are by default), no additional action
is taken on the data source or target; this is the behavior in LanguageLevel 2.
However, if the parameter is true, then after closefile has been applied to the filter,
it is also applied to the filter’s data source or target. This process propagates
through an entire pipeline, unless a filter is reached whose CloseSource or
CloseTarget parameter is false; that filter is closed, but its source or target is not.
3.13.3 Details of Individual Filters
As stated in Section 3.8.4, “Filters,” the PostScript language supports three cate-
gories of standard filters: ASCII encoding and decoding filters, compression and
decompression filters, and subfile filters. The following sections document the in-
dividual filters.
Some of the encoded formats these filters support are the same as or similar to
those supported by applications or utility programs on many computer systems.
It should be straightforward to make those programs compatible with the filters.
Also, C language implementations of some filters are available from the Adobe
Developers Association.

130
C H A P T E R 3
Language
ASCIIHexDecode Filter
source /ASCIIHexDecode filter
source dictionary /ASCIIHexDecode filter
The ASCIIHexDecode filter decodes data encoded as ASCII hexadecimal and pro-
duces binary data. For each pair of ASCII hexadecimal digits (0–9 and either A–F
or a–f), it produces one byte of binary data. All white-space characters—space,
tab, carriage return, line feed, form feed, and null—are ignored. The character >
indicates EOD. Any other characters will cause an ioerror.
If the filter encounters EOD when it has read an odd number of hexadecimal
digits, it will behave as if it had read an additional 0 digit.
The parameter dictionary can be used to specify the CloseSource parameter
(LanguageLevel 3).
ASCIIHexEncode Filter
target /ASCIIHexEncode filter
target dictionary /ASCIIHexEncode filter
The ASCIIHexEncode filter encodes binary data as ASCII hexadecimal. For each
byte of binary data, it produces two ASCII hexadecimal digits (0–9 and either A–F
or a–f). It inserts a newline in the encoded output at least once every 80 charac-
ters, thereby limiting the lengths of lines.
When the ASCIIHexEncode filter is closed, it writes a > character as an EOD
marker.
The parameter dictionary can be used to specify the CloseTarget parameter
(LanguageLevel 3).
ASCII85Decode Filter
source /ASCII85Decode filter
source dictionary /ASCII85Decode filter
The ASCII85Decode filter decodes data encoded in the ASCII base-85 encoding
format and produces binary data. See the description of the ASCII85Encode filter
for a definition of the ASCII base-85 encoding.

131
3 . 1 3
Filtered Files Details
The ASCII base-85 data format uses the characters ! through u and the character
z. All white-space characters—space, tab, carriage return, line feed, form feed,
and null—are ignored. If the ASCII85Decode filter encounters the character ~ in
its input, the next character must be > and the filter will reach EOD. Any other
characters will cause the filter to issue an ioerror. Also, any character sequences
that represent impossible combinations in the ASCII base-85 encoding will cause
an ioerror.
The parameter dictionary can be used to specify the CloseSource parameter
(LanguageLevel 3).
ASCII85Encode Filter
target /ASCII85Encode filter
target dictionary /ASCII85Encode filter
The ASCII85Encode filter encodes binary data in the ASCII base-85 encoding.
Generally, for every 4 bytes of binary data, it produces 5 ASCII printing charac-
ters in the range ! through u. It inserts a newline in the encoded output at least
once every 80 characters, thereby limiting the lengths of lines.
When the ASCII85Encode filter is closed, it writes the 2-character sequence ~> as
an EOD marker.
Binary data bytes are encoded in 4-tuples (groups of 4). Each 4-tuple is used to
produce a 5-tuple of ASCII characters. If the binary 4-tuple is (b1 b2 b3 b4) and
the encoded 5-tuple is (c1 c2 c3 c4 c5), then the relation between them is
(b
2563
×
) + (b
2562
×
) + (b
2561
×
) + b =

1
2
3
4

(c × 854) + (c × 853) + (c × 852) + (c × 851) + c
1
2
3
4
5
In other words, 4 bytes of binary data are interpreted as a base-256 number and
then converted into a base-85 number. The five “digits” of this number,
(c1 c2 c3 c4 c5), are then converted into ASCII characters by adding 33, which is
the ASCII code for !, to each. ASCII characters in the range ! to u are used, where !
represents the value 0 and u represents the value 84. As a special case, if all five
digits are 0, they are represented by a single character z instead of by !!!!!.

132
C H A P T E R 3
Language
If the ASCII85Encode filter is closed when the number of characters written to it is
not a multiple of 4, it uses the characters of the last, partial 4-tuple to produce a
last, partial 5-tuple of output. Given n (1, 2, or 3) bytes of binary data, it first ap-
pends 4 − n zero bytes to make a complete 4-tuple. Then, it encodes the 4-tuple
in the usual way, but without applying the z special case. Finally, it writes the first
n + 1 bytes of the resulting 5-tuple. Those bytes are followed immediately by the
~> EOD marker. This information is sufficient to correctly encode the number of
final bytes and the values of those bytes.
The following conditions constitute encoding violations:
The value represented by a 5-tuple is greater than 232 − 1.
A z character occurs in the middle of a 5-tuple.
A final partial 5-tuple contains only one character.
These conditions never occur in the output produced by the ASCII85Encode
filter. Their occurrence in the input to the ASCII85Decode filter causes an ioerror.
The ASCII base-85 encoding is similar to one used by the public domain utilities
btoa and atob, which are widely available on workstations. However, it is not ex-
actly the same; in particular, it omits the begin-data and end-data marker lines,
and it uses a different convention for marking end-of-data.
The parameter dictionary can be used to specify the CloseTarget parameter
(LanguageLevel 3).
LZWDecode Filter

source /LZWDecode filter
source dictionary /LZWDecode filter
The LZWDecode filter decodes data that is encoded in a Lempel-Ziv-Welch com-
pressed format. See the description of the LZWEncode filter for details of the for-
mat and a description of the filter parameters.

133
3 . 1 3
Filtered Files Details
LZWEncode Filter
target /LZWEncode filter
target dictionary /LZWEncode filter
The LZWEncode filter encodes ASCII or binary data according to the basic LZW
(Lempel-Ziv-Welch) data compression method. LZW is a variable-length, adap-
tive compression method that has been adopted as one of the standard compres-
sion methods in the tag image file format (TIFF) standard. The output produced
by the LZWEncode filter is always binary, even if the input is ASCII text.
LZW compression can discover and exploit many patterns in its input data. In its
basic form, it is especially well suited to natural-language and PostScript-
language text. The filter also supports optional pretransformation by a predictor
function, as described in the section “Predictor Functions” on page 139; this im-
proves compression of sampled image data.
Note: The LZW compression method is the subject of United States patent number
4,558,302 and corresponding foreign patents owned by the Unisys Corporation.
Adobe Systems has licensed this patent for use in its products. Independent software
vendors (ISVs) may be required to license this patent to develop software using the
LZW method to compress PostScript programs or data for use with Adobe products.
Unisys has agreed that ISVs may obtain such a license for a modest one-time fee.
Additional information can be obtained on the World Wide Web at
<http://www.unisys.com/LeadStory/lzwfaq.html>.

An LZWDecode or LZWEncode parameter dictionary may contain any of the en-
tries listed in Table 3.17. Unless otherwise noted, a decoding filter’s parameters
must match the parameters used by the encoding filter that generated its input

data.
TABLE 3.17 Entries in an LZWEncode or LZWDecode parameter dictionary
KEY
TYPE
VALUE
EarlyChange
integer
(Optional) A code indicating when to increase the code word length. The
TIFF specification can be interpreted to imply that code word length in-
creases are postponed as long as possible. However, some existing imple-
mentations of LZW increase the code word length one code word earlier
than necessary. The PostScript language supports both interpretations. If
EarlyChange is 0, code word length increases are postponed as long as
possible. If it is 1, they occur one code word early. Default value: 1.

134
C H A P T E R 3
Language
UnitLength
integer
(Optional; LanguageLevel 3) The size of the units encoded, in bits. The al-
lowed values are 3 through 8. See “UnitLength and LowBitFirst” on page
136. Default value: 8. A value other than the default is permitted only for
LZWDecode and should not be used in combination with a predictor
(specified by a Predictor value greater than 1; see Table 3.20).
LowBitFirst
boolean
(Optional; LanguageLevel 3) A flag that determines whether the code
words are packed into the encoded data stream low-order bit first (true)
or high-order bit first (false). See “UnitLength and LowBitFirst” on page
136. Default value: false. A value other than the default is permitted only
for LZWDecode.
Predictor
integer
(Optional) See Table 3.20 on page 141.
Columns
integer
(Optional) See Table 3.20 on page 141.
Colors
integer
(Optional) See Table 3.20 on page 141.
BitsPerComponent
integer
(Optional) See Table 3.20 on page 141.
CloseSource
boolean
(Optional; LanguageLevel 3; LZWDecode only) A flag specifying whether
closing the filter should also close its data source. Default value: false.
CloseTarget
boolean
(Optional; LanguageLevel 3; LZWEncode only) A flag specifying whether
closing the filter should also close its data target. Default value: false.
In LanguageLevel 3, the size of the units encoded is determined by the optional
UnitLength entry in the LZWDecode parameter dictionary; its default value is 8.
The following general discussion of the encoding scheme refers to this
LanguageLevel 3 parameter; for LanguageLevel 2, assume a unit size of 8.
The encoded data consists of a sequence of codes that can be from
(UnitLength + 1) to a maximum of 12 bits long. Each code denotes a single char-
acter of input data (0 to 2UnitLength − 1), a clear-table marker (2UnitLength), an
EOD marker (2UnitLength + 1), or a table entry representing a multicharacter
sequence that has been encountered previously in the input (2UnitLength + 2 and
greater). In the normal case where UnitLength is 8, the clear-table marker is 256
and the EOD marker is 257.
Initially, the code length is (UnitLength + 1) bits and the table contains only
entries for the (2UnitLength + 2) fixed codes. As encoding proceeds, entries are
appended to the table, associating new codes with longer and longer input char-
acter sequences. The encoding and decoding filters maintain identical copies of
this table.

135
3 . 1 3
Filtered Files Details
Whenever both the encoder and decoder independently (but synchronously) re-
alize that the current code length is no longer sufficient to represent the number
of entries in the table, they increase the number of bits per code by 1. For a
UnitLength of 8, the first output code that is 10 bits long is the one following the
creation of table entry 511, and so on for 11 (1023) and 12 (2047) bits. Codes are
never longer than 12 bits, so entry 4095 is the last entry of the LZW table.
The encoder executes the following sequence of steps to generate each output
code:
1. Accumulate a sequence of one or more input characters matching some
sequence already present in the table. For maximum compression, the encoder
should find the longest such sequence.
2. Emit the code corresponding to that sequence.
3. Create a new table entry for the first unused code. Its value is the sequence
found in step 1 followed by the next input character.
For example, suppose UnitLength is 8 and the input consists of the following
sequence of ASCII character codes:
45 45 45 45 45 65 45 45 45 66
Starting with an empty table, the encoder proceeds as shown in Table 3.18.
TABLE 3.18 Typical LZW encoding sequence
INPUT
OUTPUT
CODE ADDED
SEQUENCE REPRESENTED
SEQUENCE
CODE
TO TABLE
BY NEW CODE

256 (clear-table)


45
45
258
45 45
45 45
258
259
45 45 45
45 45
258
260
45 45 65
65
65
261
65 45
45 45 45
259
262
45 45 45 66

257 (EOD)



136
C H A P T E R 3
Language
Codes are packed into a continuous bit stream, high-order bit first (assuming
that LowBitFirst is false). This stream is then divided into 8-bit bytes, high-order
bit first. Thus, codes can straddle byte boundaries arbitrarily. After the EOD
marker (code value of 257), any leftover bits in the final byte are set to 0.
In the example above, all the output codes are 9 bits long; they would pack into
bytes as follows (represented in hexadecimal):
80 0B 60 50 22 0C 0E 02
To adapt to changing input sequences, the encoder may at any point issue a clear-
table code, which causes both the encoder and decoder to restart with initial
tables and 9-bit codes. By convention, the encoder begins by issuing a clear-table
code. It must issue a clear-table code when the table becomes full; it may do so
sooner.
UnitLength and LowBitFirst
As indicated earlier, the default value of UnitLength is 8 and of LowBitFirst is
false. These are the only values supported in LanguageLevel 2. Moreover, even in
LanguageLevel
3, values other than the default are permitted only for
LZWDecode, not for LZWEncode. This support is provided as a convenience for
decoding images from other sources (principally GIF files) that use representa-
tions other than the default. The default values are recommended for general
document interchange.
Data that has been LZW-encoded with a UnitLength less than 8 consists only of
codes in the range 0 to 2UnitLength − 1; consequently, the LZWDecode filter pro-
duces only codes in that range when read. UnitLength also affects the encoded
representation, as described above.
LZW is a bit-stream protocol, and the codes of compressed data do not necessar-
ily fall on byte boundaries. The LowBitFirst parameter controls how these codes
get packed into a byte stream.
If LowBitFirst is false (the default), codes are packed into bytes high-order bit
first. That is, bits of a code are stored into the available bits of a byte starting
with the highest-order bit. When a code straddles a byte boundary, the high-
order portion of the code appears in the low-order bits of one byte; the low-

137
3 . 1 3
Filtered Files Details
order portion of the code appears in the high-order bits of the next byte. For
example, the sequence of 9-bit output codes in Table 3.18 is encoded as
80 0B 60 50 22 0C 0E 02
If LowBitFirst is true, codes are packed into bytes low-order bit first. That is,
bits of a code are stored into the available bits of a byte starting with the lowest-
order bit. When a code straddles a byte boundary, the low-order portion of the
code appears in the high-order bits of one byte; the high-order portion of the
code appears in the low-order bits of the next byte. For example, the sequence
of 9-bit output codes in Table 3.18 would be encoded as
00 5B 08 14 18 64 60 40
FlateDecode Filter
source /FlateDecode filter
source dictionary /FlateDecode filter
The FlateDecode filter (LanguageLevel 3) decodes data encoded in zlib/deflate
compressed format. See the description of the FlateEncode filter for details of the
format.
FlateEncode Filter
target /FlateEncode filter
target dictionary /FlateEncode filter
The FlateEncode filter (LanguageLevel 3) encodes ASCII or binary data. Encoding
is based on the public-domain zlib/deflate compression method, which is a
variable-length Lempel-Ziv adaptive compression method cascaded with adap-
tive Huffman coding. This method is referred to below as the Flate method. It is
fully defined in Internet Engineering Task Force Requests for Comments (IETF
RFCs) 1950 and 1951. The output produced by the FlateEncode filter is always bi-
nary, even if the input is ASCII text.
Flate compression can discover and exploit many patterns in its input data. In its
basic form, it is especially well suited to natural-language and PostScript-
language text. The filter also supports optional pretransformation by a predictor
function, as described in the section “Predictor Functions” on page 139; this im-
proves compression of sampled image data.

138
C H A P T E R 3
Language
A FlateDecode or FlateEncode parameter dictionary may contain any of the en-
tries listed in Table 3.19. Unless otherwise noted, a decoding filter’s parameters
must match the parameters used by the encoding filter that generated its input
data.
TABLE 3.19 Entries in a FlateEncode or FlateDecode parameter dictionary
KEY
TYPE
VALUE
Effort
integer
(Optional; FlateEncode only) A code controlling the amount of memory
used and the execution speed for Flate compression. Allowed values range
from −1 to 9. A value of 0 compresses rapidly but not tightly, using little
auxiliary memory. A value of 9 compresses slowly but as tightly as possi-
ble, using a large amount of auxiliary memory. A value of −1 is mapped to
a value within the range 0 to 9 that is a “reasonable” default for the imple-
mentation. Default value: −1.
Predictor
integer
(Optional) See Table 3.20 on page 141.
Columns
integer
(Optional) See Table 3.20 on page 141.
Colors
integer
(Optional) See Table 3.20 on page 141.
BitsPerComponent
integer
(Optional) See Table 3.20 on page 141.
CloseSource
boolean
(Optional; FlateDecode only) A flag specifying whether closing the filter
should also close its data source. Default value: false.
CloseTarget
boolean
(Optional; FlateEncode only) A flag specifying whether closing the filter
should also close its data target. Default value: false.
Comparison of LZW and Flate Encoding
Flate encoding, like LZW encoding, discovers and exploits many patterns in its
input data, whether text or images. Thanks to its cascaded adaptive Huffman
coding, Flate-encoded output is usually substantially more compact than LZW-
encoded output for the same input. Flate and LZW decoding speeds are com-
parable, but Flate encoding is considerably slower than LZW encoding.
Usually, the FlateEncode and LZWEncode filters compress their inputs substan-
tially. In the worst case, however, the FlateEncode filter expands its input by no
more than a factor of 1.003, plus the effects of algorithm tags added by PNG pre-
dictors (described below) and the effects of any explicit flushfile operations. LZW

139
3 . 1 3
Filtered Files Details
compression has a worst-case expansion of at least a factor of 1.125, which can
increase to nearly 1.5 in some implementations (plus the effects of PNG tags).
Predictor Functions
LZWEncode and FlateEncode filters compress more compactly if their input data
is highly predictable. One way to increase the predictability of many continuous-
tone sampled images is to replace each sample with the difference between that
sample and some predictor function applied to earlier neighboring samples. If
the predictor function works well, the postprediction data will cluster toward 0.
The parameter dictionary for the LZW and Flate filters may contain any of the
four entries Predictor, Columns, Colors, and BitsPerComponent to specify a pre-
dictor function. When a predictor is selected, the encoding filter applies the
predictor function to the data before compression; the decoding filter applies the
complementary predictor function after decompression. Unless otherwise noted,
a decoding filter’s parameters must match the parameters used by the encoding
filter that generated its input data.
Two groups of predictor functions are supported. The first, the TIFF group, con-
sists of the single function that is Predictor 2 in the TIFF standard. (In the TIFF
standard, Predictor 2 applies only to LZW compression, but here it applies to
Flate compression as well.) TIFF Predictor 2 predicts that each color component
of a sample will be the same as the corresponding color component of the sample
immediately to the left.
The second supported group of predictor functions, the PNG group, consists of
the “filters” of the World Wide Web Consortium’s Portable Network Graphics
recommendation, documented in IETF RFC 2083. The term predictors is used
here instead of filters to avoid confusion. There are five basic PNG predictor algo-
rithms:
None
No prediction
Sub
Predicts the same as the sample to the left
Up
Predicts the same as the sample above
Average
Predicts the average of the sample to the left and the sample above
Paeth
A nonlinear function of the sample above, the sample to the left,
and the sample to the upper-left

140
C H A P T E R 3
Language
The two groups of predictor functions have some common features. Both assume
the following:
Data is presented in order, from the top row to the bottom row and from left to
right within a row.
A row occupies a whole number of bytes, rounded up if necessary.
Samples and their components are packed into bytes from high- to low-order
bits.
All color components of samples outside the image (which are necessary for
predictions near the boundaries) are 0.
The two groups differ in the following ways:
With PNG predictors, the encoded data explicitly identifies the predictor func-
tion used for each row, so different rows can be predicted with different algo-
rithms to improve compression. The TIFF predictor has no such identifier; the
same algorithm applies to all rows.
The TIFF function group predicts each color component from the prior in-
stance of that color component, taking into account the bits per component
and the number of components per sample. In contrast, the PNG function
group predicts each byte of data as a function of the corresponding byte of one
or more previous image samples, regardless of whether there are multiple color
components in a byte, or whether a single color component spans multiple
bytes. This can yield significantly better speed but with somewhat worse com-
pression.
Table 3.20 describes the predictor-related entries in a parameter dictionary for an
LZW or Flate filter.

141
3 . 1 3
Filtered Files Details
TABLE 3.20 Predictor-related entries in an LZW or Flate filter parameter dictionary
KEY
TYPE
VALUE
Predictor
integer
(Optional) A code that selects the predictor function:
1
No predictor (normal encoding or decoding)
2
TIFF Predictor 2
≥10
(LanguageLevel 3) PNG predictor. For LZWEncode and
FlateEncode, this selects the specific PNG predictor function(s)
to be used, as indicated below. For LZWDecode and
FlateDecode, any of these values merely indicates that PNG pre-
dictors are in use; the predictor function is explicitly encoded in
the incoming data. The values of Predictor for the encoding and
decoding filters need not match if they are both greater than or
equal to 10.
10
PNG predictor, None function
11
PNG predictor, Sub function
12
PNG predictor, Up function
13
PNG predictor, Average function
14
PNG predictor, Paeth function
15
PNG predictor in which the encoding filter automati-
cally chooses the optimum function separately for each
row
Default value: 1.
Columns
integer
(Optional; used only if Predictor is greater than 1) The number of samples in
each row. Default value: 1.
Colors
integer
(Optional; used only if Predictor is greater than 1) The number of interleaved
color components per sample; must be 1 or greater. Default value: 1.
BitsPerComponent
integer
(Optional; used only if Predictor is greater than 1) The number of bits used to
represent each color component of a sample; must be 1, 2, 4, or 8. Default
value: 8.

142
C H A P T E R 3
Language
RunLengthDecode Filter
source /RunLengthDecode filter
source dictionary /RunLengthDecode filter
The RunLengthDecode filter decodes data encoded in the run-length encoding
format. The encoded data consist of pairs of run-length bytes and data. See the
description of the RunLengthEncode filter for details of the format. A run length
of 128 indicates EOD.
The parameter dictionary may be used to specify the CloseSource parameter
(LanguageLevel 3).
RunLengthEncode Filter
target recordsize /RunLengthEncode filter
target dictionary recordsize /RunLengthEncode filter
The RunLengthEncode filter encodes data in a simple byte-oriented format based
on run length. The compressed data format is a sequence of runs, where each run
consists of a length byte followed by 1 to 128 bytes of data. If the length byte is in
the range 0 to 127, the following length + 1 bytes (1 to 128 bytes) are to be copied
literally upon decompression. If length is in the range 129 to 255, the following
single byte is to be replicated 257 − length times (2 to 128 times) upon decom-
pression.
When the RunLengthEncode filter is closed, it writes a final byte, with value 128
as an EOD marker.
recordsize is a nonnegative integer specifying the number of bytes in a “record” of
source data. The RunLengthEncode filter will not create a run that contains data
from more than one source record. If recordsize is 0, the filter does not treat its
source data as records. The notion of a “record” is irrelevant in the context of the
PostScript interpreter (in particular, the image operator does not require its data
to be divided into records). A nonzero recordsize is useful only if the encoded data
is to be sent to some application program that requires it.
This encoding is very similar to that used by the Apple® Macintosh® PackBits
routine and by TIFF Data Compression scheme #32773. Output from PackBits is
acceptable as input to the RunLengthDecode filter if an EOD marker (byte value
128) is appended to it. Output from the RunLengthEncode filter is acceptable to

143
3 . 1 3
Filtered Files Details
UnpackBits if the recordsize parameter is equal to the length of one scan line for
the image being encoded.
The parameter dictionary can be used to specify the CloseTarget parameter
(LanguageLevel 3). Note that there is no means for specifying recordsize in the pa-
rameter dictionary; it must be an explicit operand of the RunLengthEncode filter.
CCITTFaxDecode Filter
source /CCITTFaxDecode filter
source dictionary /CCITTFaxDecode filter
The CCITTFaxDecode filter decodes image data that has been encoded according
to the CCITT facsimile standard. See the description of the CCITTFaxEncode filter
for details of the filter parameters.
If the CCITTFaxDecode filter encounters improperly encoded source data, it will
issue an ioerror. It will not perform any error correction or resynchronization,
except as noted for DamagedRowsBeforeError in Table 3.21.
CCITTFaxEncode Filter
target /CCITTFaxEncode filter
target dictionary /CCITTFaxEncode filter
The CCITTFaxEncode filter encodes image data according to the CCITT facsimile
(fax) standard. This encoding is defined by an international standards organiza-
tion, the International Telecommunication Union (ITU), formerly known as the
Comité Consultatif International Téléphonique et Télégraphique (International
Coordinating Committee for Telephony and Telegraphy). The encoding is de-
signed to achieve efficient compression of monochrome (1 bit per pixel) image
data at relatively low resolutions. The encoding algorithm is not described in this
book, but rather in the ITU standard (see the Bibliography). We refer to that
standard as the CCITT standard for historical reasons.
Note: PostScript language support for the CCITT standard is limited to encoding
and decoding of image data. It does not include initial connection and handshaking
protocols that would be required to communicate with a fax machine. The purpose of
these filters is to enable efficient interchange of bilevel sampled images between an
application program and a PostScript interpreter.


144
C H A P T E R 3
Language
The CCITTFaxDecode and CCITTFaxEncode filters support two encoding schemes,
Group 3 and Group 4, and various optional features of the CCITT standard.
Table 3.21 describes the contents of the parameter dictionary for these filters.
TABLE 3.21 Entries in a CCITTFaxEncode or CCITTFaxDecode parameter dictionary
KEY
TYPE
VALUE
Uncompressed
boolean
(Optional) A flag indicating whether the CCITTFaxEncode filter is per-
mitted to use uncompressed encoding when advantageous. Uncom-
pressed encoding is an optional part of the CCITT fax encoding
standard. Its use can prevent significant data expansion when encoding
certain image data, but many fax machine manufacturers and software
vendors do not support it. The CCITTFaxDecode filter always accepts
uncompressed encoding. Default value: false.
K
integer
(Optional) An integer that selects the encoding scheme to be used:
<0
Pure two-dimensional encoding (Group 4)
0
Pure one-dimensional encoding (Group 3, 1-D)
>0
Mixed one- and two-dimensional encoding (Group 3, 2-D),
in which a line encoded one-dimensionally can be followed
by at most K − 1 lines encoded two-dimensionally
The CCITTFaxEncode filter uses the value of K to determine how to en-
code the data. The CCITTFaxDecode filter distinguishes among nega-
tive, zero, and positive values of K to determine how to interpret the
encoded data; however, it does not distinguish between different posi-
tive K values. Default value: 0.
EndOfLine
boolean
(Optional) A flag indicating whether the CCITTFaxEncode filter prefixes
an end-of-line bit pattern to each line of encoded data. The
CCITTFaxDecode filter always accepts end-of-line bit patterns, but re-
quires them to be present only if EndOfLine is true. Default value: false.
EncodedByteAlign
boolean
(Optional) A flag indicating whether the CCITTFaxEncode filter inserts
extra 0 bits before each encoded line so that the line begins on a byte
boundary. If true, the CCITTFaxDecode filter skips over encoded bits to
begin decoding each line at a byte boundary. If false, the filters neither
generate nor expect extra bits in the encoded representation. Default
value: false.
Columns
integer
(Optional) The width of the image in pixels. If Columns is not a multi-
ple of 8, the filters adjust the width of the unencoded image to the next
multiple of 8. This adjustment is necessary for consistency with the

145
3 . 1 3
Filtered Files Details
image operator, which requires that each line of source data start on a
byte boundary. Default value: 1728.
Rows
integer
(Optional; CCITTFaxDecode only) The height of the image in scan lines.
If Rows is 0 or absent, the image’s height is not predetermined; the en-
coded data must be terminated by an end-of-block bit pattern or by
the end of the filter’s data source. Default value: 0.
EndOfBlock
boolean
(Optional) A flag indicating whether the CCITTFaxEncode filter ap-
pends an end-of-block pattern to the encoded data. If true, the
CCITTFaxDecode filter expects the encoded data to be terminated by
end-of-block, overriding the Rows parameter. If false, the
CCITTFaxDecode filter stops when it has decoded the number of lines
indicated by Rows or when its data source is exhausted, whichever hap-
pens first. Default value: true.
The end-of-block pattern is the CCITT end-of-facsimile-block
(EOFB) or return-to-control (RTC) appropriate for the K parameter.
BlackIs1
boolean
(Optional) A flag indicating whether 1 bits are to be interpreted as
black pixels and 0 bits as white pixels, the reverse of the normal Post-
Script language convention for image data. Default value: false.
DamagedRowsBeforeError
integer
(Optional; CCITTFaxDecode only) The number of damaged rows of data
to be tolerated before an ioerror is generated; applies only if EndOfLine
is true and K is nonnegative. Tolerating a damaged row means locating
its end in the encoded data by searching for an EndOfLine pattern, then
substituting decoded data from the previous row if the previous row
was not damaged, or a white scan line if the previous row was also
damaged. Default value: 0.
CloseSource
boolean
(Optional; LanguageLevel 3; CCITTFaxDecode only) A flag specifying
whether closing the filter should also close its data source. Default
value: false.
CloseTarget
boolean
(Optional; LanguageLevel 3; CCITTFaxEncode only) A flag specifying
whether closing the filter should also close its data target. Default
value: false.
The CCITT fax standard specifies a bilevel picture encoding in terms of black and
white pixels. It does not define a representation for the unencoded image data in
terms of 0 and 1 bits in memory. However, the PostScript language (specifically,
the image operator) does impose a convention: normally, 0 means black and 1
means white. Therefore, the CCITTFaxEncode filter normally encodes 0 bits as
black pixels and 1 bits as white pixels. Similarly, the CCITTFaxDecode filter

146
C H A P T E R 3
Language
normally produces 0 bits for black pixels and 1 bits for white pixels. The BlackIs1
parameter can be used to reverse this convention if necessary.
The fax encoding method is bit-oriented, not byte-oriented. This means that, in
principle, encoded or decoded data might not end at a byte boundary. The
CCITTFaxEncode and CCITTFaxDecode filters deal with this problem in the fol-
lowing ways:
Unencoded data is treated as complete scan lines, with unused bits inserted at
the end of each scan line to fill out the last byte. This is compatible with the
convention the image operator uses.
Encoded data is ordinarily treated as a continuous, unbroken bit stream. The
EncodedByteAlign parameter can be used to cause each encoded scan line to be
filled to a byte boundary; this method is not prescribed by the CCITT standard,
and fax machines never do this, but some software packages find it convenient
to encode data this way.
When a filter reaches EOD, it always skips to the next byte boundary following
the encoded data.
DCTDecode Filter
source /DCTDecode filter
source dictionary /DCTDecode filter
The DCTDecode filter decodes grayscale or color image data in JPEG baseline
encoded format. The description of the DCTEncode filter provides details of the
format and the related filter parameters. All of the DCTEncode parameters (except
CloseTarget) are allowed for DCTDecode; however, usually no parameters are
needed except ColorTransform (and possibly CloseSource, in LanguageLevel 3),
because all information required for decoding an image is normally contained in
the JPEG signalling parameters, which accompany the encoded data in the com-
pressed data stream.
The decoded data is a stream of image samples, each of which consists of 1, 2, 3,
or 4 color components, interleaved on a per-sample basis. Each component value
occupies one 8-bit byte. The dimensions of the image and the number of com-
ponents per sample depend on parameters that were specified when the image
was encoded. Given suitable parameters, the image operator can consume data
directly from a DCTDecode filter.

147
3 . 1 3
Filtered Files Details
Note: The JPEG standard also allows an image’s components to be sent as separate
scans instead of interleaved; however, that format is not useful with the
image op-
erator, because
image requires that components from separate sources be read in
parallel.

DCTEncode Filter
target dictionary /DCTEncode filter
The DCTEncode filter encodes grayscale or color image data in JPEG baseline for-
mat. JPEG is the ISO Joint Photographic Experts Group, an organization respon-
sible for developing an international standard for compression of color image
data (see the Bibliography). Another informal abbreviation for this standard is
JFIF, for JPEG File Interchange Format. DCT refers to the primary technique
(discrete cosine transform) used in the encoding and decoding algorithms. The
algorithm can achieve very impressive compression of color images. For example,
at a compression ratio of 10 to 1, there is little or no perceptible degradation in
quality.
Note: The compression algorithm is “lossy,” meaning that the data produced by the
DCTDecode filter is not exactly the same as the data originally encoded by the
DCTEncode filter. These filters are designed specifically for compression of sampled
continuous-tone images, not for general data compression.

Input to the DCTEncode filter is a stream of image samples, each of which consists
of 1, 2, 3, or 4 color components, interleaved on a per-sample basis. Each com-
ponent value occupies one 8-bit byte. The dimensions of the image and the num-
ber of components per sample must be specified in the filter’s parameter
dictionary. The dictionary can also contain other optional parameters that con-
trol the operation of the encoding algorithm. Table 3.22 describes the contents of
this dictionary.

148
C H A P T E R 3
Language
TABLE 3.22 Entries in a DCTEncode parameter dictionary
KEY
TYPE
VALUE
Columns
integer
(Required) The width of the image in samples per scan line.
Rows
integer
(Required) The height of the image in scan lines.
Colors
integer
(Required) The number of color components in the image; must be 1, 2, 3,
or 4.
HSamples
array,
(Optional) A sequence of horizontal sampling factors (one per color
packed array,
component). If HSamples is an array or a packed array, the elements must
or string
be integers; if it is a string, the elements are interpreted as integers in the
range 0 to 255. The ith element of the sequence specifies the sampling fac-
tor for the ith color component. Allowed sampling factors are 1, 2, 3, and
4. The default value is an array containing 1 for all components, meaning
that all components are to be sampled at the same rate.
When the sampling factors are not all the same, DCTEncode subsamples
the image for those components whose sampling factors are less than the
largest one. For example, if HSamples is [4 3 2 1] for a 4-color image,
then for every 4 horizontal samples of the first component, DCTEncode
sends only 3 samples of the second component, 2 of the third, and 1 of the
fourth. However, DCTDecode inverts this sampling process so that it pro-
duces the same amount of data as was presented to DCTEncode. In other
words, this parameter affects only the encoded, and not the unencoded or
decoded, representation. The filters deal correctly with the situation in
which the width or height of the image is not a multiple of the corre-
sponding sampling factor.
VSamples
array,
(Optional) A sequence of vertical sampling factors (one per color
packed array,
component). Interpretation and default value are the same as for
or string
HSamples.
The JPEG standard imposes a restriction on the values in the HSamples
and VSamples sequences, taken together: For each color component, mul-
tiply its HSamples value by its VSamples value, then add all of the prod-
ucts together. The result must not exceed 10.
QuantTables
array or
(Optional) An array or packed array of quantization tables (one per color
packed array
component). The ith element of QuantTables is the table to be used, after
scaling by QFactor, for quantization of the ith color component. As many
as four unique quantization tables can be specified, but several elements
of the QuantTables array can refer to the same table.

149
3 . 1 3
Filtered Files Details
Each table must be an array, a packed array, or a string. If it is an array or a
packed array, the elements must be numbers; if it is a string, the elements
are interpreted as integers in the range 0 to 255. In either case, each table
must contain 64 numbers organized according to the zigzag pattern
defined by the JPEG standard. After scaling by QFactor, every element is
rounded to the nearest integer in the range 1 to 255. Default value:
implementation-dependent.
QFactor
number
(Optional) A scale factor applied to the elements of QuantTables. This
parameter enables straightforward adjustment of the tradeoff between
image compression and image quality without respecifying the quantiza-
tion tables. Valid values are in the range 0 to 1,000,000. A value less than 1
improves image quality but decreases compression; a value greater than 1
increases compression but degrades image quality. Default value: 1.0.
HuffTables
array or
(Optional) An array or packed array of at least 2 × Colors encoding tables.
packed array
The pair of tables at indices 2 × i and 2 × i + 1 in HuffTables are used to
construct Huffman tables for coding the ith color component. The first
table in each pair is used for the DC coefficients, the second for the AC
coefficients. At most two DC tables and two AC tables can be specified,
but several elements of the HuffTables array can refer to the same tables.
Default value: implementation-dependent.
Each table must be an array, a packed array, or a string. If it is an array or a
packed array, the elements must be numbers; if it is a string, the elements
are interpreted as integers in the range 0 to 255. The first 16 values specify
the number of codes of each length from 1 to 16 bits. The remaining
values are the symbols corresponding to each code; they are given in order
of increasing code length. This information is sufficient to construct a
Huffman coding table according to an algorithm given in the JPEG stan-
dard. A QFactor value other than 1.0 may alter this computation.
ColorTransform
integer
(Optional) A code specifying a transformation to be performed on the
sample values:
0
No transformation.
1
If Colors is 3, transform RGB values to YUV before encoding and
from YUV to RGB after decoding. If Colors is 4, transform CMYK
values to YUVK before encoding and from YUVK to CMYK after
decoding. This option is ignored if Colors is 1 or 2.
If performed, these transformations occur entirely within the DCTEncode
and DCTDecode filters. The RGB and YUV used here have nothing to do
with the color spaces defined as part of the Adobe imaging model. The
purpose of converting from RGB to YUV is to separate luminance and
chrominance information (see below).

150
C H A P T E R 3
Language
The default value of ColorTransform is 1 if Colors is 3 and 0 otherwise. In
other words, conversion between RGB and YUV is performed for all
three-component images unless explicitly disabled by setting Color-
Transform to 0. Additionally, the DCTEncode filter inserts an Adobe-
defined marker code in the encoded data indicating the ColorTransform
value used. If present, this marker code overrides the ColorTransform val-
ue given to DCTDecode. Thus it is necessary to specify ColorTransform
only when decoding data that does not contain the Adobe-defined marker
code.
CloseTarget
boolean
(Optional; LanguageLevel 3) A flag specifying whether closing the filter
should also close its data target. Default value: false.
Specifying the optional parameters properly requires understanding the details of
the encoding algorithm, which is described in the JPEG standard. The
DCTDecode and DCTEncode filters do not support certain features of the stan-
dard that are irrelevant to images following PostScript language conventions; in
particular, progressive JPEG is not supported. Additionally, Adobe has made cer-
tain choices regarding reserved marker codes and other optional features of the
standard; contact the Adobe Developers Association for further information.
The default values for QuantTables and HuffTables in a DCTEncode parameter
dictionary are chosen without reference to the image color space and without
specifying any particular tradeoff between image quality and compression.
Although they will work, they will not produce optimal results for most applica-
tions. For superior compression, applications should provide custom Quant-
Tables and HuffTables arrays rather than relying on the default values.
Better compression is often possible for color spaces that treat luminance and
chrominance separately than for those that do not. The RGB to YUV conversion
provided by the filters is one attempt to separate luminance and chrominance; it
conforms to CCIR recommendation 601-1. Other color spaces, such as the CIE
1976 L*a*b* space, may also achieve this objective. The chrominance compo-
nents can then be compressed more than the luminance by using coarser sam-
pling or quantization, with no degradation in quality.
Unlike other encoding filters, the DCTEncode filter requires that a specific
amount of data be written to it: Columns × Rows samples of Colors bytes each.
The filter reaches EOD at that point. It cannot accept further data, so attempting

151
3 . 1 3
Filtered Files Details
to write to it will cause an ioerror. The program must now close the filter file to
cause the buffered data and EOD marker to be flushed to the data target.
SubFileDecode Filter
source EODCount EODString /SubFileDecode filter
source dictionary EODCount EODString /SubFileDecode filter
source dictionary /SubFileDecode filter
(LanguageLevel 3)
The SubFileDecode filter does not perform data transformation, but it can detect
an EOD condition. Its output is always identical to its input, up to the point
where EOD occurs. The data preceding the EOD is called a subfile of the underly-
ing data source.
The SubFileDecode filter can be used in a variety of ways:
A subfile can contain data that should be read or executed conditionally, de-
pending on information that is not known until execution. If a program
decides to ignore the information in a subfile, it can easily skip to the end of the
subfile by invoking flushfile on the filter file.
Subfiles can help recover from errors that occur in encapsulated programs. If
the encapsulated program is treated as a subfile, the enclosing program can re-
gain control if an error occurs, flush to the end of the subfile, and resume exe-
cution from the underlying data source. The application, not the PostScript
interpreter, must provide such error handling; it is not the default error han-
dling provided by the PostScript interpreter.
The SubFileDecode filter enables an arbitrary data source (procedure or string)
to be treated as an input file. This use of subfiles does not require detection of
an EOD marker.
The SubFileDecode filter requires two parameters, EODCount and EODString,
which specify the condition under which the filter is to recognize EOD. The filter
will allow data to pass through the filter until it has encountered exactly
EODCount instances of the EODString; then it will reach EOD.
In LanguageLevel 2, EODCount and EODString are specified as operands on the
stack. In LanguageLevel
3, they may alternatively be specified in the
SubFileDecode parameter dictionary (as shown in Table 3.23). They must be
specified in the parameter dictionary if the SubFileDecode filter is used as one of
the filters in a ReusableStreamDecode filter (described in the next section).

152
C H A P T E R 3
Language
TABLE 3.23 Entries in a SubFileDecode parameter dictionary (LanguageLevel 3)
KEY
TYPE
VALUE
EODCount
integer
(Required) The number of occurrences of EODString that will be passed
through the filter and made available for reading.
EODString
string
(Required) The end-of-data string.
CloseSource
boolean
(Optional) A flag specifying whether closing the filter should also close its
data source. Default value: false.
EODCount must be a nonnegative integer. If it is greater than 0, all input data up
to and including that many occurrences of EODString will be passed through the
filter and made available for reading. If EODCount is 0, the first occurrence of EOD-
String will be consumed by the filter, but it will not be passed through the filter.
EODString is ordinarily a string of nonzero length. It is compared with successive
subsequences of the data read from the data source. This comparison is based on
equality of 8-bit character codes, so matching is case-sensitive. Each occurrence
of EODString in the data is counted once. Overlapping instances of EODString will
not be recognized. For example, an EODString of eee will be recognized only once
in the input XeeeeX.
EODString may also be of length 0, in which case the SubFileDecode filter will
simply pass EODCount bytes of arbitrary data. This is dependable only for binary
data, when suitable precautions have been taken to protect the data from any
modification by communication channels or operating systems. Ordinary ASCII
text is subject to modifications such as translation between different end-of-line
conventions, which can change the byte count in unpredictable ways.
A recommended value for EODString is a document structuring comment, such as
%%EndBinary. Including newline characters in EODString is not recommended;
translating the data between different end-of-line conventions could subvert the
string comparisons.
If EODCount is 0 and EODString is of length 0, detection of EOD markers is dis-
abled; the filter will not reach EOD. This is useful primarily when using proce-
dures or strings as data sources. EODCount is not allowed to be negative.

153
3 . 1 3
Filtered Files Details
ReusableStreamDecode Filter
source /ReusableStreamDecode filter
source dictionary /ReusableStreamDecode filter
Certain PostScript features require that large blocks of data be available, in their
entirety, for use one or more times during the invocation of those features. Exam-
ples of such data blocks include:
Data for a sampled function (see Section 3.10, “Functions”)
Image data or encapsulated PostScript (EPS) referenced from the PaintProc
procedure of a form dictionary (see Section 4.7, “Forms”)
Mesh data for shading dictionaries (see Section 4.9.3, “Shading Patterns”)
Such data can be stored in strings, but only if the amount of data is less than the
implementation limit imposed on string objects. (See Appendix B for implemen-
tation limits.) To overcome this limit, LanguageLevel 3 defines reusable streams.
A reusable stream is a file object produced by a ReusableStreamDecode filter.
Conceptually, this filter consumes all of its source data at the time the filter oper-
ator is invoked and then makes the data available as if it were contained in a tem-
porary file. The filter file can be positioned as if it were a random-access disk file;
its capacity is limited only by the amount of storage available.
Except for ReusableStreamDecode filters, a decoding filter is an input file that
can be read only once. When it reaches EOF, it is automatically closed, and no
further data can be read from it. No data is read from the filter’s source during the
execution of the filter operator.
In contrast, a ReusableStreamDecode filter is an input file that can be read many
times. When it reaches EOF, it does not automatically close, but merely stays at
EOF. It can be repositioned, when it reaches EOF or at any other time, for further
reading. In some cases, all of the data is read from the filter’s source during the
execution of the filter operator.
A reusable stream has a length, which is the total number of bytes in its data
source. The stream can be positioned anywhere from 0, which denotes the begin-
ning of the stream, to length, which denotes the EOF.

154
C H A P T E R 3
Language
When data is read from the filter’s source, it may or may not be buffered in mem-
ory or written to a temporary disk file, depending on the type of data source, the
availability of storage, and details of the implementation and system memory
management.
The AsyncRead flag in the filter’s parameter dictionary specifies whether all of the
data should be read from the data source during the execution of the filter opera-
tor (AsyncRead false, the default), or whether this may be postponed until the
data is needed (AsyncRead true). Asynchronous reading may require less memo-
ry or have better performance, but caution is required: attempts to read from the
same data source through a separate stream may produce incorrect results.
Regardless of the value of AsyncRead, a string or file that is used as the data
source for a reusable stream, as for any other decoding filter, should be consid-
ered read-only until the stream is closed. Writing into such a string or file will
have unpredictable consequences for the data read from the stream.
A reusable stream’s parameter dictionary can also specify additional filters that
are to be applied to the data source before it is read by the ReusableStream-
Decode filter. This has an effect equivalent to supplying the same filter pipeline as
the data source of the ReusableStreamDecode filter. However, specifying those
filters in the ReusableStreamDecode filter dictionary can improve efficiency by
allowing the implementation more flexibility in determining how to read and
buffer the data.
The following operators can be applied to a reusable stream:
closefile closes the file. This occurs implicitly when the file is reclaimed by the
restore operator or garbage collection. Closing the file reclaims any temporary
memory or disk space that was used to buffer the file’s contents.
fileposition returns the current file position. The result is always in the range 0
to length.
setfileposition sets the file position to a value in the range 0 to length.
resetfile sets the file position to 0.
flushfile sets the file position to length.
bytesavailable returns length minus the current file position.
Table 3.24 lists the entries in the ReusableStreamDecode parameter dictionary.

155
3 . 1 3
Filtered Files Details
TABLE 3.24 Entries in a ReusableStreamDecode parameter dictionary
KEY
TYPE
VALUE
Filter
array or name
(Optional) An array of names of decoding filters that are to be applied be-
fore delivering data to the reader. The names must be specified in the order
they should be applied to decode the data. For example, data encoded using
LZW and then ASCII base-85 encoding filters should be decoded with the
Filter value [/ASCII85Decode /LZWDecode]. If only one filter is required, the
value of Filter may be the name of that filter.
DecodeParms
array or
(Optional) An array of parameter dictionaries used by the decoding filters
dictionary
that are specified by the Filter parameter, listed in the same order. If a filter
requires no parameters, the corresponding item in the DecodeParms array
must be null. If the value of Filter is a name, DecodeParms must be the pa-
rameter dictionary for that filter. If no parameters are required for any of
the decoding filters, DecodeParms may be omitted.
Note that the SubFileDecode filter requires a parameter dictionary with en-
tries for both EODCount and EODString.
All occurrences of CloseSource in the parameter dictionaries are ignored.
When the reusable stream is closed, all the filters are also closed, indepen-
dent of the value of CloseSource in the reusable stream itself. The original
source of the reusable stream is closed only if the value of CloseSource in
the reusable stream is true.
Intent
integer
(Optional) A code representing the intended use of the reusable stream,
which may help optimize the data storage or caching strategy. If the value is
omitted or is not one of the following values, the default value of 0 is used.
0
Image data
1
Image mask data
2
Sequentially accessed lookup table data (such as a threshold array)
3
Randomly accessed lookup table data (such as the table of values
for a sampled function)
AsyncRead
boolean
(Optional) A flag that controls when data from the source is to be read. If
false, all the data from the source is read when the filter is created. If true,
data from the source may or may not be read when the filter is created;
reading may be postponed until the data is needed. Any operation on the
filter may cause all of the data to be read. Default value: false.
CloseSource
boolean
(Optional) A flag specifying whether closing the filter should also close its
data source. Default value: false.

156
C H A P T E R 3
Language
NullEncode Filter
target /NullEncode filter
target dictionary /NullEncode filter
The NullEncode filter is an encoding filter that performs no data transformation;
its output is always identical to its input. The purpose of this filter is to allow an
arbitrary data target (procedure or string) to be treated as an output file. as de-
scribed in Section 3.13.1, “Data Sources and Targets.” Note that there is no
NullDecode filter as such, because the SubFileDecode filter can be configured to
serve that function.
The parameter dictionary can be used to specify the CloseTarget parameter
(LanguageLevel 3).
3.14 Binary Encoding Details
In LanguageLevels 2 and 3, the scanner recognizes two encoded forms of the
PostScript language in addition to ASCII. These are binary token encoding and
binary object sequence encoding. All three encoding formats can be mixed in any
program.
The binary token encoding represents elements of the PostScript language as indi-
vidual syntactic entities. This encoding emphasizes compactness over efficiency
of generation or interpretation. Still, the binary token encoding is usually more
efficient than ASCII. Most elements of the language, such as integers, real num-
bers, and operator names, are represented by fewer characters in the binary en-
coding than in the ASCII encoding. Binary encoding is most suitable for
environments in which communication bandwidth or storage space is the scarce
resource.
The binary object sequence encoding represents a sequence of one or more Post-
Script objects as a single syntactic entity. This encoding is not compact, but it can
be generated and interpreted very efficiently. In this encoding, most elements of
the language are in a natural machine representation or something very close to
one. Also, this encoding is oriented toward sending fully or partially precompiled
sequences of objects, as opposed to sequences generated “on the fly.” Binary ob-
ject sequence encoding is most suitable for environments in which execution
costs dominate communication costs.

157
3 . 1 4
Binary Encoding Details
Use of the binary encodings requires that the communication channel between
the application and the PostScript interpreter be fully transparent. That is, the
channel must be able to carry an arbitrary sequence of 8-bit character codes, with
no characters reserved for communications functions, no “line” or “record”
length restrictions, and so on. If the communication channel is not transparent,
an application must use the ASCII encoding. Alternatively, it can make use of the
filters that encode binary data as ASCII text (see Section 3.13, “Filtered Files De-
tails”).
The various language encodings apply only to characters the PostScript language
scanner consumes. Applying exec to an executable file or string object invokes the
scanner, as does the token operator. File operators such as read and readstring,
however, read the incoming sequence of characters as data, not as encoded Post-
Script programs.
The first character of each token determines what encoding is to be used for that
token. If the character code is in the range 128 to 159 (that is, one of the first 32
codes with the high-order bit set), one of the binary encodings is used. For binary
encodings, the character code is treated as a token type: it determines which
encoding is used and sometimes also specifies the type and representation of the
token.
Note: The codes 128 to 159 are control characters in most standard character sets,
such as ISO and JIS; they do not have glyphs assigned to them and are unlikely to be
used to construct names in PostScript programs. Interpretation of binary encodings
can be disabled; see the
setobjectformat operator in Chapter 8.
Characters following the token type character are interpreted according to the
same encoding until the end of the token is reached, regardless of character codes.
A character code outside the range 128 to 159 can appear within a multiple-byte
binary encoding. A character code in the range 128 to 159 can appear within an
ASCII string literal or a comment. However, a binary token type character termi-
nates a preceding ASCII name or number token.
In the following descriptions, the term byte is synonymous with character but
emphasizes that the information represents binary data instead of ASCII text.

158
C H A P T E R 3
Language
3.14.1 Binary Tokens
Binary tokens are variable-length binary encodings of certain types of PostScript
objects. A binary token represents an object that can also be represented in the
ASCII encoding, but it can usually represent the object with fewer characters. The
binary token encoding is usually the most compact representation of a program.
Semantically, a binary token is equivalent to some corresponding ASCII token.
When the scanner encounters the binary encoding for the integer 123, it pro-
duces the same result as when it encounters an ASCII token consisting of the
characters 1, 2, and 3. That is, it produces an integer object whose value is 123;
the object is the same and occupies the same amount of space if stored in VM,
whether it came from a binary or an ASCII token.
Unlike the ASCII and binary object sequence encodings, the binary token en-
coding is incomplete; not everything in the language can be expressed as a binary
token. For example, it does not make sense to have binary token encodings of
{ and }, because their ASCII encodings are already compact. It also does not make
sense to have binary encodings for the names of operators that are rarely used,
because their contribution to the overall length of a PostScript program is negli-
gible. The incompleteness of the binary token encoding is not a problem, because
ASCII and binary tokens can be mixed.
The binary token encoding is summarized in Table 3.25. A binary token begins
with a token type byte. A majority of the token types (132 to 149) are used for bi-
nary tokens; the remainder are used for binary object sequences or are unas-
signed. The token type determines how many additional bytes constitute the
token and how the token is interpreted.
TABLE 3.25 Binary token interpretation
TOKEN
ADDITIONAL
TYPE(S)
BYTES INTERPRETATION
128–131

Binary object sequence (see Section 3.14.2, “Binary Object Sequences”).
132
4
32-bit integer, high-order byte first.
133
4
32-bit integer, low-order byte first.
134
2
16-bit integer, high-order byte first.
135
2
16-bit integer, low-order byte first.

159
3 . 1 4
Binary Encoding Details
136
1
8-bit integer, treating the byte after the token type as a signed number n;
−128 ≤ n ≤ 127.
137
3 or 5
16- or 32-bit fixed-point number. The number representation (size, byte
order
, and scale) is encoded in the byte immediately following the token type;
the remaining 2 or 4 bytes constitute the number itself. The representation
parameter is treated as an unsigned integer r:
0 ≤ r ≤ 31
32-bit fixed-point number, high-order byte first. scale
(the number of bits of fraction) is equal to r.
32 ≤ r ≤ 47
16-bit fixed-point number, high-order byte first; scale
equals r − 32.
128 ≤ r ≤ 175
Same as r − 128, except that all numbers are given low-
order byte first.
138
4
32-bit IEEE standard real, high-order byte first.
139
4
32-bit IEEE standard real, low-order byte first.
140
4
32-bit native real.
141
1
Boolean. The byte following the token type gives the value 0 for false, 1 for
true.
142
1 + n
String of length n. The parameter n is in the byte following the token type;
0 ≤ n ≤ 255. The n characters of the string follow the parameter.
143
2 + n
Long string of length n. The 16-bit parameter n is contained in the two bytes
following the token type, represented high-order byte first; 0 ≤ n ≤ 65,535.
The n bytes of the string follow the parameter.
144
2 + n
Same as 143 except that n is encoded low-order byte first.
145 or 146
1
Literal (145) or executable (146) encoded system name. The system name in-
dex (in the range 0 to 255) is contained in the byte following the token type.
This is described in detail in Section 3.14.3, “Encoded System Names.”
147–148

Reserved (Display PostScript extension).
149
3 + data
Homogeneous number array, which consists of a 4-byte header, including the
token type, followed by a variable-length array of numbers whose size and
representation are specified in the header. The header is described in detail
below.
150–159

Unassigned. Occurrence of a token with any of these types will cause a
syntaxerror.

160
C H A P T E R 3
Language
The encodings for integer, real, and boolean objects are straightforward; they are
explained in Section 3.14.4, “Number Representations.” The other token types re-
quire additional discussion.
Fixed-Point Numbers
A fixed-point number is a binary number having integer and fractional parts. The
position of the binary point is specified by a separate scale value. In a fixed-point
number of n bits, the high-order bit is the sign, the next nscale − 1 bits are the
integer part, and the low-order scale bits are the fractional part. For example, if
the number is 16 bits wide and scale is 5, it is interpreted as a sign, a 10-bit integer
part, and a 5-bit fractional part. A negative number is represented in twos-
complement form.
There are both 16- and 32-bit fixed-point numbers, enabling an application to
make a tradeoff between compactness and precision. Regardless of the token’s
length, the object produced by the scanner for a fixed-point number is an integer
if scale is 0; otherwise it is a real number. A 32-bit fixed-point number takes more
bytes to represent than a 32-bit real number. It is useful only if the application al-
ready represents numbers that way. Using this representation makes somewhat
more sense in homogeneous number arrays, described below.
String Tokens
A string token specifies the string’s length as a 1- or 2-byte unsigned integer. The
specified number of characters of the string follow immediately. All characters are
treated literally. There is no special treatment of \ (backslash) or other characters.
Encoded System Names
An encoded system name token selects a name object from the system name table
and uses it as either a literal or an executable name. This mechanism is described
in Section 3.14.3, “Encoded System Names.”

161
3 . 1 4
Binary Encoding Details
Homogeneous Number Arrays
A homogeneous number array is a single binary token that represents a literal array
object whose elements are all numbers. Figure 3.2 illustrates the organization of
the homogeneous number array.
8
Header
bits
149
Token type
Header
(4 bytes)
Representation
Array length
(number
of elements)
Array of numbers
(2 or 4 bytes each;
all the same size)
Number representation
Number representation
High byte first
Low byte first
Sign
2-byte
LSB
integer/fixed
LSB
Sign
Sign
LSB
4-byte
integer/fixed
LSB
Sign
Sign
Exponent
LSB of
IEEE
LSB of
Fraction
exponent
real
Fraction
exponent
Sign
Exponent
Note: First byte is at top in all diagrams.
FIGURE 3.2 Homogeneous number array
The token consists of a 4-byte header, including the token type, followed by an
arbitrarily long sequence of numbers. All of the numbers are represented in the
same way, which is specified in the header. The header consists of the token type
byte (149, denoting a homogeneous number array), a byte that describes the
number representation, and two bytes that specify the array length (number of

162
C H A P T E R 3
Language
elements). The number representation is treated as an unsigned integer r in the
range 0 to 255 and is interpreted as shown in Table 3.26.
TABLE 3.26 Number representation in header for a homogeneous number array
REPRESENTATION
INTERPRETATION
0 ≤ r ≤ 31
32-bit fixed-point number, high-order byte first. scale (the number
of bits of fraction) is equal to r.
32 ≤ r ≤ 47
16-bit fixed-point number, high-order byte first. scale equals r − 32.
48
32-bit IEEE standard real, high-order byte first.
49
32-bit native real.
128 ≤ r ≤ 177
Same as r − 128, except that all numbers are given low-order byte
first.
This interpretation is similar to that of the representation parameter r in indi-
vidual fixed-point number tokens.
The array length is given by the last two bytes of the header, treated as an un-
signed 16-bit number n. The byte order in this field is specified by the number
representation: r < 128 indicates high-order byte first; r ≥ 128 indicates low-order
byte first. Following the header are 2 × n or 4 × n bytes, depending on representa-
tion, that encode successive numbers of the array.
When the homogeneous number array is consumed by the PostScript language
scanner, the scanner produces a literal array object. The elements of this array are
all integers if the representation parameter r is 0, 32, 128, or 160, specifying fixed-
point numbers with a scale of 0. Otherwise, they are all real numbers. Once
scanned, such an array is indistinguishable from an array produced by other
means and occupies the same amount of space.
Although the homogeneous number array representation is useful in its own
right, it is particularly useful with operators that take an encoded number string
as an operand. This is described in Section 3.14.5, “Encoded Number Strings.”

163
3 . 1 4
Binary Encoding Details
3.14.2 Binary Object Sequences
A binary object sequence is a single token that describes an executable array of ob-
jects, each of which may be a simple object, a string, or another array nested to
arbitrary depth. The entire sequence can be constructed, transmitted, and
scanned as a single, self-contained syntactic entity.
Semantically, a binary object sequence is an ordinary executable array, as if the
objects in the sequence were surrounded by { and }, but with one important dif-
ference: its execution is immediate instead of deferred. That is, when the Post-
Script interpreter encounters a binary object sequence in a file being executed
directly, the interpreter performs an implicit exec operation instead of pushing
the array on the operand stack, as it ordinarily would do. This special treatment
does not apply when a binary object sequence appears in a context where execu-
tion is already deferred—for example, nested in ASCII-encoded { and } or con-
sumed by the token operator.
Because a binary object sequence is syntactically a single token, the scanner pro-
cesses it completely before the interpreter executes it. The VM allocation mode in
effect at the time the binary object sequence is scanned determines whether the
entire array and all of its composite objects are allocated in local or in global VM.
The encoding emphasizes ease of construction and interpretation over compact-
ness. Each object is represented by 8 successive bytes. In the case of simple ob-
jects, these 8 bytes describe the entire object—type, attributes, and value. In the
case of composite objects, the 8 bytes include a reference to some other part of
the binary object sequence where the value of the object resides. The entire struc-
ture is easy to describe using the data type definition facilities of implementation
languages, such as C and Pascal.
Figure 3.3 shows the organization of the binary object sequence.

164
C H A P T E R 3
Language
8
Normal header
Extended header
bits
(4 bytes)
(8 bytes)
Header
Token type
(4 or 8 bytes)
Token type
Top-level array length
0
(number of objects)
Overall length
Top-level array length
(in bytes)
(number of objects)
Top-level
array of objects
(8 bytes each)
Overall length
(in bytes)
Object
(8 bytes)
Subsidiary
0 = literal
Type
arrays of objects
1 = executable
(8 bytes each)
0
Length
Value
String
values
(variable length)
Note: First byte is at top in all diagrams.
FIGURE 3.3 Binary object sequence
A binary object sequence consists of four parts, in the following order:
1. Header. 4 or 8 bytes of information about the binary object sequence as a
whole
2. Top-level array. A sequence of objects, 8 bytes each, which constitute the value
of the main array object
3. Subsidiary arrays. More 8-byte objects, which constitute the values of nested
array objects

165
3 . 1 4
Binary Encoding Details
4. String values. An unstructured sequence of bytes, which constitute the values
of string objects and the text of name objects
The first byte of the header is the token type, mentioned earlier. Four token types
denote a binary object sequence and select a number representation for all inte-
gers and real numbers embedded within it (see Section 3.14.4, “Number Repre-
sentations”):
128
High-order byte first, IEEE standard real format
129
Low-order byte first, IEEE standard real format
130
High-order byte first, native real format
131
Low-order byte first, native real format
There are two forms of header, normal and extended, as shown in Figure 3.3. The
normal header can describe a binary object sequence that has no more than 255
top-level objects and 65,535 bytes overall. The extended header is required for se-
quences that exceed these limits.
Following the header is an uninterrupted sequence of 8-byte objects that consti-
tute the top-level array and subsidiary arrays. The length of this sequence is not
explicit. It continues until the earliest string value referenced from an object in
the sequence, or until the end of the entire token.
The first byte of each object in the sequence gives the object’s literal/executable
attribute in the high-order bit and its type in the low-order 7 bits. The attribute
values are:
0
Literal
1
Executable
The meaning of the type field is given in Table 3.27.
The second byte of an object is unused; its value must be 0. The third and fourth
bytes constitute the object’s length field; the fifth through eighth bytes constitute
its value field. The interpretation of the length and value fields depends on the
object’s type and is given in Table 3.27. Again, the byte order within these fields is
determined by the number representation for the binary object sequence overall.

166
C H A P T E R 3
Language
TABLE 3.27 Object type, length, and value fields
TYPE CODE
OBJECT TYPE
LENGTH FIELD
VALUE FIELD
0
null
Unused
Unused
1
integer
Unused
32-bit signed integer
2
real
Selects representation
Floating- or fixed-point
of value
number
3
name
Selects interpretation
Offset or index
of value
4
boolean
Unused
0 for false, 1 for true
5
string
Number of elements
Offset of first element
6
immediately
Selects interpretation
Offset or index
evaluated name
of value
9
array
Number of elements
Offset of first element
10
mark
Unused
Unused
For a real number, the length field selects the representation of the number in the
value field: if the length n is 0, the value is a floating-point number; otherwise, the
value is a fixed-point number, using n as its scale factor (see Section 3.14.1, “Bi-
nary Tokens”).
For strings and arrays, the length field specifies the number of elements (charac-
ters in a string or objects in an array). It is treated as a 16-bit unsigned integer.
The value field specifies the offset, in bytes, of the start of the object’s value rela-
tive to the first byte of the first object in the top-level array. An array offset must
refer somewhere within the top-level or subsidiary arrays; it must be a multiple of
8. A string offset must refer somewhere within the string values. The strings have
no alignment requirement and need not be null-terminated or otherwise delim-
ited. If the length of a string or array object is 0, its value is disregarded.

167
3 . 1 4
Binary Encoding Details
For name objects, the length field is treated as a 16-bit signed integer n that se-
lects one of three interpretations of the value field:
n > 0
The value is an offset to the text of the name, just as with a string. n is
the name’s length, which must be within the implementation limit for
names.
n = 0
Reserved (Display PostScript extension).
n = −1 The value is a system name index (see Section 3.14.3, “Encoded System
Names”).
An immediately evaluated name object corresponds to the //name syntax of the
ASCII encoding (see Section 3.12.2, “Immediately Evaluated Names”). Aside
from the type code, its representation is the same as a name. However, with an
immediately evaluated name object, the scanner immediately looks up the name
in the environment of the current dictionary stack and substitutes the cor-
responding value for that name. If the name is not found, an undefined error
occurs.
For the composite objects, there are no enforced restrictions against multiple ref-
erences to the same value or to recursive or self-referential arrays. However, such
structures cannot be expressed directly in the ASCII or binary token encodings of
the language; their use violates the interchangeability of the encodings. The rec-
ommended structure of a binary object sequence is for each composite object to
refer to a distinct value. There is one exception: references from multiple name
objects to the same string value are encouraged, because name objects are unique
by definition.
The scanner generates a syntaxerror when it encounters a binary object sequence
that is malformed in any way. Possible causes include:
An object type that is undefined
An “unused” field that is not 0
Lengths and offsets that, combined, would refer outside the bounds of the bi-
nary object sequence
An array offset that is not a multiple of 8 or that refers beyond the earliest
string offset

168
C H A P T E R 3
Language
When a syntaxerror occurs, the PostScript interpreter pushes the object that
caused the error onto the operand stack. For an error detected by the scanner,
however, there is no such object, because the error occurs before the scanner has
finished creating one. Instead, the scanner fabricates a string object consisting of
the characters encountered so far in the current token. If a binary token or binary
object sequence was being scanned, the string object produced is a description of
the token, such as
(bin obj seq, type=128, elements=23, size=234, array out of bounds)
rather than the literal characters, which would be gibberish if printed as part of
an error message.
3.14.3 Encoded System Names
Both the binary token and binary object sequence encodings provide optional
means for representing certain names as small integers rather than as full text
strings. Such an integer is referred to as a system name index.
A name index is a reference to an element of a name table already known to the
PostScript interpreter. When the scanner encounters a name token that specifies a
name index rather than a text name, it immediately substitutes the corresponding
element of the table. This substitution occurs at scan time, not at execution time.
The result of the substitution is an ordinary PostScript name object.
The system name table contains standard operator names, single-letter names,
and miscellaneous other useful names. The contents of this table are documented
in Appendix F. They are also available as a machine-readable file for use by driv-
ers, translators, and other programs that deal with binary encodings; contact the
Adobe Developers Association.
If there is no name associated with a specified system name index, the scanner
generates an undefined error. The offending command is systemn, where n is the
decimal representation of the index.
An encoded system name specifies, as part of the encoding, whether the name is
to be literal or executable. A given element of the system name table can be treat-
ed as either literal or executable when referenced from a binary token or object
sequence.

169
3 . 1 4
Binary Encoding Details
In the binary object sequence encoding, an immediately evaluated name object
analogous to //name can be specified. When such an object specifies a name in-
dex, there are two substitutions: the first obtains a name object from the table,
and the second looks up that name object in the current dictionary stack. The lit-
eral or executable attribute of the immediately evaluated name object is disre-
garded; it has no influence on the corresponding attribute of the resulting object.
A program can depend on a given system name index representing a particular
name object. Applications that generate binary-encoded PostScript programs are
encouraged to take advantage of encoded system names, because they save both
space and time.
Note: The binary token encoding can reference only the first 256 elements of the sys-
tem name table. Therefore, this table is organized so that the most commonly used
names are in the first 256 elements. The binary object sequence encoding does not
have this limitation.

3.14.4 Number Representations
Binary tokens and binary object sequences use various representations for num-
bers. Some numbers are the values of number objects (integer and real). Others
provide structural information, such as lengths and offsets within binary object
sequences.
Different machine architectures use different representations for numbers. The
two most common variations are the byte order within multiple-byte integers
and the format of real (floating-point) numbers.
Rather than specify a single convention for representing numbers, the language
provides a choice of representations. The application program chooses whichever
convention is most appropriate for the machine on which it is running. The Post-
Script language scanner accepts numbers conforming to any of the conventions,
translating to its own internal representation when necessary. This translation is
needed only when the application and the PostScript interpreter are running on
machines with different architectures.
The number representation to be used is specified as part of the token type—the
initial character of the binary token or binary object sequence. There are two in-

170
C H A P T E R 3
Language
dependent choices, one for byte order and one for real format. The byte order
choices are:
High-order byte first in a multiple-byte integer or fixed-point number. The
high-order byte comes first, followed by successively lower-order bytes.
Low-order byte first in a multiple-byte integer or fixed-point number. The low-
order byte comes first, followed by successively higher-order bytes.
The real format choices are:
IEEE standard. A real number is represented in the 32-bit floating-point format
defined in the IEEE Standard 754-1985 for Binary Floating-Point Arithmetic.
The order of bytes is the same as the integer byte order. For example, if the
high-order byte of an integer comes first, then the byte containing the sign and
first 7 exponent bits of an IEEE standard real number comes first.
Native. A real number is represented in the native format for the machine on
which the PostScript interpreter is running. This may be a standard format or
something completely different. The choice of byte order is not relevant. The
application program is responsible for finding out the correct format. In gener-
al, this choice is useful only in environments where the application and the
PostScript interpreter are running on the same machine or on machines with
compatible architectures. PostScript programs that use this real number repre-
sentation are not portable.
Because each binary token or binary object sequence specifies its own number
representation, binary encoded programs with different number representations
can be mixed. This is a convenience for applications that obtain portions of Post-
Script programs from different sources.
The ByteOrder and RealFormat system parameters indicate the native byte order
and real number representation of the machine on which the PostScript inter-
preter is running (see Appendix C). An interactive application can query
RealFormat to determine whether the interpreter’s native real number format is
the same as that of the application. If so, translation to and from IEEE format can
be avoided.

171
3 . 1 4
Binary Encoding Details
3.14.5 Encoded Number Strings
Several operators require as operands an indefinitely long sequence of numbers
to be used as coordinate values, either absolute or relative. The operators include
those dealing with user paths, rectangles, and explicitly positioned text. In the
most common use of these operators, all of the numbers are provided as literal
values by the applications rather than being computed by the PostScript pro-
gram.
To facilitate this common use and to streamline the generation and interpretation
of numeric operand sequences, these operators permit their operands to be pre-
sented in either of two ways:
As an array object whose elements are numbers to be used successively
As a string object to be interpreted as an encoded number string
An encoded number string is a string that contains a single homogeneous number
array
according to the binary token encoding. That is, the first 4 bytes are treated
as a header. The remaining bytes are treated as a sequence of numbers encoded as
described in the header. (See Figure 3.2 on page 161.)
An encoded number string is a compact representation of a number sequence
both in its external form and in VM. Syntactically, it is simply a string object. It
remains in that form after being scanned and placed in VM. It is interpreted as a
sequence of numbers only when it is used as an operand of an operator that is ex-
pecting a number array. Furthermore, even then it is neither processed by the
scanner nor expanded into an array object; instead, the numbers are consumed
directly by the operator. This arrangement is compact and efficient, particularly
for large number sequences.
Example 3.11 shows equivalent ways of invoking rectfill, which is one of the
LanguageLevel 2 operators that expect number sequences as operands.
Example 3.11
[100 200 40 50] rectfill
<95 200004 0064 00c8 0028 0032> rectfill
The first line constructs an ordinary PostScript array object containing the num-
bers and passes it to rectfill. This is the most general form, because the [ and ]

172
C H A P T E R 3
Language
could enclose an arbitrary computation that produces the numbers and pushes
them on the stack.
On the second line, a string object appears in the program. When rectfill notices
that it has been given a string object, it interprets the value of the string, expect-
ing to find the binary token encoding of a homogeneous number array.
Example 3.11 does not use encoded number strings to best advantage. In this
example, it is an ASCII-encoded hexadecimal string enclosed in < and >. A real
application would use a more efficient encoding, such as a binary string token or
an ASCII base-85 string literal. An ordinary ASCII string enclosed in ( and ) is un-
suitable because of the need to quote special characters.
Operators that use encoded number strings include rectfill, rectstroke, rectclip,
xshow, yshow, and xyshow. An encoded user path can represent its numeric op-
erands as an encoded number string; the relevant operators are ufill, ueofill,
ustroke, uappend, inufill, inueofill, and inustroke.
3.14.6 Structured Output
In some environments, a PostScript program can transmit information back to
the application program that generated it. This information includes the values
of objects produced by queries, error messages, and unstructured text generated
by the print operator.
A PostScript program writes all of this data to its standard output file. The appli-
cation requires a way to distinguish among these different kinds of information
received from the PostScript interpreter. To serve this need, the language includes
operators to write output in a structured output format. This format is basically
the same as the binary object sequence representation for input, described in
Section 3.14.2, “Binary Object Sequences.”
A program that writes structured output should take care when using unstruc-
tured output operators, such as print and =. Because the start of a binary object
sequence is indicated by a character whose code is in the range 128 to 131,
unstructured output should consist only of character codes outside that range;
otherwise, confusion will ensue in the application. Of course, this is only a con-
vention. By prior arrangement, a program can send arbitrary unstructured data
to the application.

173
3 . 1 4
Binary Encoding Details
The operator printobject writes an object as a binary object sequence to the stan-
dard output file. A similar operator, writeobject, writes to any file. The binary ob-
ject sequence contains a top-level array consisting of one element that is the
object being written (see Section 3.14.2, “Binary Object Sequences”). That object,
however, can be composite, so the binary object sequence may include subsidiary
arrays and strings.
In the binary object sequences produced by printobject and writeobject, the
number representation is controlled by the setobjectformat operator. The binary
object sequence has a token type that identifies the representation used.
Accompanying the top-level object in the object sequence is a 1-byte tag, which is
specified as an operand of printobject and writeobject. This tag is carried in the
second byte of the object, which is otherwise unused (see Figure 3.3 on page 164).
Only the top-level object receives a tag; the second byte of subsidiary objects is 0.
Despite its physical position, the tag is logically associated with the object se-
quence as a whole.
The purpose of the tag is to enable the PostScript program to specify the intended
disposition of the object sequence. A few tag values are reserved for reporting
errors (see below). The remaining tag values may be used arbitrarily.
Tag values 0 through 249 are available for general use. Tag values 250 through 255
are reserved for identifying object sequences that have special significance. Of
these, only tag value 250 is presently defined; it is used to report errors.
Errors are initiated as described in Section 3.11, “Errors.” Normally, when an
error occurs, control automatically passes from the PostScript program to a built-
in procedure that catches errors. That procedure invokes handleerror. Subse-
quent behavior depends on the definition of handleerror. The following descrip-
tion applies to the standard definition of handleerror.
If the value of binary in the $error dictionary is true and binary encoding is en-
abled, handleerror writes a binary object sequence with a tag value of 250. But if
binary is false or binary encoding is disabled, handleerror writes a human-
readable text message whose format is product-dependent.

174
C H A P T E R 3
Language
The binary object sequence that reports an error contains a four-element array as
its top-level object. The array elements, ordered as they appear, are:
1. The name Error, which indicates an ordinary error detected by the PostScript
interpreter. A different name could indicate another class of errors, in which
case the meanings of the other array elements might be different.
2. The name that identifies the specific error—for example, typecheck.
3. The object that was being executed when the error occurred. If the object that
raised the error is not printable, some suitable substitute is provided—for ex-
ample, an operator name in place of an operator object.
4. A boolean object (used in the Display PostScript extension), whose normal
value is false.

175
CHAPTER 4
Graphics
4
THE POSTSCRIPT GRAPHICS OPERATORS describe the appearance of pages
that are to be reproduced on a raster output device. The facilities described here
are intended for both printer and display applications.
The graphics operators form seven main groups:
Graphics state operators. These operators manipulate the data structure called
the graphics state, which is the global framework within which the other graph-
ics operators execute.
Coordinate system and matrix operators. The graphics state includes the current
transformation matrix (CTM), which maps coordinates specified by the Post-
Script program into output device coordinates. The operators in this group
manipulate the CTM to achieve any combination of translation, scaling, rota-
tion, reflection, and skewing of user coordinates onto device coordinates.
Path construction operators. The graphics state includes the current path, which
defines shapes and line trajectories. Path construction operators begin a new
path, add line segments and curves to the current path, and close the current
path. All of these operators implicitly reference the CTM parameter in the
graphics state.
Painting operators. The operators in this group paint graphical elements, such
as lines, filled areas, and sampled images, into the raster memory of the output
device. These operators are controlled by the current path, current color, and
many other parameters in the graphics state.
Glyph and font operators. These operators select and paint character glyphs
from fonts (descriptions of typefaces for representing text characters). Because
the PostScript language treats glyphs as general graphical shapes, many of the
font operators should be grouped with the path construction or painting oper-

176
C H A P T E R 4
Graphics
ators. However, the data structures and mechanisms for dealing with glyph and
font descriptions are sufficiently specialized that Chapter 5 focuses on them.
Device setup operators. These operators establish the association between raster
memory and a physical output device, such as a printer or a display. They are
discussed in detail in Chapter 6.
Output operators. Once a page has been completely described, executing an
output operator transmits the page to the output device.
This chapter presents general information about device-independent graphics in
the PostScript language: how a program describes the abstract appearance of a
page. Rendering—the device-dependent part of graphics—is covered in
Chapter 7.
4.1 Imaging Model
The Adobe imaging model is a simple and unified view of two-dimensional
graphics borrowed from the graphic arts. A PostScript program builds an image
by placing “paint” on a “page” in selected areas.
The painted figures may be in the form of letter shapes, general filled shapes,
lines, or digitally sampled representations of photographs.
The paint may be in color or in black, white, or any shade of gray.
The paint may take the form of a repeating pattern (LanguageLevel 2) or a
smooth transition between colors (LanguageLevel 3).
Any of these elements may be clipped to appear within other shapes as they are
placed onto the page.
Once a page has been built up to the desired form, it may be transmitted to an
output device.
The PostScript interpreter maintains an implicit current page that accumulates
the marks made by the painting operators. When a program begins, the current
page is completely blank. As each painting operator executes, it places marks on
the current page. Each new mark completely obscures any marks it may overlay
(subject to the effects of the overprint parameter in the graphics state; see
Section 4.8.5). This method is known as a painting model: no matter what color a
mark has—white, black, gray, or color—it is put onto the current page as if it
were applied with opaque paint. Once the page has been completely composed,

177
4 . 1
Imaging Model
invoking the showpage operator renders the accumulated marks on the output
media and then clears the current page to white again.
The principal painting operators (among many others) are as follows:
fill paints an area.
stroke paints lines.
image paints a sampled image.
show paints glyphs representing character shapes.
The painting operators require various parameters, some explicit and others im-
plicit. Chief among the implicit parameters is the current path used by fill, stroke,
and show. A path consists of a sequence of connected and disconnected points,
lines, and curves that together describe shapes and their positions. It is built up
through the sequential application of the path construction operators, each of
which modifies the current path in some way, usually by appending one new ele-
ment.
Path construction operators include newpath, moveto, lineto, curveto, arc, and
closepath. None of the path construction operators places marks on the current
page; the painting operators do that. Path construction operators create the
shapes that the painting operators paint. Some operators, such as ufill and
ustroke, combine path construction and painting in a single operation for effi-
ciency.
Implicit parameters to the painting operators include the current color, current
line width, current font (typeface and size), and many others. There are operators
that examine and set each implicit parameter in the graphics state. The values
used for implicit parameters are those in effect at the time an operator is invoked.
PostScript programs contain many instances of the following typical sequence of
steps:
1. Build a path using path construction operators.
2. Set any implicit parameters if their values need to change.
3. Perform a painting operation.

178
C H A P T E R 4
Graphics
There is one additional implicit element in the Adobe imaging model that modi-
fies this description: the current clipping path outlines the area of the current page
on which paint may be placed. Initially, this path outlines the entire imageable
area of the current page. By using the clip operator, a PostScript program can
shrink the path to any shape desired. Although painting operators may attempt
to place marks anywhere on the current page, only those marks falling within the
current clipping path will affect the page; those falling outside it will not.
4.2 Graphics State
The PostScript interpreter maintains an internal data structure called the graphics
state
that holds current graphics control parameters. These parameters define the
global framework within which the graphics operators execute. For example, the
stroke operator implicitly uses the current line width parameter from the graphics
state, and the fill operator implicitly uses the current color parameter.
Most graphics state parameters are ordinary PostScript objects that can be read
and altered by the appropriate graphics state operators. For example, the opera-
tor setlinewidth changes the current line width parameter, and currentlinewidth
reads that parameter from the graphics state. In general, the operators that set
graphics state parameters simply store them unchanged for later use by other
graphics operators. However, certain parameters have special properties or be-
havior:
Most parameters must be of the correct type or have values that fall into a cer-
tain range.
Parameters that are numeric values, such as color, line width, and miter limit,
are forced into legal range, if necessary, and stored as real numbers. If they are
later read out, they are always real, regardless of how they were originally speci-
fied. However, they are not adjusted to reflect capabilities of the raster output
device, such as resolution or number of distinguishable colors. Graphics ren-
dering operators perform such adjustments, but the adjusted values are not
stored back into the graphics state.
Certain parameters are composite objects, such as arrays or dictionaries.
Graphics operators consult the values of these objects at unpredictable times
and may cache them for later use, so altering them can have unpredictable re-
sults. A PostScript program should treat the values of graphics state parameters
(including those in saved graphics states) as if they were read-only.

179
4 . 2
Graphics State
The current path, clipping path, and device parameters are internal objects that
are not directly accessible to a PostScript program.
Table 4.1 lists those graphics state parameters that are device-independent and
are appropriate to specify in page descriptions. The parameters listed in Table 4.2
control details of the rendering (scan conversion) process and are device-
dependent. A page description that is intended to be device-independent should
not alter these parameters.
TABLE 4.1 Device-independent parameters of the graphics state
PARAMETER
TYPE
VALUE
CTM
array
The current transformation matrix, which maps positions from user
coordinates to device coordinates. This matrix is modified by each ap-
plication of the coordinate system operators. Initial value: a matrix
that transforms default user coordinates to device coordinates.
position
two numbers
The coordinates of the current point in user space, the last element of
the current path. Initial value: undefined.
path
(internal)
The current path as built up by the path construction operators. Used
as an implicit argument by operators such as fill, stroke, and clip. Ini-
tial value: empty.
clipping path
(internal)
A path defining the current boundary against which all output is to be
cropped. Initial value: the boundary of the entire imageable portion of
the output page.
clipping path stack
(internal)
(LanguageLevel 3) A stack holding clipping paths that have been saved
with the clipsave operator and not yet restored with cliprestore.
color space
array
(LanguageLevel 2) The color space in which color values are to be in-
terpreted. Initial value: DeviceGray.
color
(various)
The color to use during painting operations. The type and interpreta-
tion of this parameter depends on the current color space. For most
color spaces, a color value consists of one to four numbers. Initial
value: black.
font
dictionary
The set of graphic shapes (glyphs) that represent characters in the cur-
rent typeface. Initial value: an invalid font dictionary.
line width
number
The thickness (in user coordinate units) of lines to be drawn by the
stroke operator. Initial value: 1.0.

180
C H A P T E R 4
Graphics
line cap
integer
A code that specifies the shape of the endpoints of any open path that
is stroked. Initial value: 0 for a square butt end.
line join
integer
A code that specifies the shape of joints between connected segments
of a stroked line. Initial value: 0 for mitered joins.
miter limit
number
The maximum length of mitered line joins for the stroke operator.
This limits the length of “spikes” produced when line segments join at
sharp angles. Initial value: 10.0 for a miter cutoff below 11 degrees.
dash pattern
array and
A description of the dash pattern to be used when lines are painted by
number
the stroke operator. Initial value: a normal solid line.
stroke adjustment
boolean
(LanguageLevel 2) A flag that specifies whether to compensate for reso-
lution effects that may be noticeable when line thickness is a small
number of device pixels. Initial value: false.
TABLE 4.2 Device-dependent parameters of the graphics state
PARAMETER
TYPE
VALUE
color rendering
dictionary
(LanguageLevel 2) A collection of parameters that determine how to
transform CIE-based color specifications to device color values. Initial
value: installation-dependent.
overprint
boolean
(LanguageLevel 2) A flag that specifies (on output devices that support
the overprint control feature) whether painting in one set of colorants
cause the corresponding areas of other colorants to be erased (false) or
left unchanged (true). Initial value: false.
black generation
procedure
(LanguageLevel 2) A procedure that calculates the amount of black to
use when converting RGB colors to CMYK. Initial value: installation-
dependent.
undercolor removal
procedure
(LanguageLevel 2) A procedure that calculates the reduction in the
amount of cyan, magenta, and yellow components to compensate for
the amount of black added by black generation. Initial value:
installation-dependent.
transfer
procedure
A transfer function that adjusts device gray or color component values
to correct for nonlinear response in a particular device. Support for
four transfer functions is a LanguageLevel 2 feature. Initial value:
installation-dependent.
halftone
(various)
A halftone screen for gray and color rendering, specified either as fre-
quency, angle, and spot function or as a halftone dictionary. Halftone
dictionaries, as well as support for four halftone screens, are Language-
Level 2 features. Initial value: installation-dependent.

181
4 . 2
Graphics State
flatness
number
The precision with which curves are to be rendered on the output de-
vice. This number gives the maximum error tolerance, measured in
output device pixels. Smaller numbers give smoother curves at the ex-
pense of more computation and memory use. Initial value: 1.0.
smoothness
number
(LanguageLevel 3) The precision with which color gradients are to be
rendered on the output device. This number gives the maximum error
tolerance between a shading approximated by piecewise linear interpo-
lation and the true value of a (possibly nonlinear) shading function,
expressed as a fraction of the range of each color component. Smaller
numbers give smoother color transitions at the expense of more com-
putation and memory use. Initial value: installation-dependent.
device
(internal)
An internal data structure representing the current output device. Ini-
tial value: installation-dependent.
Although it contains many objects, the graphics state is not itself a PostScript ob-
ject and cannot be accessed directly from within a PostScript program. However,
there are two mechanisms for saving and later restoring the entire graphics state.
One is the graphics state stack, managed by the following operators:
gsave pushes a copy of the entire graphics state onto the stack.
grestore restores the entire graphics state to its former value by popping it from
the stack.
The graphics state stack, with its LIFO (last in, first out) organization, serves the
needs of PostScript programs that are page descriptions. A well-structured docu-
ment typically contains many graphical elements that are essentially independent
of each other and sometimes nested to multiple levels. The gsave and grestore
operators can be used to encapsulate these elements so that they can make local
changes to the graphics state without disturbing the graphics state of the sur-
rounding environment.
In some interactive applications, however, a program must switch its attention
among multiple, more-or-less independent imaging contexts in an unpredictable
order. The second mechanism, available in LanguageLevels 2 and 3, uses gstate

182
C H A P T E R 4
Graphics
objects in virtual memory that contain saved copies of the graphics state. The fol-
lowing LanguageLevel 2 operators manipulate gstate objects:
gstate creates a new gstate object.
currentgstate copies the entire current graphics state into a gstate object.
setgstate replaces the entire current graphics state by the value of a gstate ob-
ject.
Interactive programs can use these operators to create a separate gstate object for
each imaging context and switch among them dynamically as needed.
Note: Saving a graphics state, with either gsave or currentgstate, captures every
parameter, including such things as the current path and current clipping path. For
example, if a nonempty current path exists at the time that
gsave, gstate, or
currentgstate is executed, that path will be reinstated by the corresponding grestore
or setgstate. Unless this effect is specifically desired, it is best to minimize storage de-
mands by saving a graphics state only when the current path is empty and the cur-
rent clipping path is in its default state.

4.3 Coordinate Systems and Transformations
Paths and shapes are defined in terms of pairs of coordinates on the Cartesian
plane. A coordinate pair is a pair of real numbers x and y that locate a point hori-
zontally and vertically within a Cartesian (two-axis) coordinate system superim-
posed on the current page. The PostScript language defines a default coordinate
system that PostScript programs can use to locate any point on the page.
4.3.1 User Space and Device Space
Coordinates specified in a PostScript program refer to locations within a coordi-
nate system that always bears the same relationship to the current page, regardless
of the output device on which printing or displaying will be done. This coordi-
nate system is called user space.
Output devices vary greatly in the built-in coordinate systems they use to address
pixels within their imageable areas. A particular device’s coordinate system is a
device space. A device space origin can be anywhere on the output page. This is
because the paper moves through different printers and imagesetters in different

183
4 . 3
Coordinate Systems and Transformations
directions. On displays, the origin can vary depending on the window system.
Different devices have different resolutions. Some devices even have resolutions
that are different in the horizontal and vertical directions.
The operands of the path operators are coordinates expressed in user space. The
PostScript interpreter automatically transforms user space coordinates into de-
vice space. For the most part, this transformation is hidden from the PostScript
program. A program must consider device space only rarely, for certain special
effects. This independence of user space from device space is essential to the
device-independent nature of PostScript page descriptions.
A coordinate system can be defined with respect to the current page by stating:
The location of the origin
The orientation of the x and y axes
The lengths of the units along each axis
Initially, the user space origin is located at the lower-left corner of the output
page or display window, with the positive x axis extending horizontally to the
right and the positive y axis extending vertically upward, as in standard mathe-
matical practice. The length of a unit along both the x and y axes is 1⁄72 inch.
This coordinate system is the default user space. In default user space, all points
within the current page have positive x and y coordinate values.
Note: The default unit size (1⁄72 inch) is approximately the same as a “point,” a unit
widely used in the printing industry. It is not exactly the same as a point, however;
there is no universal definition of a point.

The default user space origin coincides with the lower-left corner of the physical
page. Portions of the physical page may not be imageable on certain output de-
vices. For example, many laser printers cannot place marks at the extreme edges
of their physical page areas. It may not be possible to place marks at or near the
default user space origin. The physical correspondence of page corner to default
origin ensures that marks within the imageable portion of the output page will be
consistently positioned with respect to the edges of the page.

184
C H A P T E R 4
Graphics
Coordinates in user space may be specified as either integers or real numbers.
Therefore, the unit size in default user space does not constrain locations to any ar-
bitrary grid.
The resolution of coordinates in user space is not related in any way
to the resolution of pixels in device space.
The default user space provides a consistent, dependable starting place for Post-
Script programs regardless of the output device used. If necessary, the PostScript
program may then modify user space to be more suitable to its needs by applying
coordinate transformation operators, such as translate, rotate, and scale.
What may appear to be absolute coordinates in a PostScript program are not ab-
solute with respect to the current page, because they are expressed in a coordinate
system that may slide around and shrink or expand. Coordinate system transfor-
mation not only enhances device independence but is a useful tool in its own
right. For example, a page description originally composed to occupy an entire
page can be incorporated without change as an element of another page descrip-
tion by shrinking the coordinate system in which it is drawn.
Conceptually, user space is an infinite plane. Only a small portion of this plane
corresponds to the imageable area of the output device: a rectangular area above
and to the right of the origin in default user space. The actual size and position of
the area is device- and media-dependent. An application can request a particular
page size or other media properties by using the LanguageLevel 2 operator
setpagedevice, described in Section 6.1.1, “Page Device Dictionary.”
4.3.2 Transformations
A transformation matrix specifies how to transform the coordinate pairs of one
coordinate space into another coordinate space. The graphics state includes the
current transformation matrix (CTM), which describes the transformation from
user space to device space.
The elements of a matrix specify the coefficients of a pair of linear equations that
transform the values of coordinates x and y. However, in graphical applications,
matrices are not often thought of in this abstract mathematical way. Instead, a
matrix is considered to capture some sequence of geometric manipulations:
translation, rotation, scaling, reflection, and so forth. Most of the PostScript lan-
guage’s matrix operators are organized according to this latter model.

185
4 . 3
Coordinate Systems and Transformations
The most commonly used matrix operators are those that modify the current
transformation matrix in the graphics state. Instead of creating a new transfor-
mation matrix from nothing, these operators change the existing transformation
matrix in some specific way. Operators that modify user space include the follow-
ing:
translate moves the user space origin to a new position with respect to the cur-
rent page, leaving the orientation of the axes and the unit lengths unchanged.
rotate turns the user space axes about the current user space origin by some
angle, leaving the origin location and unit lengths unchanged.
scale modifies the unit lengths independently along the current x and y axes,
leaving the origin location and the orientation of the axes unchanged.
concat applies an arbitrary linear transformation to the user coordinate sys-
tem.
Such modifications have a variety of uses:
Changing the user coordinate system conventions for an entire page. For example,
in some applications it might be convenient to express user coordinates in cen-
timeters rather than in 72nds of an inch, or it might be convenient to have the
origin in the center of the page rather than in the lower-left corner.
Defining each graphical element of a page in its own coordinate system, indepen-
dent of any other element. The program can then position, orient, and scale
each element to the desired location on the page by temporarily modifying the
user coordinate system. This allows the description of an element to be de-
coupled from its placement on the page.
Example 4.1 may aid in understanding the second type of modification. Com-
ments explain what each operator does.
Example 4.1
/box
% Define a procedure to construct a unit-square path in the
{
newpath
% current user coordinate system, with its lower-left corner at
0 0 moveto
% the origin.
0 1 lineto
1 1 lineto
1 0 lineto
closepath
} def

186
C H A P T E R 4
Graphics
gsave
% Save the current graphics state and create a new one that we
% can modify.
72 72 scale
% Modify the current transformation matrix so that everything
% subsequently drawn will be 72 times larger; that is, each unit
% will represent an inch instead of 1⁄72 inch.
box fill
% Draw a unit square with its lower-left corner at the origin and
% fill it with black. Because the unit size is now 1 inch, this box
% is 1 inch on a side.
2 2 translate
% Change the transformation matrix again so that the origin is
% displaced 2 inches in from the left and bottom edges of the
% page.
box fill
% Draw the box again. This box has its lower-left corner 2 inches
% up from and 2 inches to the right of the lower-left corner of
% the page.
grestore
% Restore the saved graphics state. Now we are back to default
% user space.
(0, 0)
Inches
FIGURE 4.1 The two squares produced by Example 4.1
Figure 4.1 is a reduction of the entire page containing the two squares painted by
Example 4.1, along with scales indicating x and y positions in inches. This shows
how coordinates, such as the ones given to the moveto and lineto graphics opera-
tors, are transformed by the current transformation matrix. By combining trans-

187
4 . 3
Coordinate Systems and Transformations
lation, scaling, and rotation, very simple prototype graphics procedures—such as
box in the example—can be used to generate an infinite variety of instances.
4.3.3 Matrix Representation and Manipulation
This section presents a brief introduction to the representation and manipulation
of matrices. Some knowledge of this topic will make the descriptions of the coor-
dinate system and matrix operators in Chapter 8 easier to understand. It is not
essential to understand the details of matrix arithmetic on first reading, but only
to obtain a clear geometrical model of the effects of the various transformations.
A two-dimensional transformation is described mathematically by a 3-by-3
matrix:
a
b 0
c
d 0
t
t
1
x
y
In the PostScript language, this matrix is represented as a six-element array object
[a b c d tx ty]
omitting the matrix elements in the third column, which always have constant
values.
This matrix transforms a coordinate pair (x, y) into another coordinate pair
(x′, y′) according to the linear equations
x ′ = ax + cy + tx
y
′ = bx + dy + ty
The common transformations are easily described in this matrix notation. Trans-
lation by a specified displacement (tx, ty) is described by the matrix
1
0 0
0
1 0
t
t
1
x
y

188
C H A P T E R 4
Graphics
Scaling by the factor sx in the horizontal dimension and sy in the vertical dimen-
sion is accomplished by the matrix
s
0 0
x
0 s
0
y
0
0 1
Rotation counterclockwise about the origin by an angle θ is described by the
matrix
cos θ sinθ
0
– sin θ
cos θ
0
0
0
1
Figure 4.2 illustrates the effects of these common transformations.
s
t
y
y
tx
sx
Translation
Scaling
Rotation
FIGURE 4.2 Effects of coordinate transformations
A PostScript program can describe any desired transformation as a sequence of
these operations performed in some order. An important property of the matrix
notation is that a program can concatenate a sequence of operations to form a
single matrix that embodies all of them in combination. That is, transforming
any pair of coordinates by the single concatenated matrix produces the same re-
sult as transforming them by all of the original matrices in sequence. Any linear

189
4 . 4
Path Construction
transformation from user space to device space can be described by a single
transformation matrix, the CTM.
Note: Concatenation is performed by matrix multiplication. The order in which
transformations are concatenated is significant (technically, matrix operations are
associative, but not commutative). The requirement that matrices conform during
multiplication is what leads to the use of 3-by-3 matrices. Otherwise, 2-by-3 matri-
ces would suffice to describe transformations.

The operators translate, scale, and rotate each concatenate the CTM with a
matrix describing the desired transformation, producing a new matrix that com-
bines the original and additional transformations. This matrix is then established
as the new CTM:
newCTM = transformation × originalCTM
It is sometimes necessary to perform the inverse of a transformation—that is, to
find the user space coordinates that correspond to a specific pair of device space
coordinates. PostScript programs explicitly do this only occasionally, but it oc-
curs commonly in the PostScript interpreter itself.
Not all transformations are invertible in the way just described. For example, if a
matrix contains a, b, c, and d elements that are all 0, all user coordinates map to
the same device coordinates and there is no unique inverse transformation. Such
noninvertible transformations are not very useful and generally arise from unin-
tentional operations, such as scaling by 0. A noninvertible CTM can sometimes
cause an undefinedresult error to occur during the execution of graphics and
font operators.
4.4 Path Construction
In the PostScript language, paths define shapes, trajectories, and regions of all
sorts. Programs use paths to draw lines, define the shapes of filled areas, and
specify boundaries for clipping other graphics.
A path is composed of straight and curved line segments, which may connect to
one another or may be disconnected. A pair of segments are said to connect only if
they are defined consecutively, with the second segment starting where the first
one ends. Thus the order in which the segments of a path are defined is signifi-

190
C H A P T E R 4
Graphics
cant. Nonconsecutive segments that meet or intersect fortuitously are not consid-
ered to connect.
A path is made up of one or more disconnected subpaths, each comprising a se-
quence of connected segments. The topology of the path is unrestricted: it may be
concave or convex, may contain multiple subpaths representing disjoint areas,
and may intersect itself in arbitrary ways. There is an operator, closepath, that ex-
plicitly connects the end of a subpath back to its starting point; such a subpath is
said to be closed. A subpath that has not been explicitly closed is open.
Paths are represented by data structures internal to the PostScript interpreter. Al-
though a path is not directly accessible as an object, its construction and use are
under program control. A path is constructed by sequential application of one or
more path construction operators. PostScript programs can read out the path or,
more commonly, use it to control the application of one of the painting operators
described in Section 4.5, “Painting.”
Note: Because the entire set of points defining a path must exist as data simulta-
neously, there is a limit to the number of segments it may have. Because several paths
may also exist simultaneously (the current path and the clipping path, both discussed
below, as well as any paths saved by the
save, gsave, clipsave, gstate, and
currentgstate operators), this limit applies to the total amount of storage occupied by
all paths. If a path exhausts the available storage, a limitcheck error occurs.
LanguageLevel 1 has a fixed limit for path storage that is implementation-
dependent; see Appendix B for more information. In LanguageLevels 2 and 3, there
is no such fixed limit; path storage competes with other uses of memory.

As a practical matter, the limits on path storage are large enough not to impose an
unreasonable restriction. It is important, however, that each distinct element of a
page be constructed as a separate path, painted,
and then discarded before con-
structing the next element. Attempting to describe an entire page as a single path is
likely to exceed the path storage limit.

4.4.1 Current Path
The current path is part of the graphics state. The path construction operators
modify the current path, usually by appending to it, and the painting operators
implicitly refer to the current path. The gsave and grestore operators respectively
save and restore the current path, as they do all components of the graphics state.

191
4 . 4
Path Construction
A program begins a new path by invoking the newpath operator. This initializes
the current path to be empty. (Some of the painting operators also reinitialize the
current path at the end of their execution.) The program then builds up the defi-
nition of the path by applying one or more of the operators that add segments to
the current path. These operators may be invoked in any sequence, but the first
one invoked must be moveto.
The trailing endpoint of the segment most recently added is referred to as the
current point. If the current path is empty, the current point is undefined. Most
operators that add a segment to the current path start at the current point. If the
current point is undefined, they generate the error nocurrentpoint.
Following is a list of the most common path construction operators. There are
other, less common ones as well; see Chapter 8 for complete details.
moveto establishes a new current point without adding a segment to the cur-
rent path, thereby beginning a new subpath.
lineto adds a straight line segment to the current path, connecting the previous
current point to the new one.
arc, arcn, arct, and arcto add an arc of a circle to the current path.
curveto adds a section of a cubic Bézier curve to the current path.
rmoveto, rlineto, and rcurveto perform the moveto, lineto, and curveto opera-
tions, but specify new points via displacements in user space relative to the cur-
rent point, rather than by absolute coordinates.
closepath adds a straight line segment connecting the current point to the
starting point of the current subpath (usually the point most recently specified
by moveto), thereby closing the current subpath.
Note: Remember that the path construction operators do not place any marks on the
page; only the painting operators do that. The usual procedure for painting a graph-
ical element on the page is to define that element as a path and then invoke one of the
painting operators. This is repeated for each element on the page.

All of the points used to describe the path are specified in user space. All coordi-
nates are transformed by the CTM into device space at the time the program adds
the point to the current path. Changing the CTM does not affect the coordinates
of existing points in device space.

192
C H A P T E R 4
Graphics
A path that is to be used more than once in a page description can be defined by a
PostScript procedure that invokes the operators for constructing the path. Each
instance of the path can then be constructed and painted on the page by a three-
step sequence:
1. Modify the CTM, if necessary, by invoking coordinate transformation opera-
tors to locate, orient, and scale the path to the desired place on the page.
2. Call the procedure to construct the path.
3. Invoke a painting operator to mark the path on the page in the desired
manner.
In the common situation that the path description is constant, the
LanguageLevel 2 user path operators (described in Section 4.6, “User Paths”) can
be used to combine steps 2 and 3. The entire sequence can be encapsulated by
surrounding it with the operators gsave and grestore. See Example 4.1 on
page 185 for a simple illustration of this technique.
4.4.2 Clipping Path
The graphics state also contains a clipping path that limits the regions of the page
affected by the painting operators. The closed subpaths of this path define the
area that can be painted. Marks falling inside this area will be applied to the page;
those falling outside it will not. (Precisely what is considered to be “inside” a path
is discussed in Section 4.5.2, “Filling.”) The clipping path affects current painting
operations only; it has no effect on paths being constructed with the path con-
struction operators listed in Section 4.4.1. When such a path is eventually paint-
ed, the results will be limited only by the clipping path current at that time, and
not by the one in effect when the path was constructed.
In LanguageLevel 3, the graphics state can also contain a subsidiary stack of saved
clipping paths, which are pushed and popped by the clipsave and cliprestore op-
erators. This enables a program to save and restore just the clipping path without
affecting the rest of the graphics state. Because the clipping path stack is an ele-
ment of the graphics state, wholesale replacement of the graphics state by
grestore or setgstate will replace the entire clipping path stack.

193
4 . 5
Painting
The following operators manage the clipping path:
clip computes a new clipping path from the intersection of the current path
with the existing clipping path.
clippath replaces the current path with a copy of the current clipping path.
clipsave (LanguageLevel 3) pushes a copy of the current clipping path onto the
clipping path stack.
cliprestore (LanguageLevel 3) pops the topmost element off the clipping path
stack and makes it the current clipping path.
4.5 Painting
The painting operators mark graphical shapes on the current page. This section
describes the principal, general-purpose painting operators, stroke and fill. Vari-
ants of these operators combine path construction and painting in a single oper-
ation; see Section 4.6, “User Paths.” More specialized operators include shfill,
described in Section 4.9.3, “Shading Patterns”; image, described in Section 4.10,
“Images”; and the glyph and font operators, described in Chapter 5.
The operators and graphics state parameters described here control the abstract
appearance of graphical shapes and are device-independent. Additional, device-
dependent facilities for controlling the rendering of graphics in raster memory
are described in Chapter 7.
4.5.1 Stroking
The stroke operator draws a line along the current path. For each straight or
curved segment in the path, the stroked line is centered on the segment with sides
parallel to the segment. Each of the path’s subpaths is treated separately.
The results of the stroke operator depend on the current settings of various
parameters in the graphics state. See Section 4.2, “Graphics State,” for further in-
formation on these parameters, and Chapter 8 for detailed descriptions of the
operators that set them.
The width of the stroked line is determined by the line width parameter (see
setlinewidth).

194
C H A P T E R 4
Graphics
The color or pattern of the line is determined by the color parameter (see
setgray, setrgbcolor, sethsbcolor, setcmykcolor, setcolor, and setpattern; the
last three are LanguageLevel 2 operators).
The line can be drawn either solid or with a program-specified dash pattern,
depending on the dash pattern parameter (see setdash).
If the subpath is open, the unconnected ends are treated according to the line
cap parameter, which may be butt, rounded, or square (see setlinecap).
Wherever two consecutive segments are connected, the joint between them is
treated according to the line join parameter, which may be mitered, rounded, or
beveled (see setlinejoin). Mitered joins are also subject to the miter limit
parameter (see setmiterlimit).
Note: Points at which unconnected segments happen to meet or intersect receive no
special treatment. In particular, “closing” a subpath with an explicit
lineto rather
than with
closepath may result in a messy corner, because line caps will be applied
instead of a line join.

The stroke adjustment parameter (LanguageLevel 2) requests that coordinates
and line widths be adjusted automatically to produce strokes of uniform thick-
ness despite rasterization effects (see setstrokeadjust and Section 7.5.2, “Auto-
matic Stroke Adjustment”).
4.5.2 Filling
The fill operator uses the current color or pattern to paint the entire region en-
closed by the current path. If the path consists of several disconnected subpaths,
fill paints the insides of all subpaths, considered together. Any subpaths that are
open are implicitly closed before being filled.
For a simple path, it is intuitively clear what region lies inside. However, for a
more complex path—for example, a path that intersects itself or has one subpath
that encloses another—the interpretation of “inside” is not always obvious. The
path machinery uses one of two rules for determining which points lie inside a
path: the nonzero winding number rule and the even-odd rule, both discussed in
detail below.
The nonzero winding number rule is more versatile than the even-odd rule and is
the standard rule the fill operator uses. Similarly, the clip operator uses this rule
to determine the inside of the current clipping path. The even-odd rule is occa-

195
4 . 5
Painting
sionally useful for special effects or for compatibility with other graphics systems.
The eofill and eoclip operators invoke this rule.
Nonzero Winding Number Rule
The nonzero winding number rule determines whether a given point is inside a
path by conceptually drawing a ray from that point to infinity in any direction
and then examining the places where a segment of the path crosses the ray. Start-
ing with a count of 0, the rule adds 1 each time a path segment crosses the ray
from left to right and subtracts 1 each time a segment crosses from right to left.
After counting all the crossings, if the result is 0 then the point is outside the path;
otherwise it is inside.
Note: The method just described does not specify what to do if a path segment coin-
cides with or is tangent to the chosen ray. Since the direction of the ray is arbitrary,
the rule simply chooses a ray that does not encounter such problem intersections.

For simple convex paths, the nonzero winding number rule defines the inside and
outside as one would intuitively expect. The more interesting cases are those in-
volving complex or self-intersecting paths like the ones in Figure 4.3. For a path
consisting of a five-pointed star, drawn with five connected straight line segments
intersecting each other, the rule considers the inside to be the entire area enclosed
by the star, including the pentagon in the center. For a path composed of two
concentric circles, the areas enclosed by both circles are considered to be inside,
provided that both are drawn in the same direction. If the circles are drawn in op-
posite directions, only the “doughnut” shape between them is inside, according to
the rule; the “doughnut hole” is outside.
FIGURE 4.3 Nonzero winding number rule

196
C H A P T E R 4
Graphics
Even-Odd Rule
An alternative to the nonzero winding number rule is the even-odd rule. This rule
determines the “insideness” of a point by drawing a ray from that point in any di-
rection and simply counting the number of path segments that cross the ray, re-
gardless of direction. If this number is odd, the point is inside; if even, the point is
outside. This yields the same results as the nonzero winding number rule for
paths with simple shapes, but produces different results for more complex
shapes.
Figure 4.4 shows the effects of applying the even-odd rule to complex paths. For
the five-pointed star, the rule considers the triangular points to be inside the path,
but not the pentagon in the center. For the two concentric circles, only the
“doughnut” shape between the two circles is considered inside, regardless of the
directions in which the circles are drawn.
FIGURE 4.4 Even-odd rule
4.5.3 Insideness Testing
It is sometimes useful for a program to test whether a point lies inside a path, or
whether a path intersects another path, without actually painting anything. The
LanguageLevel 2 insideness-testing operators can be used for this purpose. They
are useful mainly for interactive applications, where they can assist in hit detec-
tion; however, they have other uses as well.
There are several insideness-testing operators that vary according to how the
paths to be tested are specified. All of the operators return a single boolean result.
What it means for a point to be inside a path is that painting the path (by fill or

197
4 . 6
User Paths
stroke) would cause the device pixel lying under that point to be marked. The in-
sideness tests disregard the current clipping path.
infill tests the current path in the graphics state. There are two forms of this op-
erator. One returns true if painting the current path with the fill operator
would result in marking the device pixel corresponding to a specific point in
user space. The second tests whether any pixels within a specified aperture
would be marked. The aperture is specified by a user path supplied as an oper-
and (see Section 4.6, “User Paths”).
instroke is similar to infill, but it tests pixels that would be marked by applying
the stroke operator to the current path, using the current settings of the stroke-
related parameters in the graphics state (line width, dash pattern, and so forth).
inufill and inustroke are similar to infill and instroke, but they test a user path
supplied as a separate operand, rather than the current path in the graphics
state.
ineofill and inueofill are similar to infill and inufill, but they use the even-odd
rule instead of the nonzero winding number rule for insideness testing; see
Section 4.5.2, “Filling,” for more information.
4.6 User Paths
A user path is a procedure that is a completely self-contained description of a path
in user space. It consists entirely of path construction operators and their coordi-
nate operands expressed as literal numbers. User paths are a LanguageLevel 2 fea-
ture.
Special user path painting operators, such as ustroke and ufill, combine the ex-
ecution of a user path description with painting operations such as stroking or
filling the resulting path. Although these operators can be fully expressed in terms
of other path construction and painting operators, they offer a number of advan-
tages in efficiency and convenience:
They closely match the needs of many application programs.
Because a user path consists solely of path construction operators and numeric
operands, rather than arbitrary computations, it is entirely self-contained: its
behavior is guaranteed not to depend on an unpredictable execution environ-
ment.

198
C H A P T E R 4
Graphics
Every user path carries information about its own bounding box, ensuring that
its coordinates will fall within predictable bounds.
Most of the user path painting operators have no effect on the graphics state.
The absence of side effects is a significant reason for the efficiency of the opera-
tions. There is no need to build up an explicit current path only to discard it
after one use. Although the operators behave as if the path were built up, paint-
ed, and discarded in the usual way, their actual implementation is optimized to
avoid unnecessary work.
Because a user path is represented as a self-contained procedure object, the
PostScript interpreter can save its output in a cache. This eliminates redundant
interpretation of paths that are used repeatedly.
As a result of all these factors, interpreting a user path may be much more effi-
cient than executing an arbitrary PostScript procedure.
4.6.1 User Path Construction
A user path is an array or packed array object consisting of only the following op-
erators and their operands:
ucache
llx lly urx ury
setbbox
x y
moveto
dx dy
rmoveto
x y
lineto
dx dy
rlineto
x1 y1 x2 y2 x3 y3
curveto
dx1 dy1 dx2 dy2 dx3 dy3
rcurveto
x y r angle1 angle2
arc
x y r angle1 angle2
arcn
x1 y1 x2 y2 r
arct
closepath
In addition to the special operators ucache and setbbox, which are used only in
constructing user paths, this list includes all standard PostScript operators that
append to the current path, with two exceptions: arcto is not allowed because it
would push results onto the operand stack, and charpath is not allowed because
the resulting user path would depend on the current font and so would not be
self-contained.

199
4 . 6
User Paths
Note: The operators in a user path may be represented either as name objects or as
operator objects (such as those associated with the operator names in
systemdict).
The latter might result, for example, from applying the
bind operator to the user path
or to a procedure containing it. Either form of operator designation is acceptable; no
advantage is gained by using one in favor of the other.

The only operands permitted in a user path are literal integers and real numbers.
The correct number of operands must be supplied to each operator. The user
path must be structured as follows:
1. The optional ucache operator places the user path in a special cache, speeding
up execution for paths that a program uses frequently. If present, ucache must
be the first operator invoked in the user path. See Section 4.6.3, “User Path
Cache.”
2. The next operator invoked must be setbbox, which establishes a bounding box
in user space enclosing the entire path. Every user path must include a call to
setbbox.
3. The remainder of the user path must consist entirely of path construction op-
erators and their operands. The path is assumed to be empty initially, so the
first operator after setbbox must be an absolute positioning operator (moveto,
arc, or arcn).
All coordinates specified as operands must fall within the bounds specified by
setbbox, or a rangecheck error will occur when the path definition is executed.
Any other deviation from the rules above will result in a typecheck error.
The user path painting operators interpret a user path as if systemdict were the
current dictionary. This guarantees that all path construction operators invoked
within the path definition will have their standard meanings. To ensure that the
definition is self-contained and its meaning independent of its execution envi-
ronment, aliases are prohibited within a user path definition: it is illegal to use
names other than the standard path construction operator names listed above.
To illustrate the construction and use of a user path, Example 4.2 defines a path
and paints its interior with the current color.

200
C H A P T E R 4
Graphics
Example 4.2
{
ucache
% This is optional
100 200 400 500 setbbox
% This is required
150 200 moveto
250 200 400 390 400 460 curveto
400 480 350 500 250 500 curveto
100 400 lineto
closepath
} ufill
4.6.2 Encoded User Paths
An encoded user path is a very compact representation of a user path. It is an array
consisting of two string objects or an array and a string, representing the oper-
ands and operators of an equivalent user path definition in a compact binary en-
coding. Encoded user paths are not actually “executed” directly in the same sense
as an ordinary PostScript procedure. Rather, user path painting operators such as
ufill interpret the encoded data structure and perform the operations it encodes.
Note: The form of operator encoding used in encoded user paths is unrelated to the
alternative external encodings of the PostScript language described in Section 3.14,
“Binary Encoding Details.”

The elements of an encoded user path are:
A data string or data array containing numeric operands. If a string, it is inter-
preted as an encoded number string according to the representation described
in Section 3.14.5, “Encoded Number Strings”; if an array, its elements must all
be numbers and are simply used in sequence.
An operator string containing a sequence of encoded path construction opera-
tors, one operation code (opcode) per character. Table 4.3 shows the allowed
opcode values.
This two-part organization is for the convenience of application programs that
generate encoded user paths. In particular, operands always fall on natural
addressing boundaries. All characters in both the data and operator strings are
interpreted as binary numbers, rather than as ASCII character codes.

201
4 . 6
User Paths
TABLE 4.3 Operation codes for encoded user paths
OPCODE
OPERATOR
OPCODE
OPERATOR
0
setbbox
6
rcurveto
1
moveto
7
arc
2
rmoveto
8
arcn
3
lineto
9
arct
4
rlineto
10
closepath
5
curveto
11
ucache
32 < n ≤ 255
repetition count: repeat
next code n − 32 times
Associated with each opcode in the operator string are zero or more operands in
the data string or data array. The order of the operands is the same as in an ordi-
nary user path; for example, the lineto operator (opcode 3) consumes an x and a
y operand from the data sequence.
Note: If the encoded user path does not conform to the rules described above, a
typecheck error will occur when the path is interpreted. Possible errors include in-
valid opcodes in the operator string or premature end of the data sequence.

Example 4.3 shows an encoded version of the user path from Example 4.2, speci-
fying its operands as an ordinary data array encoded in ASCII. Example 4.4
shows the same user path with the operands given as an encoded number string.
Example 4.3
{
{
100 200 400 500
150 200
250 200 400 390 400 460
400 480 350 500 250 500
100 400
}
< 0B 00 01 22 05 03 0A >
} ufill

202
C H A P T E R 4
Graphics
Example 4.4
{
< 95200014
0064 00C8 0190 01F4
0096 00C8
00FA 00C8 0190 0186 0190 01CC
0190 01E0 015E 01F4 00FA 01F4
0064 0190
>

< 0B 00 01 22 05 03 0A >
} ufill
Example 4.4 illustrates how encoded user paths are likely to be used. Although it
does not appear to be more compact than Example 4.3 in its ASCII representa-
tion, it occupies less space in virtual memory and executes considerably faster.
For clarity of exposition, the example shows the operand as a hexadecimal literal
string; an ASCII base-85 string literal or a binary string token would be more
compact.
4.6.3 User Path Cache
Some PostScript programs define paths that are repeated many times. To opti-
mize the interpretation of such paths, the PostScript language provides a facility
called the user path cache. This cache, analogous to the font cache, retains the re-
sults from previously interpreted user path definitions. When the PostScript
interpreter encounters a user path that is already in the cache, it substitutes the
cached results instead of reinterpreting the path definition.
There is a nontrivial cost associated with caching a user path: extra computation
is required and existing paths may be displaced from the cache. Because most
user paths are used once and immediately thrown away, it does not make sense to
place every user path in the cache. Instead, the application program must explic-
itly identify which user paths are to be cached. It does so by invoking the ucache
operator as the first operation in a user path definition, before setbbox, as shown
in Example 4.5.

203
4 . 6
User Paths
Example 4.5
/Circle1
{
ucache
–1 –1 1 1 setbbox
0 0 1 0 360 arc
} cvlit def
Circle1 ufill
The ucache operator notifies the PostScript interpreter that the enclosing user
path should be placed in the cache if it is not already there, or retrieved from the
cache if it is. (Invoking ucache outside a user path has no effect.) This cache man-
agement is not performed directly by ucache; rather, it is performed by the paint-
ing operator applied to the user path (ufill in Example 4.5). This is because the
results retained in the cache differ according to what painting operation is per-
formed. User path painting operators produce the same effects on the current
page whether the cache is accessed or not.
Caching is based on the value of a user path object. That is, two user paths are
considered the same for caching purposes if all of their corresponding elements
are equal, even if the objects themselves are not. A user path placed in the cache
need not be explicitly retained in virtual memory. An equivalent user path ap-
pearing literally later in the program can take advantage of the cached informa-
tion. Of course, if it is known that a given user path will be used many times,
defining it explicitly in VM avoids creating it multiple times.
User path caching, like font caching, is effective across translations of the user co-
ordinate system, but not across other transformations, such as scaling or rota-
tion. In other words, multiple instances of a given user path painted at different
places on the page will take advantage of the user path cache when the current
transformation matrix has been altered only by translate. If the CTM has been al-
tered by scale or rotate, the instances will be treated as if they were described by
different user paths.
Two other features of Example 4.5 are important to note:
The user path object is explicitly saved for later use (as the value of Circle1 in
this example). This is done in anticipation of painting the same path multiple
times.

204
C H A P T E R 4
Graphics
The cvlit operator is applied to the user path object to remove its executable at-
tribute. This ensures that the subsequent reference to Circle1 pushes the object
on the operand stack rather than inappropriately executing it as a procedure. It
is unnecessary to do this if the user path is to be consumed immediately by a
user path painting operator and not saved for later use.
Note: It is necessary to build the user path as an executable array with { and }, rather
than as a literal array with
[ and ], so that the user path construction operators are
not executed while the array is being built. Executable arrays have deferred execu-
tion.

4.6.4 User Path Operators
There are three categories of user path operator:
User path painting operators such as ustroke, ufill, and ueofill, which combine
interpretation of a user path with a standard painting operation (stroke or fill)
Some of the insideness-testing operators (see Section 4.5.3, “Insideness Test-
ing”)
Miscellaneous operators involving user paths, such as uappend, upath, and
ustrokepath
The userpath operand to any of these operators is one of the following:
For an ordinary user path, an array (not necessarily executable) whose length is
at least 5.
For an encoded user path, an array of two elements. The first element is either
an array whose elements are all numbers or a string that can be interpreted as
an encoded number string (see Section 3.14.5, “Encoded Number Strings”).
The second element is a string that encodes a sequence of operators, as de-
scribed in Table 4.3 on page 201.
In either case, the value of the object must conform to the rules for constructing
user paths, as detailed in preceding sections. If the user path is malformed, a
typecheck error will occur.
The user path painting operators ustroke, ufill, and ueofill interpret a user path as
if it were an ordinary PostScript procedure being executed with systemdict as the
current dictionary; they then perform the corresponding standard painting oper-

205
4 . 6
User Paths
ation (stroke, fill, or eofill). The user path operators implicitly invoke newpath
before interpreting the user path, and enclose the entire operation with gsave
and grestore. The overall effect is to define a path and paint it, leaving no side ef-
fects in the graphics state or anywhere else except in raster memory.
Several of the operators take an optional matrix as their final operand. This is a
six-element array of numbers describing a transformation matrix. A matrix is
distinguished from a user path (which is also an array) by the number and types
of its elements.
There is no user path clipping operator. Because the whole purpose of the clip-
ping operation is to alter the current clipping path, there is no way to avoid build-
ing the path. The best way to clip with a user path is
newpath userpath uappend clip newpath
Under favorable conditions, this operation can still take advantage of informa-
tion in the user path cache.
Note: The uappend operator and the user path painting operators perform a tempo-
rary adjustment to the current transformation matrix as part of their execution,
rounding the
tx and ty components of the CTM to the nearest integer values. This en-
sures that scan conversion of the user path produces uniform results when it is placed
at different positions on the page through translation. This adjustment is especially
important if the user path is cached. The adjustment is not ordinarily visible to a
PostScript program, and is not mentioned in the descriptions of the individual oper-
ators.

4.6.5 Rectangles
Because rectangles are used very frequently, it is useful to have a few operators to
paint them directly as a convenience to application programs. Also, knowing that
the figure will be a rectangle allows execution to be significantly optimized. The
rectangle operators are similar to the user path painting operators in that they
combine path construction with painting, but their operands are considerably
simpler in form.

206
C H A P T E R 4
Graphics
A rectangle is defined in the user coordinate system, just as if it were constructed
as an ordinary path. The LanguageLevel 2 rectangle operators rectfill, rectstroke,
and rectclip accept their operands in any of three different forms:
Four numbers x, y, width, and height, describing a single rectangle. The rectan-
gle’s sides are parallel to the axes in user space. It has corners located at coordi-
nates (x, y), (x + width, y), (x + width, y + height), and (x, y + height). Note that
width and height can be negative.
An arbitrarily long sequence of numbers represented as an array.
An arbitrarily long sequence of numbers represented as an encoded number
string, as described in Section 3.14.5, “Encoded Number Strings.”
The sequence in the latter two operand forms must contain a multiple of four
numbers. Each group of four consecutive numbers is interpreted as the x, y,
width, and height values defining a single rectangle. The effect produced is equiv-
alent to specifying all the rectangles as separate subpaths of a single combined
path, which is then operated on by a single stroke, fill, or clip operator.
The PostScript interpreter draws all rectangles in a counterclockwise direction in
user space, regardless of the signs of the width and height operands. This ensures
that when multiple rectangles overlap, all of their interiors are considered to be
inside the path according to the nonzero winding number rule.
4.7 Forms
A form is a self-contained description of any arbitrary graphics, text, or sampled
images that are to be painted multiple times, either on several pages or at several
locations on the same page. The appearance of a form is described by a PostScript
procedure that invokes graphics operators. Language support for forms is a
LanguageLevel 2 feature.
What distinguishes a form from an ordinary procedure is that it is self-contained
and behaves according to certain rules. By defining a form, a program declares
that each execution of the form will produce the same output, which depends
only on the graphics state at the time the form is executed. The form’s definition
does not refer to variable information in virtual memory, and its execution has
no side effects in VM.

207
4 . 7
Forms
These rules permit the PostScript interpreter to save the graphical output of the
form in a cache. Later, when the same form is used again, the interpreter substi-
tutes the saved output instead of reexecuting the form’s definition. This can sig-
nificantly improve performance when the form is used many times.
There are various uses for forms:
As its name suggests, a form can serve as the template for an entire page. For
example, a program that prints filled-in tax forms can first paint the fixed tem-
plate as a form, then paint the variable information on top of it.
A form can also be any graphical element that is to be used repeatedly. For ex-
ample, in output from computer-aided design systems, it is common for cer-
tain standard components to appear many times. A company’s logo can be
treated as a form.
4.7.1 Using Forms
The use of forms requires two steps:
1. Describe the appearance of the form. Create a form dictionary containing de-
scriptive information about the form. A crucial element of the dictionary is
the PaintProc procedure, a PostScript procedure that can be executed to paint
the form.
2. Invoke the form. Invoke the execform operator with the form dictionary as the
operand. Before doing so, a program should set appropriate parameters in the
graphics state; in particular, it should alter the current transformation matrix
to control the position, size, and orientation of the form in user space.
Every form dictionary must contain a FormType entry, which identifies the par-
ticular form type that the dictionary describes and determines the format and
meaning of its remaining entries. At the time of publication, only one form type,
type 1, has been defined. Table 4.4 shows the contents of the form dictionary for
this form type. (The dictionary can also contain any additional entries that its
PaintProc procedure may require.)

208
C H A P T E R 4
Graphics
TABLE 4.4 Entries in a type 1 form dictionary
KEY
TYPE
VALUE
FormType
integer
(Required) A code identifying the form type that this dictionary describes.
The only valid value defined at the time of publication is 1.
XUID
array
(Optional) An extended unique ID that uniquely identifies the form (see
Section 5.6.2, “Extended Unique ID Numbers”). The presence of an XUID
entry in a form dictionary enables the PostScript interpreter to save cached
output from the form for later use, even when the form dictionary is loaded
into virtual memory multiple times (for instance, by different jobs). To en-
sure correct behavior, XUID values must be assigned from a central registry.
This is particularly appropriate for forms treated as named resources. Forms
that are created dynamically by an application program should not contain
XUID entries.
BBox
array
(Required) An array of four numbers in the form coordinate system, giving
the coordinates of the left, bottom, right, and top edges, respectively, of the
form’s bounding box. These boundaries are used to clip the form and to de-
termine its size for caching.
Matrix
matrix
(Required) A transformation matrix that maps the form’s coordinate space
into user space. This matrix is concatenated with the current transformation
matrix before the PaintProc procedure is called.
PaintProc
procedure
(Required) A PostScript procedure for painting the form.
Implementation
any
An additional entry inserted in the dictionary by the execform operator, con-
taining information used by the interpreter to support form caching. The
type and value of this entry are implementation-dependent.
The form is defined in its own form coordinate system, defined by concatenating
the matrix specified by the form dictionary’s Matrix entry with the current trans-
formation matrix each time the execform operator is executed. The form diction-
ary’s BBox value is interpreted in the form coordinate system, and the PaintProc
procedure is executed within that coordinate system.
The execform operator first checks whether the form dictionary has previously
been used as an operand to execform. If not, it verifies that the dictionary con-
tains the required elements and makes the dictionary read-only. It then paints the
form, either by invoking the form’s PaintProc procedure or by substituting
cached output produced by a previous execution of the same form.

209
4 . 7
Forms
Whenever execform needs to execute the form definition, it does the following:
1. Invokes gsave
2. Concatenates the matrix from the form dictionary’s Matrix entry with the
CTM
3. Clips according to the BBox entry
4. Invokes newpath
5. Pushes the form dictionary on the operand stack
6. Executes the form’s PaintProc procedure
7. Invokes grestore
The PaintProc procedure is expected to consume its dictionary operand and to
use the information at hand to paint the form. It must obey certain guidelines to
avoid disrupting the environment in which it is executed:
It should not invoke any of the operators listed in Appendix G as unsuitable for
use in encapsulated PostScript files.
It should not invoke showpage, copypage, or any device setup operator.
Except for removing its dictionary operand, it should leave the stacks un-
changed.
It should have no side effects beyond painting the form. It should not alter ob-
jects in virtual memory or anywhere else. Because of the effects of caching, the
PaintProc procedure is called at unpredictable times and in unpredictable envi-
ronments. It should depend only on information in the form dictionary and
should produce the same effect every time it is called.
Form caching is most effective when the graphics state does not change between
successive invocations of execform for a given form. Changes to the translation
components of the CTM usually do not influence caching behavior; other chang-
es may require the interpreter to reexecute the PaintProc procedure.

210
C H A P T E R 4
Graphics
4.8 Color Spaces
The PostScript language includes powerful facilities for specifying the colors of
graphical objects to be marked on the current page. The color facilities are divid-
ed into two parts:
Color specification. A PostScript program can specify abstract colors in a device-
independent way. Colors can be described in any of a variety of color systems,
or color spaces. Some color spaces are related to device color representation
(grayscale, RGB, CMYK), others to human visual perception (CIE-based). Cer-
tain special features are also modeled as color spaces: patterns, color mapping,
separations, and high-fidelity and multitone color.
Color rendering. The PostScript interpreter reproduces colors on the raster out-
put device by a multiple-step process that includes color conversion, gamma
correction, halftoning, and scan conversion. Certain aspects of this process are
under PostScript language control. However, unlike the facilities for color spec-
ification, the color rendering facilities are device-dependent and ordinarily
should not be accessed from a page description.
This section describes the color specification facilities of the PostScript language.
It covers everything that most PostScript programs need in order to specify col-
ors. Chapter 7 describes the facilities for controlling color rendering; a program
should use those facilities only to configure or calibrate an output device or to
achieve special device-dependent effects.
Figures 4.5 and 4.6 on pages 212 and 213 illustrate the organization of the Post-
Script language features for dealing with color, showing the division between
(device-independent) color specification and (device-dependent) color render-
ing.
4.8.1 Types of Color Space
As described in Section 4.5, “Painting,” marks placed on the page by operators
such as fill and stroke have a color that is determined by the current color parame-
ter of the graphics state. A color value consists of one or more color components,
which are usually numbers. For example, a gray level can be specified by a single
number ranging from 0.0 (black) to 1.0 (white). Full color values can be specified
in any of several ways; a common method uses three numbers to specify red,
green, and blue components.

211
4 . 8
Color Spaces
In LanguageLevels 2 and 3, color values are interpreted according to the current
color space
, another parameter of the graphics state. A PostScript program first
selects a color space by invoking the setcolorspace operator. It then selects color
values within that color space with the setcolor operator. There are also
convenience operators—setgray, setrgbcolor, sethsbcolor, setcmykcolor, and
setpattern—that select both a color space and a color value in a single step.
In LanguageLevel 1, this distinction between color spaces and color values is not
explicit, and the set of color spaces is limited. Colors can be specified only by
setgray, setrgbcolor, sethsbcolor, and (in some implementations) setcmykcolor.
However, in those color spaces that are supported, the color values produce con-
sistent results from one LanguageLevel to another.
The image and colorimage operators, introduced in Section 4.10, “Images,” en-
able sampled images to be painted on the current page. Each individual sample in
an image is a color value consisting of one or more components to be interpreted
in some color space. Since the color values come from the image itself, the cur-
rent color in the graphics state is not used.
Whether color values originate from the graphics state or from a sampled image,
all later stages of color processing treat them the same way. The following sections
describe the semantics of color values that are specified as operands to the
setcolor operator, but the same semantics also apply to color values originating as
image samples.
Color spaces can be classified into color space families. Spaces within a family
share the same general characteristics; they are distinguished by parameter values
supplied at the time the space is specified. The families, in turn, fall into three cat-
egories:
Device color spaces directly specify colors or shades of gray that the output de-
vice is to produce. They provide a variety of color specification methods,
including gray level, RGB (red-green-blue), HSB (hue-saturation-brightness),
and CMYK (cyan-magenta-yellow-black), corresponding to the color space
families DeviceGray, DeviceRGB, and DeviceCMYK. (HSB is merely an alternate
convention for specifying RGB colors.) Since each of these families consists of
just a single color space with no parameters, they are sometimes loosely re-
ferred to as the DeviceGray, DeviceRGB, and DeviceCMYK color spaces.

212
C H A P T E R 4
Graphics
Sources of
Color spaces
color values
Color values
A, B, C
CIEBasedABC
setcolor
image
CIE-
A
CIEBasedA
based
setcolor
Conversion
image
to internal
X, Y, Z
color
D, E, F
X, Y, Z
spaces
CIEBasedDEF
values
setcolor
image
D, E, F, G
CIEBasedDEFG
setcolor
image
Another
UseCIEColor
(CIE-based)
true
R, G, B
color space
DeviceRGB
setcolor
setrgbcolor
R,G,B
image
colorimage
H, S, B
HSB to RGB
sethsbcolor
conversion
Device
UseCIEColor
Another
color
(CIE-based)
true
C, M, Y, K
color space
spaces
DeviceCMYK
setcolor
setcmykcolor
Another
image
UseCIEColor
(CIE-based)
colorimage
true
gray
color space
DeviceGray
setcolor
setgray
Alternative
image
Another
color
color space
transform
tint
Separation
setcolor
image
Alternative
Another
color
color space
n
transform
Special
components
color
DeviceN
spaces
setcolor
image
index
Table
Another
Indexed
setcolor
lookup
color space
image
pattern
Another
Pattern
setcolor
color space
setpattern
Pattern
dictionary
FIGURE 4.5 Color specification

213
4 . 8
Color Spaces
CIE-based
R, G, B
Device color values
X, Y, Z
color
C, M, Y, K
(depending on
rendering
contents of rendering
dictionary
gray
dictionary)
setcolorrendering
R, G, B
Component(s)
of device’s
process
color model
Conversion
from input
Device’s
device color
process
space to
colorant(s)
device’s
C, M, Y, K
process color
model
Transfer
Halftones
functions
(per
gray
(per
component)
component)
setundercolorremoval
setblackgeneration
Any single
tint
device
colorant
n
components
Any n device
colorants
sethalftone
sethalftone
settransfer
setscreen
setcolortransfer
setcolorscreen
FIGURE 4.6 Color rendering

214
C H A P T E R 4
Graphics
CIE-based color spaces are based on an international standard for color specifi-
cation created by the Commission Internationale de l’Éclairage (International
Commission on Illumination). These spaces allow colors to be specified in a
way that is independent of the characteristics of any particular output device.
Color space families in this category include CIEBasedABC, CIEBasedA,
CIEBasedDEF, and CIEBasedDEFG. Individual color spaces within these families
are specified by means of dictionaries containing the parameter values needed
to define the space.
Special color spaces add features or properties to an underlying color space.
They include facilities for patterns, color mapping, separations, and high-
fidelity and multitone color. The corresponding color space families are
Pattern, Indexed, Separation, and DeviceN. Individual color spaces within
these families are specified by means of additional parameters.
Whatever type of color space a PostScript program uses to specify a color, the pro-
cess of rendering that color on a particular output device is under separate con-
trol. Color rendering is discussed in Chapter 7.
The following operators control the selection of color spaces and color values:
setcolorspace sets the color space parameter in the graphics state; currentcolor-
space returns the current color space parameter.
The operand to setcolorspace is an array object containing as its first element a
name object identifying the desired color space. The remaining array elements,
if any, are parameters that further characterize the color space; their number
and types vary according to the particular color space selected. For color spaces
that do not require parameters, the operand to setcolorspace can simply be the
color space name itself instead of an array; currentcolorspace always returns an
array.
The following color space families are standard in LanguageLevel 2:
DeviceGray
CIEBasedABC
Pattern
DeviceRGB
CIEBasedA
Indexed
DeviceCMYK
Separation
LanguageLevel 3 supports the following additional families:
CIEBasedDEF
DeviceN
CIEBasedDEFG

215
4 . 8
Color Spaces
setcolor sets the current color parameter in the graphics state; currentcolor re-
turns the current color parameter. Depending on the color space, setcolor
requires one or more operands, each specifying one component of the color
value.
setgray, setrgbcolor, sethsbcolor, setcmykcolor, and setpattern set the color
space implicitly and the current color value as specified by the operands.
currentgray, currentrgbcolor, currenthsbcolor, and currentcmykcolor return
the current color according to an implicit color space; in certain limited cases,
the latter operators also perform conversions if the current color space differs
from the implicit one.
Note: Color specification operators such as setcolorspace, setcolor, and setpattern
sometimes install composite objects, such as arrays or dictionaries, as parameters in
the graphics state. To ensure predictable behavior, a PostScript program should
thereafter treat all such objects as if they were read-only.

In certain circumstances, it is illegal to invoke operators that specify colors or
other color-related parameters in the graphics state. This restriction occurs when
defining graphical figures whose colors are to be specified separately each time
they are used. Specifically, the restriction applies:
After execution of setcachedevice or setcachedevice2 in a BuildGlyph,
BuildChar, or CharStrings procedure of a font dictionary (see Sections 5.7,
“Type 3 Fonts”; “Type 1 CIDFonts” on page 376; and 5.9.3, “Replacing or Add-
ing Individual Glyphs”)
In the PaintProc procedure of an uncolored tiling pattern (see Section 4.9,
“Patterns”)
In these circumstances, invoking any of the following operators will cause an
undefined error:
colorimage
setcolorscreen
setpattern
image
setcolorspace
setrgbcolor
setblackgeneration
setcolortransfer
setscreen
setcmykcolor
setgray
settransfer
setcolor
sethalftone
setundercolorremoval
setcolorrendering
sethsbcolor
shfill

216
C H A P T E R 4
Graphics
Note that the imagemask operator is not restricted. This is because it does not
specify colors, but rather designates places where the current color is to be paint-
ed.
4.8.2 Device Color Spaces
The device color spaces enable a page description to specify color values that are
directly related to their representation on an output device. Color values in these
spaces map directly—or via simple conversions—to the application of device col-
orants, such as quantities of ink or intensities of display phosphors. This enables
a PostScript program to control colors precisely for a particular device, but the re-
sults may not be consistent between different devices.
Output devices form colors either by adding light sources together or by sub-
tracting light from an illuminating source. Computer displays and film recorders
typically add colors, while printing inks typically subtract them. These two ways
of forming colors give rise to two complementary forms of color specification:
the additive RGB specification and the subtractive CMYK specification. The cor-
responding device color spaces are as follows:
DeviceRGB controls the intensities of red, green, and blue light, the three addi-
tive primary colors used in displays. Colors in this space can alternatively be
specified by hue, saturation, and brightness values.
DeviceCMYK controls the concentrations of cyan, magenta, yellow, and black
inks, the four subtractive process colors used in printing.
DeviceGray controls the intensity of achromatic light, on a scale from black to
white.
Although the notion of explicit color spaces is a LanguageLevel 2 feature, the op-
erators for specifying colors in the DeviceRGB and DeviceGray color spaces—
setrgbcolor, sethsbcolor, and setgray—are available in all LanguageLevels. The
setcmykcolor operator is also supported by some (but not all) LanguageLevel 1
implementations.

217
4 . 8
Color Spaces
DeviceRGB Color Space
Colors in the DeviceRGB color space can be specified according to either of two
color models: red-green-blue (RGB) and hue-saturation-brightness (HSB). Each of
these models can specify any reproducible color with three numeric parameters,
but the numbers have different meanings in the two models. Example 4.6 shows
different ways to select the DeviceRGB color space and a specific color within that
space.
Example 4.6
[/DeviceRGB] setcolorspace red green blue setcolor
/DeviceRGB setcolorspace red green blue setcolor
red green blue setrgbcolor
hue saturation brightness sethsbcolor
In the RGB model, a color is described as a combination of the three additive pri-
mary colors (red, green, and blue) in particular concentrations. The intensity of
each primary color is specified by a number in the range 0.0 to 1.0, where 0.0 de-
notes no contribution at all and 1.0 denotes maximum intensity of that color.
If all three primary colors have equal intensity, the perceived result theoretically is
a pure gray on the scale from black to white. If the intensities are not all equal, the
result is some color other than a pure gray.
In the HSB model, a color is described by a combination of three parameters
called hue, saturation, and brightness:
Hue corresponds to the property that is intuitively meant by the term “color,”
such as yellow or blue-green.
Saturation indicates how pure the color is. A saturation of 0.0 means that none
of the color’s hue is visible; the result is a shade of gray. A saturation of 1.0 de-
notes a pure color, consisting entirely of the specified hue. Intermediate values
represent a mixture between pure hue and pure gray.
Brightness determines how light the color determined by the hue and satura-
tion will be. A brightness of 0.0 is always black. A brightness of 1.0 denotes the
lightest color that the given combination of hue and saturation can allow. (For
example, pure red can never be as light as the brightest white, because it is
missing two components.)

218
C H A P T E R 4
Graphics
HSB colors are often illustrated as arranged around a color wheel. The hue param-
eter determines the angular position of a color on this wheel, with 0.0 corre-
sponding to pure red, 1⁄3 (0.333) to pure green, 2⁄3 (0.666) to pure blue, and 1.0
to red again. The saturation parameter denotes the color’s radial position be-
tween the center of the wheel (saturation = 0.0) and the edge (saturation = 1.0).
The brightness parameter controls the brightness of the colors displayed on the
wheel itself.
Note: HSB is not a color space in its own right. It is simply an alternative convention
for specifying color values in the
DeviceRGB color space.
As shown in Example 4.6, the setcolorspace and setcolor operators select the col-
or space and color value separately; setrgbcolor and sethsbcolor set them in
combination. When the specified color space is DeviceRGB, setcolorspace sets the
red, green, and blue components of the current color to 0.0.
When DeviceRGB is the current color space, both currentcolor and currentrgb-
color return the current color value in the form of its red, green, and blue compo-
nents, regardless of how it was originally specified. currenthsbcolor returns the
current color value as a hue, saturation, and brightness, converting among color
models as necessary. When one of the other device color spaces (DeviceCMYK or
DeviceGray) is current, currentcolor returns the current color value expressed in
that space; currentrgbcolor and currenthsbcolor convert it to DeviceRGB. (The
conversions are described in Section 7.2, “Conversions among Device Color
Spaces.”) These operators cannot convert from CIE-based or special color spaces.
Note: Of the operators just described, only setrgbcolor, sethsbcolor, currentrgbcolor,
and
currenthsbcolor are supported in LanguageLevel 1.
DeviceCMYK Color Space
The DeviceCMYK color space allows colors to be specified according to the sub-
tractive CMYK model typical of printers and other paper-based output devices.
Each color component in a DeviceCMYK color value specifies the amount of light
that the corresponding ink or other colorant absorbs. In theory, each of the three
standard process colors used in printing (cyan, magenta, and yellow) absorbs one
of the additive primary colors (red, green, and blue, respectively). Black, a fourth
standard process color, absorbs all additive primaries in equal amounts. Each of
the four components in a CMYK color specification is a number between 0.0 and

219
4 . 8
Color Spaces
1.0, where 0.0 represents no ink (that is, absorbs no light) and 1.0 represents the
maximum quantity of ink (absorbs all the light it can). Note that the sense of
these numbers is opposite to that of RGB color components.
Example 4.7 shows different ways to select the DeviceCMYK color space and a
specific color within that space.
Example 4.7
[/DeviceCMYK] setcolorspace cyan magenta yellow black setcolor
/DeviceCMYK setcolorspace cyan magenta yellow black setcolor
cyan magenta yellow black setcmykcolor
The setcolorspace and setcolor operators select the color space and color value
separately; setcmykcolor sets them in combination. When the specified color
space is DeviceCMYK, setcolorspace sets the cyan, magenta, and yellow compo-
nents of the current color to 0.0 and the black component to 1.0.
When DeviceCMYK is the current color space, both currentcolor and current-
cmykcolor return the current color value in the form of its cyan, magenta, yellow,
and black components. When one of the other device color spaces (DeviceRGB or
DeviceGray) is current, currentcolor returns the current color value expressed in
that space; currentcmykcolor converts it to DeviceCMYK. (The conversions are
described in Section 7.2, “Conversions among Device Color Spaces.”) This opera-
tor cannot convert from CIE-based or special color spaces.
Note: The setcmykcolor and currentcmykcolor operators are supported by some, but
not all, LanguageLevel 1 implementations.

DeviceGray Color Space
Black, white, and intermediate shades of gray are special cases of full color. A
grayscale value is represented by a single number in the range 0.0 to 1.0, where
0.0 corresponds to black, 1.0 to white, and intermediate values to different gray
levels. Example 4.8 shows different ways to select the DeviceGray color space and
a specific gray level within that space.

220
C H A P T E R 4
Graphics
Example 4.8
[/DeviceGray] setcolorspace gray setcolor
/DeviceGray setcolorspace gray setcolor
gray setgray
The setcolorspace and setcolor operators select the color space and color value
separately; setgray sets them in combination. When the specified color space is
DeviceGray, setcolorspace sets the current color to 0.0.
When DeviceGray is the current color space, both currentcolor and currentgray
return the current color value in the form of a single gray component. When one
of the other device color spaces (DeviceRGB or DeviceCMYK) is current,
currentcolor returns the current color value expressed in that space; currentgray
converts it to DeviceGray. (The conversions are described in Section 7.2, “Con-
versions among Device Color Spaces.”) This operator cannot convert from CIE-
based or special color spaces.
Note: The setgray and currentgray operators are supported by all PostScript imple-
mentations.

4.8.3 CIE-Based Color Spaces
CIE-based color is defined relative to an international standard used in the
graphic arts, television, and printing industries. It enables a page description to
specify color values in a way that is related to human visual perception. The goal
of this standard is for a given CIE-based color specification to produce consistent
results on different output devices, up to the limitations of each device.
Note: The detailed semantics of the CIE colorimetric system and the theory on which
it is based are beyond the scope of this book. See the Bibliography for sources of fur-
ther information.

The semantics of the CIE-based color spaces are defined in terms of the relation-
ship between the space’s components and the tristimulus values X, Y, and Z of the
CIE 1931 XYZ space. LanguageLevel 2 supports two CIE-based color space fami-
lies, named CIEBasedABC and CIEBasedA; LanguageLevel 3 adds two more such
families, CIEBasedDEF and CIEBasedDEFG. CIE-based color spaces are normally
selected by
[name dictionary] setcolorspace

221
4 . 8
Color Spaces
where name is the name of one of the CIE-based color space families and
dictionary is a dictionary containing parameters that further characterize the
color space. The entries in this dictionary have specific interpretations that vary
depending on the color space; some entries are required and some are optional.
Having selected a color space, a PostScript program can then specify color values
using the setcolor operator. Color values consist of a single component in a
CIEBasedA color space, three components in a CIEBasedABC or CIEBasedDEF col-
or space, and four components in a CIEBasedDEFG color space. The interpreta-
tion of these values varies depending on the specific color space.
Note: To use any of the CIE-based color spaces with the image operator requires
using the one-operand (dictionary) form of that operator, which interprets sample
values according to the current color space. See Section 4.10.5, “Image Dictionaries.”

CIE-based color spaces are a feature of LanguageLevel
2 (CIEBasedABC,
CIEBasedA) and LanguageLevel 3 (CIEBasedDEF, CIEBasedDEFG). Such spaces are
entirely separate from device color spaces. Operators that refer to device color
spaces implicitly, such as setrgbcolor and currentrgbcolor, have no connection
with CIE-based color spaces; they do not perform conversions between CIE-
based and device color spaces. (Note, however, that the PostScript interpreter
may perform such conversions internally under the control of the UseCIEColor
parameter in the page device dictionary; see “Remapping Device Colors to CIE”
on page 237.) The setrgbcolor operator changes the color space to DeviceRGB.
When the current color space is CIE-based, currentrgbcolor returns the initial
value of the DeviceRGB color space, which has no relation to the current color in
the graphics state.
CIEBasedABC Color Spaces
A CIEBasedABC color space (LanguageLevel 2) is defined in terms of a two-stage,
nonlinear transformation of the CIE 1931 XYZ space. The formulation of
CIEBasedABC color spaces models a simple zone theory of color vision, consisting
of a nonlinear trichromatic first stage combined with a nonlinear opponent-color
second stage. This formulation allows colors to be digitized with minimum loss
of fidelity, an important consideration in sampled images.

222
C H A P T E R 4
Graphics
The CIEBasedABC family includes a variety of interesting and useful color spaces,
such as the CIE 1931 XYZ space, a class of calibrated RGB spaces, and a class of
opponent-color spaces such as the CIE 1976 L*a*b* space and the NTSC,
SECAM, and PAL television spaces.
Color values in CIEBasedABC color spaces have three components, arbitrarily
named A, B, and C. They can represent a variety of independent color compo-
nents, depending on how the space is parameterized. For example, A, B, and C
may represent:
X, Y, and Z in the CIE 1931 XYZ space
R, G, and B in a calibrated RGB space
L*, a*, and b* in the CIE 1976 L*a*b* space
Y, I, and Q in the NTSC television space
Y, U, and V in the SECAM and PAL television spaces
The initial values of A, B, and C are 0.0 unless the range of valid values for a color
component does not include 0.0, in which case the nearest valid value is substi-
tuted.
The parameters for a CIEBasedABC color space must be provided in a dictionary
that is the second element of the array operand to the setcolorspace operator.
Table 4.5 describes the contents of this dictionary; Figure 4.7 illustrates the trans-
formations involved.
DecodeABC
DecodeLMN
A
L
B
MatrixABC
M
MatrixLMN
C
N
FIGURE 4.7 Component transformations in the CIEBasedABC color space

223
4 . 8
Color Spaces
TABLE 4.5 Entries in a CIEBasedABC color space dictionary
KEY
TYPE
VALUE
RangeABC
array
(Optional) An array of six numbers [A0 A1 B0 B1 C0 C1] specifying the range
of valid values for the A, B, and C components of the color space—that is,
A0 ≤ AA1, B0 ≤ BB1, and C0 ≤ CC1. Component values falling outside
the specified range will be adjusted to the nearest valid value without error
indication. Default value: [0.0 1.0 0.0 1.0 0.0 1.0].
DecodeABC
array
(Optional) An array of three PostScript procedures [DA DB DC] that decode
the A, B, and C components of the color space into values that are linear with
respect to an intermediate LMN representation; see MatrixABC below for fur-
ther explanation. Default value: the array of identity procedures [{} {} {}].
Each of the three procedures is called with an encoded A, B, or C component
on the operand stack and must return the corresponding decoded value. The
result must be a monotonically nondecreasing function of the operand. The
procedures must be prepared to accept operand values outside the ranges
specified by the RangeABC entry and to deal with such values in a robust way.
Because these procedures are called at unpredictable times and in unpredict-
able environments, they must operate as pure functions without side effects.
MatrixABC
array
(Optional) An array of nine numbers [LA MA NA LB MB NB LC MC NC] spec-
ifying the linear interpretation of the decoded A, B, and C components of the
color space with respect to the intermediate LMN representation. Default
value: the identity matrix [1 0 0 0 1 0 0 0 1].
The transformation defined by the DecodeABC and MatrixABC entries is
L = D (A) × L + D (B) × L + D (C) × L
A
A
B
B
C
C
M = D (A) × M + D (B) × M + D (C) × M
A
A
B
B
C
C
N = D (A) × N + D (B) × N + D (C) × N
A
A
B
B
C
C
In other words, the A, B, and C components of the color space are first de-
coded individually by the DecodeABC procedures. The results ar