123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707 |
- <pre>
- DRAFT TIFF Technical Note #2 17-Mar-95
- ============================
- This Technical Note describes serious problems that have been found in
- TIFF 6.0's design for embedding JPEG-compressed data in TIFF (Section 22
- of the TIFF 6.0 spec of 3 June 1992). A replacement TIFF/JPEG
- specification is given. Some corrections to Section 21 are also given.
- To permit TIFF implementations to continue to read existing files, the 6.0
- JPEG fields and tag values will remain reserved indefinitely. However,
- TIFF writers are strongly discouraged from using the 6.0 JPEG design. It
- is expected that the next full release of the TIFF specification will not
- describe the old design at all, except to note that certain tag numbers
- are reserved. The existing Section 22 will be replaced by the
- specification text given in the second part of this Tech Note.
- Problems in TIFF 6.0 JPEG
- =========================
- Abandoning a published spec is not a step to be taken lightly. This
- section summarizes the reasons that have forced this decision.
- TIFF 6.0's JPEG design suffers from design errors and limitations,
- ambiguities, and unnecessary complexity.
- Design errors and limitations
- -----------------------------
- The fundamental design error in the existing Section 22 is that JPEG's
- various tables and parameters are broken out as separate fields which the
- TIFF control logic must manage. This is bad software engineering: that
- information should be treated as private to the JPEG codec
- (compressor/decompressor). Worse, the fields themselves are specified
- without sufficient thought for future extension and without regard to
- well-established TIFF conventions. Here are some of the significant
- problems:
- * The JPEGxxTable fields do not store the table data directly in the
- IFD/field structure; rather, the fields hold pointers to information
- elsewhere in the file. This requires special-purpose code to be added to
- *every* TIFF-manipulating application, whether it needs to decode JPEG
- image data or not. Even a trivial TIFF editor, for example a program to
- add an ImageDescription field to a TIFF file, must be explicitly aware of
- the internal structure of the JPEG-related tables, or else it will probably
- break the file. Every other auxiliary field in the TIFF spec contains
- data, not pointers, and can be copied or relocated by standard code that
- doesn't know anything about the particular field. This is a crucial
- property of the TIFF format that must not be given up.
- * To manipulate these fields, the TIFF control logic is required to know a
- great deal about JPEG details, for example such arcana as how to compute
- the length of a Huffman code table --- the length is not supplied in the
- field structure and can only be found by inspecting the table contents.
- This is again a violation of good software practice. Moreover, it will
- prevent easy adoption of future JPEG extensions that might change these
- low-level details.
- * The design neglects the fact that baseline JPEG codecs support only two
- sets of Huffman tables: it specifies a separate table for each color
- component. This implies that encoders must waste space (by storing
- duplicate Huffman tables) or else violate the well-founded TIFF convention
- that prohibits duplicate pointers. Furthermore, baseline decoders must
- test to find out which tables are identical, a waste of time and code
- space.
- * The JPEGInterchangeFormat field also violates TIFF's proscription against
- duplicate pointers: the normal strip/tile pointers are expected to point
- into the larger data area pointed to by JPEGInterchangeFormat. All TIFF
- editing applications must be specifically aware of this relationship, since
- they must maintain it or else delete the JPEGInterchangeFormat field. The
- JPEGxxTables fields are also likely to point into the JPEGInterchangeFormat
- area, creating additional pointer relationships that must be maintained.
- * The JPEGQTables field is fixed at a byte per table entry; there is no
- way to support 16-bit quantization values. This is a serious impediment
- to extending TIFF to use 12-bit JPEG.
- * The 6.0 design cannot support using different quantization tables in
- different strips/tiles of an image (so as to encode some areas at higher
- quality than others). Furthermore, since quantization tables are tied
- one-for-one to color components, the design cannot support table switching
- options that are likely to be added in future JPEG revisions.
- Ambiguities
- -----------
- Several incompatible interpretations are possible for 6.0's treatment of
- JPEG restart markers:
- * It is unclear whether restart markers must be omitted at TIFF segment
- (strip/tile) boundaries, or whether they are optional.
- * It is unclear whether the segment size is required to be chosen as
- a multiple of the specified restart interval (if any); perhaps the
- JPEG codec is supposed to be reset at each segment boundary as if
- there were a restart marker there, even if the boundary does not fall
- at a multiple of the nominal restart interval.
- * The spec fails to address the question of restart marker numbering:
- do the numbers begin again within each segment, or not?
- That last point is particularly nasty. If we make numbering begin again
- within each segment, we give up the ability to impose a TIFF strip/tile
- structure on an existing JPEG datastream with restarts (which was clearly a
- goal of Section 22's authors). But the other choice interferes with random
- access to the image segments: a reader must compute the first restart
- number to be expected within a segment, and must have a way to reset its
- JPEG decoder to expect a nonzero restart number first. This may not even
- be possible with some JPEG chips.
- The tile height restriction found on page 104 contradicts Section 15's
- general description of tiles. For an image that is not vertically
- downsampled, page 104 specifies a tile height of one MCU or 8 pixels; but
- Section 15 requires tiles to be a multiple of 16 pixels high.
- This Tech Note does not attempt to resolve these ambiguities, so
- implementations that follow the 6.0 design should be aware that
- inter-application compatibility problems are likely to arise.
- Unnecessary complexity
- ----------------------
- The 6.0 design creates problems for implementations that need to keep the
- JPEG codec separate from the TIFF control logic --- for example, consider
- using a JPEG chip that was not designed specifically for TIFF. JPEG codecs
- generally want to produce or consume a standard ISO JPEG datastream, not
- just raw compressed data. (If they were to handle raw data, a separate
- out-of-band mechanism would be needed to load tables into the codec.)
- With such a codec, the TIFF control logic must parse JPEG markers emitted
- by the codec to create the TIFF table fields (when writing) or synthesize
- JPEG markers from the TIFF fields to feed the codec (when reading). This
- means that the control logic must know a great deal more about JPEG details
- than we would like. The parsing and reconstruction of the markers also
- represents a fair amount of unnecessary work.
- Quite a few implementors have proposed writing "TIFF/JPEG" files in which
- a standard JPEG datastream is simply dumped into the file and pointed to
- by JPEGInterchangeFormat. To avoid parsing the JPEG datastream, they
- suggest not writing the JPEG auxiliary fields (JPEGxxTables etc) nor even
- the basic TIFF strip/tile data pointers. This approach is incompatible
- with implementations that handle the full TIFF 6.0 JPEG design, since they
- will expect to find strip/tile pointers and auxiliary fields. Indeed this
- is arguably not TIFF at all, since *all* TIFF-reading applications expect
- to find strip or tile pointers. A subset implementation that is not
- upward-compatible with the full spec is clearly unacceptable. However,
- the frequency with which this idea has come up makes it clear that
- implementors find the existing Section 22 too complex.
- Overview of the solution
- ========================
- To solve these problems, we adopt a new design for embedding
- JPEG-compressed data in TIFF files. The new design uses only complete,
- uninterpreted ISO JPEG datastreams, so it should be much more forgiving of
- extensions to the ISO standard. It should also be far easier to implement
- using unmodified JPEG codecs.
- To reduce overhead in multi-segment TIFF files, we allow JPEG overhead
- tables to be stored just once in a JPEGTables auxiliary field. This
- feature does not violate the integrity of the JPEG datastreams, because it
- uses the notions of "tables-only datastreams" and "abbreviated image
- datastreams" as defined by the ISO standard.
- To prevent confusion with the old design, the new design is given a new
- Compression tag value, Compression=7. Readers that need to handle
- existing 6.0 JPEG files may read both old and new files, using whatever
- interpretation of the 6.0 spec they did before. Compression tag value 6
- and the field tag numbers defined by 6.0 section 22 will remain reserved
- indefinitely, even though detailed descriptions of them will be dropped
- from future editions of the TIFF specification.
- Replacement TIFF/JPEG specification
- ===================================
- [This section of the Tech Note is expected to replace Section 22 in the
- next release of the TIFF specification.]
- This section describes TIFF compression scheme 7, a high-performance
- compression method for continuous-tone images.
- Introduction
- ------------
- This TIFF compression method uses the international standard for image
- compression ISO/IEC 10918-1, usually known as "JPEG" (after the original
- name of the standards committee, Joint Photographic Experts Group). JPEG
- is a joint ISO/CCITT standard for compression of continuous-tone images.
- The JPEG committee decided that because of the broad scope of the standard,
- no one algorithmic procedure was able to satisfy the requirements of all
- applications. Instead, the JPEG standard became a "toolkit" of multiple
- algorithms and optional capabilities. Individual applications may select
- a subset of the JPEG standard that meets their requirements.
- The most important distinction among the JPEG processes is between lossy
- and lossless compression. Lossy compression methods provide high
- compression but allow only approximate reconstruction of the original
- image. JPEG's lossy processes allow the encoder to trade off compressed
- file size against reconstruction fidelity over a wide range. Typically,
- 10:1 or more compression of full-color data can be obtained while keeping
- the reconstructed image visually indistinguishable from the original. Much
- higher compression ratios are possible if a low-quality reconstructed image
- is acceptable. Lossless compression provides exact reconstruction of the
- source data, but the achievable compression ratio is much lower than for
- the lossy processes; JPEG's rather simple lossless process typically
- achieves around 2:1 compression of full-color data.
- The most widely implemented JPEG subset is the "baseline" JPEG process.
- This provides lossy compression of 8-bit-per-channel data. Optional
- extensions include 12-bit-per-channel data, arithmetic entropy coding for
- better compression, and progressive/hierarchical representations. The
- lossless process is an independent algorithm that has little in
- common with the lossy processes.
- It should be noted that the optional arithmetic-coding extension is subject
- to several US and Japanese patents. To avoid patent problems, use of
- arithmetic coding processes in TIFF files intended for inter-application
- interchange is discouraged.
- All of the JPEG processes are useful only for "continuous tone" data,
- in which the difference between adjacent pixel values is usually small.
- Low-bit-depth source data is not appropriate for JPEG compression, nor
- are palette-color images good candidates. The JPEG processes work well
- on grayscale and full-color data.
- Describing the JPEG compression algorithms in sufficient detail to permit
- implementation would require more space than we have here. Instead, we
- refer the reader to the References section.
- What data is being compressed?
- ------------------------------
- In lossy JPEG compression, it is customary to convert color source data
- to YCbCr and then downsample it before JPEG compression. This gives
- 2:1 data compression with hardly any visible image degradation, and it
- permits additional space savings within the JPEG compression step proper.
- However, these steps are not considered part of the ISO JPEG standard.
- The ISO standard is "color blind": it accepts data in any color space.
- For TIFF purposes, the JPEG compression tag is considered to represent the
- ISO JPEG compression standard only. The ISO standard is applied to the
- same data that would be stored in the TIFF file if no compression were
- used. Therefore, if color conversion or downsampling are used, they must
- be reflected in the regular TIFF fields; these steps are not considered to
- be implicit in the JPEG compression tag value. PhotometricInterpretation
- and related fields shall describe the color space actually stored in the
- file. With the TIFF 6.0 field definitions, downsampling is permissible
- only for YCbCr data, and it must correspond to the YCbCrSubSampling field.
- (Note that the default value for this field is not 1,1; so the default for
- YCbCr is to apply downsampling!) It is likely that future versions of TIFF
- will provide additional PhotometricInterpretation values and a more general
- way of defining subsampling, so as to allow more flexibility in
- JPEG-compressed files. But that issue is not addressed in this Tech Note.
- Implementors should note that many popular JPEG codecs
- (compressor/decompressors) provide automatic color conversion and
- downsampling, so that the application may supply full-size RGB data which
- is nonetheless converted to downsampled YCbCr. This is an implementation
- convenience which does not excuse the TIFF control layer from its
- responsibility to know what is really going on. The
- PhotometricInterpretation and subsampling fields written to the file must
- describe what is actually in the file.
- A JPEG-compressed TIFF file will typically have PhotometricInterpretation =
- YCbCr and YCbCrSubSampling = [2,1] or [2,2], unless the source data was
- grayscale or CMYK.
- Basic representation of JPEG-compressed images
- ----------------------------------------------
- JPEG compression works in either strip-based or tile-based TIFF files.
- Rather than repeating "strip or tile" constantly, we will use the term
- "segment" to mean either a strip or a tile.
- When the Compression field has the value 7, each image segment contains
- a complete JPEG datastream which is valid according to the ISO JPEG
- standard (ISO/IEC 10918-1). Any sequential JPEG process can be used,
- including lossless JPEG, but progressive and hierarchical processes are not
- supported. Since JPEG is useful only for continuous-tone images, the
- PhotometricInterpretation of the image shall not be 3 (palette color) nor
- 4 (transparency mask). The bit depth of the data is also restricted as
- specified below.
- Each image segment in a JPEG-compressed TIFF file shall contain a valid
- JPEG datastream according to the ISO JPEG standard's rules for
- interchange-format or abbreviated-image-format data. The datastream shall
- contain a single JPEG frame storing that segment of the image. The
- required JPEG markers within a segment are:
- SOI (must appear at very beginning of segment)
- SOFn
- SOS (one for each scan, if there is more than one scan)
- EOI (must appear at very end of segment)
- The actual compressed data follows SOS; it may contain RSTn markers if DRI
- is used.
- Additional JPEG "tables and miscellaneous" markers may appear between SOI
- and SOFn, between SOFn and SOS, and before each subsequent SOS if there is
- more than one scan. These markers include:
- DQT
- DHT
- DAC (not to appear unless arithmetic coding is used)
- DRI
- APPn (shall be ignored by TIFF readers)
- COM (shall be ignored by TIFF readers)
- DNL markers shall not be used in TIFF files. Readers should abort if any
- other marker type is found, especially the JPEG reserved markers;
- occurrence of such a marker is likely to indicate a JPEG extension.
- The tables/miscellaneous markers may appear in any order. Readers are
- cautioned that although the SOFn marker refers to DQT tables, JPEG does not
- require those tables to precede the SOFn, only the SOS. Missing-table
- checks should be made when SOS is reached.
- If no JPEGTables field is used, then each image segment shall be a complete
- JPEG interchange datastream. Each segment must define all the tables it
- references. To allow readers to decode segments in any order, no segment
- may rely on tables being carried over from a previous segment.
- When a JPEGTables field is used, image segments may omit tables that have
- been specified in the JPEGTables field. Further details appear below.
- The SOFn marker shall be of type SOF0 for strict baseline JPEG data, of
- type SOF1 for non-baseline lossy JPEG data, or of type SOF3 for lossless
- JPEG data. (SOF9 or SOF11 would be used for arithmetic coding.) All
- segments of a JPEG-compressed TIFF image shall use the same JPEG
- compression process, in particular the same SOFn type.
- The data precision field of the SOFn marker shall agree with the TIFF
- BitsPerSample field. (Note that when PlanarConfiguration=1, this implies
- that all components must have the same BitsPerSample value; when
- PlanarConfiguration=2, different components could have different bit
- depths.) For SOF0 only precision 8 is permitted; for SOF1, precision 8 or
- 12 is permitted; for SOF3, precisions 2 to 16 are permitted.
- The image dimensions given in the SOFn marker shall agree with the logical
- dimensions of that particular strip or tile. For strip images, the SOFn
- image width shall equal ImageWidth and the height shall equal RowsPerStrip,
- except in the last strip; its SOFn height shall equal the number of rows
- remaining in the ImageLength. (In other words, no padding data is counted
- in the SOFn dimensions.) For tile images, each SOFn shall have width
- TileWidth and height TileHeight; adding and removing any padding needed in
- the edge tiles is the concern of some higher level of the TIFF software.
- (The dimensional rules are slightly different when PlanarConfiguration=2,
- as described below.)
- The ISO JPEG standard only permits images up to 65535 pixels in width or
- height, due to 2-byte fields in the SOFn markers. In TIFF, this limits
- the size of an individual JPEG-compressed strip or tile, but the total
- image size can be greater.
- The number of components in the JPEG datastream shall equal SamplesPerPixel
- for PlanarConfiguration=1, and shall be 1 for PlanarConfiguration=2. The
- components shall be stored in the same order as they are described at the
- TIFF field level. (This applies both to their order in the SOFn marker,
- and to the order in which they are scanned if multiple JPEG scans are
- used.) The component ID bytes are arbitrary so long as each component
- within an image segment is given a distinct ID. To avoid any possible
- confusion, we require that all segments of a TIFF image use the same ID
- code for a given component.
- In PlanarConfiguration 1, the sampling factors given in SOFn markers shall
- agree with the sampling factors defined by the related TIFF fields (or with
- the default values that are specified in the absence of those fields).
- When DCT-based JPEG is used in a strip TIFF file, RowsPerStrip is required
- to be a multiple of 8 times the largest vertical sampling factor, i.e., a
- multiple of the height of an interleaved MCU. (For simplicity of
- specification, we require this even if the data is not actually
- interleaved.) For example, if YCbCrSubSampling = [2,2] then RowsPerStrip
- must be a multiple of 16. An exception to this rule is made for
- single-strip images (RowsPerStrip >= ImageLength): the exact value of
- RowsPerStrip is unimportant in that case. This rule ensures that no data
- padding is needed at the bottom of a strip, except perhaps the last strip.
- Any padding required at the right edge of the image, or at the bottom of
- the last strip, is expected to occur internally to the JPEG codec.
- When DCT-based JPEG is used in a tiled TIFF file, TileLength is required
- to be a multiple of 8 times the largest vertical sampling factor, i.e.,
- a multiple of the height of an interleaved MCU; and TileWidth is required
- to be a multiple of 8 times the largest horizontal sampling factor, i.e.,
- a multiple of the width of an interleaved MCU. (For simplicity of
- specification, we require this even if the data is not actually
- interleaved.) All edge padding required will therefore occur in the course
- of normal TIFF tile padding; it is not special to JPEG.
- Lossless JPEG does not impose these constraints on strip and tile sizes,
- since it is not DCT-based.
- Note that within JPEG datastreams, multibyte values appear in the MSB-first
- order specified by the JPEG standard, regardless of the byte ordering of
- the surrounding TIFF file.
- JPEGTables field
- ----------------
- The only auxiliary TIFF field added for Compression=7 is the optional
- JPEGTables field. The purpose of JPEGTables is to predefine JPEG
- quantization and/or Huffman tables for subsequent use by JPEG image
- segments. When this is done, these rather bulky tables need not be
- duplicated in each segment, thus saving space and processing time.
- JPEGTables may be used even in a single-segment file, although there is no
- space savings in that case.
- JPEGTables:
- Tag = 347 (15B.H)
- Type = UNDEFINED
- N = number of bytes in tables datastream, typically a few hundred
- JPEGTables provides default JPEG quantization and/or Huffman tables which
- are used whenever a segment datastream does not contain its own tables, as
- specified below.
- Notice that the JPEGTables field is required to have type code UNDEFINED,
- not type code BYTE. This is to cue readers that expanding individual bytes
- to short or long integers is not appropriate. A TIFF reader will generally
- need to store the field value as an uninterpreted byte sequence until it is
- fed to the JPEG decoder.
- Multibyte quantities within the tables follow the ISO JPEG convention of
- MSB-first storage, regardless of the byte ordering of the surrounding TIFF
- file.
- When the JPEGTables field is present, it shall contain a valid JPEG
- "abbreviated table specification" datastream. This datastream shall begin
- with SOI and end with EOI. It may contain zero or more JPEG "tables and
- miscellaneous" markers, namely:
- DQT
- DHT
- DAC (not to appear unless arithmetic coding is used)
- DRI
- APPn (shall be ignored by TIFF readers)
- COM (shall be ignored by TIFF readers)
- Since JPEG defines the SOI marker to reset the DAC and DRI state, these two
- markers' values cannot be carried over into any image datastream, and thus
- they are effectively no-ops in the JPEGTables field. To avoid confusion,
- it is recommended that writers not place DAC or DRI markers in JPEGTables.
- However readers must properly skip over them if they appear.
- When JPEGTables is present, readers shall load the table specifications
- contained in JPEGTables before processing image segment datastreams.
- Image segments may simply refer to these preloaded tables without defining
- them. An image segment can still define and use its own tables, subject to
- the restrictions below.
- An image segment may not redefine any table defined in JPEGTables. (This
- restriction is imposed to allow readers to process image segments in random
- order without having to reload JPEGTables between segments.) Therefore, use
- of JPEGTables divides the available table slots into two groups: "global"
- slots are defined in JPEGTables and may be used but not redefined by
- segments; "local" slots are available for local definition and use in each
- segment. To permit random access, a segment may not reference any local
- tables that it does not itself define.
- Special considerations for PlanarConfiguration 2
- ------------------------------------------------
- In PlanarConfiguration 2, each image segment contains data for only one
- color component. To avoid confusing the JPEG codec, we wish the segments
- to look like valid single-channel (i.e., grayscale) JPEG datastreams. This
- means that different rules must be used for the SOFn parameters.
- In PlanarConfiguration 2, the dimensions given in the SOFn of a subsampled
- component shall be scaled down by the sampling factors compared to the SOFn
- dimensions that would be used in PlanarConfiguration 1. This is necessary
- to match the actual number of samples stored in that segment, so that the
- JPEG codec doesn't complain about too much or too little data. In strip
- TIFF files the computed dimensions may need to be rounded up to the next
- integer; in tiled files, the restrictions on tile size make this case
- impossible.
- Furthermore, all SOFn sampling factors shall be given as 1. (This is
- merely to avoid confusion, since the sampling factors in a single-channel
- JPEG datastream have no real effect.)
- Any downsampling will need to happen externally to the JPEG codec, since
- JPEG sampling factors are defined with reference to the full-precision
- component. In PlanarConfiguration 2, the JPEG codec will be working on
- only one component at a time and thus will have no reference component to
- downsample against.
- Minimum requirements for TIFF/JPEG
- ----------------------------------
- ISO JPEG is a large and complex standard; most implementations support only
- a subset of it. Here we define a "core" subset of TIFF/JPEG which readers
- must support to claim TIFF/JPEG compatibility. For maximum
- cross-application compatibility, we recommend that writers confine
- themselves to this subset unless there is very good reason to do otherwise.
- Use the ISO baseline JPEG process: 8-bit data precision, Huffman coding,
- with no more than 2 DC and 2 AC Huffman tables. Note that this implies
- BitsPerSample = 8 for each component. We recommend deviating from baseline
- JPEG only if 12-bit data precision or lossless coding is required.
- Use no subsampling (all JPEG sampling factors = 1) for color spaces other
- than YCbCr. (This is, in fact, required with the TIFF 6.0 field
- definitions, but may not be so in future revisions.) For YCbCr, use one of
- the following choices:
- YCbCrSubSampling field JPEG sampling factors
- 1,1 1h1v, 1h1v, 1h1v
- 2,1 2h1v, 1h1v, 1h1v
- 2,2 (default value) 2h2v, 1h1v, 1h1v
- We recommend that RGB source data be converted to YCbCr for best compression
- results. Other source data colorspaces should probably be left alone.
- Minimal readers need not support JPEG images with colorspaces other than
- YCbCr and grayscale (PhotometricInterpretation = 6 or 1).
- A minimal reader also need not support JPEG YCbCr images with nondefault
- values of YCbCrCoefficients or YCbCrPositioning, nor with values of
- ReferenceBlackWhite other than [0,255,128,255,128,255]. (These values
- correspond to the RGB<=>YCbCr conversion specified by JFIF, which is widely
- implemented in JPEG codecs.)
- Writers are reminded that a ReferenceBlackWhite field *must* be included
- when PhotometricInterpretation is YCbCr, because the default
- ReferenceBlackWhite values are inappropriate for YCbCr.
- If any subsampling is used, PlanarConfiguration=1 is preferred to avoid the
- possibly-confusing requirements of PlanarConfiguration=2. In any case,
- readers are not required to support PlanarConfiguration=2.
- If possible, use a single interleaved scan in each image segment. This is
- not legal JPEG if there are more than 4 SamplesPerPixel or if the sampling
- factors are such that more than 10 blocks would be needed per MCU; in that
- case, use a separate scan for each component. (The recommended color
- spaces and sampling factors will not run into that restriction, so a
- minimal reader need not support more than one scan per segment.)
- To claim TIFF/JPEG compatibility, readers shall support multiple-strip TIFF
- files and the optional JPEGTables field; it is not acceptable to read only
- single-datastream files. Support for tiled TIFF files is strongly
- recommended but not required.
- Other recommendations for implementors
- --------------------------------------
- The TIFF tag Compression=7 guarantees only that the compressed data is
- represented as ISO JPEG datastreams. Since JPEG is a large and evolving
- standard, readers should apply careful error checking to the JPEG markers
- to ensure that the compression process is within their capabilities. In
- particular, to avoid being confused by future extensions to the JPEG
- standard, it is important to abort if unknown marker codes are seen.
- The point of requiring that all image segments use the same JPEG process is
- to ensure that a reader need check only one segment to determine whether it
- can handle the image. For example, consider a TIFF reader that has access
- to fast but restricted JPEG hardware, as well as a slower, more general
- software implementation. It is desirable to check only one image segment
- to find out whether the fast hardware can be used. Thus, writers should
- try to ensure that all segments of an image look as much "alike" as
- possible: there should be no variation in scan layout, use of options such
- as DRI, etc. Ideally, segments will be processed identically except
- perhaps for using different local quantization or entropy-coding tables.
- Writers should avoid including "noise" JPEG markers (COM and APPn markers).
- Standard TIFF fields provide a better way to transport any non-image data.
- Some JPEG codecs may change behavior if they see an APPn marker they
- think they understand; since the TIFF spec requires these markers to be
- ignored, this behavior is undesirable.
- It is possible to convert an interchange-JPEG file (e.g., a JFIF file) to
- TIFF simply by dropping the interchange datastream into a single strip.
- (However, designers are reminded that the TIFF spec discourages huge
- strips; splitting the image is somewhat more work but may give better
- results.) Conversion from TIFF to interchange JPEG is more complex. A
- strip-based TIFF/JPEG file can be converted fairly easily if all strips use
- identical JPEG tables and no RSTn markers: just delete the overhead markers
- and insert RSTn markers between strips. Converting tiled images is harder,
- since the data will usually not be in the right order (unless the tiles are
- only one MCU high). This can still be done losslessly, but it will require
- undoing and redoing the entropy coding so that the DC coefficient
- differences can be updated.
- There is no default value for JPEGTables: standard TIFF files must define all
- tables that they reference. For some closed systems in which many files will
- have identical tables, it might make sense to define a default JPEGTables
- value to avoid actually storing the tables. Or even better, invent a
- private field selecting one of N default JPEGTables settings, so as to allow
- for future expansion. Either of these must be regarded as a private
- extension that will render the files unreadable by other applications.
- References
- ----------
- [1] Wallace, Gregory K. "The JPEG Still Picture Compression Standard",
- Communications of the ACM, April 1991 (vol. 34 no. 4), pp. 30-44.
- This is the best short technical introduction to the JPEG algorithms.
- It is a good overview but does not provide sufficiently detailed
- information to write an implementation.
- [2] Pennebaker, William B. and Mitchell, Joan L. "JPEG Still Image Data
- Compression Standard", Van Nostrand Reinhold, 1993, ISBN 0-442-01272-1.
- 638pp.
- This textbook is by far the most complete exposition of JPEG in existence.
- It includes the full text of the ISO JPEG standards (DIS 10918-1 and draft
- DIS 10918-2). No would-be JPEG implementor should be without it.
- [3] ISO/IEC IS 10918-1, "Digital Compression and Coding of Continuous-tone
- Still Images, Part 1: Requirements and guidelines", February 1994.
- ISO/IEC DIS 10918-2, "Digital Compression and Coding of Continuous-tone
- Still Images, Part 2: Compliance testing", final approval expected 1994.
- These are the official standards documents. Note that the Pennebaker and
- Mitchell textbook is likely to be cheaper and more useful than the official
- standards.
- Changes to Section 21: YCbCr Images
- ===================================
- [This section of the Tech Note clarifies section 21 to make clear the
- interpretation of image dimensions in a subsampled image. Furthermore,
- the section is changed to allow the original image dimensions not to be
- multiples of the sampling factors. This change is necessary to support use
- of JPEG compression on odd-size images.]
- Add the following paragraphs to the Section 21 introduction (p. 89),
- just after the paragraph beginning "When a Class Y image is subsampled":
- In a subsampled image, it is understood that all TIFF image
- dimensions are measured in terms of the highest-resolution
- (luminance) component. In particular, ImageWidth, ImageLength,
- RowsPerStrip, TileWidth, TileLength, XResolution, and YResolution
- are measured in luminance samples.
- RowsPerStrip, TileWidth, and TileLength are constrained so that
- there are an integral number of samples of each component in a
- complete strip or tile. However, ImageWidth/ImageLength are not
- constrained. If an odd-size image is to be converted to subsampled
- format, the writer should pad the source data to a multiple of the
- sampling factors by replication of the last column and/or row, then
- downsample. The number of luminance samples actually stored in the
- file will be a multiple of the sampling factors. Conversely,
- readers must ignore any extra data (outside the specified image
- dimensions) after upsampling.
- When PlanarConfiguration=2, each strip or tile covers the same
- image area despite subsampling; that is, the total number of strips
- or tiles in the image is the same for each component. Therefore
- strips or tiles of the subsampled components contain fewer samples
- than strips or tiles of the luminance component.
- If there are extra samples per pixel (see field ExtraSamples),
- these data channels have the same number of samples as the
- luminance component.
- Rewrite the YCbCrSubSampling field description (pp 91-92) as follows
- (largely to eliminate possibly-misleading references to
- ImageWidth/ImageLength of the subsampled components):
- (first paragraph unchanged)
- The two elements of this field are defined as follows:
- Short 0: ChromaSubsampleHoriz:
- 1 = there are equal numbers of luma and chroma samples horizontally.
- 2 = there are twice as many luma samples as chroma samples
- horizontally.
- 4 = there are four times as many luma samples as chroma samples
- horizontally.
- Short 1: ChromaSubsampleVert:
- 1 = there are equal numbers of luma and chroma samples vertically.
- 2 = there are twice as many luma samples as chroma samples
- vertically.
- 4 = there are four times as many luma samples as chroma samples
- vertically.
- ChromaSubsampleVert shall always be less than or equal to
- ChromaSubsampleHoriz. Note that Cb and Cr have the same sampling
- ratios.
- In a strip TIFF file, RowsPerStrip is required to be an integer
- multiple of ChromaSubSampleVert (unless RowsPerStrip >=
- ImageLength, in which case its exact value is unimportant).
- If ImageWidth and ImageLength are not multiples of
- ChromaSubsampleHoriz and ChromaSubsampleVert respectively, then the
- source data shall be padded to the next integer multiple of these
- values before downsampling.
- In a tiled TIFF file, TileWidth must be an integer multiple of
- ChromaSubsampleHoriz and TileLength must be an integer multiple of
- ChromaSubsampleVert. Padding will occur to tile boundaries.
- The default values of this field are [ 2,2 ]. Thus, YCbCr data is
- downsampled by default!
- </pre>
|