DocScape: 100% automatic print publishing to satisfy even the most sophisticated standards
DocScape: database publishing at it's best
DocScape Logo
Process-oriented publishing   |   Interfaces   |   Integration   |   Data-based Layout   |   Text Layout    

Integration of different data sources

Data do not require processing for the publishing process, DocScape takes care of this job automatically

DocScape Publisher, the main component of the DocScape system, receives a structured XML dataset file as input (if required spread modularly over several interconnected XML files), from which a PDF file is generated.

The DocScape Data Extractor component integrates a wide range of data sources into this dataset file. It is configured through an XML formula for the dataset file, in which - next to the data structure - the d ata sources  and additional aggregation and structuring rules are defined through special DocScape annotations. By linking various data sources through unique key criteria, the document represents an integrative view of a wide range of available data sources.

A number of scenarios:
  • Generating data via a leading ERP system. Grouping into product groups, product families and catalog chapters through predefined grouping keys. Linked via the product code, product description texts (RTF) and figures (image format) are integrated as files.
  • Generating data via a leading PIM system. Editorial contents relating to product families (introduction, project illustrations as eye-catchers) are stored in the CMS and linked through the family key.
  • Generating data via a leading CMS system. Contents up to product group level (incl. text descriptions and product illustrations) are administered and structured, and the sequence of contents is defined for the catalog. Product data, technical features and price information are, linked through the product code, taken directly from the inventory management system.
  • Linking to output interface of ERP system (e.g. for offers). Output files from the ERP system (e.g. RDI) are converted into XML, linked through the product code, mixed with data from the PIM system and published, e.g. as illustrated offers, under application of the full catalog regulations.

Access to all data sources, conversion into XML (e.g. RDI, RTF, XLS) if required, set-up and standardization of XML structures and the consolidation of all contents in one joint dataset structure is performed 100% automatically by DocScape Data Extractor.

For the realization of its DocScape Data Extractor, QuinScape relies on standard technologies in JAVA and XSLT,  which guarantee long-term portability and maintainability.

Automatic data compression and aggregation

In addition to the definition of data sources and structuring criteria for data extraction from a variety of sources, the XML formula allows the definition of compression and aggregation rules for the dataset file, which enables the grouping and summarizing of data according to a selection of criteria.

Possible applications:
  • Grouping of consecutive products with identical product photo into product groups.
  • Collating, linking and summarizing of accessory products.
  • Generating of symbol lists.
  • Compilation of detailed images for generated diagrams (such as explosion drawings).
  • Inclusion/exclusion rule: should a product feature be displayed generally for the whole product group and only exceptions be listed, or should it be featured at product level?
By selecting the relevant structuring, compression and aggregation rules, very different document structures may be generated from the same dataset, such as
  • Main catalogs, summarized catalogs, price lists;
  • Specialized category catalogs (featuring a carefully selected range of information, such as detailed images, explosion drawings, generated diagrams, feature tables);
  • Value-added offers for premium customers;
  • Personalized catalogs/brochures: emphasized representations of products which - based on the customer profile - may be of interest to a specific customer.
Thanks to formula-based data modeling, the rule-based translation of these aggregation and compression tasks is no problem for DocScape Data Extractor.

Other compression rules are subject to the available space or other layout-dependent criteria (symmetry of double spreads, chapter structuring, spread optimization) and cannot be applied before the layout is generated:
  • Display of product features by individual products or compilation in a table at product group level.
  • Selection of product images from the total volume of available images (with different sizes/shapes).
  • Omission of less significant product features to save space.
  • Selection of space-saving or more complex layouts for premium products (in cases of multi-level premium classification) to optimize space utilization.
  • Aggregation of product texts.
  • Table structuring.

If a compression or aggregation rule contains a possible reference to the layout regulations, it should not be mapped in the DocScape Data Extractor regulations, but in the DocScape Publisher regulations. The realization of compression and aggregation rules in DocScape Data Extractor is much more efficient, but does not include interaction with the layout engine.


Inclusion of external documents

Not all contents of a document have to be generated 100% data-based. Manually generated contents may be integrated in several ways:
  • Generation of manually designed pages with a DTP program, storage as a PDF file, integration through DocScape.
    Pagination, column titles etc. may be added by DocScape if required, as well as the accurate positioning on left/right pages. Multi-page documents are integrated into a generated document as a sequence of pages. If different languages or other versions are included in the DocScape-generated document as PDF levels, external documents featuring several levels may also be integrated accurately into the levels of the generated document.
  • Generation of page sections (such as advertisements, eye-catchers or other contents featuring components which are not available fully structured in the database) with a DTP program, storage as cropped PDF, integration through DocScape. 
    At every level of the document structure, a manually designed content may be added or replace a basically data-based generated content. DocScape’s rule-based approach takes care of its positioning on the page. An external PDF document with contents which do not fill a full page may be separated into several “PDF pages”, which are positioned individually and distributed on the actual document pages, guaranteeing a thoroughly optimized layout.

 

Integration of structured text contents

If no content management system is integrated, the recording of text contents for print publishing must be planned carefully: on the one hand certain formats, such as accentuations, headlines, lists and - if required - tables, must be supported, and on the other hand, media-independent recording is desirable to support the multiple application of a text content in different font sizes, text widths and layout designs. Not all file input interfaces allow the recording of structured texts for text input fields, but most CMS systems provide this option.
Several options should be considered for their application with DocScape:
  • Integration as HTML
    Editors which support a formatted recording of text contents in HTML are available for the integration into web-based data administration surfaces. It is one of the system’s advantages that almost any content from other applications may be integrated via the clipboard function. In terms of media-independence, there are a number of HTML attributes (switching of font or color, defined width of table columns) which this kind of text should not feature. DocScape’s filter components filter out such formattings, replacing them with media-independent alternatives. HTML is converted into XML during the integration with DocScape.
  • Integration as RTF RTFtoXML
    RTF is a standardized text format for the recording of text contents via commercially available text processing programs (and their integration as individual files). If integrated with DocScape, RTF is converted into XML. From the point of view of media-independence, there is a wide range of RTF features (font or color switching, table features) which must be filtered out and replaced by media-independent alternatives. The DocScape component which converts RTF into XML includes a configurable filter feature which fulfils the following tasks:
    • Conversion from RTF into XML without referring to Office software.
    • Filtering out of undesired formats (font type and color changes, paragraph formats).
    • Conversion of visual structures (such as tables with x columns, or changing to larger, bolder fonts) into logical structures (such as load tables and headings).
    • Analysis of meta-information (change tracking).
  • Integration as DocScape XML
    For structured texts, DocScape defines its own XML dialect, which serves as conversion target for all other text formats and may be recorded by DocScape’s DocEdit component, if required. DocEdit is a browser-based text editor which is configured via an XML formula and provides the following functions: 
    • WYSIWYG editor for structured XML texts.
    • Integration into any web data administration mask.
    • Look and feel are familiar from Office products.
    • Individually adaptable.
    • Specification of admissible text structures through XML formula: at any point of the text, only those structural elements are provided which are admissible at that point.
    • Templates, text/table components.
    • Transferring contents from Office software via the clipboard is possible, while all structures which are not admissible at that specific point will be filtered out and converted.
    • Data-based generating of content parameters..
    •  

Image processing

  • Processing of complete file trees. Freisteller
  • Conversion into PDF.
  • Extraction of cropping and other paths.
  • Generation of drop and outline shadows.
  • Make functionality.

Deutsch (Deutsch)

Data do not require processing in the systems responsible for print publishing: DocScape takes care of this task automatically, from data compression to automatic image processing.  
Contact form
Phone:
+49 231 / 533 831 0
info@docscape.de