Documents for Word .NET Edition
In This Topic
    Product Architecture
    In This Topic

    Packaging

    GcWord is a collection of .NET Standard 2.0 class libraries written in C#, that provides API to create DOCX/DOCM MS Word files from scratch. The library also allows to load, analyze and modify existing Word documents.

    GcWord works on all platforms supported by .NET Standard, including .NET Core, ASP.NET Core, .NET Framework and so on.

    GcWord and supporting packages are available on nuget.org:

    Package Description

    GrapeCity.Documents.Word

    Main package which automatically pulls in the other required infrastructure packages.

    GrapeCity.Documents.Layout

    Enables saving Word documents as PDF.

    GrapeCity.Documents.Imaging

    Provides image handling.

    GrapeCity.Documents.Common

    An infrastructure package used by other packages.

    GrapeCity.Documents.Common.Windows

    Provides support for font linking specified in the Windows registry. On a non-Windows system this library can be referenced, but will do nothing.

    GrapeCity.Documents.DX.Windows

    Provides access to the native graphics APIs when running on a Windows system.

    Document Overview

    A Word document in GcWord is represented by an instance of the GrapeCity.Documents.Word.GcWordDocument class.

    The object model of the GcWordDocument class corresponds to the structure of a Word document, with the following properties corresponding to major parts of the document:

    Property Description

    Body

    The main document story
    Styles A collection of document styles to format document content
    ListTemplates A collection of list templates to format list content in the document
    Settings Provides options to control view, compatibility and other settings
    Theme Provides the different formatting options available to a document through a theme
    CustomXMLParts Provides the collection of CustomXMLPart objects.
    GlossaryDocument Provides the supplementary document storage which stores the content for future insertion.

    Body

    Body is the place where the content elements (representing the actual content of a document) are stored. GcWordDocument.Body represents the main content of the document, but other parts of the document (such as headers/footers, comments, footnotes/endnotes) also have bodies to store their content, the specific body type is indicated by the GrapeCity.Documents.Word.BodyType enumeration, which has the following members:

    Member Description
    Main Body of main document part
    Header Body of section header
    Footer Body of section footer
    Comment Body of comment
    BuildingBlock Body of building block
    Footnote Body of footnote
    FootnoteSeparator Body of footnote separator
    FootnoteContinuationSeparator Body of footnote continuation separator
    FootnoteContinuationNotice Body of footnote continuation notice
    Endnote Body of endnote
    EndnoteSeparator Body of endnote separator
    EndnoteContinuationSeparator Body of endnote continuation separator
    EndnoteContinuationNotice Body of endnote continuation notice

    Unlike other body types, the main body has Sections as the top level content elements. It also contains comments, footnotes and endnotes collections. There are three types of content elements that can be stored in a body:

    Content Element Type Description Content Elements
    Block elements Top level elements
    Inline elements Elements that must be placed inside another elements
    • Runs
    • Texts
    • Pictures
    • Simple fields
    • Hyperlinks
    • Footnotes
    • Endnotes
    Reference elements Elements that do not have its own content in the body (except for complex fields, see Complex Fields) but are represented by start/end markers.
    • Bookmarks
    • Comments
    • Complex fields

    The following sections explain how to access and work with various content elements of a body.

    Range

    A range is a sequence of content elements in a body. The body itself is a kind of range that holds all the content elements. In GcWord, the Range class is the main feature providing access to the various content elements in a document.

    All content elements have the GetRange() method, using which it is possible to access and modify collections of elements of specific types inside the content element's range, since the Range object has properties returning collections of specific types of objects included in the range. These collections allow to add/insert elements using the Add() and Insert() methods. 

    Please note that adding or inserting always occurs on one or both (e.g. when replacing a range) of a range's boundary. It is not possible to insert something in the middle of a range without creating a range with a boundary on that position first.

    A range provides the following two overloads to get new ranges based on it:

    Method Description
    GetRange (ContentObject first, ContentObject last) Gets a range that extends from the 'first' content object to the 'last'
    GetRange(Marker start, Marker end) Gets a range providing a fine-grained control over the range's bounds, e.g. GetRange(first.End, last.Start). For more information, see GcWord API Reference.

    To clear all content in a range use the Range.Clear() method. Range, being a collection of ContentObject, allows to enumerate the content elements included in it.

    ContentObect

    Block and inline elements are derived from the ContentObject class which provides access to the start and end position of an element in a document. Also, it allows to get the parent content element and enumerate the element's children. 

    In addition, all content objects have the Next and Previous properties which allow to enumerate objects of the same content type through the whole body.

    The Delete() method of the ContentObject class removes the element itself and all its inner content from the body.

    ContentRange

    Reference elements, bookmarks, comments, and complex fields, are slightly different from simple ContentObject.  This kind of elements do not have a parent content since the element can start and end anywhere. For example, it can start in one section and end in another. Instead, reference elements provide a pair of ContentObjects named ContentMark that define the start and end of the element.  The ContentMark has Owner property that points to the ContentRange element. Removing a ContentMark from the body also removes its owner element. The Delete() method on a ContentRange usually removes its ContentMarks only. Complex fields are an exception to this as its actual internal content is also deleted.

    Complex Fields

    Despite the fact that the complex field inherits from ContentRange, it actually is a combination of ContentRange and ContentObject. Bounds of a complex field are defined by special field characters (see the FieldChar class and the associated enum that defines the type of the field character as Begin, Separator or End values). The complex field can contain two ranges, code range and result range, separated by a Separator field character.

    The code range usually contains one or several codes (see FieldCode class) that in turn contain instructions on how to calculate the field's result. The result range contains cached result of the instructions. In the current version, GcWord does not yet calculate instructions, so it does not update the result.

    As mentioned above, unlike other ContentRange elements, the Delete() method on a complex field removes not only the field characters from the body but the field codes and the result too.

    Section

    Sections can only be present in the main body, and any document must have at least one section.

    Sections allow to change page formatting for the document parts; PageSetup property and headers or footers collections of a section provide the means to do that. Each section can have its own headers or footers and page formatting.

    Headers and footers display on each page of the section and they have their own bodies to store their content. There are several types of headers or footers in a section (see HeaderFooterType enum) and each header or footer can be linked to the same type from a previous section, so you do not have to create identical headers or footers for each section.

    Run

    A run is a contiguous fragment of a body content with uniform formatting. So, a run is the primary means to change character formatting. It is also a container for all other inline elements (excluding simple fields and hyperlinks).

    Nesting elements

    The top elements in the main body are the sections. For other body types, the top elements can be paragraphs, tables and content marks (see ContentRange).

    Usually elements with the same type cannot be nested (for example, a Run cannot be nested within another Run). Only SimpleField and Hyperlink can be nested. Also, a cell in a table can contain another table within its own cells.

    Styles

    Styles is the main means allowing to apply formatting to a document's content. GcWord provides 375 built-in styles. There are different style types (see StyleType enumeration). Each type of style can be applied only to the corresponding content type. You can get any built-in type using BuiltInStyleId enumeration.

    The StyleCollection class has default styles which can be fetched or set using its GetDefaultStyle(StyleType) or SetDefaultStyle(StyleType, Style) methods. These styles are applied to content that does not have an explicitly specified style. StyleCollection provides the DefaultFont and DefaultParagraphFormat properties which are used by default for the default styles.

    Some styles are linked. A linked style is a grouping of a paragraph style and character style which is used in a user interface to allow the same set of formatting properties. For example, if you want to apply Heading 1 paragraph style to a run, you can apply it using Document.Styles[BuiltInStyleId.Heading1].LinkStyle.

    Formatting inheritance

    GcWord allows to get the actual formatting values of elements. It takes into account the formatting inheritance from default document formatting, base style formatting, applied style formatting, parent content formatting and direct formatting of the element.

    ListTemplates

    GcWord provides 21 built-in list templates to create lists in the document. The formatting of these templates is the same as in Microsoft Word built-in list templates. There is no "list" class in GcWord. To create a list you need to set ListFormat.Template and ListFormat.LevelNumber (for multilevel lists) properties on each paragraph that should be in the list.  

    Settings

    The Settings class allows to set properties that apply to the whole document, add custom document properties, control document variables, detect and remove document macros, and change view options.