We are pleased to announce the release of GrapeCity Documents v3.2. This release offers significant document editing and text extraction features.
The main highlights of this release are:
In addition to searching and extracting lines of text, GcPdf now allows extracting entire text paragraphs from PDF documents.
Highlights of this new feature include:
Now the text search works across line breaks. Text split between separate PDF text rendering operators (but appearing in the same paragraph) can also be searched and extracted. The FindText() method will now return an array of Quadrilaterals because text search can occur across lines' bounds.
In many PDF documents, text renders multiple times in the same location, but visually it does not appear repeated in the PDF. Previously, extracting such text returned these redundant lines in the search results. With the latest release, search results will return only unique occurrences of the lines that are logically duplicated in the file.
Tables with multi-line text in cells are sometimes rendered as single lines across all cells in a table row.
So a table row may look like this:
But a PDF operator renders it as:
Column 1 .. Column 2,
Row 1 .. Row 1
With GcPdf v3.2, "Column 1, Row 1" is recognized correctly as a paragraph.
A new member is added to ITextMap: a Paragraphs feature. This feature returns a collection of ITextParagraphobjects associated with this text map.
The following code shows how to extract text paragraphs from a page in PDF Document, through a simple collection of Paragraphs on a page.
// Print all paragraphs found on a page:
var doc = new GcPdfDocument();
using (var fs = File.OpenRead("Wetlands.pdf"))
{
doc.Load();
var map = doc.Pages[0]; // assume that the document has at least one page
foreach (var par in map.Paragraphs)
Console.WriteLine(par.GetText());
}
When a print supplier does not have access to fonts used in the PDF, the fonts must be removed before printing. GcPdf v3.2 includes page.Graphics.DrawTextAsPath. This Boolean property outlines the text and removes the fonts. The resulting PDF will look exactly like the original but with the glyphs rendered as graphics paths. This property can be used to manipulate the paths or to make it impossible to copy or search the text.
Visit Demo
Searching long documents with specific search terms or patterns is made easier with GrapeCity’s Javascript PDF Viewer. It displays the total number of search results.
Here is an example:
Certain words in a document may not be concurrent. The proximity search in GcPdfViewer helps find two or more words that aren’t next to each other with the new AROUND(N) operator. This support is consistent with Google's proximity search.
For example, a search for information such as 'graduation' at 'Duke University' in the year '2015', becomes: graduation AROUND(3) DUKE University AROUND(3) 2015. This will look for information that includes these words with the maximum count of words apart, specified in the AROUND(n) operator.
Help | Demo (Open Search button from left panel)
Search for words starting with or ending with certain combinations of letters. Specify the query to search at the beginning or end of the word. The search will return words starting or ending with the search query.
Help | Demo (Open Search button from left panel)
Specifying words with a Wildcard is an advanced search technique that maximizes search results. Supported wildcards are "*", matching any number of characters and "?", matching a single character once or zero times.
Help | Demo (Open Search button from left panel)
GcPdfViewer can now highlight all search results at once using the Highlight All option. You can also change the default highlight color using the 'useCanvasForSelection' option:
var viewer = new GcPdfViewer('#root',
{ useCanvasForSelection:
{ selectionColor: 'rgba(0, 0, 195, 0.25)',
highlightColor: 'rgba(255, 0, 0, 0.35)',
inactiveHighlightColor: "rgba(180, 0, 170, 0.35)"
}
});
Visit Help | Demo (Open Search button from left panel)
Review PDF documents in collaboration with other members. The PDF Editor now includes the Text Annotation Comment/Reply tool to add review comments, user names, and status of the comment.
Features included in this support:
Enable the ability to add replies or edit/remove existing replies. Configure the SupportApi project, and make sure it is connected to GrapeCity Documents for PDF (GcPdf) on the server.
If the annotation comment/reply tool is enabled without SupportApi, the tool works in read-only mode. Here’s a Demo.
For complex form design, use similar properties in all the form fields without manually setting them in each field. Copy and paste fields on the PDF form using the shortcut keys or clone the field using the Clone button in the Properties panel. Or copy and paste or clone annotations in the PDF Editor.
GrapeCity Documents PDF Editor now includes snap lines and snap margins to check the alignment of form fields and annotations in relation to each other. This allows users to align two elements (fields/annotations) to the same location within the document while designing PDF forms. Other features include:
Customize the context menu of GcPdfViewer and search selected text with any search engine.
Visit Demo
We further add additional functionalities to existing API to help you generate more secure Excel reports with digital signatures, as well as converting them to other formats. The main highlights of this release are:
Add and edit shapes. Choose from a wide range of geometric shapes, shape presets, and theme-based shape styles. Change the shape of a company logo, or remove the borders of shapes in a document from multiple places in the document. Modify the shapes for size, color, or fill type across several Word documents.
Many Word documents use shapes to emphasise ideas or highlight important points. The Word API supports the object model of the shapes supported in MS Word. That makes it possible to load and edit shapes and save Word documents without losing shapes or shape properties.
GrapeCity Documents for Word (GcWord) adds new API to the existing shapes support to create and manipulate shapes in Word documents.
Use the following features while working with shapes:
You can now add any shape supported in MS Word to a Word document through GcWord API. This support enhances loading of any Word document having shapes into GcWord object model, modifying shape properties and saving back. Following Shape types are supported:
Add a Shape Type with a GeometryType enum and set shape properties.
//Draw an yellow arc.
var doc = new GcWordDocument();
//Picture shape we creates in the Run element.
var run = doc.Body.Paragraphs.Add().GetRange().Runs.Add();
var shape = run.GetRange().Shapes.Add(200,300,GeometryType.Arc);
shape.Line.Fill.Type = FillType.Solid;
shape.Line.Fill.SolidFill.ThemeColor = ThemeColorId.Accent4;
GroupShape class in GcWord represents a group shape element in the body content. It helps set unified properties for child shapes. Viewed as a "shape container."
Ink shapes help users draw lines or curves in a document. Current support of Ink shapes In GcWord loads a Word file having Ink shapes and modifies it automatically through the Ink class. Providing outer Xml defines the Ink shape, and makes creating new ink shapes possible.
When a Word document consists of several shapes, adding a drawing canvas helps arrange the shapes in a container. Add a canvas shape with the Add and Insert methods of the CanvasShapeCollection class.
Visit Help
Create a picture shape using the Picture class in a Word document.
Add a picture to a Word document. Set the outline width, fill type, and location properties. Load Word documents with pictures, modify and save them. This allows modifications to several documents containing the same picture image.
Visit Help
Fill and line formats are essential to make illustrations complete. These properties help distinguish between various shapes.
Use the FillFormat class to set various types of fill formats like PatternFill, SolidFill, ImageFill, and GradientFill. The FillType enum defines the active fill type.
Similarly, the LineFormat class contains various properties that define the appearance of shape lines. FillFormat defines fill properties.
Note: If FillFormat or LineFormat are not defined on the shape, their value is taken from the ShapeStyle class (if present in the document).
Read more about the ShapeStyle class here.
GcWord supports 42 t**hemed styles. Add these to any shape in a word document. Use the ApplyThemedStyle** method overloads to add themed styles to a shape. These themes are the easiest options for applying a theme to a shape with desired properties.
Sometimes presets are used more often than themed styles. Some of these styles are perfect for adding a transparent or semi-transparent shape with one click. GcWord supports about 29 shape preset types through the ApplyPreset method overloads.
Compare images for software testing, image manipulation detection, or even different frames of security footage. Perform fuzzy image comparisons and generate different images. This sample provides an effective way of comparing images (full source code is provided as with all demos).
Have a look at these samples:
Find Differences | Invisible Text | PNG vs JPEG | Font Hinting
What do you think about the new features in v3.2? Leave a comment below. Thanks!